Skip to article frontmatterSkip to article content
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

the data

Here is a simple dataset of three tables.

import pandas as pd

wifi_table = pd.DataFrame(
    {
        "SSID_ID": [1, 2, 2, 5, 3, 1],
        "BSSID_ID": [1, 2, 2, 5, 3, 1],
        "DBM": [-34, -36, -80, -60, -55, -48],
    },
    index=pd.date_range("3 june 2021", periods=6),
)

ssid_table = pd.DataFrame(
    {
        "ID": [1, 2, 3, 4, 5, 6],
        "SSID": ["box bill", "sam", "orange1884", "FreeWifi", "sfr43", "Eduroam"],
    }
)
bssid_table = pd.DataFrame(
    {
        "ID": [1, 2, 3, 4, 5, 6],
        "BSSID": [
            "10:10:10:10:20",
            "30:30:20:11:15",
            "18:34:26:45:12",
            "11:54:65:55:23",
            "45:43:22:43:54",
            "44:35:33:22:11",
        ],
    }
)
print(f"**wifi_table**\n{wifi_table}\n**ssid_table**\n{ssid_table}\n**bssid_table**\n{bssid_table}")
**wifi_table**
            SSID_ID  BSSID_ID  DBM
2021-06-03        1         1  -34
2021-06-04        2         2  -36
2021-06-05        2         2  -80
2021-06-06        5         5  -60
2021-06-07        3         3  -55
2021-06-08        1         1  -48
**ssid_table**
   ID        SSID
0   1    box bill
1   2         sam
2   3  orange1884
3   4    FreeWifi
4   5       sfr43
5   6     Eduroam
**bssid_table**
   ID           BSSID
0   1  10:10:10:10:20
1   2  30:30:20:11:15
2   3  18:34:26:45:12
3   4  11:54:65:55:23
4   5  45:43:22:43:54
5   6  44:35:33:22:11

merge

Let’s start with a simple merge of wifi_table with ssid_table (on SSID_ID and ID), and with bssid_table (on BSSID_ID and ID). You must keep the index of wifi_table.

# your code

another dataset

Here is a new dataset

df1 = pd.DataFrame({'name': ['Bob', 'Lisa', 'Sue'],
                    'pulse': [70, 63, 81]}, 
                   index=[123, 354, 165])
df2 = pd.DataFrame({'name': ['Eric', 'Bob', 'Marc'],
                    'weight': [60, 100, 70]},
                  index=[654, 123, 664])

merge

Let’s outer merge the tables df1 and df2 on name and preserving both indexes. Name the index of df1 ID_old and the index of df2 ID_new.

# your code

Now, consider that ID_new is the new person ID that must be kept, but that for some person, the new ID is missing. In that case, this missing ID must be replaced with the old person ID ID_old. Finally, set the new person ID as the index.

Hint: consider combine_first

# your code