Python: Pandas Merge With Examples


One of the most common operations that one has to perform while working with pandas is merging of two or more DataFrames. This can be achieved using the merge() function.

We can perform the merge on DataFrames based on either the column or index. Let's take a look at each of them with examples.


Example 1: Pandas: Merge Two DataFrames by Column Names

    import pandas as pd
    
    dataframe_open = pd.DataFrame({'TimeSeries': ["2023-01", "2023-02", "2023-03"], 'Open': [14, 22, 45]})
    dataframe_high = pd.DataFrame({'TimeSeries': ["2023-01", "2023-02", "2023-03"], 'High': [16, 22, 46]})
    
    # Merge based on column name TimeSeries
    merged_timeseries_df = pd.merge(dataframe_open, dataframe_high, on='TimeSeries')
    
    print(merged_timeseries_df)
    

    Output:

     TimeSeries  Open  High
    0    2023-01    14    16
    1    2023-02    22    22
    2    2023-03    45    46

Example 2: Pandas: Merge Multiple DataFrames by Column Names

    import pandas as pd
    
    dataframe_open = pd.DataFrame({'TimeSeries': ["2023-01", "2023-02", "2023-03"], 'Open': [14, 22, 45]})
    dataframe_high = pd.DataFrame({'TimeSeries': ["2023-01", "2023-02", "2023-03"], 'High': [16, 22, 46]})
    data_frame_low = pd.DataFrame({'TimeSeries': ["2023-01", "2023-02", "2023-03"], 'Low': [12, 20, 41]})
    data_frame_close = pd.DataFrame({'TimeSeries': ["2023-01", "2023-02", "2023-03"], 'Close': [15, 21, 42]})
    
    # Merge multiple df based on column name TimeSeries
    merged_timeseries_df = pd.merge(dataframe_open, dataframe_high, on='TimeSeries')
    merged_timeseries_df = pd.merge(merged_timeseries_df, data_frame_low, on='TimeSeries')
    merged_timeseries_df = pd.merge(merged_timeseries_df, data_frame_close, on='TimeSeries')
    
    print(merged_timeseries_df)
    

    Output:

     TimeSeries  Open  High  Low  Close
    0    2023-01    14    16   12     15
    1    2023-02    22    22   20     21
    2    2023-03    45    46   41     42


Example 3: Pandas: Merge DataFrames based on Index

    import pandas as pd
    
    dataframe_open = pd.DataFrame({'Open': [14, 22, 45]}, index=["2023-01", "2023-02", "2023-03"])
    dataframe_high = pd.DataFrame({'High': [16, 22, 46]}, index=["2023-01", "2023-02", "2023-03"])
    dataframe_low = pd.DataFrame({'Low': [12, 20, 41]}, index=["2023-01", "2023-02", "2023-03"])
    dataframe_close = pd.DataFrame({'Close': [15, 21, 42]}, index=["2023-01", "2023-02", "2023-03"])
    
    # Merge the DataFrames based on index
    merged_df = pd.merge(dataframe_open, dataframe_high, left_index=True, right_index=True)
    merged_df = pd.merge(merged_df, dataframe_low, left_index=True, right_index=True)
    merged_df = pd.merge(merged_df, dataframe_close, left_index=True, right_index=True)
    
    print(merged_df)
    

    Output:

            Open  High  Low  Close
    2023-01    14    16   12     15
    2023-02    22    22   20     21
    2023-03    45    46   41     42

Facing issues? Have Questions? Post them here! I am happy to answer!

Author Info:

Rakesh (He/Him) has over 14+ years of experience in Web and Application development. He is the author of insightful How-To articles for Code2care.

Follow him on: X

You can also reach out to him via e-mail: rakesh@code2care.org

Copyright ยฉ Code2care 2024 | Privacy Policy | About Us | Contact Us | Sitemap