Skip to content

load_data_timeline_cancer_presence

Load and return the MSK-IMPACT cancer presence timeline dataset (deidentified).

Returns:

Name Type Description
data Bunch

Dictionary-like object, with the following attributes.

  • data : pandas DataFrame The data matrix.
  • description_columns (Future release) : list The names of the dataset columns.
  • description_dataset (Future release) : str The full description of the dataset.
  • filename (Future release) : str The path to the location of the data.

Examples

```python from msk_cdm.datasets import connect_to_db from msk_cdm.datasets.impact import load_data_timeline_cancer_presence

Connect to the database

auth_file = 'path/to/config.txt' connect_to_db(auth_file=auth_file)

Load the dataset

df_timeline_cancer_presence = load_data_timeline_cancer_presence()

Access the data

df_cancer_presence = df_timeline_cancer_presence['data']

Display the first few rows of the data

print(df_cancer_presence.head())

Source code in msk_cdm/datasets/impact/datasets_impact.py
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
def load_data_timeline_cancer_presence() -> Bunch:
    """Load and return the MSK-IMPACT cancer presence timeline dataset (deidentified).

    Returns:
        data: Dictionary-like object, with the following attributes.

            - **data** : pandas DataFrame
                The data matrix.
            - **description_columns** (Future release) : list
                The names of the dataset columns.
            - **description_dataset** (Future release) : str
                The full description of the dataset.
            - **filename** (Future release) : str
                The path to the location of the data.

    Examples
    --------
    ```python
    from msk_cdm.datasets import connect_to_db
    from msk_cdm.datasets.impact import load_data_timeline_cancer_presence

    # Connect to the database
    auth_file = 'path/to/config.txt'
    connect_to_db(auth_file=auth_file)

    # Load the dataset
    df_timeline_cancer_presence = load_data_timeline_cancer_presence()

    # Access the data
    df_cancer_presence = df_timeline_cancer_presence['data']

    # Display the first few rows of the data
    print(df_cancer_presence.head())
    """
    df = _loader._load_impact_data_timeline_cancer_presence()
    output = Bunch(data=df)
    return output