Load and return the MSK-IMPACT clinical patient dataset (deidentified).
Returns:
Name | Type |
Description |
data |
Bunch
|
Dictionary-like object, with the following attributes.
- data : pandas DataFrame
The data matrix.
- description_columns : list
The names of the dataset columns. (Future release)
- description_dataset : str
The full description of the dataset. (Future release)
- filename : str
The path to the location of the data. (Future release)
|
Examples
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15 | from msk_cdm.datasets import connect_to_db
from msk_cdm.datasets.impact import load_data_clinical_patient
# Connect to the database
auth_file = 'path/to/config.txt'
connect_to_db(auth_file=auth_file)
# Load the dataset
df_clinical_patient = load_data_clinical_patient()
# Access the data
df_clin_p = df_clinical_patient['data']
# Display the first few rows of the data
print(df_clin_p.head())
|
Source code in msk_cdm/datasets/impact/datasets_impact.py
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43 | def load_data_clinical_patient() -> Bunch:
"""Load and return the MSK-IMPACT clinical patient dataset (deidentified).
Returns:
data : Dictionary-like object, with the following attributes.
- **data** : pandas DataFrame
The data matrix.
- **description_columns** : list
The names of the dataset columns. (Future release)
- **description_dataset** : str
The full description of the dataset. (Future release)
- **filename** : str
The path to the location of the data. (Future release)
Examples
--------
```python
from msk_cdm.datasets import connect_to_db
from msk_cdm.datasets.impact import load_data_clinical_patient
# Connect to the database
auth_file = 'path/to/config.txt'
connect_to_db(auth_file=auth_file)
# Load the dataset
df_clinical_patient = load_data_clinical_patient()
# Access the data
df_clin_p = df_clinical_patient['data']
# Display the first few rows of the data
print(df_clin_p.head())
```
"""
df = _loader._load_impact_data_clinical_patient()
output = Bunch(data=df)
return output
|