Downloading data and datasets
Downloading form the UI
In uSmart it is possible to download the full dataset and in a variety of formats if created in those formats by the publisher. Once you have found the dataset of interest simply navigate into the dataset by hitting the datacard.
Once in the dataset you will see the file options for download. If you only want to consume subsets of data or in other ways follow the guidance here.
Downloading data from Real-time Datasets
Real-time datasets are not stored as flat files normally but those can be requested
Datasets <10,000,000 rows
For most datasets from within the Data Explorer card you can filter the data as described here then request a CSV or JSON of the results.
This could take a moment to compile the data with larger volumes or with larger datasets use the Bulk File Download method
Bulk File Download
This method is only available to real-time datasets and must be enabled to function. We would recommend enabling this for datasets with >10million rows or more than 1GB where users are expected to download the whole dataset periodically.
Step 1
Navigate to a real-time dataset you wish to enable bulk file download for and select the Real-time Data Access card.
Step 2
Select Bulk File Create to enable bulk file download
Depending on the size of the data the system will take a few moments to enable the request. What is happening is the data is being compiled into a file ready for download by multiple users.
Step 3
Once processing is complete any user with access to the dataset can navigate to the Real-time Data Access card to download the file
As data is added to the dataset the file is rebuilt once per day overnight enabling users who periodically analyse the full dataset to have that available without delays associated with compiling very large datasets.
Downloading data from the DCAT
When using a public DCAT all links contained are open, data links will trigger a file download from a browser or via code, API links will return a JSON and review the guide for Querying API connections.
When using a private DCAT all links associated with private datasets will be revealed. Anyone or any service wanting to access links in private datasets will require an API key attached to the dataset.
API Downloads
API links can be identified when the subdomain is api and the title is API, for example
{
"@type": "dcat:Distribution",
"accessURL": "https://api.usmart.io/org/28ccd497-7cad-4470-bd17-721d5cbbd6ef/08ff206b-d424-44c6-a712-ee029af2e689/1/urql",
"mediaType": "application/json",
"title": "API"
},
A user can access any API links using the same method for querying API connections but they must have an API key attached to that dataset and make the query setting api-key-id and api-key-secret in the headers to the keys values.
File downloads
File downloads can be identified where the address begins with data and the title is a file name.
{
"@type": "dcat:Distribution",
"accessURL": "https://data-staging.usmart.io/org/28ccd497-7cad-4470-bd17-721d5cbbd6ef/resource?resourceGUID=df0fb1eb-e285-4606-84ef-fe8f82871282",
"mediaType": "text/csv",
"title": "overall_summary.csv"
},
{
"@type": "dcat:Distribution",
"accessURL": "https://data-staging.usmart.io/org/28ccd497-7cad-4470-bd17-721d5cbbd6ef/resource?resourceGUID=dde36a7f-d237-48ed-8257-c10fb78e7f4f",
"mediaType": "application/json",
"title": "Test Private DCAT.json"
},
{
"@type": "dcat:Distribution",
"accessURL": "https://data-staging.usmart.io/org/28ccd497-7cad-4470-bd17-721d5cbbd6ef/resource?resourceGUID=3b14fda8-5af7-4b62-8fea-31567d3b4151",
"mediaType": "text/csv",
"title": "Test Private DCAT.csv"
},
{
"@type": "dcat:Distribution",
"accessURL": "https://data-staging.usmart.io/org/28ccd497-7cad-4470-bd17-721d5cbbd6ef/resource?resourceGUID=f3d24f65-bc5d-49b2-ac69-e8f5ced36e89",
"mediaType": "application/xml",
"title": "Test Private DCAT.xml"
},
{
"@type": "dcat:Distribution",
"accessURL": "https://data-staging.usmart.io/org/28ccd497-7cad-4470-bd17-721d5cbbd6ef/additionalDocumentation/72474891-427e-4d31-bef4-a40267f19d19/Food_Menu.pdf",
"mediaType": "application/pdf",
"title": "Food_Menu.pdf"
}
There are 3 type's of file downloads, UI File Downloads, Raw File Downloads, and Additional Documentation File Downloads. Using the following image as an example.
UI File Downloads
These can be identified in DCAT when the subdomain is data, the media type is “text/csv”, “application/xml”, and “application/json”, the path after the organisationGUID begins with “fileDownload”, and the file name is the name of the dataset. These can be downloaded with an API key connected to the dataset, add the api-key-id and api-key-secret as headers in the https request.
Raw File Downloads
These can be identified in DCAT when the subdomain is data, the media type depends on the file type, in the example above “text/csv”, the path after the organisationGUID begins with “fileDownload”, and the file name will match the name of the file uploaded on the UI. These can be downloaded with an API key connected to the dataset, add the api-key-id and api-key-secret as headers in the https request.
Additional Documentation
These can be identified in DCAT when the subdomain is data, the media type depends on the file type, in the example above “application/pdf” and the path after the organisationGUID begins with “additionalDocumentation/”. These can be downloaded with an API key connected to the dataset, add the api-key-id and api-key-secret as headers in the https request.