Setup
Join our Slack community or Get in touch with the dltHub Customer Success team.
To connect to your SQL database using dlt, follow these steps:
-
Initialize a
dltproject in the current working directory by running the following command:dlt init sql_database duckdbThis will add necessary files and configurations for a
dltpipeline with SQL database as the source and DuckDB as the destination.
If you'd like to use a different destination, simply replace duckdb with the name of your preferred destination.
-
Add credentials for your SQL database
To connect to your SQL database,
dltwould need to authenticate using necessary credentials. To enable this, paste your credentials in thesecrets.tomlfile created inside the.dlt/folder in the following format:[sources.sql_database.credentials]
drivername = "mysql+pymysql" # driver name for the database
database = "Rfam" # database name
username = "rfamro" # username associated with the database
host = "mysql-rfam-public.ebi.ac.uk" # host address
port = "4497" # port required for connectionAlternatively, you can also authenticate using connection strings:
[sources.sql_database]
credentials="mysql+pymysql://rfamro@mysql-rfam-public.ebi.ac.uk:4497/Rfam"To learn more about how to add credentials into your
sql_databasepipeline, see here. -
Add credentials for your destination (if necessary)
Depending on which destination you're loading into, you might also need to add your destination credentials. For more information, read the General Usage: Credentials.
-
Install any necessary dependencies
pip install -r requirements.txtnoteTo load data more efficiently using pyarrow, you'll also need to install
pyarrow,numpy, andpandas.pip install pyarrow numpy pandas -
Run the pipeline
python sql_database_pipeline.pyExecuting this command will run the example script
sql_database_pipeline.pycreated in step 1. In order for this to run successfully, you will need to pass the names of the databases and/or tables you wish to load. See the section on configuring the sql_database source for more details. -
Make sure everything is loaded as expected with
dlt pipeline <pipeline_name> shownoteThe pipeline_name for the above example is
rfam, you may also use any custom name instead.