Posts

Showing posts from January, 2018

Complete SQLite IMDb Database with Python

Create IMDb Database with Python Our goal now is to automate the process of downloading the IMDb data and creating an optimized SQLite database with it. Below are the SQLite statements to create the tables, additional columns, and indices: create_Title = ("CREATE TABLE IF NOT EXISTS Title (" "tconst TEXT PRIMARY KEY, " "title_type TEXT, " "primary_title TEXT, " "original_title TEXT, " "is_adult INTEGER, " "start_year INTEGER, " "end_year INTEGER, " "runtime_minutes INTEGER, " "genres TEXT);") create_Name = ("CREATE TABLE IF NOT EXISTS Name (" "nconst TEXT PRIMARY KEY, " "primary_name TEXT, " "birth_year INTEGER, " "death_year TEXT, " "primary_profession TEXT, " "known_for_titles TEXT);") create_Crew = ("CREATE TABLE IF NOT EXISTS Crew (" "tconst TEXT PRIMARY KEY, "

Create IMDb Database with Python

Create IMDb Database with Python Making an IMDb Database with Python We will use the sqlite3 library in Python to create a database which holds the available data from the IMDb datasets. The IMDb datasets are available here . The descriptions are available here . All of these files are TSVs (tab separated value) as opposed to CSVs. Also, I will omit the file "title.akas.tsv.gz" because I find that its contents are superfluous for this project. Its columns are in the table below. Note that titleId and ordering form a composite key. Name titleId ordering title region language types attributes isOriginalTitle Description IMDb identifier for each title. (tt#######) (TEXT) Distinguishes a different translation of the title. (TEXT) Title of the translation. (TEXT) Country code. (ISO 3166-1 alpha-2) (TEXT) TBD (TEXT) TBD (TEXT)

NULL vs None in sqlite3 for Python

Null vs None in sqlite3, Python The sqlite3 library in Python accepts Python data types for data insertion. This table in the docs shows the type conversion for data when transferred between SQLite and Python. NULL vs None SQLite automatically accepts NULL (and null ) since it is a reserved keyword in. It cannot, however, be accepted as input via parameter substitution. We build a database and explain the distinctions in the following gist: dbpath = "/.../test.db" >>> import sqlite3 >>> connection = sqlite3.connect(dbpath) >>> cursor = connection.cursor() >>> cursor.execute("CREATE TABLE Test (testcolumn TEXT);") >>> cursor.execute("INSERT INTO Test VALUES(NULL);") >>> #We now have exactly one entry in our table. Verify this with SELECT: >>> cursor.execute("SELECT * FROM Test WHERE testcolumn IS NULL;") >>> for item in cursor: print(item) (Non