kuljis.xyz

Importing ASIC Business Names Dataset into SQLite

Jan 14, 2025

376 words

Get the dataset

ASIC Business Names Dataset

Setup the SQLite database

Create a SQLite database and start the command line shell.

sqlite3 data/db.sqlite3

Import the CSV file into a new table.

Now that we are in the shell, first we need to enable tab mode. Then we can create the table using the .import command. It will take a while to import the file, so wait a little while and then verify the table creation by viewing the schema.

.mode tabs
.import data/BUSINESS_NAMES_202501.csv business_names

Verfify table creation and population

Verify the table creation by viewing the created schema.

.schema business_names

Check results

select count(*) from business_names;

Add Full Text Search

SQLite FTS5 Extension

Create virtual table with fts5.

CREATE VIRTUAL TABLE business_search USING fts5(name, abn, state);

Populate the virtual table using insert select.

INSERT INTO
     business_search
     SELECT
          trim(BN_NAME), trim(BN_ABN), trim(BN_STATE_OF_REG)
     FROM
          business_names;

Notes

Line Mode

.mode line

select trim(BN_NAME) as 'Business', BN_ABN as 'ABN' from business_names limit 10;

Outputs

Business =
     ABN = 87387704324

Business = SILENT SCISSORZ
     ABN = 76643277300

Business = LITTLE MIRACLES PRESCHOOL & LONG DAY CARE (POINT CLARE, SWANSEA, MT RIVERVIEW & WAMBERAL)
     ABN = 23979823212

Business = A Cut Above Painting & Texture Coating
     ABN = 86634681397

Business = HOMSAFE
     ABN = 56098948915

Business = COASTAL  EARTH WORKS
     ABN = 88573118334

Business = Easy Settlements
     ABN = 47613497233

Business = Nourishing Your World
     ABN =

I find .mode line easy to read for a few reasons, for example it mitigate line wrapping and every entry is delimited by a new line.

From this example there are a few oddites:


Example Queries

Query business_names table

select count(*) from business_names;
select distinct BN_STATUS from business_names;
select distinct BN_STATE_OF_REG from business_names;
select trim(BN_NAME) as 'Business', BN_ABN as 'ABN' from business_names limit 10;

Query business virtual table with full text search.

select * from businesses WHERE business_name MATCH 'software';