Importing ASIC Business Names Dataset into SQLite

In this article I am going to cover how I imported the ASIC business names dataset into SQLite. Additonally we'll implement full text search using the sqlite FTS5 extension.

  1. Download the dataset.
  2. Import the datset into sqlite.
  3. Run some basic queries
  4. Conclude with implementing FTS5 full text search.

--

The first thing to do is to download the latest zipped dataset.

Then we'll use the special sqlite command to

Setup the SQLite database

Create a SQLite database and start the command line shell.

sqlite3 data/db.sqlite3
.mode tabs
.import data/BUSINESS_NAMES_202501.csv business_names

Verfify table creation and population

Verify the table creation by viewing the created schema.

.schema business_names

Check results

select count(*) from business_names;

SQLite FTS5 Extension

Create virtual table with fts5.

CREATE VIRTUAL TABLE business_search USING fts5(name, abn, state);

Populate the virtual table using insert select.

INSERT INTO
     business_search
     SELECT
          trim(BN_NAME), trim(BN_ABN), trim(BN_STATE_OF_REG)
     FROM
          business_names;

Notes

Line Mode

.mode line

select trim(BN_NAME) as 'Business', BN_ABN as 'ABN' from business_names limit 10;

Outputs

Business =
     ABN = 87387704324

Business = SILENT SCISSORZ
     ABN = 76643277300

Business = LITTLE MIRACLES PRESCHOOL & LONG DAY CARE (POINT CLARE, SWANSEA, MT RIVERVIEW & WAMBERAL)
     ABN = 23979823212

Business = A Cut Above Painting & Texture Coating
     ABN = 86634681397

Business = HOMSAFE
     ABN = 56098948915

Business = COASTAL  EARTH WORKS
     ABN = 88573118334

Business = Easy Settlements
     ABN = 47613497233

Business = Nourishing Your World
     ABN =

I find .mode line easy to read for a few reasons, for example it mitigate line wrapping and every entry is delimited by a new line.

From this example there are a few oddites:

  • Emtpy Business Names
  • Empty Australian Business Numbers

  • use trim() to strip leading and trailing whitespace. This is especially useful for Business Names.

Example Queries

Query business_names table

select count(*) from business_names;
select distinct BN_STATUS from business_names;
select distinct BN_STATE_OF_REG from business_names;
select trim(BN_NAME) as 'Business', BN_ABN as 'ABN' from business_names limit 10;

Query business virtual table with full text search.

select * from businesses WHERE business_name MATCH 'software';

Read More