SQL Index |¦| Indexes in SQL |¦| Database Index

SQL Index |¦| Indexes in SQL |¦| Database Index

Today we are going to run SQL queries against
a table containing ten THOUSAND records. {{ Maniacal laughter }} {{ Phone call }} What is it, I’m in the middle of a video… You don’t say? ALL in RAM?… Well, alrighty then… Today we are going to run SQL queries against
a table containing one .. Hundred .. MILLION records. {{ Maniacal laughter }} But don’t worry. By using indexes, we can rapidly speed up
queries so you do not have to experience the phenomenon known as boredom. We will work with a single table called ‘person’
containing 100 MILLION randomly generated people. The first row is an auto-generated primary
key called “person_id” The other columns are first_name… last_name…
and birthday. To create this table, we randomly generated
names using the 1000 most popular female names, male names, and last names in the United States. We did not weight the names by frequency when
generating our random sample. The datasets and the Python code used to generate
the random names are available for download from Github. Today, we are going to pay close attention
to how long each query takes to execute. So on the right, we will maintain a list of
queries and how many milliseconds each query took to complete. These values will vary depending on the specifications
of your workstation. We will begin by counting the number of rows
in the “person” table… There are indeed 100 MILLION records. Notice how long it took to perform this simple
count. 3 seconds and 562 milliseconds. Or 3,562 milliseconds. Let us put this query and the time required
in our table. As another speed test, count the number of
people with the last name “Smith.” Execute Over 100,000 Smiths… And the query took 4,261 milliseconds. For a second test, count how many people are
named “Emma.” Execute… Nearly 50,000. And note the time: 4,066 milliseconds. What if we count the number of people born
in May, 1980? For this we will use the BETWEEN operator. Execute… 166,000… And the query took 4,480 milliseconds. A pattern is starting to emerge. Let’s quickly gather a few more speed samples. How many Michaels are in our database? 4,127 milliseconds How many people have the last name
‘Hawkins’ or ‘Snow’? 5,315 milliseconds And how many people were born on April 1, 1995? 4,110 milliseconds. We ran 7 queries, and the execution times
were all very similar. This is because in order to answer each query,
the database had to scan all 100 MILLION rows to check each record for a match. Our 7 scans of the entire table took a combined
total of 29,921 milliseconds. This means on average my workstation was able
to scan 23,394,940 rows per second. While this may seem fast, we can do better… MUCH better… And we will do this by building an INDEX. The name ‘index’ was chosen because of
how an index works in a book. For example, if you want to learn about “normal
distributions”, you would not flip through your entire math book page by page, scanning
for the term “normal distribution.” Instead, you would go to the index, find the
term “normal distribution”, and you would see the list of pages. This technique is used by databases. When you create an index, the database will
generate a method to rapidly find data based on one or more columns. As our first example, let us create an index
on the “person” table. To do this, write “CREATE INDEX” and then
give your index a name. There are many different naming conventions
that you can use. The important thing is to be consistent. We will use the table name, followed by the
field name, then add “idx” to indicate this is an index. Next, write “ON” followed by the table
name. Inside parentheses, write the columns to include
in the index. We will start simply and create an index for
the “first_name” column. Execute. That certainly took a while. Remember, the database had to scan 100 MILLION
rows and build a “first name” index from scratch. Let’s test our new index by repeating an
earlier query where we counted the number of people named “Emma.” Execute. This time, it took only 508 milliseconds. If we repeat the count of people with last
name “Smith”, it takes 4,415 milliseconds. What’s going on here? This is about as long as it took last time. It did not speed up because our index was
built for the first name column. The last name was not indexed. And as you might suspect, if we search for
people born on April 1, 1995, the query should not run any faster than last time. Execute. 4,011 milliseconds. No real change. Our index improved the speed of searches by
first name, but had no effect on queries by last name or birthday. However, there are other queries where the
“first name” index will improve performance. For example, count the number of people named
“Julie Andrews.” Execute. This ran in just 514 milliseconds. By contrast, if you count people with last
name ‘Andrews’ and born on June 12, 1992… It takes 4,274 milliseconds. The full name query benefited from the first
name index, because the database was able to use it to quickly find all the “Julies.” From this smaller set of records, it then
completed the job by finding people with last name ‘Andrews’. The second query, however, had no index to
help. It had to perform a full scan of the table
to find all people with last name “Andrews”, and then find those with a matching birthday. Fortunately databases are clever about how
to use indexes. For example, if you count the number of people
born on October 31, 1985 with a first name of “Mia”… It executes quickly: 512 milliseconds… But why? Wouldn’t the database first scan all rows
to find people with a matching birthday, and then filter by first name? No! Just because we wrote it in this order does
not mean the database will do its work in that order. The database will consider all possible ways
to execute your query, then chooses the optimal path. We call each possibility a “query plan”,
and the “query optimizer” picks what it thinks is best. You’ll be happy to know a database can have
more than one index. Better still, each index can be created using
more than one column. Let us create a multi-column index. Write “CREATE INDEX”, then an overly long
but descriptive name. Next, write “ON” then the table name. In parentheses, list the columns you want
to go into this index. Important Note: The order matters. Think of this as sorting the data first by
last_name, then by first_name Execute… It is now time to take this new index for a test run. How many people named “John Williams”
are in the database? Execute… There are 36 people. And it only took 27 milliseconds to run this query. What about Julie Andrews? 26 milliseconds. Well done, index.. Well done… A word of caution. Indexes are not free. Just as indexes in a book require paper, indexes
in a database require storage. Furthermore, when you add data to a database,
it needs to create new records AND update all relevant indexes. So go forth and use indexes, but use them wisely. And before you go, please help us achieve
our goal of being the first channel with
One Hundred BILLION subscribers! {maniacal laughter
hahhahaa please subscribe, really}

Daniel Ostrander

Related Posts

100 thoughts on “SQL Index |¦| Indexes in SQL |¦| Database Index

  1. Abdul Dadashev says:

    Halo from Russia Dagestan

  2. Ahmad Al-Mutawa says:

    I can't get enough of these wonderful videos, and the drama is just perfect! Please make a one-hour-video just like this one

  3. Tony McDonald says:

    Thank you for this video. Would it be possible to have some videos on document databases such as MongoDB please? 🙂

  4. Vipin .R says:

    she is amazing…..

  5. StonedAurelius says:

    this girl got talents… i really enjoy her videos and acting

  6. Priscilla Amorim says:

    Great video! How can I check if a database has index before running a query?

  7. Mohammad Ullah says:

    Can you make the tutorial database specific like postgreasql database.

  8. John Hill Escobar says:

    Love the comedic way you use to teach. Please, do not stop doing it!!

  9. Surajit Sarkar says:

    Best video on database index

  10. Gabriel Moab says:

    Coloquem legenda em Português, seria ótimo .

  11. Martin Wirz says:

    Great and a million of thnks, inexed in some milliseconds!

  12. Mylovelyform says:

    where is the git link to get the python code

  13. Parth Jani says:

    Happy to see you

  14. Alpha Delta says:

    To an AI, 7m 41s is Infinity

  15. Maciej K says:

    I dont know how you do it, but i watch this for entertainment.

  16. Dino Hunter says:

    Nice topic, nice female.

  17. charles stanton says:

    This was a most excellent series. Time well spent. Thank you so much for posting.

  18. Robert Larson says:

    I thoroughly enjoyed the 596,000 ms spent watching this video.

  19. Morshedul Islam says:

    wow! This video worth a lot.
    B4 watching this i never knew index is this essential for faster performance. Well explained. Keep uploading such Videos. Animation were superb to understand quickly. God bless u.

  20. Shanur Rahman says:

    more videos please

  21. Ayejax says:

    @822 if the order matters where you put the on person (last name, first name) how come the order was backward with the create index person firstname_lastname_idk ??

  22. Ayejax says:

    When you add data to a database do you have to create new records and update it manually or does it just need time to automatically update itself?

  23. Sidney LC says:

    Good explanation, I would like you can make an in-depth video about indexes, showing advanced queries and tips. Also which editor or program did you use to run the querys? Seems familiar to me. Thanks

  24. James Lucas says:

    10k records that so funny. Small dataset. Now a billion rows; That's fun.

  25. Aditya Pandey says:

    want django tutorial

  26. MrBlaq says:

    She's creepy, but very informative.

  27. woolfel says:

    You should show query planner to illustrate how RDBMS determines which query plan to use. I'm guess that is already on your list of things to cover. After that, you should cover lock escalation under concurrent load where you have a mix of reads/writes on the same table. Then for the advanced SQL indexing concepts, compare b-tree versus bitmap indexes. If you get to report queries materialized views versus dynamic views for complex multi-dimensional queries 🙂


    Now i know who was the mysterious girl in Python videos of Socratica

  29. Karlos Anderson says:

    Really nice explaughnation!!

  30. 拿科学吓唬我 says:


  31. Investor Jess says:

    Holy shit she’s crazy!! But I like it!!!

  32. JK FRT says:

    Nice video. Looking forward to seeing more advanced SQL in the future. 😉

  33. Manish Jatav says:

    Waiting for more videos on sql

  34. www.poteresfondi.blogspot.com Wallpapers blog says:

    Just rename the channel to Codecratica already 😐

  35. prashanth kumar says:

    scary 😦

  36. Douglas Leite says:

    How much beautiful a woman can be? This video has the answer …

  37. john says:

    I subscribed in this channel because of this video

  38. MyTech says:

    The book analogy seems flawed, unlike a book a database is not a big text file of random strings. As such I am not clear on the structural difference between searching a database and searching an index. To me a database is already similar to a book's index, or possibly comparable to a telephone book, and telephone books do not have indexes because they are already just a big index.

  39. Ben Daniel says:


  40. Lorenzo Ponce says:

    How is this series even better than many online courses I've taken on programming? So impressed

  41. Tajinder Singh says:

    How to make these Videos…Please tell me.

  42. gravitycuda says:

    Python code is missing in your github

  43. Peter Parker says:

    In which application your running that queries

  44. mehtab ahmed says:

    Great effort , I appreciate it

  45. uilsonRJ says:

    How about a video telling the difference/best usage case, between different type of indexes, like b-tree and others.

  46. Vero Ev0 says:

    Fantastic. I'm slightly familiar with SQL because I only very occasionally need to have certain data pulled by our DBAs. Very cool @ 7:25 – I had that question. Thanks Socratica.

  47. veizour says:

    2:38 4127 Michael-Seconds

  48. Developers Club says:

    Thanks Socratica for helping . i really appreciate your hard work for making videos ! You saved almost my 50 Marks ! thankyou so much !

  49. nicopiana1 says:

    what happens if you have more columns?

    does the index help for any query?

    how do I use them wisely? optimizing them? how?

    do they help for any data type?

    does inserting new rows take as long as 4 min like the first time I built the index? how much time does it take inserting new rows then?

    how much time takes a delete query once the index was built?
    can I choose not to update the index on every insert?

  50. Con Man says:

    I’m a simple man. I see a new Socratica video, I click on it.

  51. CptObvious says:

    Try 10 billion records

  52. nobody101 Man2020 says:

    does she has an Instagram? … asking for a friend….

  53. Eric Juma says:

    This is the best coding channel I HAVE EVER SEEN!!!!!!!!!!!

  54. Mish Akopyan says:

    I can't believe i watched the video. i learned something. and totally enjoyed every second of it.

  55. bigchunk1 says:

    Educational AND entertaining. I didn't think it was possible.

  56. Ron Spi says:

    Braces. The best investment in our kids!
    She makes me want to learn more.

  57. Eric Swanson says:

    Why not just get a faster computer

  58. James Laine says:

    Great, but as a read-only database user does this even apply?

  59. Cathy B says:

    Soooo…I build an index, then new rows are added.  The index needs to be rebuilt/rerun?  And the existing index is overwritten? Love the video.

  60. Shortcut says:

    these are the best tutorials because they short and sweet just like my stamina in bed.

  61. Omar Enrique Otero Duarte says:

    what about :

    FROM person
    WHERE last_name = 'Mohanty',
    AND first_name = 'Ulka';

    Result: ONE! and only ONE!
    Messages: Successfully run. We found the best teacher in earth and mars! Total query runtime: 0 msec.

  62. Justinas Beinorius says:

    Good SR!

  63. chevalier says:

    I, hereby declare that Socratica is the obvious front-runner for the first channel to reach 100 billion subs.

  64. jiang ji says:

    one day machine will talk like human, human like machine

  65. sent4dc says:

    There's no winning in SQL. With indexes you get very fast fetches but then seriously cripple inserts and the db size.

  66. Cosmin Popescu says:

    marked it with a like: nice, well documented and scaled video, really useful, but just loose the strange laugh..

    for sure, you can build your market identity with other ways, than this strange and out-of-any-reason laughs,
    just use your intelligence

  67. Stein Gauslaa Strindhaug says:

    Love how you're insisting on reading out all times in milliseconds like a programmer, even though the GUI already has converted it into more human digestible seconds and ms.

  68. Iurie Cojocari says:

    Could you please tell the Socratica to make new series of topology, or any high math

  69. Elvin Diaz says:

    why you create index for text?.. sorry but there is not logic.. use full text search over index.. indexing text does not have logic.. bab use of index

  70. Gustavo Santos says:

    I can believe how easily and clearly you explained this shit. Suscribed!

  71. Ramesh Navi says:

    earned one subscription 😎

  72. Ben McDougall says:

    You just made databases interesting, and 10 mins ago I never even realised what an index was!

  73. Marwan Ghubein says:

    Really useful and easy to understand

  74. Tonoy Saha says:

    If create index than add record to the database are new records will take into consideration? If yes then is it true that we will create index when we have less records on database and the time required for creating index be less?

  75. Sergio Espaillat says:

    Your videos help me understand farrr better than my professors. Thank you so much for all you do!

  76. Caitlin Gibbons says:

    What is the song that begins at 8:11?

  77. Lola Heffernan says:

    I can see how the sci-fi visuals & sound effects are engaging, but I feel like the robotic vocal affect detracts from the quality of this video. I appreciate the writing + examples though.

  78. How to Pakistan says:

    i really appreciate her courses .

  79. Michael Vigato says:

    Excellent tutorial, I'd also like to see something about PostgreSQL isolation levels

  80. pavel bobrovskiy says:

    dats laugh… 😀

  81. alxndrdg8 says:

    i like your evil laugh.

  82. ČHĪĆ ŘĀŢÏ says:

    Great job👍

  83. Srijal Joshi says:

    So if I add a new row to the table, will it take 4 minutes ( in this case ) to rebuild the index? What data structure is used and how can the index be rebuilt in a smarter manner?

  84. b123kotey says:

    So damn hot

  85. Q Turki says:

    One of the greatest explanations I've ever seen! Well done!
    Anybody knows the name of the piano music? 8:11

  86. Un BelExité :-D says:

    I can't find the python code on Github, but I've seen it in action in another video of yours, I don't remember which one 😀

  87. Ed Push & Pop says:

    wow great video ty very very much!!!

  88. AmirHossein Ahmadi says:

    Wow! Well done! This is by far the best video I've ever seen on SQL indexes.
    The demonstration was phenomenal!

  89. Djey Kanjaria says:

    @Socratica Could you please share github link for this SQL file / python script which has 100K Rows. I tried to search on github, but it is not available there.
    You guys are the best. Love your work, Keep it up.

  90. 刘煜琦 says:

    Great video! Thank you so much. Can you make a video of how to use index WISELY?

  91. Anirudh Mahajan says:

    What was that piano tune at 8.27 ?

  92. Pramod Kumar Singh says:

    great explanation,hahahahahahhahahahaha

  93. dhanashri magdum says:

    Well explained…awesome

  94. Anik Ghose says:

    What software are you using? SMSS?

  95. Roberto Enrique Chavez Rodriguez says:

    So good content, thank you so much!! Much love from Russia 🙂

  96. Cool Blue says:

    Wowie!! Thanks for ‘splaining that indexy things makes searchyness fasterer in data things!

  97. Mrflu55y says:

    Wow these are great.

  98. Cameron Larson says:

    Very interesting and thoughtful video.

  99. Edvin Zigmanovic says:

    Extremely helpful video, thanks <3

  100. Nayan Zubeen Bayan says:

    Her laugh🤣🤣🤣🤣🤣

Leave a Reply

Your email address will not be published. Required fields are marked *