Access: Database Design Participation Project – West Virginia K-12 Education Problem

Access: Database Design Participation Project – West Virginia K-12 Education Problem

Okay, welcome to the database design participation
project. Hopefully you’ve already downloaded the data files that we’re going to be working
with today. It’s going to be this big Education Excel workbook. So, I’m just going to right-click
over the zipped folder, right here, and I’m going to click Extract All. And, if I extract
it, you should be in this data folder right here, data-files-wvkep-dd. And, if I double-click on the Education file
right there, this is what should come up. One of the first things that you’ll notice
is, you have all of the data all in this one big worksheet right here, and that’s not ideal
because there’s a ton of, not only duplicate values, but you also have these overlapping
fields. You have fields that don’t need to go together for one thing. And, we can broadly
split this information into two categories, information about statistics like dropout
rate, enrollment, attendance rate, things like that, and then information about the
schools themselves, what county they’re in, school name, the grades that they serve, the
region, etc., etc. So, another thing to note when we’re going
through this is whether Access or Excel is going to be the best choice to work with the
data. So, if you’re looking to do something in Excel, it would be best suited to use Excel
when you’re going to be making charts or visuals or if you’re going to be performing some kind
of intense calculations or doing, like, a what-if analysis. You can also work with redundant data a lot
easier in Excel because there’s this feature, if you’re in the table right here, if you
have everything set up in a table and you go under the Design tab, you can actually
remove duplicates and we’ll get to that a little bit later after we separate things
properly. If there’s large amounts of redundant data,
Access is better to store it but Excel is simpler to work with removing the duplicates,
in this case. In Access, you’d have to make, like, a query or something. If there’s lots
of different categories of data like we have here, Access would be better to store it because
they might be related. If you’re going to form relationships between these different
tables that we’ll be making, then Access will be better to work with it after that’s all
done. So, for today, we’re going to be working primarily
in Excel and we’re going to prepare everything for Access. So, we can broadly classify the
information in front of us into these two big super categories like Schools and Annual
Statistics because you have certain information, for example, like about the school year, the
populate, the enrollment, the dropout rate, stuff like that, the attendance rate. That’s
all statistical information, that’s not necessarily about the school itself. However, the county,
school name, and the–whether the school is like an elementary school, intermediate school,
middle school, high school, that’s all information about the schools themselves. So, I’m going to actually separate those right
now by copying this Education database, or this Education worksheet. And, I’m going to
rename this first one to Schools. And, I’m going to copy it again and by right-clicking
over the worksheet tab and clicking Move or Copy. I’m going to create a copy right there
and I’m going to rename that one to Annual Statistics. So, the information that’s going to go in
the Annual Statistics is going to be the school year and the dropout rate onwards, and getting
the information for these tests, for example, like the WESTEST, social studies, math, reading,
and those scores. But, this information right here, the counties, the school name, grades
served, that’s all going to be superfluous, and we’re going to just further decompose
this information. For example, when we go and take a look at
the dropout rate, we’re actually going to separate that out a little bit later into
its own category. The school’s information, likewise, can be decomposed to information
about the county, and the region and population. You don’t need to know about the population
of a given area or the region. If you’re just looking for information about the school,
you want to know whether it’s an elementary school, middle school, high school, what grades
are served, and that kind of information. We can’t really reorganize the fields–we
can’t really split any of this information up. The regions are all the full name so we
don’t have a north region and a central region, we have like North Central, we have Eastern
Panhandle, Southwestern, so forth like that. It would confuse those actual–the North Central
region with a north and a central region. We can’t split up these numbers, or, like
these Yes/No categories or the names of the schools, for example. So, none of these fields
are really good candidates for that. These fields also can’t be consolidated into one
field. For example, it isn’t appropriate to consolidate
Is Elementary School, Is Intermediate School, Is Middle School, into one big category because
you just end up with Yes, No, No, No, No, No, or No, No, No, No, No, that kind of thing.
That’s not helpful at all. So, what I’m going to do now is actually start
making the other two tables that we’re going to need. So, I’m going to copy the Education
database again. And, this one is going to be called Counties. And, I’m going to copy
it one more time, one more time. And, I’m going to call that one Dropouts. Okay. And,
you want to leave the Education worksheet alone. You don’t want to just rename that
one, just in case anything goes wrong on any of these sheets, you can just copy the Education
database again and just fix it like that. So, let’s start off with the Schools table.
So, the Schools is going to have the county, the school name, the grades served, whether
it’s an elementary school, intermediate school, or middle school, and that’s it. So, I don’t
need the school year. I don’t need the region. I don’t need the population. So, I’m just
clicking the entire columns, right up top here. When my cursor turns into this bold,
black down-pointing arrow and the column itself is green, I’m just right-clicking and I’m
clicking Delete. And then, everything from Dropout Rate, all
the way over to the last WESTEST, can get deleted as well. So, you’re left with these
seven fields right here for Schools. The next one I’m going to do is going to be
this Annual Statistics table. So, what I’m going to do is, I’m going to get the county,
the school name, the school year, which is already here, the enrollment, attendance rate,
graduation rate, and then all these WESTEST scores. So, I don’t need the grades served,
or the region, population, or whether it’s an elementary school, middle school, high
school, or the dropout rate. So, everything from Grades Served, so column D, all the way
over to column K, right here, is going to get deleted. So, this is what you’ll be left
with, right here. And then, the next one we’re going to do is
the Counties. So, this is just going to have the county, region, and population. So, school
year can get deleted, school name, grades served can get deleted, and then pretty much
everything from column D, right now, all the way over to column P is getting deleted. So,
you should be left with these three fields right here, County, Region, and Population. And, the last one is this Dropouts. So, we’re
just going to have the county, the school year, and the dropout–the dropout rate. So,
everything from School Names all the way over to, Is High School, so that’s columns C through
J, are going to get deleted. And, everything from Enrollment all the way over to the last
WESTEST is getting deleted. So, the next thing that we’re going to take a look at is kind
of removing some of the redundancy that’s going on. And, if we take a look in the Schools
table right here, or in Excel it’s called a worksheet, but when we import this into
Access it will be as a table itself, we have all these different columns right here, these
different fields like Is Elementary School, Is Middle School, Is High School, and we’d
really like to consolidate that into one field that kind of says everything about all this
information right here because this also doesn’t take into account schools that are none of
the above. For example, as technical institutes, or, like, if they’re just another type of
school like a community college, perhaps, or something like that. So, what we’re going to do is we’re going
to insert a Table Column to the Right and we’re going to call that School Type. And,
what we’re going to do is we’re actually going to make a new worksheet by clicking the little
plus sign over here, and we’re going to rename this to School Types. And, the first column
that we’re going to have, this first field, is going to be the School Type Abbreviation.
So, School Type Abbrev. Then, if I just hit Tab, I’m going to make another one and it’s
going to be the full name. It will just be the School Type and this is kind of skipping
ahead but it’s going to be really beneficial to do this right now so, what I’m going to
do is this is actually under 12b. I’m just going to kind of move this to the side here
so it’s easy to see. And, what I’m going to do is just type in
these abbreviations and I’m going to just make that the full thing, School Type Name.
Type in the abbreviations right here on the left and then the school type names, the full
names on the right. I’m just going to copy and paste from the instructions. And, I’m going to skip ahead. You can feel
free to pause the video. When I come back, I’m going to have this entire thing filled
in. Okay, hopefully you’ve got all of these school
types entered right here. And, I’m just going to make a table out of these guys by highlighting
all the data so it should be everything from A2 all the way down to B13 if you did it just
like I did. And, I’m going to click Format as a Table under the Home tab. If I expand
this out, it’ll look like this button right here. And, you’re going to make sure that
you select your table has headers. So, it should look like that. And, this way when we go through the schools,
we can make sure that we get the School Type Abbreviation just correct. And, what we’re
going to do first though is, we’re going to eliminate some of these duplicates. So, what
I’m going to do is, I’m going to select these in the first column right here, which is Counties.
If I go under the Design tab under Table Tools, if I click Remove Duplicates and I Select
All right here. I say OK. It will remove these 3400 duplicates and this will be a lot less–a
lot fewer school types to go through and sort through. Alright, so, now what we’re going to do, now
that we have our reduced Schools table right here, we’re actually going to sort through
the school types to figure out exactly what we need. So, we’re going to start off with
the elementary schools, which is abbreviated as ELS. So, that’s something that–a school
that is only an elementary school. So, under the Filters tab, if I just click this dropdown
right here, I’m going to uncheck the No’s. We want only Yes’s. I’m going to make sure
that all of these ones over here, Is an Intermediate School, Is Middle School, I’m going to make
sure that those are all No’s. So, I’m going to uncheck any Yes’s going on right there.
Then, what I’m left with will be only ones that are elementary schools. So, as you can see, Primary School, Elementary
School, so on and so forth going all the way down. And, we’ll see that most of these, like,
about 400 of these schools right here are elementary schools, out of the 700 or so. So, I’m going to type in ELS. And, I’m going
to autofill all the way down to the very last one. And, the next one is just the intermediate
schools. So, if I go back to the Schools table, if I say No for elementary schools and I uncheck
the Yes, I’m going to check only the Yes’s for Is Intermediate School, and make sure
these last two are No’s and they are. So, I’m going to type in INS. The next one will be just the middle schools,
and then just the high schools. Alright, so, I just made sure that these three were No’s
and this one was definitely a Yes. So, this one is MDS for middle school. And, the last
one is a high school so make sure that these first three are No’s and that Is High School
is a Yes. So, it’s HIS for high school. And then, we have these other schools right
here. For example, elementary and middle schools. So, the way we’ll get that is the elementary
schools will be a Yes, and then the high school and intermediate schools will be No’s. This is EMS and then the next one is middle
and high school. So, elementary, Is Elementary School will be No, because we want middle
and high schools. Yeah. So, that intermediate, Is Intermediate School is a No, Is Elementary
School is a No, Is Middle School is a Yes, and Is High School is going to be a Yes. So,
should be the MHS type, middle/high school. Then, we’re going to look for ones that are
elementary school, middle school, and high schools. So, we’re just going to make sure
that Is Elementary School is a Yes and then these other two should be Yes’s from the one
we just did. So, EMH. Then, the only other one we might
be able to get would be the preschool. So, if we remove all these filters by clicking
the Clear Filters, we should see most of these are all filled in but there’s a few, like
here, that are not. And, what we can do is we can take a look at the Grades Served and
if we scroll all the way down, we’ll see that there’s one that is just pre-k. So, if we
uncheck the Select All at the very top of these Grades Served options right here, and
then we scroll down again, we’re just going select Pre-K to Pre-K and we’re left with
these ones. The abbreviation for Pre-K is PRE for preschool. Then, we’re going to remove the filters from
here and we’re going to uncheck the Select All and find only the blanks and see we’re
only left with a couple of these ones where we don’t have information about what grades
are served. So, this one is a technical center, technical center, vocational, technical center.
So, this made it quite a bit easier. There’s only a couple of these ones right here that
we’ll have to label Technical Centers. So, TEC. Then, our last category is, our last two are
Alternative School and Others. We don’t know exactly whether these ones–at least I don’t
know offhand whether these ones are other or alternative schools, so I’m going to leave
those unlabeled at the moment. Alright, so, now for the Schools table, we
can just delete the Is Elementary School, Is Intermediate, Is Middle School, Is High
School. And, we’re just left with the school type abbreviation, which I’m going to just
add the abbrev part right at the end there so we know it’s an abbreviation, not the full
name. And, we’re going to go through these other tables here starting with Annual Statistics,
and we’re going to remove any of the duplicates going on. So, I’m going to select right there
where the header is and then wait and hover over kind of in between the School Year and
that A column right there. When my cursor turns into that black, bold arrow, and go
into the Design tab and I’m going to remove duplicates. It should Select All for you.
You want to make sure that your Data has Headers is checked. You’re going to say OK. So, no duplicates are found. That’s good.
Now, we’re going to go into Counties and do the same thing. So, under here in the Counties
worksheet, going to select this one right here. Remove duplicates. OK. Make sure that
your Data Has Headers is checked and all these fields are selected and there’s 4000 duplicate
values removed. So, as you can see, that’s a lot cleaner than
what we started out with. And, the last one is this Dropouts, right here. So, we’re going
to, again, go under the Design tab, Remove Duplicates, make sure that all these are selected,
and your Data Has Headers is checked. And, you should be left with about 330 actual values
now. So, now that we have the final data that we’re
going to be working with, it’s kind of important to consider what fields, what columns will
be suitable for the primary keys or composite primary keys. So, what the easiest way to
do if you have the data here in Excel is to actually just go uncheck like Select All for
these columns and see what combination of these will give you a unique value for just
like selecting one particular type in that field. So, for example, I’m selecting just Barbour
County right here. And, if it doesn’t give you just one unique value, then you may need
to have multiple fields. So, for example, if I have these two fields, the County and
the School Name for the Schools table, then I can get just one particular value. But,
for example–or, if I just use the school name. So, the reason
I’m not going to use only the school name for this one, even though it would provide
a completely unique value for the Schools table, is because I want to be able to form
a relationship between this table and another table. For example, like, the Counties table
or the Dropouts table or something like that, and Counties comes up in all of these other
tables right here, including Annual Statistics. So, it will be something that, it will be
a point of connection. So, I can form a relationship in Access. So, I’m going to set both of these as the
primary key. It’s going to be a composite primary key. So, because the Counties is found
in a whole bunch of different tables, it’s going to form one point of connection for
pretty much every table except for the School Types. The other thing that’s found in multiple
tables is the School Type Abbreviation that’s found in the Schools table, as well as the
School Types. We’re going to be able to use the school year
as well. That’s found in the Annual Statistics table and it’s also found in the Dropouts
table, as well. So, those are how we’re going to form relationships. But, now we don’t need
the Education worksheet anymore so I’m just going to delete that. And, this is our finished
product. So–alright, so, save it. Submit it.

Daniel Ostrander

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *