The Internet, sometimes called simply "the Net," is a worldwide system of computer networks in which a computer (client) can get information from any other computer (server):
The Internet is based on standards so that computers and software from different vendors can talk to each other. This is known as interoperability.
Every computer in internet is identified by a unique IP address (such as 110.22.33.114) using IPV4 which allows for a little more than 4 billion addresses (32 bit). The Internet is slowly progressing to IPV6, which allows for 3.4 * 1038 addresses (128 bit). This is 3.4 followed by 38 zeros addresses. IPV6 address look like: 1A23:120B:0:0:0:7634:AD01:004D.
Servers are computers that are permanently available on the Internet an always have the same static IP address. These addresses are assigned by the Internet Service Provider (ISP) that houses the server. Blocks of IP addresses and naming conventions are were assigned by ICANN
Clients, such as your computer, tablet, or smart phone are off/on and constantly move around on the internet. They have a dynamic IP address that changes each time you turn on the device. A local server on your network know as a Dynamic Host Configuration Protocol (DHCP) server assigns the address when you turn you device on.
The Internet handles data as packets, each packet containing about 1000 characters. Large amounts of data such as images and moves are broken down into packets. Each packet has a "From" IP address and a "To" IP address, similar to a physical letter that is mailed.
Internet use"two protocol standards to move data packets: TCP and IP. Internet Protocol (IP) is the addressing scheme we just discssed. Transmission Control Protocol (TCP) is responsible for breaking down large data items into packet, and reassembling them reliably on the client. It also handles errors and retries.
Special servers know as Domain Name Servers (DNS) are used to give name to the IP Address so that user can locate a computer by a name. For example, a DNS server will resolve the name http://ualr.edu to 144.167.4.20 to uniquely identify the computer on which this website is hosted.
Routers are devices that control the flow of packets throughout the internet. They can determine the best route for a packet (based on the IP address. They can also block malicious or unwanted internet traffic before it reaches a server.
The internet is designed to operate over many different types of media, such as twisted pair copper, coax cable, fiberoptic lines, and wireless radio transmission (satellite, Wi-Fi, and cellular phone networks. There has been an exponential increase in wireless transmission due to the use of mobile devices (smartphones and tablets). Connections between your smartphone and internet using 4G LTE technology is automatically encrypted. Wi-Fi connections are encrypted only if the wireless access point requires a password (well usually).
Uniform Resource Locator (URL) refers to a web address which uniquely identifies a document over the internet. See Example.
This document can be a web page, image, audio, video or anything else present on the web.
A URL can consist of:
For example, http://ualr.edu/informationtechnology/ is an URL to the index.html which is stored on ualr.edu web server under informationtechnology directory.
Assignment (due next class period): Watch the following videos and exercises.
Assignment (due next class period): Review the following exercises:
In your shared Google Sheets Students folder, create "South University Autorama Sales" as follows. See Solution.
Type | January | February | March |
---|---|---|---|
Compacts | 347,999.12 | 376,984.92 | 377,540.57 |
Sedans | 330,373.42 | 289,076.30 | 271,717.67 |
SUVs | 185,835.76 | 161,446.98 | 140,408.67 |
Trucks | 309,725.48 | 272,558.89 | 330,373.58 |
Add a Bar Chart to your "South University Autorama Sales" assignment - See Solution
Review the following topics:
In your shared Google Sheets Students folder, create "South University Autorama Safety Products" as follows. See Solution.
Item In Inventory | Quantity | Average Cost | Retail Price |
---|---|---|---|
Severe Weather Kit | 50 | 33.50 | 41.95 |
Emergency Road Kit | 135 | 23.90 | 29.95 |
First Aid Kit | 75 | 14.30 | 19.96 |
Road Trip Emergency Kit | 35 | 67.96 | 84.95 |
Accident Reporting Camera Kit | 35 | 12.00 | 22.95 |
Premium Auto Safety Kit | 50 | 36.00 | 44.95 |
Super Flashlight | 50 | 23.30 | 27.95 |
='SheetName'!A1
As the city grows, the school district must plan for addition students that will enroll in the elementary, middle, and high schools. Create workbook with 3 worksheets containing the following information.
As the city grows, the school district must plan for addition students that will enroll in the elementary, middle, and high schools. Create a workbook named Schools with 3 worksheets containing the following information. See Solution
Projected Increase | 2% | 3.5% | 7% | 5% | |
---|---|---|---|---|---|
Year | 2016 | 2017 | 2018 | 2019 | 2020 |
Wyndham | 1350 | ||||
Warm Creek | 956 | ||||
Los Seranos | 1175 | ||||
Hidden Canyon | 1465 | ||||
Butterfield | 845 |
Projected Increase | 4.5% | 5.75% | 3% | 3.5% | |
---|---|---|---|---|---|
Year | 2016 | 2017 | 2018 | 2019 | 2020 |
Townsand | 1194 | ||||
Canyon Hills | 1575 | ||||
Los Seranos | 1175 | ||||
Golden Springs | 1392 |
Projected Increase | 6.0% | 7.5% | 4% | 2.75% | |
---|---|---|---|---|---|
Year | 2016 | 2017 | 2018 | 2019 | 2020 |
Poppy Hills | 2276 | ||||
Diamond Branch | 3150 |
=SUM(A:A)
where A is any column name. This way you can continually add data without changing the formula. By putting the sum in a sidebar instead of the bottom of a column, you will not have change your formulas that use the value of the SUM. If you have column headers, you can start from below the headers to the bottom of the spreadsheet using =SUM(A4:A)
, indicating that you want to sum from column A4 to the bottom of the sheet.You should have the following assignments completed:
Today's Lecture
A good database design starts with a list of the data that you want to include in your database and what you want to be able to do with the database later on.
The types of information that are saved in the database are called 'entities'. These entities exist in four kinds: people, things, events, and locations. Everything you could want to put in a database fits into one of these categories. If the information you want to include doesn't fit into these categories, than it is probably not an entity but a property of an entity, an attribute.
For example, imagine that you are creating a website for a store, what kind of information do you have to deal with? In a store you sell your products to customers. The "Store" is a location; "Sale" is an event; "Products" are things; and "Customers" are people. These are all entities that need to be included in your database.
But what other things are happening when selling a product? A customer comes into the store, approaches the salesperson, asks a question and gets an answer. "Salespersons" also participate, and because salespersons are people, we need a salespersons entity.
The next step is to determine the relationships between the entities and to determine the cardinality of each relationship. The relationship is the connection between the entities, just like in the real world: what does one entity do with the other, how do they relate to each other? For example, customers buy products, products are sold to customers, a sale comprises products, a sale happens in a store.
The cardinality shows how much of one side of the relationship belongs to how much of the other side of the relationship. First, you need to state for each relationship, how much of one side belongs to exactly 1 of the other side. For example: How many customers belong to 1 sale?; How many sales belong to 1 customer?; How many sales take place in 1 store?
Did we mention all relationships? There are four entities and each entity has a relationship with every other entity, so each entity must have three relationships, and also appear on the left end of the relationship three times. Above, 12 relationships were mentioned, which is 4*3, so we can conclude that all relationships were mentioned.
Now we'll put the data together to find the cardinality of the whole relationship. In order to do this, we'll draft the cardinalities per relationship. To make this easy to do, we'll adjust the notation a bit, by noting the 'backward'-relationship the other way around:
The second relationship we will turn around so it has the same entity order as the first. Please notice the arrow that is now faced the other way!
Cardinality exists in four types: one-to-one, one-to-many, many-to-one, and many-to-many. In a database design this is indicated as: 1:1, 1:N, M:1, and M:N. To find the right indication just leave the '1'. If there is a 'many' on the left side, this will be indicated with 'M', if there is a 'many' on the right side it is indicated with 'N'.
The true cardinality can be calculated through assigning the biggest values for left and right, for which 'N' or 'M' are greater than '1'. In this example, in both cases there is a '1' on the left side. On the right side, there is a 'N' and a '1', the 'N' is the biggest value. The total cardinality is therefore '1:N'. A customer can make multiple 'sales', but each 'sale' has just one customer.
If we do this for the other relationships too, we'll get:
Signup for a free educational account with Lucid Charts. Following the instructions at the bottom of "Pricing" page where it says: "Free for Individual Teachers & Students"
Relationships have two indicators. The first one (often called multiplicity) refers to the maximum number of times that an instance of one entity can be associated with instances in the related entity. It can be one or many.
The second describes the minimum number of times one instance can be related to others. It can be zero or one, and accordingly describes the relationship as optional or mandatory.
The combination of these two indicators is always in a specific order. Placed on the outside edge of the relationship, the symbol of multiplicity comes first. The symbol indicating whether the relationship is mandatory or optional is shown after the symbol of multiplicity.
In crow’s foot notation:
This result in four possible relationships:
A good database design starts with a list of the data that you want to include in your database and what you want to be able to do with the database later on.
Between the entities there may be a mutual dependency. This means that the one item cannot exist if the other item does not exist. For example, there cannot be a sale if there are no customers, and there cannot be a sale if there are no products.
The relationships Sales → Customers, and Sales → Products are mandatory, but the other way around this is not the case. A customer can exist without sale, and also a product can exist without sale. This is of importance for the next step.
Sometimes in your model you will get a 'redundant relationship'. These are relationships that are already indicated by other relationships, although not directly.
In the case of our example there is a direct relationships between customers and products. But there are also relationships from customers to sales and from sales to products, so indirectly there already is a relationship between customers and products through sales. The relationship 'Customers ↔ Products' is made twice, and one of them is therefore redundant. In this case, products are only purchased through a sale, so the relationships 'Customers ↔ Products' can be deleted. The model will then look like this:
Many-to-many relationships (M:N) are not directly possible in a database. What a M:N relationship says is that a number of records from one table belongs to a number of records from another table. Somewhere you need to save which records these are and the solution is to split the relationship up in two one-to-many relationships.
This can be done by creating a new entity that is in between the related entities. In our example, there is a many-to-many relationship between sales and products. This can be solved by creating a new entity: sales-products. This entity has a many-to-one relationship with Sales, and a many-to-one relationship with Products. In logical models this is called an associative entity and in physical database terms this is called a link table or junction table.
In the example there are two many-to-many relationships that need to be solved: 'Products ↔ Sales', and 'Products ↔ Stores'. For both situations there needs to be created a new entity, but what is that entity?
For the Products ↔ Sales relationship, every sale includes more products. The relationship shows the content of the sale. In other words, it gives details about the sale. So the entity is called 'Sales details'. You could also name it 'sold products'.
The Products ↔ Stores relationship shows which products are available in which the Stores, also known as 'stock'. Our model would now look like this:
The data elements that you want to save for each entity are called 'attributes'.
About the products that you sell, you want to know, for example, what the price is, what the name of the manufacturer is, and what the type number is. About the customers you know their customer number, their name, and address. About the Stores you know the location code, the name, the address. Of the sales you know when they happened, in which store, what products were sold, and the sum total of the sale. Of the Salesperson you know his staff number, name, and address. What will be included precisely is not of importance yet; it is still only about what you want to save.
Derived data is data that is derived from the other data that you have already saved. In this case the 'sum total' is a classical case of derived data. You know exactly what has been sold and what each product costs, so you can always calculate how much the sum total of the sales is. So really it is not necessary to save the sum total.
So why is it saved here? Well, because it is a sale, and the price of the product can vary over time. A product can be priced at $10 today and at $8 next month, and for your administration you need to know what it cost at the time of the sale, and the easiest way to do this is to save it here.
Using Lucid Charts, try to reproduce "Figure 2: Relationships between entities"
A primary key (PK) is one or more data attributes that uniquely identify an entity. A key that consists of two or more attributes is called a composite key. All attributes part of a primary key must have a value in every record (which cannot be left empty) and the combination of the values within these attributes must be unique in the table.
In the example there are a few obvious candidates for the primary key. Customers all have a customer number, products all have a unique product number and the sales have a sales number. Each of these data is unique and each record will contain a value, so these attributes can be a primary key. Often an integer column is used for the primary key so a record can be easily found through its number.
Link-entities usually refer to the primary key attributes of the entities that they link. The primary key of a link-entity is usually a collection of these reference-attributes. For example in the Sales_details entity we could use the combination of the PK's of the sales and products entities as the PK of Sales_details. In this way we enforce that the same product (type) can only be used once in the same sale. Multiple items of the same product type in a sale must be indicated by the quantity.
In the ERD the primary key attributes are indicated by the text 'PK' behind the name of the attribute. In the example only the entity 'store' does not have an obvious candidate for the PK, so we will introduce a new attribute for that entity: storenumber.
The Foreign Key (FK) in an entity is the reference to the primary key of another entity. In the ERD that attribute will be indicated with 'FK' behind its name. The foreign key of an entity can also be part of the primary key, in that case the attribute will be indicated with 'PF' behind its name. This is usually the case with the link-entities, because you usually link two instances only once together (with 1 sale only 1 product type is sold 1 time).
If we put all link-entities, PK's and FK's into the ERD, we get the model as shown below. Please note that the attribute 'products' is no longer necessary in 'Sales', because 'sold products' is now included in the link-table. In the link-table another field was added, 'quantity', that indicates how many products were sold. The quantity field was also added in the stock-table, to indicate how many products are still in store.
Now it is time to figure out which data types need to be used for the attributes. There are a lot of different data types. A few are standardized, but many databases have their own data types that all have their own advantages. Some databases offerthe possibility to define your own data types, in case the standard types cannot do the things you need.
Okay, time for a crash course:
The standard data types that every database knows, and are most-used, are: CHAR, VARCHAR, TEXT, FLOAT, DOUBLE, and INT. Normally, we use signed integers instead of unsigned integers.
Text:
Numbers:
Date and Time:
Other types:
For our example the data types are as follows:
Homework
Review: Crash Course in Binary Numbers
Normalization makes your data model flexible and reliable. It does generate some overhead because you usually get more tables, but it enables you to do many things with your data model without having to adjust it.
The first form of normalization states that there may be no repeating groups of columns in an entity. We could have created an entity 'sales' with attributes for each of the products that were bought. This would look like this:
What is wrong about this is that now only 3 products can be sold. If you would have to sell 4 products, than you would have to start a second sale or adjust your data model by adding 'product4' attributes. Both solutions are unwanted. In these cases you should always create a new entity that you link to the old one via a one-to-many relationship.
The second form of normalization states that all attributes of an entity should be fully dependent on the whole primary key. This means that each attribute of an entity can only be identified through the whole primary key. Suppose we had the date in the Sales_details entity:
This entity is not according the second normalization form, because in order to be able to look up the date of a sale, I do not have to know what is sold (productnumber), the only thing I need to know is the sales number. This was solved by splitting up the tables into the sales and the Sales_details table:
Now each attribute of the entities is dependent on the whole PK of the entity. The date is dependent on the sales number, and the quantity is dependent on the sales number and the sold product.
The third form of normalization states that all attributes need to be directly dependent on the primary key, and not on other attributes. This seems to be what the second form of normalization states, but in the second form is actually stated the opposite. In the second form of normalization you point out attributes through the PK, in the third form of normalization every attribute needs to be dependent on the PK, and nothing else.
In this case the price of a loose product is dependent on the ordering number, and the ordering number is dependent on the product number and the sales number. This is not according to the third form of normalization. Again, splitting up the tables solves this.
There are more normalization forms than the three forms mentioned above, but those are not of great interest for the average user. These other forms are highly specialized for certain applications. If you stick to the design rules and the normalization, you will create a design that works great for most applications.
If you apply the normalization rules, you will find that the 'manufacturer' in the product table should also be a separate table:
Files you will need for this exercise:
When analyzing data, you want to develop a spreadsheet that can be used for different periods of data. Often you get data monthly, and you would like to import the data without having to re-enter the calculations.
In this analysis, we are going to look at the number of products sold and the dollar amount of the products sold. Look at your ERD diagram and see the OrderDetailt table. It has the counts and unit price of each item sold.
As previously stated, don't want to do any direct calculations in the data, so we will make a copy of the data in the Temp sheet. Enter the following in the Temp Sheet
=OrderDetails!A2
=OrderDetails!B2
=OrderDetails!C2
=OrderDetails!D2
=OrderDetails!E2
Now we will calculate the Amount of the Order Detail. It is the Quantity times the Unit Price minus the discount:
=(C2*D2) - ((C2*D2)*E2)
Let's work on the category of the product. We already have the Product ID (Column B in Temp sheet). We can then go to the Products sheet, lookup a product using the Product ID (Colunn A in Products sheet) and find the CategoryID (Column D in the Products sheet). The CategoryID is now in column G of the Temp sheet.
=INDEX(Products!$D:$D,MATCH(B2,Products!$A:$A))
Now that we have the CategoryID in column G of the Temp sheet, we can find the CategoryName. We can then go to the Categories sheet, lookup a category using the CategoryID (Colunn G in Categories sheet) and find the CategoryName (Column B in the Categories sheet). The CategoryName is now in column H of the Temp sheet.
=INDEX(Categories!$B:$B,MATCH(G2,Categories!$A:$A))
Now duplicate all of your calculations in the Temp sheet, down to row 2156.
Let's put together the body of the report. Go to the Report sheet and enter the following:
=SORT(Categories!B2:B,1,true)
=SUMIFS(Temp!$D$2:$D,temp!$H$2:$H,"="&A2)
=SUMIFS(Temp!$F$2:$F,temp!$H$2:$H,"="&A2)
=RANK(C2,$C$2:$C)
- Also, add a conditional format that highlights the 1st, 2nd, and 3rd highest ranking=SUM($B$2:$B)
=SUM($C$2:$C)
We have a great looking report, but now we would like to add a date range based on the order date. We can go to the Temp sheet, lookup a an order using the OrderID (Column A in Orders sheet) and find the OrderDate (Column D in the Orders sheet). The OrderDate is now in column I of the Temp sheet.
=INDEX(Orders!$D:$D,MATCH(A2,Orders!$A:$A))
Let's go back to the Report Sheet and add the following to use the OrderDate
=SUMIFS(Temp!$D$2:$D,Temp!$H$2:$H,"="&A2,Temp!$I$2:$I,">="&$G$2,Temp!$I$2:$I,"<="&G$3)
=SUMIFS(Temp!$F$2:$F,Temp!$H$2:$H,"="&A2,Temp!$I$2:$I,">="&$G$2,Temp!$I$2:$I,"<="&G$3)
Almost done. We would also limit the data to specific customer. We can go to the Temp sheet, lookup a an order using the OrderID (Column A in Orders sheet) and find the CustomerID (Column B in the Orders sheet). The CustomerID is now in column J of the Temp sheet. While it is nice to have the CustomerID, it would be more readable to have the CompanyName. We can go to the Temp sheet, lookup an Customer using the CustomerID (Column A in Customers sheet) and find the CompanyName (Column B in the Customers sheet). The CompanyName is now in column K of the Temp sheet.
=INDEX(Orders!$B:$B,MATCH(A2,Orders!$A:$A))
=INDEX(Customers!$B:$B,MATCH(J2,Customers!$A:$A))
=SUMIFS(Temp!$D$2:$D,Temp!$H$2:$H,"="&A2,Temp!$I$2:$I,">="&$G$2,Temp!$I$2:$I,"<="&G$3,Temp!$K$2:$K,"="&$G$4)
=SUMIFS(Temp!$F$2:$F,Temp!$H$2:$H,"="&A2,Temp!$I$2:$I,">="&$G$2,Temp!$I$2:$I,"<="&G$3,temp!$K$2:$K,"="&$G$4)
Everything looks good, except that the dropdown for CompanyNames are not sorted. Let's create a sorted column:
=Sort(K2:K,1,true)
One final enhancement. Add a sheet called "Chart". Create a pie chart that displays the categories and sales amount.
Using the Northwind data and ERD, Create a spreadsheet that shows number of units and sales by employee. You should be able to enter from and to date for the date of the sales order and select a specifc product. Conditionally format the rank of the top 3 employess. Create a pie chart of the sales and employees.
Practice Assessment Solution
Order Detail Calculations
=OrderDetails!A2
=OrderDetails!B2
=OrderDetails!C2
=OrderDetails!D2
=OrderDetails!E2
=(C2*D2) - ((C2*D2)*E2)
Use Orders sheet to get the EmployeeID.
=INDEX(Orders!$C:$C,MATCH(A2,Orders!$A:$A))
Use Employees sheet to get the employee's LastName.
=INDEX(Employees!$B:$B,MATCH(G2,Employees!$A:$A))
Use Orders sheet to get the Order Date
=INDEX(Orders!$D:$D,MATCH(A2,Orders!$A:$A))
Use Products sheet to get the ProductName.
=INDEX(Products!$B:$B,MATCH(B2,Products!$A:$A))
=SORT(J2:J,1,true)
Duplicate Columns A-J in the Temp sheet, down to row 2156.
Report Sheet:
=Sort(Employees!B2:B,1,true)
=SUMIFS(Temp!$D$2:$D,Temp!$H$2:$H,"="&A2,Temp!$I$2:$I,">="&$G$2,Temp!$I$2:$I,"<="&$G$3,Temp!$J$2:$J,"="&$G$4)
=SUMIFS(Temp!$F$2:$F, Temp!$H$2:$H,"="&A2,Temp!$I$2:$I,">="&$G$2,Temp!$I$2:$I,"<="&$G$3,Temp!$J$2:$J,"="&$G$4)
=RANK(C2,$C$2:$C)
- Also, add a conditional format that highlights the 1st, 2nd, and 3rd highest ranking=SUM($B$2:$B)
=SUM($C$2:$C)
One final enhancement. Add a sheet called "Chart". Create a pie chart that displays the categories and sales amount.
Files you will need for this exercise:
Review the following:
In Class Assignment - Open myPHPAdmin and navigate to your database. Run the MySQL procedure that creates the Northwind database in Northwind.MySQL5.sql
Hint: When using phpMyAdmin, use the "back tic" mark (key is located to the left of the Number 1 on your keyboard) when the table name or the column name contains a blank, as in:
Try to create the following SQL statements
Try to create the following SQL Statements:
Try to create the following SQL Statements:
Try to create the following SQL Statements: