acis 1504 - introduction to data analytics & business intelligence text mining data cleaning
TRANSCRIPT
![Page 1: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/1.jpg)
ACIS 1504 - Introduction to Data Analytics & Business Intelligence
Text MiningData Cleaning
![Page 2: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/2.jpg)
Concept MapText Mining
Implementation
Mixed Cell References
Design: Accuracy
Random
Search, Left, Right, Mid,
Len, &
Paste Values
![Page 3: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/3.jpg)
Objectives
• Define Text Mining
• Demonstrate Excel features that support text mining.
![Page 4: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/4.jpg)
Segment A:Text Mining
![Page 5: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/5.jpg)
Text Analytics / Text Mining
• Software that searches vast amounts of textual data (unstructured) identifying patterns.
![Page 6: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/6.jpg)
Nestle• Nestle processes Social Media
http://uk.reuters.com/article/video/idUKBRE89P07Q20121026?videoId=238680321
![Page 7: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/7.jpg)
Segment B:Text Functions
![Page 8: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/8.jpg)
Text Mining
• Search
• Parse
• Concatenate
• SEARCH
• LEFT, MID, RIGHT, LEN
• &
![Page 9: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/9.jpg)
Name Example
Open Grades Textfile.xlsx.
Divide Last Name, First Name into two separate columns.
1. Locate the comma (SEARCH)2. Extract all characters to left of comma (LEFT)3. Locate end of full name (LEN)4. Extract almost all characters between comma
and end of name (RIGHT)
![Page 10: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/10.jpg)
SEARCH Function
![Page 11: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/11.jpg)
LEFT Function
![Page 12: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/12.jpg)
LEN or Length Function
![Page 13: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/13.jpg)
RIGHT Function
![Page 14: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/14.jpg)
MID FunctionExtract the first initial of first name.
![Page 15: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/15.jpg)
Concatenate• Combine First Name, space and Last
Name.
• & is the concatenate symbol
• Quotes are required around constant strings of text
![Page 16: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/16.jpg)
Student ID Example
Extract each student’s PID from their email address.
Create a new student identifier by combining the first three letters of the last name with the last four digits of the student ID number.
![Page 17: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/17.jpg)
Segment C:Data Cleaning & Generation
![Page 18: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/18.jpg)
Data Cleaning• Delete Unnecessary Columns & Rows• Resize Columns• Format Numeric Values• Separate Distinct Values • Shorten Lengthy Values• Data Validation for Future Entries• Generate Values
![Page 19: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/19.jpg)
Favorite Pie Example
![Page 20: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/20.jpg)
Favorite Pie Example
1. Ensure pie flavor data is consistent.
2. Replace confidential clicker ID # with randomly generated 6 digit number.
3. Ensure new ID number is static and unique.
![Page 21: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/21.jpg)
Favorite Pie Example
Original Sorted Consistent
![Page 22: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/22.jpg)
Random Number Functions
• =RAND()
• =RANDBETWEEN(low#, high#)
![Page 23: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/23.jpg)
Paste Special - Values
MAC: Edit Menu, Paste Special
![Page 24: ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning](https://reader035.vdocuments.mx/reader035/viewer/2022062806/5697bf8b1a28abf838c8b6b3/html5/thumbnails/24.jpg)
Exam Feedback Example
Open Exam Feedback.xlsx