Advanced Python

Regular Expressions, Dictionary, Writing to CSV File

This question has multiple parts, and will take 20+ hours to complete, depending on your python proficiency level. Knowing these skills will be extremely beneficial during the first few weeks of the bootcamp. Note: Do not use Pandas library to complete this section.

For Part 1, use of regular expressions is optional.

The data file represents the Biostats Faculty List at University of Pennsylvania

This data is available in this file: faculty.csv

Part I - Regular Expressions

Q1. Find how many different degrees there are, and their frequencies: Ex: PhD, ScD, MD, MPH, BSEd, MS, JD, etc.

REPLACE THIS WITH YOUR RESPONSE

Q2. Find how many different titles there are, and their frequencies: Ex: Assistant Professor, Professor

REPLACE THIS WITH YOUR RESPONSE

Q3. Search for email addresses and put them in a list. Print the list of email addresses.

REPLACE THIS WITH YOUR RESPONSE

Q4. Find how many different email domains there are (Ex: mail.med.upenn.edu, upenn.edu, email.chop.edu, etc.). Print the list of unique email domains.

REPLACE THIS WITH YOUR RESPONSE

Place your code in this file: advanced_python_regex.py

Part II - Write to CSV File

Q5. Write email addresses from Part I to csv file

Place your code in this file: advanced_python_csv.py

The emails.csv file you create should be added and committed to your forked repository.

Your file, emails.csv, will look like this:

bellamys@mail.med.upenn.edu
warren@upenn.edu
bryanma@upenn.edu

Part III - Dictionary

Q6. Create a dictionary in the below format:

faculty_dict = { 'Ellenberg': [['Ph.D.', 'Professor', 'sellenbe@upenn.edu'], ['Ph.D.', 'Professor', 'jellenbe@mail.med.upenn.edu']],
              'Li': [['Ph.D.', 'Assistant Professor', 'liy3@email.chop.edu'], ['Ph.D.', 'Associate Professor', 'mingyao@mail.med.upenn.edu'], ['Ph.D.', 'Professor', 'hongzhe@upenn.edu']]}

Print the first 3 key and value pairs of the dictionary:

REPLACE THIS WITH YOUR RESPONSE

Q7. The previous dictionary does not have the best design for keys. Create a new dictionary with keys as:

professor_dict = {('Susan', 'Ellenberg'): ['Ph.D.', 'Professor', 'sellenbe@upenn.edu'], ('Jonas', 'Ellenberg'): ['Ph.D.', 'Professor', 'jellenbe@mail.med.upenn.edu'], ('Yimei', 'Li'): ['Ph.D.', 'Assistant Professor', 'liy3@email.chop.edu'], ('Mingyao','Li'): ['Ph.D.', 'Associate Professor', 'mingyao@mail.med.upenn.edu'], ('Hongzhe','Li'): ['Ph.D.', 'Professor', 'hongzhe@upenn.edu'] }

Print the first 3 key and value pairs of the dictionary:

REPLACE THIS WITH YOUR RESPONSE

Q8. It looks like the current dictionary is printing by first name. Print out the dictionary key value pairs based on alphabetical orders of the last name of the professors

REPLACE THIS WITH YOUR RESPONSE

Place your code in this file: advanced_python_dict.py

If you're all done and looking for an extra challenge, then try the below problem:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

05b-python_advanced.md

05b-python_advanced.md

Advanced Python

Regular Expressions, Dictionary, Writing to CSV File

Part I - Regular Expressions

Q1. Find how many different degrees there are, and their frequencies: Ex: PhD, ScD, MD, MPH, BSEd, MS, JD, etc.

Q2. Find how many different titles there are, and their frequencies: Ex: Assistant Professor, Professor

Q3. Search for email addresses and put them in a list. Print the list of email addresses.

Q4. Find how many different email domains there are (Ex: mail.med.upenn.edu, upenn.edu, email.chop.edu, etc.). Print the list of unique email domains.

Part II - Write to CSV File

Q5. Write email addresses from Part I to csv file

Part III - Dictionary

Q6. Create a dictionary in the below format:

Q7. The previous dictionary does not have the best design for keys. Create a new dictionary with keys as:

Q8. It looks like the current dictionary is printing by first name. Print out the dictionary key value pairs based on alphabetical orders of the last name of the professors

Markov (Optional)

Files

05b-python_advanced.md

Latest commit

History

05b-python_advanced.md

File metadata and controls

Advanced Python

Regular Expressions, Dictionary, Writing to CSV File

Part I - Regular Expressions

Q1. Find how many different degrees there are, and their frequencies: Ex: PhD, ScD, MD, MPH, BSEd, MS, JD, etc.

Q2. Find how many different titles there are, and their frequencies: Ex: Assistant Professor, Professor

Q3. Search for email addresses and put them in a list. Print the list of email addresses.

Q4. Find how many different email domains there are (Ex: mail.med.upenn.edu, upenn.edu, email.chop.edu, etc.). Print the list of unique email domains.

Part II - Write to CSV File

Q5. Write email addresses from Part I to csv file

Part III - Dictionary

Q6. Create a dictionary in the below format:

Q7. The previous dictionary does not have the best design for keys. Create a new dictionary with keys as:

Q8. It looks like the current dictionary is printing by first name. Print out the dictionary key value pairs based on alphabetical orders of the last name of the professors

Markov (Optional)