{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Files & Directories" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## An example of using the OS function to create a directory and move a file:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "# Create a directory and move a file from one directory to another\n", "# using low-level OS functions.\n", "\n", "import os\n", "\n", "# Check to see if a directory named \"test1\" exists under the current\n", "# directory. If not, create it:\n", "dest_dir = os.path.join(os.getcwd(), \"test1\")\n", "if not os.path.exists(dest_dir):\n", " os.mkdir(dest_dir)\n", "\n", "\n", "# Construct source and destination paths:\n", "src_file = os.path.join(os.getcwd(), \"sample_data\", \"README.md\")\n", "dest_file = os.path.join(os.getcwd(), \"test1\", \"README.md\")\n", "\n", "\n", "# Move the file from its original location to the destination:\n", "os.rename(src_file, dest_file)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Here is an example of using Pathlib to create a directory and move a file:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create a directory and move a file from one directory to another\n", "# using Pathlib.\n", "\n", "from pathlib import Path\n", "\n", "# Check to see if the \"test1\" subdirectory exists. If not, create it:\n", "dest_dir = Path(\"./test1/\")\n", "if not dest_dir.exists():\n", " dest_dir.mkdir()\n", "\n", "# Construct source and destination paths:\n", "src_file = Path(\"./sample_data/README.md\")\n", "dest_file = dest_dir / \"README.md\"\n", "\n", "# Move the file from its original location to the destination:\n", "src_file.rename(dest_file)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The OS module \n", "Python’s OS module, or the miscellaneous operating system interface, is very useful for file operations, directories, and permissions. Let’s take a look at each.\n", "\n", "## File operations\n", "File names can be thought of as two names separated by a dot. For example, helloworld.txt is the file name and the extension defines the file type. OS provides functions to create, read, update, and delete files. Some of the basic functions include:\n", "\n", "## Opening and closing files\n", "\n", "Reading from and writing to files\n", "\n", "Appending to files\n", "\n", "## Directories\n", "OS also provides functions to create, read, update, and delete directories, as well as change directories and list files. Knowing how to use these functions is key to working with files. For example, os.listdir( path ) returns a list of all files and subdirectories in a directory.\n", "\n", "## Permissions\n", "Having the ability to update file permissions is an important aspect of making installations from a terminal window. The os.chmod() provides the ability to create, read, and update permissions for individuals or groups.\n", "\n", "## Things to keep in mind \n", "One thing to be aware of is that Python treats text and binary files differently. Because Python is cross-platform, it tries to automatically handle different ASCII line endings. If you’re processing a binary file, make sure to open it in binary mode so Python doesn’t try to “fix” newlines in a binary file.\n", "\n", "A best practice is to always close() a file when you’re done reading or writing to it. Even though Python usually closes them for you, it’s a good signal to other people reading your code that you’re done with that file. Make sure to catch any potential errors from filesystem calls, such as permission denied, file not found, and so on. Generally, you wrap them in try/except to handle those errors.\n", "\n", "## Key takeaways\n", "There are several ways to manage files and directories in Python. One way is to use low-level functions in the OS and SYS modules that closely mimic standard Linux commands. Another way is to utilize the Pathlib module, which provides an object-oriented interface to working with the file systems. \n", "\n", "## Resources for more information\n", "More information about files and directories can be found in several resources provided below: \n", "- https://docs.python.org/3/library/os.html\n", "- https://docs.python.org/3/library/os.path.html\n", "- https://en.wikipedia.org/wiki/Unix_time" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Reading And Writing CSV" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Importing CSV" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import csv\n", "f = open(\"csv_file.txt\")\n", "csv_f = csv.reader(f)\n", "for row in csv_f:\n", " name, phone, role = row\n", " print(\"Name: {}, Phone: {}, Role: {}\".format(name, phone, role))\n", "f.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reading CSV" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import csv\n", "f = open(\"csv_file.txt\")\n", "csv_f = csv.reader(f)\n", "for row in csv_f:\n", " name, phone, role = row\n", " print(\"Name: {}, Phone: {}, Role: {}\".format(name, phone, role))\n", "f.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Generating CSV" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import csv\n", "\n", "hosts = [[\"workstation.local\", \"192.168.25.46\"],[\"webserver.cloud\", \"10.2.5.6\"]]\n", "with open('hosts.csv', 'w') as hosts_csv:\n", " writer = csv.writer(hosts_csv)\n", " writer.writerows(hosts)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reading and writing CSV files with dictionaries" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "users = [ {\"name\": \"Sol Mansi\", \"username\": \"solm\", \"department\": \"IT infrastructure\"}, \n", " {\"name\": \"Lio Nelson\", \"username\": \"lion\", \"department\": \"User Experience Research\"}, \n", " {\"name\": \"Charlie Grey\", \"username\": \"greyc\", \"department\": \"Development\"}]\n", "keys = [\"name\", \"username\", \"department\"]\n", "with open('by_department.csv', 'w') as by_department:\n", " writer = csv.DictWriter(by_department, fieldnames=keys)\n", " writer.writeheader()\n", " writer.writerows(users)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Study guide: .csv files\n", "The most common format for importing and exporting data for spreadsheets is a .csv format. A Comma Separated Values (.csv) file is a plain text file that uses—you guessed it—commas to separate each piece of data. You may already be familiar with .csv files if you have saved a spreadsheet in the .csv format. Here is a simple example of a .csv file displaying employee information:\n", "\n", "Name, Department, Salary\n", "\n", "Aisha Khan, Engineering, 80000\n", "\n", "Jules Lee, Marketing, 67000\n", "\n", "Queenie Corbit, Human Resources, 90000\n", "\n", "Notice that each row represents an employee’s information, and the values are separated by commas. \n", "\n", "In this reading, you will examine different commands to use when working with .csv files in Python and be provided with additional links for more information.\n", "\n", "Module contents\n", "The .csv module is a built-in Python functionality used to read and work with .csv files. Let’s look at how the .csv module defines some of these functions:\n", "\n", "csv.reader This function returns a reader object that iterates over lines in the .csv file.\n", "\n", "csv.writer This function returns a writer object that’s responsible for converting the user’s data into delimited strings on the given file-like object.\n", "\n", "class csv.DictReader This function creates an object that functions as a regular reader but maps the information in each row to a dictionary whose keys are given by the optional fieldname parameters.\n", "\n", "Dialects and formatting parameters\n", "Dialects are rules that define how a .csv file is structured, and parameters are formed to control the behavior of the .csv reader and writer and live within dialects. The following features are supported by dialects:\n", "\n", "Dialect.delimiter This attribute is a one-character string used to separate fields and defaults to a comma.\n", "\n", "Dialect.quotechar This attribute is a one-character string used to quote fields containing special characters and defaults to ‘ ‘’ ‘.\n", "\n", "Dialect.strict This attribute’s default is False, but when True, exception csv.Error will be raised if an error is detected.\n", "\n", "Reader objects\n", "A reader object contains the following public methods and attributes:\n", "\n", "csvreader._next_() This method returns the next row of the reader’s iterable object as a list or a dictionary, parsed properly to the current dialect. Typically, you would call this next(reader).\n", "\n", "csvreader.dialect This attribute is a read-only description of the dialect in use by the parser.\n", "\n", "Writer objects\n", "Writer objects provide you the capability to write data to a .csv file. Let’s look at a couple of public methods and attributes for writer objects:\n", "\n", "csvwriter.writerows(rows) This method writes all elements in rows to the writer’s file object and formats following the current dialect.\n", "\n", "csvwriter.dialect This attribute is a read-only description of the dialect being used by the writer.\n", "\n", "Key takeaways\n", "If you haven’t worked with .csv files yet, it’s only a matter of time. Become familiar with the .csv module’s reader and writer objects to work more efficiently with .csv files. The modules, features, and attributes in this reading are only some of the commands that can be used while working with .csv files. \n", "\n", "Resources for more information\n", "This \n", "document https://docs.python.org/3/library/csv.html\n", " provides additional information on how to read and write functions using .csv files.\n", "\n", "This \n", "document https://realpython.com/python-csv/\n", " provides additional information on what a .csv file is, how to parse .csv files with Python’s built-in .csv library, and how to parse .csv files with the pandas library." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Question 1\n", "We're working with a list of flowers and some information about each one. The create_file function writes this information to a CSV file. The contents_of_file function reads this file into records and returns the information in a nicely formatted block. Fill in the gaps of the contents_of_file function to turn the data in the CSV file into a dictionary using DictReader." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import csv\n", "\n", "# Create a file with data in it\n", "def create_file(filename):\n", " with open(filename, \"w\") as file:\n", " file.write(\"name,color,type\\n\")\n", " file.write(\"carnation,pink,annual\\n\")\n", " file.write(\"daffodil,yellow,perennial\\n\")\n", " file.write(\"iris,blue,perennial\\n\")\n", " file.write(\"poinsettia,red,perennial\\n\")\n", " file.write(\"sunflower,yellow,annual\\n\")\n", "\n", "\n", "# Read the file contents and format the information about each row\n", "def contents_of_file(filename):\n", " return_string = \"\"\n", "\n", " # Call the function to create the file \n", " create_file(filename)\n", "\n", " # Open the file\n", " with open(filename, 'r') as open_file:\n", " # Read the rows of the file into a dictionary\n", " rows = csv.DictReader(open_file)\n", " # Process each item of the dictionary\n", " for row in rows:\n", " return_string += \"a {} {} is {}\\n\".format(row[\"color\"], row[\"name\"], row[\"type\"])\n", " return return_string\n", "\n", "\n", "#Call the function\n", "print(contents_of_file(\"flowers.csv\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " Using the CSV file of flowers again, fill in the gaps of the contents_of_file function to process the data without turning it into a dictionary. How do you skip over the header record with the field names?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import csv\n", "\n", "# Create a file with data in it\n", "def create_file(filename):\n", " with open(filename, \"w\") as file:\n", " file.write(\"name,color,type\\n\")\n", " file.write(\"carnation,pink,annual\\n\")\n", " file.write(\"daffodil,yellow,perennial\\n\")\n", " file.write(\"iris,blue,perennial\\n\")\n", " file.write(\"poinsettia,red,perennial\\n\")\n", " file.write(\"sunflower,yellow,annual\\n\")\n", "\n", "# Read the file contents and format the information about each row\n", "def contents_of_file(filename):\n", " return_string = \"\"\n", "\n", " # Call the function to create the file \n", " create_file(filename)\n", "\n", " # Open the file\n", " with open(filename, 'r') as open_file:\n", " # Read the rows of the file\n", " rows = csv.reader(open_file)\n", " next(rows, None)\n", " # Process each row\n", " for row in rows:\n", " name, color, type = row\n", " # Format the return string for data rows only\n", "\n", " return_string += \"a {} {} is {}\\n\".format(name,color,type)\n", " return return_string\n", "\n", "#Call the function\n", "print(contents_of_file(\"flowers.csv\"))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Qwiklabs - Handle Files\n", "# cd data\n", "# ls\n", "# cat employees.csv\n", "\n", "''' input:\n", "Full Name, Username, Department\n", "Audrey Miller, audrey, Development\n", "Arden Garcia, ardeng, Sales\n", "Bailey Thomas, baileyt, Human Resources\n", "Blake Sousa, sousa, IT infrastructure\n", "Cameron Nguyen, nguyen, Marketing\n", "Charlie Grey, greyc, Development\n", "Chris Black, chrisb, User Experience Research\n", "Courtney Silva, silva, IT infrastructure\n", "Darcy Johnsonn, darcy, IT infrastructure\n", "Elliot Lamb, elliotl, Development\n", "Emery Halls, halls, Sales\n", "Flynn McMillan, flynn, Marketing\n", "Harley Klose, harley, Human Resources\n", "Jean May Coy, jeanm, Vendor operations\n", "Kay Stevens, kstev, Sales\n", "Lio Nelson, lion, User Experience Research\n", "Logan Tillas, tillas, Vendor operations\n", "Micah Lopes, micah, Development\n", "Sol Mansi, solm, IT infrastructure\n", "'''\n", "\n", "#!/usr/bin/env python3\n", "import csv\n", "\n", "def read_employees(csv_file_location):\n", " # Dialect classes can be registered by name so that callers of the CSV module\n", " # don't need to know the parameter settings in advance. We will now register a\n", " # dialect empDialect.\n", " csv.register_dialect('empDialect', skipinitialspace=True, strict=True)\n", " # Append the dictionaries to an empty initialised list employee_list as you\n", " # iterate over the CSV file.\n", " employee_file = csv.DictReader(open(csv_file_location), dialect = 'empDialect')\n", " employee_list = []\n", " for data in employee_file:\n", " employee_list.append(dict(data))\n", " return employee_list\n", "\n", "def process_data(employee_list):\n", " # Now, initialize a new list called department_list, iterate over employee_list,\n", " # and add only the departments into the department_list.\n", " department_list = []\n", " for employee_data in employee_list:\n", " department_list.append(employee_data['Department'])\n", " # The department_list should now have a redundant list of all the department\n", " # names. We now have to remove the redundancy and return a dictionary. We will\n", " # return this dictionary in the format department:amount, where amount is the\n", " # number of employees in that particular department.\n", " department_data = {}\n", " for department_name in set(department_list):\n", " department_data[department_name] = department_list.count(department_name)\n", " return department_data\n", "\n", "def write_report(dictionary, report_file):\n", " with open(report_file, \"w+\") as f:\n", " for k in sorted(dictionary):\n", " f.write(str(k) + ':' + str(dictionary[k]) + '\\n')\n", " f.close()\n", "\n", "employee_list = read_employees('/home/student/data/employees.csv')\n", "dictionary = process_data(employee_list)\n", "write_report(dictionary, '/home/student/data/report.txt')\n", "\n", "''' output1:\n", "[{'Full Name': 'Audrey Miller', 'Username': 'audrey', 'Department': 'Development'}, {'Full Name': 'Arden Garcia', 'Username': 'ardeng', 'Department': 'Sales'}, {'Full Name': 'Bailey Thomas', 'Username': 'baileyt', 'Department': 'Human Resources'}, {'Full Name': 'Blake Sousa', 'Username': 'sousa', 'Department': 'IT infrastructure'}, {'Full Name': 'Cameron Nguyen', 'Username': 'nguyen', 'Department': 'Marketing'}, {'Full Name': 'Charlie Grey', 'Username': 'greyc', 'Department': 'Development'}, {'Full Name': 'Chris Black', 'Username': 'chrisb', 'Department': 'User Experience Research'}, {'Full Name': 'Courtney Silva', 'Username': 'silva', 'Department': 'IT infrastructure'}, {'Full Name': 'Darcy Johnsonn', 'Username': 'darcy', 'Department': 'IT infrastructure'}, {'Full Name': 'Elliot Lamb', 'Username': 'elliotl', 'Department': 'Development'}, {'Full Name': 'Emery Halls', 'Username': 'halls', 'Department': 'Sales'}, {'Full Name': 'Flynn McMillan', 'Username': 'flynn', 'Department': 'Marketing'}, {'Full Name': 'Harley Klose', 'Username': 'harley', 'Department': 'Human Resources'}, {'Full Name': 'Jean May Coy', 'Username': 'jeanm', 'Department': 'Vendor operations'}, {'Full Name': 'Kay Stevens', 'Username': 'kstev', 'Department': 'Sales'}, {'Full Name': 'Lio Nelson', 'Username': 'lion', 'Department': 'User Experience Research'}, {'Full Name': 'Logan Tillas', 'Username': 'tillas', 'Department': 'Vendor operations'}, {'Full Name': 'Micah Lopes', 'Username': 'micah', 'Department': 'Development'}, {'Full Name': 'Sol Mansi', 'Username': 'solm', 'Department': 'IT infrastructure'}]\n", "'''\n", "\n", "'''output2:\n", "{'Sales': 3, 'Human Resources': 2, 'Development': 4, 'Marketing': 2, 'User Experience Research': 2, 'Vendor operations': 2, 'IT infrastructure': 4}\n", "'''\n", "\n", "'''output3:\n", "Development:4\n", "Human Resources:2\n", "IT infrastructure:4\n", "Marketing:2\n", "Sales:3\n", "User Experience Research:2\n", "Vendor operations:2\n", "'''" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.3" } }, "nbformat": 4, "nbformat_minor": 2 }