{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# HW4 Deception (V2)\n", "## STEP 1: Get Data" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import os\n", "def get_data(file, path):\n", " f=open(path+file)\n", " data = f.read()\n", " f.close()\n", " return data\n", " \n", "def get_data_from_files(path):\n", " results = [get_data(file, path) for file in os.listdir(path)]\n", " return results\n", "\n", "# pos = get_data_from_files('../pos_cornell//')\n", "# neg = get_data_from_files('../neg_cornell/')\n", "pos = get_data_from_files('../hw4_lie_false/')\n", "neg = get_data_from_files('../hw4_lie_true/')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### STEP 1b -- turn it into neg/pos df then merge df" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | 0 | \n", "PoN | \n", "
---|---|---|
0 | \n", "? | \n", "N | \n", "
1 | \n", "Twin Trees Cicero NY HUGE salad bar and high q... | \n", "N | \n", "
2 | \n", "The worst restaurant that I have ever eaten in... | \n", "N | \n", "
3 | \n", "? | \n", "N | \n", "
4 | \n", "I have been to a Asian restaurant in New York ... | \n", "N | \n", "
... | \n", "... | \n", "... | \n", "
87 | \n", "Mikes Pizza High Point NY Service was very slo... | \n", "P | \n", "
88 | \n", "After I went shopping with some of my friend w... | \n", "P | \n", "
89 | \n", "I entered the restaurant and a waitress came b... | \n", "P | \n", "
90 | \n", "Carlos Plate Shack was the worst dining experi... | \n", "P | \n", "
91 | \n", "Olive Oil Garden was very disappointing. I exp... | \n", "P | \n", "
92 rows × 2 columns
\n", "