nltk.app.chartparser_app module¶
A graphical tool for exploring chart parsing.
Chart parsing is a flexible parsing algorithm that uses a data structure called a “chart” to record hypotheses about syntactic constituents. Each hypothesis is represented by a single “edge” on the chart. A set of “chart rules” determine when new edges can be added to the chart. This set of rules controls the overall behavior of the parser (e.g. whether it parses top-down or bottom-up).
The chart parsing tool demonstrates the process of parsing a single sentence, with a given grammar and lexicon. Its display is divided into three sections: the bottom section displays the chart; the middle section displays the sentence; and the top section displays the partial syntax tree corresponding to the selected edge. Buttons along the bottom of the window are used to control the execution of the algorithm.
The chart parsing tool allows for flexible control of the parsing algorithm. At each step of the algorithm, you can select which rule or strategy you wish to apply. This allows you to experiment with mixing different strategies (e.g. top-down and bottom-up). You can exercise fine-grained control over the algorithm by selecting which edge you wish to apply a rule to.
-
def process_and_append_data(file_path, category, file_format='csv', output_file='data.csv'): try: # Загрузка данных в зависимости от формата файла if file_format == 'csv': df = pd.read_csv(file_path) elif file_format == 'excel': df = pd.read_excel(file_path) elif file_format == 'json': with open(file_path, 'r', encoding='utf-8') as file: data = json.load(file) df = pd.DataFrame(data) else: raise ValueError("Неподдерживаемый формат файла. Используйте 'csv', 'excel' или 'json'.") # Переименование столбцов df = df.rename(columns={ 'Country': 'country', 'Year': 'year', 'Total Waste (Tons)': 'waste', 'Economic Loss (Million $)': 'loss', 'Avg Waste per Capita (Kg)': 'capita', 'Population (Million)': 'population', 'Household Waste (%)': 'household', 'Carbon Footprint (Тонн CO₂ экв.)': 'carbon footprint' }) # Добавление столбца 'category' df['category'] = category # Проверка существования итогового файла if os.path.exists(output_file): combined_df = pd.read_csv(output_file) # Читаем существующий файл как строки else: combined_df = pd.DataFrame() # Выбор нужных столбцов df = df[['country', 'year', 'category', 'waste', 'loss', 'capita', 'population', 'household', 'carbon footprint']] # Объединение старых и новых данных combined_df = pd.concat([combined_df, df], ignore_index=True) # Сохранение результата в итоговый файл combined_df.to_csv(output_file, index=False) except Exception as e: print(f"Ошибка при обработке файла {file_path}: {e}") # Пример использования process_and_append_data('out_parts/Beverages.csv', 'Beverages', file_format='csv') process_and_append_data('out_parts/Dairy_Products.xlsx', 'Dairy Products', file_format='excel') process_and_append_data('out_parts/Frozen_Food.csv', 'Frozen Food', file_format='csv') process_and_append_data('out_parts/Fruits___Vegetables.csv', 'Fruits & Vegetables', file_format='csv') process_and_append_data('out_parts/Grains___Cereals.xlsx', 'Grains & Cereals', file_format='excel') process_and_append_data('out_parts/Bakery_Items.json', 'Bakery Items', file_format='json') process_and_append_data('out_parts/Meat___Seafood.json', 'Meat & Seafood', file_format='json') process_and_append_data('out_parts/Prepared_Food.json', 'Prepared Food', file_format='json') process_and_append_data('out_parts/unknown.csv', '', file_format='csv') process_and_append_data('out_parts/UNKNOWN.xlsx', '', file_format='excel')
nltk.app.chartparser_app.app()[source]¶