зеркало из https://github.com/microsoft/topologic.git
121 строка
3.4 KiB
Plaintext
121 строка
3.4 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"name": "#%% md\n"
|
|
}
|
|
},
|
|
"source": [
|
|
"# Bipartite Graphs\n",
|
|
"\n",
|
|
"Bipartite graphs are networks of 2 disjoint, independent sets of vertices. Sometimes it is useful to generate a graph\n",
|
|
"between vertices within one set by using their mutual relationships from the other set.\n",
|
|
"\n",
|
|
"`topologic` provides a facility for materializing a graph of this form, and this notebook will show you how. \n",
|
|
"\n",
|
|
"# Data\n",
|
|
"The data used is not a particularly interesting dataset. It is hand crafted and miniscule in size, but it will \n",
|
|
"illustrate the utility. The data is located in `test_data`, colocated in the same directory as this notebook.\n",
|
|
"\n",
|
|
"The format of the file is:\n",
|
|
"```\n",
|
|
"Role,Movie,Person\n",
|
|
"Cast,Apollo 13,Tom Hanks\n",
|
|
"Cast,Apollo 13,Bill Paxton\n",
|
|
"Cast,Apollo 13,Kevin Bacon\n",
|
|
"Cast,Apollo 13,Kathleen Quinlan\n",
|
|
"Cast,Planes Trains & Automobiles,Kevin Bacon\n",
|
|
"Cast,Planes Trains & Automobiles,Steve Martin\n",
|
|
"Cast,Planes Trains & Automobiles,John Candy\n",
|
|
"Executive Producer,Mamma Mia! Here We Go Again,Tom Hanks\n",
|
|
"Cast,Forrest Gump,Tom Hanks\n",
|
|
"Cast,Forrest Gump,Sally Field\n",
|
|
"```\n",
|
|
"\n",
|
|
"What we expect to see is an undirected graph of People who have worked together in some capacity or another."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"metadata": {
|
|
"pycharm": {
|
|
"is_executing": false,
|
|
"name": "#%%\n"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Sally Field worked on a movie with Tom Hanks? True\n",
|
|
"Graph is directed: False\n",
|
|
"['Steve Martin', 'Kathleen Quinlan', 'Tom Hanks', 'Kevin Bacon', 'Sally Field', 'John Candy', 'Bill Paxton']\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"import topologic as tc\n",
|
|
"\n",
|
|
"bipartite_actors = \"test_data/actor_bipartite_graph_reordered.csv\"\n",
|
|
"\n",
|
|
"with open(bipartite_actors, \"r\") as bipartite_io:\n",
|
|
" csv_dataset = tc.io.CsvDataset(\n",
|
|
" bipartite_io,\n",
|
|
" has_headers=True,\n",
|
|
" dialect=\"excel\"\n",
|
|
" )\n",
|
|
" graph = tc.io.bipartite_graph_consolidator.consolidate_bipartite(\n",
|
|
" csv_dataset=csv_dataset, \n",
|
|
" vertex_column_index=2,\n",
|
|
" pivot_column_index=1\n",
|
|
" )\n",
|
|
" \n",
|
|
"print(f\"Sally Field worked on a movie with Tom Hanks? {'Sally Field' in graph['Tom Hanks']}\")\n",
|
|
"print(f\"Graph is directed: {graph.is_directed()}\")\n",
|
|
"print(f\"{graph.nodes()}\")\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.7.3"
|
|
},
|
|
"pycharm": {
|
|
"stem_cell": {
|
|
"cell_type": "raw",
|
|
"metadata": {
|
|
"collapsed": false
|
|
},
|
|
"source": []
|
|
}
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 2
|
|
}
|