Cleanup: Got unit tests running again, updated docs

This commit is contained in:
Andrew Head 2019-02-27 15:58:18 -08:00
Родитель 1f067987da
Коммит f3d735c038
19 изменённых файлов: 211 добавлений и 280 удалений

118
README.md
Просмотреть файл

@ -1,67 +1,95 @@
# Code Gathering Tools
# Gather - Code Cleanup for Jupyter Notebooks
Tool for gathering, recalling, comparing implicit versions of code in Jupyter Lab. Read the paper [here](dead link).
Tool for cleaning code, recovering lost code, and version
control in Jupyter Lab.
## Download the Jupyter Lab extension
Download the extension with one command:
```bash
# This download link is currently dead
jupyter labextension install gathering-tools
jupyter labextension install gather
```
## Development
Read the docs [here](https://microsoft.github.io/gather).
And read our academic paper on the design of the tool
[here](https://people.eecs.berkeley.edu/~andrewhead/pdf/notebooks.pdf).
For a development install (requires npm version 4 or later), do the following in the repository directory:
## Contributing
To set up the code for this repository, run:
```bash
npm install # download dependencies
jupyter labextension link . # install this package in Jupyter Lab
jlpm run watch # automatically recompile sources
jupyter lab --watch # launch Jupyter Lab, automatically re-load extension
git clone <this-repository-url> # clone the repository
npm install # download dependencies
jupyter labextension link . # install this package in Jupyter Lab
jlpm run watch # automatically recompile source code
jupyter lab --watch # launch Jupyter Lab, automatically re-load extension
```
These setup instructions have been successfully completed with Node v9.5.0.
This requires npm version 4 or later, and was tested most
recently with Node v9.5.0.
### Testing the extension
#### Sharing a compiled version
The tests assume you have Google Chrome installed on your
computer. Because this plugin depends on Jupyter Lab and in
turn on browser functionality, some of these tests need a
browser to run in.
To run the tests from the command line, call:
```bash
npm run test
```
Wait a few seconds while the code compiles, and then you
should see the results of running the tests. The process
will continue to live after the tests finish running---it
will recompile and re-run the tests whenever the test code
changes. Type Ctrl+C to abort the command at any time.
To debug the tests, call:
```bash
npm run test:debug
```
This will launch a Chrome window. Click the **DEBUG**
button in the page that opens. Launch the Chrome developer
tools (View -> Developer -> Developer Tools). The "Console"
will show the test results, with one line for each test. In
the "Sources" tab, you can open scripts using the file prompt
(Cmd + P on Mac, Ctrl + P on Windows) and set breakpoints in
the code. When you refresh the page, the tests will be run
again, and the debugger will trigger when the first
breakpoint is reached.
### Packaging the project for beta users
Package up the project as follows:
```bash
npm pack # output: <package-name>-<version>.tgz
```
Then send the package to someone else, and have them install
it using this command:
```bash
npm pack # output is <package-name>-<version>.tgz
# Then, on the installer's computer
jupyter labextension install <package-name>-<version>.tgz
```
#### Publishing to a private repository
### Publishing to a private repository
If you want to test publishing the package to npm, you can
use the following commands.
```bash
npm login
npm login # requires credentials for a valid npm account
npm publish --access=restricted # make this public eventually
```
### Pre-alpha Jupyter notebook version
This project was initially developed as a Jupyter notebook extension. It is not being maintained, as it requires access to the internal API, including parts that change across minor versions. Still, if you want to build and install the notebook version, run these commands:
```bash
npm run build
npm run build_nb_extension
npm run install_nb_extension
```
### Troubleshooting
##### The extension UI doesn't get loaded
Sometimes you might reload the page and see that the buttons on the page are missing. I haven't been able to track the cause of the issue. [This Stack Overflow post](https://stackoverflow.com/questions/11991218/undefined-object-being-passed-via-requirejs) suggests the issue might be with circular `require` dependencies. The problem has disappeared when I have:
* Deleted the virtual environment containing Jupyter, and installing it globally, or
* Removed what I thought might be circular dependencies in the project
But I don't know if either of these *really* fixed the issue. They're worth trying if the gathering UI disappears.
Then run `jupyter notebook` and the extension will be running.
### `500` message when launching Jupyter notebook
Install these versions of Jupyter notebook and dependencies
@ -72,17 +100,3 @@ nbconvert==5.3.1
nbformat==4.4.0
notebook==5.6.0
```
### Backend (logging) extension (optional)
If you want to add logging to the project, look in the `src/nb/python` directory. This Python plugin needs to be installed to receive logging requests and save them to file (`~/.jupyter/events.txt`). To register this Python extension in Jupyter notebook or lab, see this guide: https://jupyter-notebook.readthedocs.io/en/latest/extending/handlers.html. As of the time of this writing, installation involves:
```bash
pip install portalocker # dependency for this package
cd src/nb/python
python setup.py install # build this package
jupyter serverextension enable --py gather_logger # enable the package
```
We aren't yet including the frontend extension in the server extension, nor do we have a good way to develop the plugin in development mode yet. To do either of these two things, follow the instructions here:
https://jupyter-notebook.readthedocs.io/en/latest/examples/Notebook/Distributing%20Jupyter%20Extensions%20as%20Python%20Packages.html .

Просмотреть файл

@ -1,33 +1,34 @@
{
"name": "gather",
"version": "0.2.1",
"description": "Tools for cleaning and recovering code in Jupyter Lab",
"author": "",
"main": "lib/lab/index.js",
"version": "0.2.2",
"description": "Spit shine for computational notebooks",
"keywords": [
"jupyter",
"jupyterlab"
],
"author": {
"name": "Andrew Head",
"email": "andrewhead@berkeley.edu",
"url": "http://andrewhead.info"
},
"homepage": "https://microsoft.github.io/gather/",
"scripts": {
"build_parser": "node ./node_modules/jison/lib/cli.js --outfile src/parsers/python/python3.js src/parsers/python/python3.jison",
"build": "tsc",
"build_nb_extension": "npm run build && npx webpack",
"watch_nb_extension": "npx webpack -w",
"install_nb_extension": "jupyter nbextension install dist && jupyter nbextension enable dist/gather",
"prepack": "npm run clean && npm run build_parser && npm run build",
"clean": "rm -rf lib",
"build": "npm run build:parser && tsc",
"build:parser": "node ./node_modules/jison/lib/cli.js --outfile src/parsers/python/python3.js src/parsers/python/python3.jison",
"watch": "tsc -w",
"test": "mocha -r ts-node/register src/test/*.test.ts"
"prepack": "npm run build",
"test": "karma start src/test/karma.conf.js",
"test:debug": "karma start src/test/karma.debug.conf.js"
},
"jupyterlab": {
"extension": true,
"schemaDir": "schema"
},
"files": [
"schema/*.json",
"lib/**/*.{d.ts,eot,gif,html,jpg,js,js.map,json,png,svg,woff2,ttf}",
"style/**/*.{css,eot,gif,html,jpg,json,png,svg,woff2,ttf}"
],
"jupyterlab": {
"extension": true,
"schemaDir": "schema"
},
"dependencies": {
"@jupyterlab/application": "^0.19.1",
"@jupyterlab/apputils": "^0.19.1",
@ -54,15 +55,18 @@
"@types/mocha": "^5.2.4",
"@types/node": "^8.0.51",
"chai": "^4.1.2",
"cloc": "^2.3.3",
"css-loader": "^1.0.0",
"file-loader": "^1.1.11",
"jison": "^0.4.18",
"karma": "^4.0.0",
"karma-chai": "^0.1.0",
"karma-chrome-launcher": "^2.2.0",
"karma-mocha": "^1.3.0",
"karma-webpack": "^3.0.5",
"mocha": "^5.2.0",
"raw-loader": "^0.5.1",
"rimraf": "^2.6.1",
"style-loader": "^0.21.0",
"ts-node": "8.0.2",
"ts-loader": "^5.3.3",
"typescript": "3.3.1",
"webpack": "^4.16.1",
"webpack-cli": "^3.1.0"

Просмотреть файл

@ -107,6 +107,6 @@ function _loadExecutionFromJson(executionJson: JSONObject): CellExecution {
/**
* TODO(andrewhead): Update with Kunal's code for serializing and deserializing outputs.
*/
let cell = new SimpleCell(id, executionCount, hasError, text, [], persistentId);
let cell = new SimpleCell({ id, executionCount, hasError, text, persistentId });
return new CellExecution(cell, executionTime);
}

Просмотреть файл

@ -1,7 +1,7 @@
import { GatherModel } from "../packages/gather";
import { IObservableList } from "@jupyterlab/observables";
import { ICellModel, CodeCellModel } from "@jupyterlab/cells";
import { LabCell, copyICodeCellModel } from "./LabCell";
import { LabCell } from "./LabCell";
import { NotebookPanel } from "@jupyterlab/notebook";
export class ExecutionLogger {
@ -20,8 +20,7 @@ export class ExecutionLogger {
if (cellModel.type !== 'code') { return; }
cellModel.stateChanged.connect((changedCell, cellStateChange) => {
if (changedCell instanceof CodeCellModel && cellStateChange.name === "executionCount" && cellStateChange.newValue !== undefined && cellStateChange.newValue !== null) {
let cellClone = copyICodeCellModel(changedCell);
const cell = new LabCell(cellClone);
let cell = new LabCell(changedCell).deepCopy();
this._gatherModel.executionLog.logExecution(cell);
this._gatherModel.lastExecutedCell = cell;
}

Просмотреть файл

@ -145,12 +145,17 @@ function getCellsJsonForSlice(slice: SlicedExecution, outputSelections?: OutputS
.map((cellSlice) => {
let slicedCell = cellSlice.cell;
if (SHOULD_SLICE_CELLS) {
slicedCell = slicedCell.copy();
slicedCell = slicedCell.deepCopy();
slicedCell.text = cellSlice.textSliceLines;
}
let cellJson = slicedCell.serialize();
// This new cell hasn't been executed yet. So don't mark it as having been executed.
cellJson.execution_count = null;
for (let output of cellJson.outputs) {
if (nbformat.isExecuteResult(output)) {
output.execution_count = null;
}
}
// Add a flag to distinguish gathered cells from other cells.
if (!cellJson.hasOwnProperty("metadata")) {
cellJson.metadata = {};

Просмотреть файл

@ -5,14 +5,7 @@ import { nbformat } from "@jupyterlab/coreutils";
import { AbstractCell } from "../packages/cell";
/**
* Create a new cell with the same ID and content.
*/
export function copyICodeCellModel(cell: ICodeCellModel): ICodeCellModel {
return new CodeCellModel({ id: cell.id, cell: cell.toJSON() });
}
/**
* Implementation of SliceableCell for Jupyter Lab. Wrapper around the ICodeCellModel.
* Abstract interface to data of a Jupyter Lab code cell.
*/
export class LabCell extends AbstractCell {
@ -78,9 +71,8 @@ export class LabCell extends AbstractCell {
return this._model.metadata.get("gathered") as boolean;
}
copy(): LabCell {
let clonedModel = copyICodeCellModel(this._model);
return new LabCell(clonedModel);
deepCopy(): LabCell {
return new LabCell(new CodeCellModel({ id: this.id, cell: this.toJSON() }));
}
serialize(): any {

Просмотреть файл

@ -35,10 +35,15 @@ export interface ICell {
readonly is_cell: boolean;
/**
* Create a deep copy of the cell. Copies will have all the same properties, except for the
* persistent ID, which should be entirely new.
* Create a deep copy of the cell.
*/
copy: () => ICell;
deepCopy: () => ICell;
/**
* Create a new cell from this cell. The new cell will have null execution counts, and a new
* ID and persistent ID.
*/
copyToNewCell: () => ICell;
/**
* Serialize this ICell to JSON that can be stored in a notebook file, or which can be used to
@ -61,7 +66,7 @@ export abstract class AbstractCell implements ICell {
abstract text: string;
abstract gathered: boolean;
abstract outputs: nbformat.IOutput[];
abstract copy(): AbstractCell;
abstract deepCopy(): AbstractCell;
/**
* This method is called by the logger to sanitize cell data before logging it. This method
@ -79,6 +84,21 @@ export abstract class AbstractCell implements ICell {
};
}
copyToNewCell(): ICell {
let clonedOutputs = this.outputs.map((output) => {
let clone = JSON.parse(JSON.stringify(output)) as nbformat.IOutput;
if (nbformat.isExecuteResult(clone)) {
clone.execution_count = undefined;
}
return clone;
});
return new SimpleCell({
text: this.text,
hasError: this.hasError,
outputs: clonedOutputs
});
}
serialize(): nbformat.ICodeCell {
return {
id: this.id,
@ -96,21 +116,23 @@ export abstract class AbstractCell implements ICell {
export class SimpleCell extends AbstractCell {
constructor(id: string, executionCount: number,
hasError: boolean, text: string, outputs: nbformat.IOutput[], persistentId?: string) {
constructor(cellData: {
id?: string, persistentId?: string, executionCount?: number, hasError?: boolean,
text?: string, outputs?: nbformat.IOutput[]
}) {
super();
this.is_cell = true;
this.id = id;
this.persistentId = persistentId ? persistentId : UUID.uuid4.toString();
this.executionCount = executionCount;
this.hasError = hasError;
this.text = text;
this.outputs = outputs;
this.id = cellData.id || UUID.uuid4();
this.persistentId = cellData.persistentId || UUID.uuid4();
this.executionCount = cellData.executionCount || undefined;
this.hasError = cellData.hasError || false;
this.text = cellData.text || "";
this.outputs = cellData.outputs || [];
this.gathered = false;
}
copy(): SimpleCell {
return new SimpleCell(this.id, this.executionCount, this.hasError, this.text, this.outputs);
deepCopy(): AbstractCell {
return new SimpleCell(this);
}
readonly is_cell: boolean;

Просмотреть файл

@ -38,7 +38,7 @@ export class SlicedExecution {
let cell = cellSlice.cell;
if (!cellSlices.hasOwnProperty(cell.persistentId)) cellSlices[cell.persistentId] = {};
if (!cellSlices[cell.persistentId].hasOwnProperty(cell.executionCount)) {
let newCellSlice = new CellSlice(cell.copy(), new LocationSet(), cellSlice.executionTime);
let newCellSlice = new CellSlice(cell.deepCopy(), new LocationSet(), cellSlice.executionTime);
cellSlices[cell.persistentId][cell.executionCount] = newCellSlice;
mergedCellSlices.push(newCellSlice);
}

Просмотреть файл

@ -1,7 +1,6 @@
import { NumberSet, range } from "./Set";
import { NumberSet, range, Set } from "./Set";
import { ControlFlowGraph } from "./ControlFlowAnalysis";
import { ILocation, parse, IModule } from "../parsers/python/python_parser";
import { Set } from "./Set";
import { DataflowAnalyzer } from "./DataflowAnalysis";
export enum DataflowDirection { Forward, Backward };

Просмотреть файл

@ -1,14 +1,11 @@
import { expect } from 'chai';
import { CellSlice } from '../packages/cell';
import { CellSlice, SimpleCell } from '../packages/cell';
import { LocationSet } from '../slicing/Slice';
describe('CellSlice', () => {
it('yields a text slice based on a set of locations', () => {
let cellSlice = new CellSlice({
is_cell: true,
id: "id",
persistentId: "persistent-id",
let cellSlice = new CellSlice(new SimpleCell({
text: [
"a = 1",
"b = 2",
@ -16,13 +13,8 @@ describe('CellSlice', () => {
"d = 4",
""
].join("\n"),
hasError: false,
executionCount: 1,
outputs: [],
gathered: false,
copy: () => null,
serialize: () => null
}, new LocationSet(
}), new LocationSet(
{ first_line: 1, first_column: 0, last_line: 1, last_column: 5 },
{ first_line: 2, first_column: 4, last_line: 3, last_column: 4 }
));
@ -34,10 +26,7 @@ describe('CellSlice', () => {
});
it('yields entire lines if requested', () => {
let cellSlice = new CellSlice({
is_cell: true,
id: "id",
persistentId: "persistent-id",
let cellSlice = new CellSlice(new SimpleCell({
text: [
"a = 1",
"b = 2",
@ -45,13 +34,8 @@ describe('CellSlice', () => {
"d = 4",
""
].join("\n"),
hasError: false,
executionCount: 1,
outputs: [],
gathered: false,
copy: () => null,
serialize: () => null
}, new LocationSet(
}), new LocationSet(
{ first_line: 1, first_column: 0, last_line: 1, last_column: 5 },
{ first_line: 2, first_column: 4, last_line: 3, last_column: 4 }
));

Просмотреть файл

@ -0,0 +1,16 @@
var webpackConfig = require('./webpack.test.config.js');
module.exports = {
frameworks: ['mocha', 'chai'],
files: [
'**/*.test.ts'
],
preprocessors: {
'**/*.ts': ['webpack']
},
webpack: webpackConfig,
reporters: ['progress'],
colors: true,
autoWatch: true,
concurrency: Infinity
}

8
src/test/karma.conf.js Normal file
Просмотреть файл

@ -0,0 +1,8 @@
var baseKarmaConfig = require('./karma.base.conf.js');
var conf = baseKarmaConfig;
conf['browsers'] = ['ChromeHeadless'];
module.exports = function(config) {
config.set(conf);
}

Просмотреть файл

@ -0,0 +1,8 @@
var baseKarmaConfig = require('./karma.base.conf.js');
var conf = baseKarmaConfig;
conf['browsers'] = ['Chrome'];
module.exports = function(config) {
config.set(conf);
}

Просмотреть файл

@ -1,23 +1,12 @@
import { expect } from "chai";
import { LocationSet } from "../slicing/Slice";
import { SlicedExecution } from "../slicing/ExecutionSlicer";
import { ICell, CellSlice } from "../packages/cell";
import { ICell, CellSlice, SimpleCell } from "../packages/cell";
describe('SlicedExecution', () => {
function cell(id: string, executionCount: number, ...codeLines: string[]): ICell {
return {
is_cell: true,
id: id,
persistentId: "persistent-id",
executionCount: executionCount,
text: codeLines.join('\n'),
hasError: false,
outputs: [],
gathered: false,
copy: () => null,
serialize: () => null
};
function cell(persistentId: string, executionCount: number, ...codeLines: string[]): ICell {
return new SimpleCell({ executionCount, text: codeLines.join("\n"), persistentId });
}
function cellSlice(cell: ICell, slice: LocationSet): CellSlice {
@ -54,8 +43,8 @@ describe('SlicedExecution', () => {
new LocationSet(location(1, 0, 1, 5))
));
let merged = slice1.merge(slice2);
expect(merged.cellSlices[0].cell.id).to.equal("1");
expect(merged.cellSlices[1].cell.id).to.equal("2");
expect(merged.cellSlices[0].cell.persistentId).to.equal("1");
expect(merged.cellSlices[1].cell.persistentId).to.equal("2");
});
it('will not include the same locations from the same cell twice', () => {

Просмотреть файл

@ -1,14 +1,13 @@
import { expect } from "chai";
import { ProgramBuilder } from "../slicing/ProgramBuilder";
import { ICell } from '../packages/cell';
import { ICell, SimpleCell } from '../packages/cell';
describe('program builder', () => {
function createCell(id: string, executionCount: number, ...codeLines: string[]): ICell {
function createCell(persistentId: string, executionCount: number, ...codeLines: string[]): ICell {
let text = codeLines.join("\n");
return { is_cell: true, id, executionCount, persistentId: "persistent-id", text: text,
hasError: false, gathered: false, outputs: [], copy: () => null, serialize: () => null };
return new SimpleCell({ executionCount, persistentId, text });
}
let programBuilder: ProgramBuilder;

Просмотреть файл

@ -1,67 +0,0 @@
import { parse } from '../../parsers/python/python_parser';
import { ControlFlowGraph } from '../../slicing/ControlFlowAnalysis';
import { DataflowAnalyzer } from '../../slicing/DataflowAnalysis';
import * as fs from 'fs';
import * as path from 'path';
if (process.argv.length <= 2) {
console.log(`usage: ${__filename} path/to/directory`);
process.exit(-1);
}
let failCount = 0;
function testInDir(rootDir: string) {
function isPyFile(filename: string) { return /.py$/.test(filename); }
const items = fs.readdirSync(rootDir);
for (let item of items.slice(0, 500)) {
const itemPath = path.join(rootDir, item);
const stats = fs.statSync(itemPath);
if (stats.isFile() && isPyFile(item)) {
const text = fs.readFileSync(itemPath).toString().replace(/\r\n/g, '\n')
+ '\n'; // ⚠️ the parser freaks without a final newline
console.log(itemPath);
try {
const ast = parse(text);
if (!ast) {
// empty file
continue;
}
const cfg = new ControlFlowGraph(ast);
if (!cfg || !cfg.blocks) {
console.log('CFG FAIL');
continue;
}
const analyzer = new DataflowAnalyzer();
const dfa = analyzer.analyze(cfg);
if (!dfa) {
console.log('DFA FAIL');
continue;
}
} catch (e) {
const py2ErrorPatterns = [
/except .*,.*:/,
/print /,
/exec /,
/[0-9]+L/,
/Expecting 'NAME', got 'False'/,
/Expecting ':', 'as', got ','/,
/[r.]aise [^,]+,/,
/[^0-9A-Za-z_]0[0-9]+/,
/ur["']/,
/0x[0-9A-Fa-f]L/,
/<>/,
];
if (!py2ErrorPatterns.some(pat => pat.test(e.message))) {
console.log('FAIL', e);
failCount++;
}
}
} else if (stats.isDirectory()) {
testInDir(itemPath);
}
}
}
testInDir(process.argv[2]);
console.log('TOTAL FAILURES', failCount);

Просмотреть файл

@ -0,0 +1,27 @@
module.exports = {
mode: 'development',
node: {
fs: 'empty'
},
module: {
rules: [{
test: /\.txt/,
use: ['raw-loader']
}, {
test: /\.css$/,
use: ['style-loader', 'css-loader']
}, {
test: /\.tsx?$/,
use: ['ts-loader']
}, {
test: /\.png$/,
use: ['file-loader']
}]
},
resolve: {
extensions: ['.ts', '.js', '.tsx', '.jsx', '.css'],
modules: [
'node_modules'
]
}
};

Просмотреть файл

@ -1,28 +0,0 @@
:root {
--brand-color5: #f0f9ff;
--brand-color4: #e3f2fd;
--brand-color3: #bbdefb;
--brand-color2: #90caf9;
--brand-color1: #64b5f6;
--brand-color0: #42a5f5;
}
.p-Widget.jp-Notebook-revisionbrowser {
position: absolute;
top: 0;
left: 0;
width: 100%;
height: 100%;
z-index: 1000;
padding-top: 4%;
background-color: rgba(0, 0, 0, .7);
}
.p-Widget.jp-Notebook-revisionbrowser-exit {
color: white;
font-size: xx-large;
cursor: pointer;
position: absolute;
right: 30px;
top: 8px;
}

Просмотреть файл

@ -1,40 +0,0 @@
const path = require('path');
module.exports = {
entry: './lib/nb/index.js',
output: {
path: path.resolve(__dirname, 'dist'),
filename: 'gather.js',
libraryTarget: 'amd'
},
mode: 'none',
devtool: "inline-source-map",
externals: {
"base/js/namespace": "base/js/namespace",
"base/js/utils": "base/js/utils"
},
node: {
fs: 'empty'
},
module: {
rules: [{
test: /\.txt/,
use: ['raw-loader']
}, {
test: /\.css$/,
use: ['style-loader', 'css-loader']
}, {
test: /\.png$/,
use: ['file-loader']
}]
},
optimization: {
minimize: false
},
resolve: {
extensions: ['.js', '.css'],
modules: [
'node_modules'
]
}
};