add CI/CD
This commit is contained in:
Markus Cozowicz 2021-09-17 15:10:14 +02:00
Родитель c093687595
Коммит c2ac332f56
4 изменённых файлов: 103 добавлений и 9 удалений

43
.github/workflows/publish-to-test-pypi.yml поставляемый Normal file
Просмотреть файл

@ -0,0 +1,43 @@
name: Publish Python 🐍 distributions 📦 to PyPI and TestPyPI
on: push
jobs:
build-n-publish:
name: Build and publish Python 🐍 distributions 📦 to PyPI and TestPyPI
runs-on: ubuntu-18.04
steps:
- uses: actions/checkout@master
- name: Set up Python 3.8
uses: actions/setup-python@v1
with:
python-version: 3.8
- name: Install pypa/build
run: >-
python -m
pip install
build
--user
- name: Build a binary wheel and a source tarball
run: >-
python -m
build
--sdist
--wheel
--outdir dist/
.
- name: Publish distribution 📦 to Test PyPI
uses: pypa/gh-action-pypi-publish@master
with:
password: ${{ secrets.TEST_PYPI_API_TOKEN }}
repository_url: https://test.pypi.org/legacy/
- name: Publish distribution 📦 to PyPI
if: startsWith(github.ref, 'refs/tags')
uses: pypa/gh-action-pypi-publish@master
with:
password: ${{ secrets.PYPI_API_TOKEN }}

Просмотреть файл

@ -1,14 +1,60 @@
# Project
# Cyclic Boosting Machines
> This repo has been populated by an initial template to help get you started. Please
> make sure to update the content to build a great experience for community-building.
This is an efficient and Scikit-learn compatible implementation of the machine learning algorithm [Cyclic Boosting -- an explainable supervised machine learning algorithm](https://arxiv.org/abs/2002.03425), specifically for predicting count-data, such as sales and demand.
As the maintainer of this project, please make a few updates:
## Usage
- Improving this README.MD file to provide a great experience
- Updating SUPPORT.MD with content about this project's support experience
- Understanding the security reporting process in SECURITY.MD
- Remove this section from the README
```python
import cbm
import numpy as np
x_train: np.ndarray = ... # will be cast to uint8, so make sure you featurize before hand
y_train: np.ndarray = ... # will be cast to uint32
model = cbm.CBM()
model.fit(x_train, y_train)
x_test: np.numpy = ...
y_pred = model.predict(x_test)
```
## Explainability
The CBM model predicts by multiplying the global mean with each weight estimate for each bin and feature. Thus the weights can be interpreted as % increase or decrease from the global mean. e.g. a weight of 1.2 for the bin _Monday_ of the feature _Day-of-Week_ can be interpreted as a 20% increase of the target.
<img src="https://render.githubusercontent.com/render/math?math=\hat{y}_i = \mu \cdot \product^{p}_{j=1} f^k_j"> with <img src="https://render.githubusercontent.com/render/math?math=k = \{x_{j,_i} \in b^k_j \}">
```python
model = cbm.CBM()
model.fit(x_train, y_train)
import matplotlib.pyplot as plt
fig, axes = plt.subplots(2,
int(np.ceil(x_train.shape[1] / 2)),
figsize=(25, 20),
sharex=True)
for feature in np.arange(x_train.shape[1]):
w = model.weights[feature]
ax = axes[feature % 2, feature // 2]
(ax.barh(x_train.iloc[:,feature].cat.categories.astype(str),
np.array(w) - 1, # make sure it looks nice w/ bars go up and down from zero
)
)
ax.set_title(x_train.columns[feature])
ax.xaxis.set_tick_params(which='both', labelbottom=True)
fig.tight_layout()
```
## Featurization
Categorical features can be passed as 0-based indices, with a maximum of 255 categories supported at the moment.
Continuous features need to be discretized. [pandas.qcut](https://pandas.pydata.org/docs/reference/api/pandas.qcut.html) for equal-sized bins or [numpy.interp](https://numpy.org/doc/stable/reference/generated/numpy.interp.html) for equal-distant bins yield good results for us.
## Contributing

5
pyproject.toml Normal file
Просмотреть файл

@ -0,0 +1,5 @@
[build-system]
# These are the assumed default build requirements from pip:
# https://pip.pypa.io/en/stable/reference/pip/#pep-517-and-518-support
requires = ["setuptools>=40.8.0", "wheel"]
build-backend = "setuptools.build_meta"

Просмотреть файл

@ -15,7 +15,7 @@ class get_pybind_include(object):
setup(
name="cbm",
name="cyclicbm",
version="0.0.1",
description="Cyclic Boosting Machines",
url="https://github.com/Microsoft/CBM",