Граф коммитов

15 Коммитов

Автор SHA1 Сообщение Дата
Li Jiang 700ff05874
Add RetrieveChat (#1158)
* Add RetrieveChat notebook, RetrieveAssistantAgent and RetrieveUserProxyAgent

* Update according to comments

* Add output

* Add tests, merge main, address comments

* Fix tests

* Merge main

* Remove unnecessary code

* Update test

* Update notebook, some functions

* Fix print issue

* Update notebook

* Update notebook

* Update notebook

* Improve retrieve utils and update notebook

* Update vector db creation method

* Update notebook

* Update notebook

* Add terminate if no more context

* Update prompt and notebook, add example for update context

* Update results

* Update results

* Update results of update context

* Fix typo

* Add table of contents

* Update table of contents
2023-08-13 12:51:54 +00:00
Chi Wang 595f5a8025
gpt-4 support; openai workflow fix; model str; timeout; voting (#958)
* workflow; model str; timeout

* voting

* notebook

* pull request

* recover workflow

* voted answer

* aoai

* ignore None answer

* default config

* note

* gpt-4

* n=5

* cleanup

* config name

* introduction

* readme

* avoid None

* add output/ to gitignore

* openai version

* invalid var

* comment long running cells
2023-03-26 17:13:06 +00:00
Li Jiang 50334f2c52
Support spark dataframe as input dataset and spark models as estimators (#934)
* add basic support to Spark dataframe

add support to SynapseML LightGBM model

update to pyspark>=3.2.0 to leverage pandas_on_Spark API

* clean code, add TODOs

* add sample_train_data for pyspark.pandas dataframe, fix bugs

* improve some functions, fix bugs

* fix dict change size during iteration

* update model predict

* update LightGBM model, update test

* update SynapseML LightGBM params

* update synapseML and tests

* update TODOs

* Added support to roc_auc for spark models

* Added support to score of spark estimator

* Added test for automl score of spark estimator

* Added cv support to pyspark.pandas dataframe

* Update test, fix bugs

* Added tests

* Updated docs, tests, added a notebook

* Fix bugs in non-spark env

* Fix bugs and improve tests

* Fix uninstall pyspark

* Fix tests error

* Fix java.lang.OutOfMemoryError: Java heap space

* Fix test_performance

* Update test_sparkml to test_0sparkml to use the expected spark conf

* Remove unnecessary widgets in notebook

* Fix iloc java.lang.StackOverflowError

* fix pre-commit

* Added params check for spark dataframes

* Refactor code for train_test_split to a function

* Update train_test_split_pyspark

* Refactor if-else, remove unnecessary code

* Remove y from predict, remove mem control from n_iter compute

* Update workflow

* Improve _split_pyspark

* Fix test failure of too short training time

* Fix typos, improve docstrings

* Fix index errors of pandas_on_spark, add spark loss metric

* Fix typo of ndcgAtK

* Update NDCG metrics and tests

* Remove unuseful logger

* Use cache and count to ensure consistent indexes

* refactor for merge maain

* fix errors of refactor

* Updated SparkLightGBMEstimator and cache

* Updated config2params

* Remove unused import

* Fix unknown parameters

* Update default_estimator_list

* Add unit tests for spark metrics
2023-03-25 19:59:46 +00:00
Jirka Borovec 2ff1035733
precommit: end-of-file-fixer (#929)
* precommit: end-of-file-fixer

* exclude .gitignore

* apply

---------

Co-authored-by: Shaokun <shaokunzhang529@gmail.com>
2023-02-28 16:27:14 +00:00
levscaut c6a2440348
add PySparkOvertimeMonitor to avoid exceeding time budget (#923)
* merging

* clean commit

* Delete mylearner.py

This file is not needed.

* fix py4j import error

* more tolerant cancelling time

* fix problems following suggestions

* Update flaml/tune/spark/utils.py

Co-authored-by: Li Jiang <bnujli@gmail.com>

* remove redundant model

* Update test/spark/custom_mylearner.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* add docstr

* reverse change in gitignore

* Update test/spark/custom_mylearner.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

---------

Co-authored-by: Li Jiang <bnujli@gmail.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2023-02-24 08:07:00 +00:00
Xueqing Liu 5f97532986
adding evaluation (#495)
* adding automl.score

* fixing the metric name in train_with_config

* adding pickle after score

* fixing a bug in automl.pickle
2022-03-25 17:00:08 -04:00
Chi Wang efd85b4c86
Deploy a new doc website (#338)
A new documentation website. And:

* add actions for doc

* update docstr

* installation instructions for doc dev

* unify README and Getting Started

* rename notebook

* doc about best_model_for_estimator #340

* docstr for keep_search_state #340

* DNN

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
Co-authored-by: Z.sk <shaokunzhang@psu.edu>
2021-12-16 17:11:33 -08:00
Xueqing Liu 42de3075e9
Make NLP tasks available from AutoML.fit() (#210)
Sequence classification and regression: "seq-classification" and "seq-regression"

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-11-16 11:06:20 -08:00
Xueqing Liu 926589bdda
exception, coverage for autohf (#106)
* increase coverage

* fixing exception messages

* fixing import
2021-06-14 14:11:40 -07:00
Xueqing Liu a4049ad9b6
autohf (#43)
automate huggingface transformer
2021-06-09 08:37:03 -07:00
Chi Wang 4a8110c87b
pickle the AutoML object (#37)
* pickle the AutoML object

* get best model per estimator

* test deberta

* stateless API

* Add Gitter badge (#41)

* prevent divide by zero

* test roberta

* BlendSearchTuner

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: The Gitter Badger <badger@gitter.im>
2021-03-16 22:13:35 -07:00
Chi Wang 1560a6e52a
V0.2.7 (#35)
* bug fix

* admissible region

* use CFO's init point as backup

* step lower bound

* test electra
2021-03-05 23:39:14 -08:00
Chi Wang 6ff0ed434b
v0.2.5 (#30)
* test distillbert

* import check

* complete partial config

* None check

* init config is not suggested by bo

* badge

* notebook for lightgbm
2021-02-22 22:10:41 -08:00
Chi Wang 776aa55189
V0.2.2 (#19)
* v0.2.2

separate the HPO part into the module flaml.tune
enhanced implementation of FLOW^2, CFO and BlendSearch
support parallel tuning using ray tune
add support for sample_weight and generic fit arguments
enable mlflow logging

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: qingyun-wu <qw2ky@virginia.edu>
2021-02-05 21:41:14 -08:00
Chi Wang (MSR) 492990655d v0.1.0 2020-12-04 09:40:27 -08:00