Aleksei Smirnov
12411fc678
Fix broken inheritance from DataFrameColumn class ( #7324 )
2024-11-27 10:45:01 -07:00
Xiaoyun Zhang
a4c67fe26f
[GenAI] SFT Example ( #7316 )
...
* implement sft
* add causalLMDataset
* update
* add SFT trainer
* update
* update
* disable x64 test on non-x64 machine
* support batch
2024-11-25 13:47:15 -08:00
Emmanuel Ferdman
5d0dafbf9c
Update dynamic loading report reference ( #7321 )
...
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
2024-11-20 10:29:19 -07:00
Aleksei Smirnov
cfc1fb5df2
Update System.Numerics.Tensors version ( #7322 )
...
* Update System.Numerics.Tensors version
* Fix build
2024-11-20 10:28:50 -07:00
Eric StJohn
d4bc05db15
Update version for 5.0 ( #7311 )
2024-11-18 12:36:03 -08:00
Michael Sharp
509032755b
Fixing tokenizers version ( #7309 )
...
* fixing tokenizers version
* fixes
2024-11-13 16:16:30 -07:00
Michael Sharp
442d51cffb
mp updates ( #7304 )
2024-11-12 17:39:11 -07:00
Michael Sharp
26b9d65da9
4.0 release notes ( #7302 )
2024-11-12 13:19:13 -07:00
Carlos Sánchez López
4a6ce50e80
Update dependencies from maintenance-packages to latest versions ( #7301 )
...
* Update dependencies from maintenance-packages to latest preview
* Fix sni.dll failure
2024-11-11 22:23:47 -07:00
Michael Sharp
e968d3226c
Fixing native lookup ( #7282 )
...
* enable m1 tests and fixing native lookup
* Mac testing
* mac test
* macos test
* mac testing
* reverting mac changes
* testing mac temp fix
* fixed cmake
* print DYLD libraries
* osx lightgbm test
* lightgbm dyld log
* excluding osx x64 from some tests
2024-11-11 15:47:59 -07:00
Michael Sharp
3e2ba817bd
Updated remote executor ( #7295 )
...
* updated remote executor
* fixing test reference issue
* full version string
* old memory version
* removed nowarn
2024-11-11 14:25:02 -07:00
Tarek Mahmoud Sayed
86112114fd
Final tokenizer's cleanup ( #7291 )
2024-11-08 10:37:27 -08:00
Xiaoyun Zhang
3659a485a0
[GenAI] Introduce CausalLMPipelineChatClient for MEAI.IChatClient ( #7270 )
...
* leverage MEAI abstraction
* Update src/Microsoft.ML.GenAI.LLaMA/Llama3CausalLMChatClient.cs
Co-authored-by: Stephen Toub <stoub@microsoft.com>
* Update src/Microsoft.ML.GenAI.LLaMA/Llama3CausalLMChatClient.cs
Co-authored-by: Stephen Toub <stoub@microsoft.com>
* Update src/Microsoft.ML.GenAI.Phi/Phi3/Phi3CausalLMChatClient.cs
Co-authored-by: Stephen Toub <stoub@microsoft.com>
* fix comments
* Update Microsoft.ML.GenAI.Core.csproj
---------
Co-authored-by: Stephen Toub <stoub@microsoft.com>
2024-11-05 09:27:05 -08:00
Michael Sharp
5b4981a134
Update To MacOS 13 ( #7285 )
...
* mac 13 testing
* build fixes
2024-11-04 17:41:25 -07:00
Tarek Mahmoud Sayed
5c503195e8
Add Timeout to Regex used in the tokenizers ( #7284 )
...
* Add Timeout to Regex used in the tokenizers
* Address the feedback
2024-11-04 08:06:54 -08:00
Tarek Mahmoud Sayed
7cce7535b7
Add the components governance file `cgmanifest.json` for tokenizer's vocab files ( #7283 )
...
* Add the governance file cgmanifest.json for tokenizer's vocab files
* Address the feedback
* apply more schema requirements on the doc
2024-11-01 15:20:59 -07:00
Tarek Mahmoud Sayed
a9b4212eb3
Address the feedback regarding Bert tokenizer ( #7280 )
...
* Address the feedback regarding Bert tokenizer
* Small fix
2024-10-26 16:07:20 -07:00
Michael Sharp
a7a6d88b05
Adds in a way to add settings for the MLContext. ( #7273 )
...
* api, no tests
* updates from pr
* fixed rebase errors and pr comments
* updates based on ONNX team
2024-10-22 16:27:04 -06:00
Michael Sharp
869dc9fcfa
fixing osx ci ( #7279 )
2024-10-22 14:46:20 -06:00
Tarek Mahmoud Sayed
81122c4c48
Introducing WordPiece and Bert tokenizers ( #7275 )
...
* Introducing WordPiece and Bert tokenizers
* Fix corner case in WordPiece
2024-10-22 12:33:58 -07:00
Michael Sharp
32bac5e395
fixing apple silicon official build ( #7278 )
2024-10-22 10:55:12 -06:00
Eui Jung
f385b06aa0
Fixes #7271 AOT for ML.Tokenizers ( #7272 )
...
* AOT for ML.Tokenizers
* Forgot to add ModelSourceGenerationContext
* Update src/Microsoft.ML.Tokenizers/Model/ModelSourceGenerationContext.cs
Co-authored-by: Eirik Tsarpalis <eirik.tsarpalis@gmail.com>
---------
Co-authored-by: Eirik Tsarpalis <eirik.tsarpalis@gmail.com>
2024-10-17 14:16:22 -07:00
Tarek Mahmoud Sayed
823fc175f4
Misc Changes ( #7264 )
...
* Add o1 model support
* Replace Usage of tuples with Range in EncodedToken and Remove TorchSharp Range/Index implementation
* Rename SentencePieceBpeTokenizer to allow adding more models to it in the future.
* Make Tokenizer.Decode returns non-nullable string
* Make BPE tokenizer support added tokens
* add net9 package source to the nuget.config file
* Rename TiktokenPreTokenizer to RegexPreTokenizer
2024-10-11 16:06:22 -07:00
Michael Sharp
e7943427f4
Load onnx model from Stream working ( #7254 )
2024-10-09 11:07:20 -06:00
Aleksei Smirnov
1843bcdfa6
Fix dataframe incorrectly parse CSV when renameDuplicatedColumns is true ( #7242 )
...
* Fix dataframe incorrectly parse CSV when renameDuplicatedColumns is true
* Update System.Numerics.Tensors dependency
2024-10-09 09:46:43 -06:00
Eric StJohn
9baf26b58d
Update wording ( #7253 )
2024-10-07 14:33:42 -07:00
Weihan Li
f7d4184f27
docs: update nuget package badge ( #7236 )
2024-10-07 13:41:06 -06:00
dotnet-maestro[bot]
4919b43668
[main] Update dependencies from dotnet/arcade ( #7235 )
...
* Update dependencies from https://github.com/dotnet/arcade build 20240909.1
Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
From Version 10.0.0-beta.24430.1 -> To Version 10.0.0-beta.24459.1
* Update dependencies from https://github.com/dotnet/arcade build 20240917.1
Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
From Version 10.0.0-beta.24459.1 -> To Version 10.0.0-beta.24467.1
* Update dependencies from https://github.com/dotnet/arcade build 20240926.2
Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
From Version 10.0.0-beta.24467.1 -> To Version 10.0.0-beta.24476.2
* Update dependencies from https://github.com/dotnet/arcade build 20241004.4
Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
From Version 10.0.0-beta.24476.2 -> To Version 10.0.0-beta.24504.4
---------
Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
2024-10-07 13:40:50 -06:00
Stephen Toub
3f042f632b
Update tiktoken regexes ( #7255 )
2024-10-07 13:39:15 -06:00
Xiaoyun Zhang
62c4a13f4d
[GenAI] Support Llama 3.2 1B and 3B model ( #7245 )
...
* update
* enable llama3_2
* fix tests
2024-10-06 00:16:36 -07:00
Tarek Mahmoud Sayed
1e914273d3
Move the Tokenizer's data into separate packages. ( #7248 )
...
* Move the Tokenizer's data into separate packages.
* Address the feedback
* More feedback addressing
* More feedback addressing
* Trimming/AoT support
* Make data types internal
2024-10-04 14:47:37 -07:00
Eric StJohn
189ba24641
Enable SDL tools ( #7247 )
2024-10-01 10:38:15 -07:00
Eric StJohn
eba09d977c
Add Service Tree ID for .NET Libraries ( #7252 )
2024-10-01 10:37:34 -07:00
Xiaoyun Zhang
be1e428d41
[GenAI] pack GenAI core package ( #7246 )
...
* update
* enable llama3_2
* fix tests
* pack GenAI core
2024-09-27 13:10:30 -07:00
Xiaoyun Zhang
817a77f89a
[GenAI] Add Mistral 7B Instruction V0.3 ( #7231 )
...
* add mistral and tests
* add test and sample
* add tool call support
* update autogen to v 0.1.0
* update autogen to 0.1.0
* remove tests on non-x64 machien
* add file header
* update
* update
* update ml tokenizer test version
* fix build error
* remove .receive.txt
* Update docs/samples/Microsoft.ML.GenAI.Samples/Mistral/Mistral_7B_Instruct.cs
Co-authored-by: Weihan Li <7604648+WeihanLi@users.noreply.github.com>
* update
* set t to 0
* fix test
* Update Microsoft.ML.GenAI.Mistral.csproj
---------
Co-authored-by: Weihan Li <7604648+WeihanLi@users.noreply.github.com>
2024-09-25 20:15:59 -07:00
Xiaoyun Zhang
e14f22fe0d
set <IsPackable>true</IsPackable> to GenAI.Llama and Phi ( #7237 )
2024-09-16 11:53:03 -07:00
Tarek Mahmoud Sayed
87a41fae4a
Fix decoding special tokens in SentencePiece tokenizer ( #7233 )
...
* Fix decoding special tokens in SentencePiece tokenizer
* Apply the change to the other Decode method overload
2024-09-09 15:02:38 -07:00
dotnet-maestro[bot]
4e364e4bda
[main] Update dependencies from dotnet/arcade ( #7218 )
...
* Update dependencies from https://github.com/dotnet/arcade build 20240808.2
Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
From Version 9.0.0-beta.24368.9 -> To Version 9.0.0-beta.24408.2
* Update dependencies from https://github.com/dotnet/arcade build 20240816.2
Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
From Version 9.0.0-beta.24408.2 -> To Version 9.0.0-beta.24416.2
* Update dependencies from https://github.com/dotnet/arcade build 20240821.2
Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
From Version 9.0.0-beta.24416.2 -> To Version 9.0.0-beta.24421.2
* Update dependencies from https://github.com/dotnet/arcade build 20240823.2
Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
From Version 9.0.0-beta.24421.2 -> To Version 9.0.0-beta.24423.2
* Update dependencies from https://github.com/dotnet/arcade build 20240830.1
Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
From Version 9.0.0-beta.24423.2 -> To Version 10.0.0-beta.24430.1
---------
Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
2024-09-02 15:03:10 -06:00
Xiaoyun Zhang
7c937bf81a
[GenAI] Add generateEmbedding API to CausalLMPipeline ( #7227 )
...
* add embedding
* add frompretrain api to phi3 model
* fix bug
* Update CausalLMPipeline.cs
2024-08-30 10:30:46 -07:00
Xiaoyun Zhang
1d1cc997d5
Directly refer sql data client 4.8.6 package in GenAI tests to fix security vulnerable package ( #7228 )
...
* update approvaltest to 7.0
* Update Versions.props
* add direct reference to sql client in GenAI test packages
2024-08-30 10:30:32 -07:00
MohammadReza SH
1612edc770
Refactor Namespace and Seald Classes in Microsoft.ML.AutoML.SourceGenerator Project ( #7223 )
...
* Make all classes in Microsoft.ML.AutoML.SourceGenerator project `sealed`
* Refactor namespace from Microsoft.ML.ModelBuilder.SweepableEstimator.CodeGenerator to Microsoft.ML.AutoML.SourceGenerator
2024-08-29 14:09:44 -07:00
Xiaoyun Zhang
70e5ab1e27
[GenAI] Add LLaMA support ( #7220 )
...
* add llama
* add test for tokenizer
* make llama 3.1 working
* update
* add shape test for 70b and 405b
* clean up
* add tests
* update
* fix error
* calculate rotary embedding in model layer
* remove rotary_emb from attention
* update feed
* update .csproj
* Update NuGet.config
* fix test
* pass device
* fix test
* update constructor
* disable 405b test
* update
* disable 70b test
* use windows only fact
* revert change
* rename test to LLaMA3_1
2024-08-28 15:35:31 -07:00
Michael Sharp
fa8c822045
Update dependency versions. ( #7216 )
...
* version updates and fixes
* fixed torch sharp test nuget
* skipping test again
* fixing typo
* consolidating extensions versioning
* fixed remote executor app
2024-08-07 18:12:44 -06:00
Xiaoyun Zhang
5f5a199a5a
[GenAI] Add readme to Microsoft.ML.GenAI.Phi ( #7206 )
...
* add readme
* Update src/Microsoft.ML.GenAI.Phi/README.md
Co-authored-by: Luis Quintanilla <46974588+luisquintanilla@users.noreply.github.com>
* Update src/Microsoft.ML.GenAI.Phi/README.md
Co-authored-by: Luis Quintanilla <46974588+luisquintanilla@users.noreply.github.com>
* Update src/Microsoft.ML.GenAI.Phi/README.md
Co-authored-by: Luis Quintanilla <46974588+luisquintanilla@users.noreply.github.com>
* Update src/Microsoft.ML.GenAI.Phi/README.md
Co-authored-by: Luis Quintanilla <46974588+luisquintanilla@users.noreply.github.com>
* Update src/Microsoft.ML.GenAI.Phi/README.md
Co-authored-by: Luis Quintanilla <46974588+luisquintanilla@users.noreply.github.com>
* Update src/Microsoft.ML.GenAI.Phi/README.md
Co-authored-by: Luis Quintanilla <46974588+luisquintanilla@users.noreply.github.com>
---------
Co-authored-by: Luis Quintanilla <46974588+luisquintanilla@users.noreply.github.com>
2024-08-05 13:36:00 -07:00
Xiaoyun Zhang
9ffb3a3175
increase cancelling time ( #7207 )
2024-08-02 12:44:00 -06:00
Xiaoyun Zhang
6d1f7e2bf8
Add Microsoft.ML.GenAI.Phi, test package and sample project. ( #7184 )
...
* add genai.phi and tests
* formatter
* refactor Phi3Tokenizer
* update
* add configuration for phi-series
* add semantic kernel and autogen intergration
* update
* add Microsoft.ML.GenAI.Sample
* use tokenzier model from testTokenizer package
* use defaults
* add quantize linear
* use version string
* remove special token from CreatePhi2 API
* set up quantize sample
* initialize linear with zeros
* update sample
* add 6.0 to targetframework
* fix tests
* update
* remove Phi3Tokenizer and use LlamaTokenizer instead
* revert change in tokenizer package
* run test on x64
* fix tests
* check in approved file
* run test in net6.0
* use meta device
* copy approval tests to output folder
* set up approval test file location
* fix comment
* rename to AddGenAITextGeneration and AddGenAIChatCompletion
* Update job-template.yml
* add mit license
* add reference
* bump code coverage version
* add <PreserveCompilationContext>true</PreserveCompilationContext>
* add runtime package
* remove flag
* add flag
* fix build error
* update
* update
2024-07-30 15:47:31 -07:00
Tarek Mahmoud Sayed
f72c9d22fc
Reduce Tiktoken Creation Memory Allocation ( #7202 )
2024-07-28 15:55:03 -07:00
dotnet-maestro[bot]
34eb579d4b
[main] Update dependencies from dotnet/arcade ( #7165 )
...
* Update dependencies from https://github.com/dotnet/arcade build 20240531.1
Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
From Version 9.0.0-beta.24272.5 -> To Version 9.0.0-beta.24281.1
* Update dependencies from https://github.com/dotnet/arcade build 20240606.4
Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
From Version 9.0.0-beta.24272.5 -> To Version 9.0.0-beta.24306.4
* Update dependencies from https://github.com/dotnet/arcade build 20240614.1
Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
From Version 9.0.0-beta.24272.5 -> To Version 9.0.0-beta.24314.1
* Update dependencies from https://github.com/dotnet/arcade build 20240621.4
Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
From Version 9.0.0-beta.24272.5 -> To Version 9.0.0-beta.24321.4
* Update dependencies from https://github.com/dotnet/arcade build 20240627.1
Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
From Version 9.0.0-beta.24272.5 -> To Version 9.0.0-beta.24327.1
* Update dependencies from https://github.com/dotnet/arcade build 20240702.2
Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
From Version 9.0.0-beta.24272.5 -> To Version 9.0.0-beta.24352.2
* Update dependencies from https://github.com/dotnet/arcade build 20240710.4
Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
From Version 9.0.0-beta.24272.5 -> To Version 9.0.0-beta.24360.4
* Update dependencies from https://github.com/dotnet/arcade build 20240718.9
Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
From Version 9.0.0-beta.24272.5 -> To Version 9.0.0-beta.24368.9
* fixed async void tests
* fix bpe test
* fixed auto ml test
* skipping test
---------
Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
Co-authored-by: Michael Sharp <misharp@microsoft.com>
2024-07-27 18:33:59 -06:00
Erik
b4a4b10954
create unique temporary directories to prevent permission issues ( #7173 )
...
* minor fix to prevent multiple mldotnet directories
* forgot missing increment
2024-07-26 21:04:17 -06:00
Michael Sharp
ba94d7abff
Add package readmes ( #7200 )
...
* minor formatting issues, use preview 5, and package readmes
* fixing async void test
* test fixes
* added in ReadExactly extension to stream
* fixes from PR comments
* added types to fast tree
* fixed overflow
* fixed package md file
2024-07-26 21:03:35 -06:00