This commit is contained in:
Microsoft 2019-07-19 12:39:48 -07:00
Коммит 0fee8b3684
5 изменённых файлов: 231 добавлений и 0 удалений

3
.gitignore поставляемый Normal file
Просмотреть файл

@ -0,0 +1,3 @@
# build output
*.docx
*.pdf

Просмотреть файл

@ -0,0 +1,64 @@
# Computational Use of Data Agreement v0.1
# Annotated-Discussion-DRAFT-20190722
This is the Computational Use of Data Agreement, Version 0.1 (the “C-UDA”). Capitalized terms are defined in Section 5. Data Provider and you agree as follows:
**Comment** _The C-UDA was developed for use by a Data Provider that owns or controls Data, or has assembled Data from lawfully accessed, publicly available sources, and wishes to limit the use for computational purposes to be consistent with copyright laws. The C-UDA is not intended for a use of Data that may include personal data. To be precise, it is not appropriate for data sets that include any data that might include materials subject to privacy laws such as the GDPR or HIPAA._
1. **Provision of the Data**
1.1. You may use, modify, and distribute the Data made available to you by the Data Provider under this C-UDA for Computational Use if you follow the C-UDA's terms.
**Comment**: _The C-UDA permits data to be used for computational use only, but it also allows the Data to be modified and redistributed so long as the Downstream Recipient also complies with the C-UDAs terms. Because the rights (whether copyright, database rights, or merely access rights) that may potentially apply to content included in a data set can vary around the world, the C-UDA is styled to permit a set of uses recognized under law rather than to grant specific rights that may or may not be applicable._
1.2. Data Provider will not sue you or any Downstream Recipient for any claim arising out of the use, modification, or distribution of the Data provided you meet the terms of the C-UDA.
**Comment**: _This is a promise by the Data Provider not to sue the user so long as the user complies with C-UDAs requirements. It doesn't allow a Data Provider to terminate a permitted use of the Data, but it does allow the Data Provider to bring an action to enforce the C-UDAs terms._
2. **Restrictions**
2.1 You agree that you will use the Data solely for Computational Use.
**Comment**: _The agreement allows data provided under this C-UDA to be used for computational use (e.g., to train AI models), while not permitting other uses that might conflict with rights that may be held by third parties in the material within a database, such as broad rights to copy and distribute expressive works._
2.2 The C-UDA does not impose any restriction with respect to the use, modification, or distribution of Outputs.
3. **Redistribution of Data**
3.1. You may redistribute the Data, so long as:
3.1.1. You include with any Data you redistribute all credit or attribution information that you received with the Data, and your terms require any Downstream Recipient to do the same; and
3.1.2. You bind each recipient to whom you redistribute the Data to the terms of the C-UDA.
**Comment**: _The only requirements for redistributing Data are to maintain attribution (if any) and use restrictions under the C-UDA, so that Downstream Recipients are bound by it for their use. These two restrictions apply only to the Data, but not to Output. Maintaining attribution is an accepted practice for sharing data to indicate its source or provenance. Requiring the use of the C-UDA for subsequent distribution provides further certainty to a Data Provider that the authorized downstream uses of Data will be limited to computational use._
4. **No Warranty, Limitation of Liability**
4.1. Data Provider does not represent or warrant that it has any rights whatsoever in the Data.
**Comment**: _We have chosen a broad disclaimer of representations and warranties, which may not be appropriate in commercial contexts. Because the Data Provider makes no claims that they have rights in the data, the Downstream Recipient must ensure that its use of the Data conforms to applicable laws or regulations. A Data Provider should not use the C-UDA for Data that it knows should not be distributed, or data that contains sensitive or private information._
4.2. THE DATA IS PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OR CONDITIONS OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
4.3. NEITHER DATA PROVIDER NOR ANY UPSTREAM DATA PROVIDER SHALL HAVE ANY LIABILITY FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING WITHOUT LIMITATION LOST PROFITS), HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE DATA OR OUTPUTS, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
**Comment**: _These disclaimers are necessary to limit Data Providers liability. They disclaim both express and implied warranties or representations, and the disclaimer applies to all Upstream Data Providers. These limitations of liability are common in the open data context to encourage possessors of data to share the data without requiring them to accept liability to a user for downstream uses of the data, but may not be appropriate in commercial contexts._
5. **Definitions**
5.1. “Computational Use” means activities necessary to enable the use of Data (alone or along with other material) for analysis by a computer.
5.2.“Data” means the material you receive under the C-UDA in modified or unmodified form, but not including Output.
**Comment**: _The term "Data" encompasses both the initial data made available to "you" as well as any later modifications made to that data by a Data Provider or a Downstream Recipient that redistributes the data. This also means that Downstream Recipients remain free to modify the data. Data specifically excludes Outputs._
5.3. “Data Provider” means the source from which you receive the Data and with whom you enter into the C-UDA.
5.4. “Downstream Recipient” means any person or persons who receives the Data directly or indirectly from you in accordance with the C-UDA.
5.5. “Output” means the outcomes or results that you obtain from your use of Data that do not include more than a de minimis portion of the Data on which the use is based. Output may include de minimis portions of the Data necessary to report on or explain use that has been conducted with the Data, such as figures in scientific papers, but do not include more. Artificial intelligence models trained on Data (and which do not include more than a de minimis portion of Data) are Output.
**Comment**: _The C-UDA defines “Output” to clarify that any AI model produced from the use of the Data (e.g., as a training set) should not typically be considered to be subject to the C-UDA's restrictions. The C-UDA considers that Outputs are not a "derivative" or a "modification" of the Data if they don't contain more than a de minimis portion of the Data. This is also intended to clarify that research papers and accompanying figures that may include only a de minimis part of the Data are not subject to any restriction in the C-UDA._
5.6. “Upstream Data Providers” means the source or sources from which the Data Provider directly or indirectly received, under the terms of the C-UDA, material that is included in the Data.

116
LICENSE Normal file
Просмотреть файл

@ -0,0 +1,116 @@
CC0 1.0 Universal
Statement of Purpose
The laws of most jurisdictions throughout the world automatically confer
exclusive Copyright and Related Rights (defined below) upon the creator and
subsequent owner(s) (each and all, an "owner") of an original work of
authorship and/or a database (each, a "Work").
Certain owners wish to permanently relinquish those rights to a Work for the
purpose of contributing to a commons of creative, cultural and scientific
works ("Commons") that the public can reliably and without fear of later
claims of infringement build upon, modify, incorporate in other works, reuse
and redistribute as freely as possible in any form whatsoever and for any
purposes, including without limitation commercial purposes. These owners may
contribute to the Commons to promote the ideal of a free culture and the
further production of creative, cultural and scientific works, or to gain
reputation or greater distribution for their Work in part through the use and
efforts of others.
For these and/or other purposes and motivations, and without any expectation
of additional consideration or compensation, the person associating CC0 with a
Work (the "Affirmer"), to the extent that he or she is an owner of Copyright
and Related Rights in the Work, voluntarily elects to apply CC0 to the Work
and publicly distribute the Work under its terms, with knowledge of his or her
Copyright and Related Rights in the Work and the meaning and intended legal
effect of CC0 on those rights.
1. Copyright and Related Rights. A Work made available under CC0 may be
protected by copyright and related or neighboring rights ("Copyright and
Related Rights"). Copyright and Related Rights include, but are not limited
to, the following:
i. the right to reproduce, adapt, distribute, perform, display, communicate,
and translate a Work;
ii. moral rights retained by the original author(s) and/or performer(s);
iii. publicity and privacy rights pertaining to a person's image or likeness
depicted in a Work;
iv. rights protecting against unfair competition in regards to a Work,
subject to the limitations in paragraph 4(a), below;
v. rights protecting the extraction, dissemination, use and reuse of data in
a Work;
vi. database rights (such as those arising under Directive 96/9/EC of the
European Parliament and of the Council of 11 March 1996 on the legal
protection of databases, and under any national implementation thereof,
including any amended or successor version of such directive); and
vii. other similar, equivalent or corresponding rights throughout the world
based on applicable law or treaty, and any national implementations thereof.
2. Waiver. To the greatest extent permitted by, but not in contravention of,
applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and
unconditionally waives, abandons, and surrenders all of Affirmer's Copyright
and Related Rights and associated claims and causes of action, whether now
known or unknown (including existing as well as future claims and causes of
action), in the Work (i) in all territories worldwide, (ii) for the maximum
duration provided by applicable law or treaty (including future time
extensions), (iii) in any current or future medium and for any number of
copies, and (iv) for any purpose whatsoever, including without limitation
commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes
the Waiver for the benefit of each member of the public at large and to the
detriment of Affirmer's heirs and successors, fully intending that such Waiver
shall not be subject to revocation, rescission, cancellation, termination, or
any other legal or equitable action to disrupt the quiet enjoyment of the Work
by the public as contemplated by Affirmer's express Statement of Purpose.
3. Public License Fallback. Should any part of the Waiver for any reason be
judged legally invalid or ineffective under applicable law, then the Waiver
shall be preserved to the maximum extent permitted taking into account
Affirmer's express Statement of Purpose. In addition, to the extent the Waiver
is so judged Affirmer hereby grants to each affected person a royalty-free,
non transferable, non sublicensable, non exclusive, irrevocable and
unconditional license to exercise Affirmer's Copyright and Related Rights in
the Work (i) in all territories worldwide, (ii) for the maximum duration
provided by applicable law or treaty (including future time extensions), (iii)
in any current or future medium and for any number of copies, and (iv) for any
purpose whatsoever, including without limitation commercial, advertising or
promotional purposes (the "License"). The License shall be deemed effective as
of the date CC0 was applied by Affirmer to the Work. Should any part of the
License for any reason be judged legally invalid or ineffective under
applicable law, such partial invalidity or ineffectiveness shall not
invalidate the remainder of the License, and in such case Affirmer hereby
affirms that he or she will not (i) exercise any of his or her remaining
Copyright and Related Rights in the Work or (ii) assert any associated claims
and causes of action with respect to the Work, in either case contrary to
Affirmer's express Statement of Purpose.
4. Limitations and Disclaimers.
a. No trademark or patent rights held by Affirmer are waived, abandoned,
surrendered, licensed or otherwise affected by this document.
b. Affirmer offers the Work as-is and makes no representations or warranties
of any kind concerning the Work, express, implied, statutory or otherwise,
including without limitation warranties of title, merchantability, fitness
for a particular purpose, non infringement, or the absence of latent or
other defects, accuracy, or the present or absence of errors, whether or not
discoverable, all to the greatest extent permissible under applicable law.
c. Affirmer disclaims responsibility for clearing rights of other persons
that may apply to the Work or any use thereof, including without limitation
any person's Copyright and Related Rights in the Work. Further, Affirmer
disclaims responsibility for obtaining any necessary consents, permissions
or other rights required for any use of the Work.
d. Affirmer understands and acknowledges that Creative Commons is not a
party to this document and has no duty or obligation with respect to this
CC0 or use of the Work.
For more information, please see
<http://creativecommons.org/publicdomain/zero/1.0/>

46
README.md Normal file
Просмотреть файл

@ -0,0 +1,46 @@
# Computational Use of Data Agreement (C-UDA)
## Goal
Sharing data can help address some of societys biggest challenges and can help individuals and organizations be more innovative, efficient, and productive. We want to make it easier for individuals and organizations that want to share data to do so. Were working with companies, academics, and researchers to build better processes and tools. As a first step, weve taken a closer look at a specific data use scenario with this Computational Use of Data Agreement (C-UDA), intended to complement the [Open Use of Data Agreement (O-UDA)](https://github.com/microsoft/Open-Use-of-Data-Agreement/). The goal of the C-UDA is to define a use of data sets for AI training purposes that contain third party materials, in a manner consistent with law. We hope to gather community input that evolves the agreement for broad use. Our aim is to release a v1 of the C-UDA in Fall 2019. Please provide feedback by October 1, 2019.
For more information on Microsofts resources to Removing Barriers to Data Innovation, visit [here](https://news.microsoft.com/datainnovation).
## Overview
The C-UDA is a simple agreement that allows the data holder to make data available to anyone for computational use purposes, such as artificial intelligence, machine learning, and text and data mining. In short:
* It is intended for data sets that may include material not owned or controlled by the data distributor.
* It addresses data that is assembled from lawfully accessed, publicly available sources to be used for computational analysis.
* Redistribution of the Output from use of the data under the agreement—including results of analysis of the data or ML models trained with the data—carries no obligations.
* Redistribution of data under the agreement—modified or unmodified—requires use of the C-UDA.
* The redistribution obligations are designed to encourage sharing by limiting the liability of the data provider and ensuring that those downstream can identify where the data came from.
## Contemplated use case
We envision that this agreement is suitable for situations where the original data provider owns or has lawfully acquired the material in the data set (because they have express permission to use the material), or where they have assembled materials from lawfully and publicly accessible sources and the data is appropriate for distribution for computational use purposes. Permission to redistribute this material is limited to computational analysis to remain compliant with legal precedent and statutory exceptions and to respect the legitimate interests of third party rights owners.
This agreement is not recommended where the data provider includes material in the data set that (i) was not lawfully accessed and is not appropriate for distribution for computational use purposes, (ii) is subject to a legally binding restriction that restricts its further distribution, or (iii) raises privacy concerns arising from its distribution. Data Providers may need to consider whether additional measures are appropriate to ensure that data is not made available for use beyond legally permissible computational uses.
A limitation of this agreement is that it does not authorize uses beyond computational use that may otherwise be legally permissible.
With this agreement, Microsoft is not giving legal advice. Please consider your own circumstances and seek your own legal counsel as needed.
## The C-UDA does not meet the Open Data Definition
The C-UDA is not intended to be and should not be described as an open data license. Specifically, it does not permit use for any purpose as described in Section 2.1.1 and 2.1.8 of the [Open Definition](https://opendefinition.org/od/2.1/en/). The C-UDA is intended to address situations in which data cannot be shared under an open license, but it is possible for a data provider to permit computational use. For situations in which an open data license is appropriate, see the O-UDA and [other open data licenses](https://github.com/microsoft/Open-Use-of-Data-Agreement#why-a-new-license).
## Why a "computation" only agreement?
We developed the C-UDA to address a gap among current public agreements. Data that is useful for computational analysis may often include copyrightable content, and global legal precedent and legislation have confirmed that copyrighted works may be used for computational use without express consent of the owner. However, continued perceived uncertainty over copyright law has caused many data providers to resort to limitations that significantly restrict who can use data, or how the data can be used, in ways that may be more restrictive than those permitted by applicable law or legislation. These restrictions may create uncertainly or cause confusion among users that greatly limits the usefulness and benefit of data sets containing copyrighted works in artificial intelligence activities, such as machine learning. The C-UDA does not restrict who can use such data, but it limits the use of data to computational analysis to be consistent with applicable law and legislation, and to respect the legitimate interest of rights holders.
## Contributing
This project welcomes contributions and suggestions under [CC0-1.0](https://creativecommons.org/share-your-work/public-domain/cc0/). To suggest edits, open a [Pull Request](https://help.github.com/en/articles/editing-files-in-another-users-repository) or to start a discussion open an [Issue](https://help.github.com/en/articles/creating-an-issue). Or, if you prefer to submit comments via email, please submit them to [datainno@microsoft.com](mailto:datainno@microsoft.com). If you wish your comments to remain anonymous, please submit them by email and say so in the first line of the email.
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
## Legal Notices
Microsoft and any contributors grant you a license to content in this repository under CC0-1.0, see the [LICENSE](LICENSE) file.

2
build-docs.ps1 Normal file
Просмотреть файл

@ -0,0 +1,2 @@
pandoc .\C-UDA-0.1_annotated_discussion-draft.md -o .\C-UDA-0.1_annotated_discussion-draft.docx
pandoc .\C-UDA-0.1_annotated_discussion-draft.md -o .\C-UDA-0.1_annotated_discussion-draft.pdf