IoannisKat1's picture
Add finetuned model
807a21f verified
metadata
language:
  - en
license: apache-2.0
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:391
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
base_model: intfloat/multilingual-e5-large
widget:
  - source_sentence: What does 'personal data breach' entail?
    sentences:
      - >-
        1.Processing of personal data revealing racial or ethnic origin,
        political opinions, religious or philosophical beliefs, or trade union
        membership, and the processing of genetic data, biometric data for the
        purpose of uniquely identifying a natural person, data concerning health
        or data concerning a natural person's sex life or sexual orientation
        shall be prohibited.

        2.Paragraph 1 shall not apply if one of the following applies: (a)  the
        data subject has given explicit consent to the processing of those
        personal data for one or more specified purposes, except where Union or
        Member State law provide that the prohibition referred to in paragraph 1
        may not be lifted by the data subject; (b)  processing is necessary for
        the purposes of carrying out the obligations and exercising specific
        rights of the controller or of the data subject in the field of
        employment and social security and social protection law in so far as it
        is authorised by Union or Member State law or a collective agreement
        pursuant to Member State law providing for appropriate safeguards for
        the fundamental rights and the interests of the data subject; (c) 
        processing is necessary to protect the vital interests of the data
        subject or of another natural person where the data subject is
        physically or legally incapable of giving consent; (d)  processing is
        carried out in the course of its legitimate activities with appropriate
        safeguards by a foundation, association or any other not-for-profit body
        with a political, philosophical, religious or trade union aim and on
        condition that the processing relates solely to the members or to former
        members of the body or to persons who have regular contact with it in
        connection with its purposes and that the personal data are not
        disclosed outside that body without the consent of the data subjects;
        (e)  processing relates to personal data which are manifestly made
        public by the data subject; (f)  processing is necessary for the
        establishment, exercise or defence of legal claims or whenever courts
        are acting in their judicial capacity; (g)  processing is necessary for
        reasons of substantial public interest, on the basis of Union or Member
        State law which shall be proportionate to the aim pursued, respect the
        essence of the right to data protection and provide for suitable and
        specific measures to safeguard the fundamental rights and the interests
        of the data subject; (h)  processing is necessary for the purposes of
        preventive or occupational medicine, for the assessment of the working
        capacity of the employee, medical diagnosis, the provision of health or
        social care or treatment or the management of health or social care
        systems and services on the basis of Union or Member State law or
        pursuant to contract with a health professional and subject to the
        conditions and safeguards referred to in paragraph 3; (i)  processing is
        necessary for reasons of public interest in the area of public health,
        such as protecting against serious cross-border threats to health or
        ensuring high standards of quality and safety of health care and of
        medicinal products or medical devices, on the basis of Union or Member
        State law which provides for suitable and specific measures to safeguard
        the rights and freedoms of the data subject, in particular professional
        secrecy; 4.5.2016 L 119/38   (j)  processing is necessary for archiving
        purposes in the public interest, scientific or historical research
        purposes or statistical purposes in accordance with Article 89(1) based
        on Union or Member State law which shall be proportionate to the aim
        pursued, respect the essence of the right to data protection and provide
        for suitable and specific measures to safeguard the fundamental rights
        and the interests of the data subject.

        3.Personal data referred to in paragraph 1 may be processed for the
        purposes referred to in point (h) of paragraph 2 when those data are
        processed by or under the responsibility of a professional subject to
        the obligation of professional secrecy under Union or Member State law
        or rules established by national competent bodies or by another person
        also subject to an obligation of secrecy under Union or Member State law
        or rules established by national competent bodies.

        4.Member States may maintain or introduce further conditions, including
        limitations, with regard to the processing of genetic data, biometric
        data or data concerning health.
      - >-
        1) 'personal data' means any information relating to an identified or
        identifiable natural person ('data subject'); an identifiable natural
        person is one who can be identified, directly or indirectly, in
        particular by reference to an identifier such as a name, an
        identification number, location data, an online identifier or to one or
        more factors specific to the physical, physiological, genetic, mental,
        economic, cultural or social identity of that natural person;

        (2) ‘processing’ means any operation or set of operations which is
        performed on personal data or on sets of personal data, whether or not
        by automated means, such as collection, recording, organisation,
        structuring, storage, adaptation or alteration, retrieval, consultation,
        use, disclosure by transmission, dissemination or otherwise making
        available, alignment or combination, restriction, erasure or
        destruction;

        (3) ‘restriction of processing’ means the marking of stored personal
        data with the aim of limiting their processing in the future;

        (4) ‘profiling’ means any form of automated processing of personal data
        consisting of the use of personal data to evaluate certain personal
        aspects relating to a natural person, in particular to analyse or
        predict aspects concerning that natural person's performance at work,
        economic situation, health, personal preferences, interests,
        reliability, behaviour, location or movements;

        (5) ‘pseudonymisation’ means the processing of personal data in such a
        manner that the personal data can no longer be attributed to a specific
        data subject without the use of additional information, provided that
        such additional information is kept separately and is subject to
        technical and organisational measures to ensure that the personal data
        are not attributed to an identified or identifiable natural person;

        (6) ‘filing system’ means any structured set of personal data which are
        accessible according to specific criteria, whether centralised,
        decentralised or dispersed on a functional or geographical basis;

        (7) ‘controller’ means the natural or legal person, public authority,
        agency or other body which, alone or jointly with others, determines the
        purposes and means of the processing of personal data; where the
        purposes and means of such processing are determined by Union or Member
        State law, the controller or the specific criteria for its nomination
        may be provided for by Union or Member State law;

        (8) ‘processor’ means a natural or legal person, public authority,
        agency or other body which processes personal data on behalf of the
        controller;

        (9) ‘recipient’ means a natural or legal person, public authority,
        agency or another body, to which the personal data are disclosed,
        whether a third party or not. However, public authorities which may
        receive personal data in the framework of a particular inquiry in
        accordance with Union or Member State law shall not be regarded as
        recipients; the processing of those data by those public authorities
        shall be in compliance with the applicable data protection rules
        according to the purposes of the processing;

        (10) ‘third party’ means a natural or legal person, public authority,
        agency or body other than the data subject, controller, processor and
        persons who, under the direct authority of the controller or processor,
        are authorised to process personal data;

        (11) ‘consent’ of the data subject means any freely given, specific,
        informed and unambiguous indication of the data subject's wishes by
        which he or she, by a statement or by a clear affirmative action,
        signifies agreement to the processing of personal data relating to him
        or her;

        (12) ‘personal data breach’ means a breach of security leading to the
        accidental or unlawful destruction, loss, alteration, unauthorised
        disclosure of, or access to, personal data transmitted, stored or
        otherwise processed;

        (13) ‘genetic data’ means personal data relating to the inherited or
        acquired genetic characteristics of a natural person which give unique
        information about the physiology or the health of that natural person
        and which result, in particular, from an analysis of a biological sample
        from the natural person in question;

        (14) ‘biometric data’ means personal data resulting from specific
        technical processing relating to the physical, physiological or
        behavioural characteristics of a natural person, which allow or confirm
        the unique identification of that natural person, such as facial images
        or dactyloscopic data;

        (15) ‘data concerning health’ means personal data related to the
        physical or mental health of a natural person, including the provision
        of health care services, which reveal information about his or her
        health status;

        (16) ‘main establishment’ means: (a) as regards a controller with
        establishments in more than one Member State, the place of its central
        administration in the Union, unless the decisions on the purposes and
        means of the processing of personal data are taken in another
        establishment of the controller in the Union and the latter
        establishment has the power to have such decisions implemented, in which
        case the establishment having taken such decisions is to be considered
        to be the main establishment; (b) as regards a processor with
        establishments in more than one Member State, the place of its central
        administration in the Union, or, if the processor has no central
        administration in the Union, the establishment of the processor in the
        Union where the main processing activities in the context of the
        activities of an establishment of the processor take place to the extent
        that the processor is subject to specific obligations under this
        Regulation;

        (17) ‘representative’ means a natural or legal person established in the
        Union who, designated by the controller or processor in writing pursuant
        to Article 27, represents the controller or processor with regard to
        their respective obligations under this Regulation;

        (18) ‘enterprise’ means a natural or legal person engaged in an economic
        activity, irrespective of its legal form, including partnerships or
        associations regularly engaged in an economic activity;

        (19) ‘group of undertakings’ means a controlling undertaking and its
        controlled undertakings;

        (20) ‘binding corporate rules’ means personal data protection policies
        which are adhered to by a controller or processor established on the
        territory of a Member State for transfers or a set of transfers of
        personal data to a controller or processor in one or more third
        countries within a group of undertakings, or group of enterprises
        engaged in a joint economic activity;

        (21) ‘supervisory authority’ means an independent public authority which
        is established by a Member State pursuant to Article 51;

        (22) ‘supervisory authority concerned’ means a supervisory authority
        which is concerned by the processing of personal data because: (a) the
        controller or processor is established on the territory of the Member
        State of that supervisory authority; (b) data subjects residing in the
        Member State of that supervisory authority are substantially affected or
        likely to be substantially affected by the processing; or (c) a
        complaint has been lodged with that supervisory authority;

        (23) ‘cross-border processing’ means either: (a) processing of personal
        data which takes place in the context of the activities of
        establishments in more than one Member State of a controller or
        processor in the Union where the controller or processor is established
        in more than one Member State; or (b) processing of personal data which
        takes place in the context of the activities of a single establishment
        of a controller or processor in the Union but which substantially
        affects or is likely to substantially affect data subjects in more than
        one Member State.

        (24) ‘relevant and reasoned objection’ means an objection to a draft
        decision as to whether there is an infringement of this Regulation, or
        whether envisaged action in relation to the controller or processor
        complies with this Regulation, which clearly demonstrates the
        significance of the risks posed by the draft decision as regards the
        fundamental rights and freedoms of data subjects and, where applicable,
        the free flow of personal data within the Union;

        (25) ‘information society service’ means a service as defined in point
        (b) of Article 1(1) of Directive (EU) 2015/1535 of the European
        Parliament and of the Council (1);

        (26) ‘international organisation’ means an organisation and its
        subordinate bodies governed by public international law, or any other
        body which is set up by, or on the basis of, an agreement between two or
        more countries.
      - >-
        Any processing of personal data should be lawful and fair. It should be
        transparent to natural persons that personal data concerning them are
        collected, used, consulted or otherwise processed and to what extent the
        personal data are or will be processed. The principle of transparency
        requires that any information and communication relating to the
        processing of those personal data be easily accessible and easy to
        understand, and that clear and plain language be used. That principle
        concerns, in particular, information to the data subjects on the
        identity of the controller and the purposes of the processing and
        further information to ensure fair and transparent processing in respect
        of the natural persons concerned and their right to obtain confirmation
        and communication of personal data concerning them which are being
        processed. Natural persons should be made aware of risks, rules,
        safeguards and rights in relation to the processing of personal data and
        how to exercise their rights in relation to such processing. In
        particular, the specific purposes for which personal data are processed
        should be explicit and legitimate and determined at the time of the
        collection of the personal data. The personal data should be adequate,
        relevant and limited to what is necessary for the purposes for which
        they are processed. This requires, in particular, ensuring that the
        period for which the personal data are stored is limited to a strict
        minimum. Personal data should be processed only if the purpose of the
        processing could not reasonably be fulfilled by other means. In order to
        ensure that the personal data are not kept longer than necessary, time
        limits should be established by the controller for erasure or for a
        periodic review. Every reasonable step should be taken to ensure that
        personal data which are inaccurate are rectified or deleted. Personal
        data should be processed in a manner that ensures appropriate security
        and confidentiality of the personal data, including for preventing
        unauthorised access to or use of personal data and the equipment used
        for the processing.
  - source_sentence: >-
      In what situations could providing information to the data subject be
      considered impossible or involve a disproportionate effort?
    sentences:
      - >-
        1.The controller shall consult the supervisory authority prior to
        processing where a data protection impact assessment under Article 35
        indicates that the processing would result in a high risk in the absence
        of measures taken by the controller to mitigate the risk.

        2.Where the supervisory authority is of the opinion that the intended
        processing referred to in paragraph 1 would infringe this Regulation, in
        particular where the controller has insufficiently identified or
        mitigated the risk, the supervisory authority shall, within period of up
        to eight weeks of receipt of the request for consultation, provide
        written advice to the controller and, where applicable to the processor,
        and may use any of its powers referred to in Article 58. That period may
        be extended by six weeks, taking into account the complexity of the
        intended processing. The supervisory authority shall inform the
        controller and, where applicable, the processor, of any such extension
        within one month of receipt of the request for consultation together
        with the reasons for the delay. Those periods may be suspended until the
        supervisory authority has obtained information it has requested for the
        purposes of the consultation.

        3.When consulting the supervisory authority pursuant to paragraph 1, the
        controller shall provide the supervisory authority with: (a)  where
        applicable, the respective responsibilities of the controller, joint
        controllers and processors involved in the processing, in particular for
        processing within a group of undertakings; (b)  the purposes and means
        of the intended processing; (c)  the measures and safeguards provided to
        protect the rights and freedoms of data subjects pursuant to this
        Regulation; (d)  where applicable, the contact details of the data
        protection officer; 4.5.2016 L 119/54   (e)  the data protection impact
        assessment provided for in Article 35; and (f)  any other information
        requested by the supervisory authority.

        4.Member States shall consult the supervisory authority during the
        preparation of a proposal for a legislative measure to be adopted by a
        national parliament, or of a regulatory measure based on such a
        legislative measure, which relates to processing.

        5.Notwithstanding paragraph 1, Member State law may require controllers
        to consult with, and obtain prior authorisation from, the supervisory
        authority in relation to processing by a controller for the performance
        of a task carried out by the controller in the public interest,
        including processing in relation to social protection and public health
      - >-
        1.The Member States, the supervisory authorities, the Board and the
        Commission shall encourage, in particular at Union level, the
        establishment of data protection certification mechanisms and of data
        protection seals and marks, for the purpose of demonstrating compliance
        with this Regulation of processing operations by controllers and
        processors. The specific needs of micro, small and medium-sized
        enterprises shall be taken into account. 4.5.2016 L 119/58  

        2.In addition to adherence by controllers or processors subject to this
        Regulation, data protection certification mechanisms, seals or marks
        approved pursuant to paragraph 5 of this Article may be established for
        the purpose of demonstrating the existence of appropriate safeguards
        provided by controllers or processors that are not subject to this
        Regulation pursuant to Article 3 within the framework of personal data
        transfers to third countries or international organisations under the
        terms referred to in point (f) of Article 46(2). Such controllers or
        processors shall make binding and enforceable commitments, via
        contractual or other legally binding instruments, to apply those
        appropriate safeguards, including with regard to the rights of data
        subjects.

        3.The certification shall be voluntary and available via a process that
        is transparent.

        4.A certification pursuant to this Article does not reduce the
        responsibility of the controller or the processor for compliance with
        this Regulation and is without prejudice to the tasks and powers of the
        supervisory authorities which are competent pursuant to Article 55 or 56

        5.A certification pursuant to this Article shall be issued by the
        certification bodies referred to in Article 43 or by the competent
        supervisory authority, on the basis of criteria approved by that
        competent supervisory authority pursuant to Article 58(3) or by the
        Board pursuant to Article 63. Where the criteria are approved by the
        Board, this may result in a common certification, the European Data
        Protection Seal.

        6.The controller or processor which submits its processing to the
        certification mechanism shall provide the certification body referred to
        in Article 43, or where applicable, the competent supervisory authority,
        with all information and access to its processing activities which are
        necessary to conduct the certification procedure.

        7.Certification shall be issued to a controller or processor for a
        maximum period of three years and may be renewed, under the same
        conditions, provided that the relevant requirements continue to be met.
        Certification shall be withdrawn, as applicable, by the certification
        bodies referred to in Article 43 or by the competent supervisory
        authority where the requirements for the certification are not or are no
        longer met.

        8.The Board shall collate all certification mechanisms and data
        protection seals and marks in a register and shall make them publicly
        available by any appropriate means.
      - >-
        However, it is not necessary to impose the obligation to provide
        information where the data subject already possesses the information,
        where the recording or disclosure of the personal data is expressly laid
        down by law or where the provision of information to the data subject
        proves to be impossible or would involve a disproportionate effort. The
        latter could in particular be the case where processing is carried out
        for archiving purposes in the public interest, scientific or historical
        research purposes or statistical purposes. In that regard, the number of
        data subjects, the age of the data and any appropriate safeguards
        adopted should be taken into consideration.
  - source_sentence: >-
      What is the data subject provided with prior to further processing of
      personal data?
    sentences:
      - >-
        1.Where personal data relating to a data subject are collected from the
        data subject, the controller shall, at the time when personal data are
        obtained, provide the data subject with all of the following
        information: (a)  the identity and the contact details of the controller
        and, where applicable, of the controller's representative; (b)  the
        contact details of the data protection officer, where applicable; (c) 
        the purposes of the processing for which the personal data are intended
        as well as the legal basis for the processing; 4.5.2016 L 119/40   (d) 
        where the processing is based on point (f) of Article 6(1), the
        legitimate interests pursued by the controller or by a third party; (e) 
        the recipients or categories of recipients of the personal data, if any;
        (f)  where applicable, the fact that the controller intends to transfer
        personal data to a third country or international organisation and the
        existence or absence of an adequacy decision by the Commission, or in
        the case of transfers referred to in Article 46 or 47, or the second
        subparagraph of Article 49(1), reference to the appropriate or suitable
        safeguards and the means by which to obtain a copy of them or where they
        have been made available.

        2.In addition to the information referred to in paragraph 1, the
        controller shall, at the time when personal data are obtained, provide
        the data subject with the following further information necessary to
        ensure fair and transparent processing: (a)  the period for which the
        personal data will be stored, or if that is not possible, the criteria
        used to determine that period; (b)  the existence of the right to
        request from the controller access to and rectification or erasure of
        personal data or restriction of processing concerning the data subject
        or to object to processing as well as the right to data portability;
        (c)  where the processing is based on point (a) of Article 6(1) or point
        (a) of Article 9(2), the existence of the right to withdraw consent at
        any time, without affecting the lawfulness of processing based on
        consent before its withdrawal; (d)  the right to lodge a complaint with
        a supervisory authority; (e)  whether the provision of personal data is
        a statutory or contractual requirement, or a requirement necessary to
        enter into a contract, as well as whether the data subject is obliged to
        provide the personal data and of the possible consequences of failure to
        provide such data; (f)  the existence of automated decision-making,
        including profiling, referred to in Article 22(1) and (4) and, at least
        in those cases, meaningful information about the logic involved, as well
        as the significance and the envisaged consequences of such processing
        for the data subject.

        3.Where the controller intends to further process the personal data for
        a purpose other than that for which the personal data were collected,
        the controller shall provide the data subject prior to that further
        processing with information on that other purpose and with any relevant
        further information as referred to in paragraph 2

        4.Paragraphs 1, 2 and 3 shall not apply where and insofar as the data
        subject already has the information.
      - >-
        This Regulation respects and does not prejudice the status under
        existing constitutional law of churches and religious associations or
        communities in the Member States, as recognised in Article 17 TFEU.
      - >-
        1) 'personal data' means any information relating to an identified or
        identifiable natural person ('data subject'); an identifiable natural
        person is one who can be identified, directly or indirectly, in
        particular by reference to an identifier such as a name, an
        identification number, location data, an online identifier or to one or
        more factors specific to the physical, physiological, genetic, mental,
        economic, cultural or social identity of that natural person;

        (2) ‘processing’ means any operation or set of operations which is
        performed on personal data or on sets of personal data, whether or not
        by automated means, such as collection, recording, organisation,
        structuring, storage, adaptation or alteration, retrieval, consultation,
        use, disclosure by transmission, dissemination or otherwise making
        available, alignment or combination, restriction, erasure or
        destruction;

        (3) ‘restriction of processing’ means the marking of stored personal
        data with the aim of limiting their processing in the future;

        (4) ‘profiling’ means any form of automated processing of personal data
        consisting of the use of personal data to evaluate certain personal
        aspects relating to a natural person, in particular to analyse or
        predict aspects concerning that natural person's performance at work,
        economic situation, health, personal preferences, interests,
        reliability, behaviour, location or movements;

        (5) ‘pseudonymisation’ means the processing of personal data in such a
        manner that the personal data can no longer be attributed to a specific
        data subject without the use of additional information, provided that
        such additional information is kept separately and is subject to
        technical and organisational measures to ensure that the personal data
        are not attributed to an identified or identifiable natural person;

        (6) ‘filing system’ means any structured set of personal data which are
        accessible according to specific criteria, whether centralised,
        decentralised or dispersed on a functional or geographical basis;

        (7) ‘controller’ means the natural or legal person, public authority,
        agency or other body which, alone or jointly with others, determines the
        purposes and means of the processing of personal data; where the
        purposes and means of such processing are determined by Union or Member
        State law, the controller or the specific criteria for its nomination
        may be provided for by Union or Member State law;

        (8) ‘processor’ means a natural or legal person, public authority,
        agency or other body which processes personal data on behalf of the
        controller;

        (9) ‘recipient’ means a natural or legal person, public authority,
        agency or another body, to which the personal data are disclosed,
        whether a third party or not. However, public authorities which may
        receive personal data in the framework of a particular inquiry in
        accordance with Union or Member State law shall not be regarded as
        recipients; the processing of those data by those public authorities
        shall be in compliance with the applicable data protection rules
        according to the purposes of the processing;

        (10) ‘third party’ means a natural or legal person, public authority,
        agency or body other than the data subject, controller, processor and
        persons who, under the direct authority of the controller or processor,
        are authorised to process personal data;

        (11) ‘consent’ of the data subject means any freely given, specific,
        informed and unambiguous indication of the data subject's wishes by
        which he or she, by a statement or by a clear affirmative action,
        signifies agreement to the processing of personal data relating to him
        or her;

        (12) ‘personal data breach’ means a breach of security leading to the
        accidental or unlawful destruction, loss, alteration, unauthorised
        disclosure of, or access to, personal data transmitted, stored or
        otherwise processed;

        (13) ‘genetic data’ means personal data relating to the inherited or
        acquired genetic characteristics of a natural person which give unique
        information about the physiology or the health of that natural person
        and which result, in particular, from an analysis of a biological sample
        from the natural person in question;

        (14) ‘biometric data’ means personal data resulting from specific
        technical processing relating to the physical, physiological or
        behavioural characteristics of a natural person, which allow or confirm
        the unique identification of that natural person, such as facial images
        or dactyloscopic data;

        (15) ‘data concerning health’ means personal data related to the
        physical or mental health of a natural person, including the provision
        of health care services, which reveal information about his or her
        health status;

        (16) ‘main establishment’ means: (a) as regards a controller with
        establishments in more than one Member State, the place of its central
        administration in the Union, unless the decisions on the purposes and
        means of the processing of personal data are taken in another
        establishment of the controller in the Union and the latter
        establishment has the power to have such decisions implemented, in which
        case the establishment having taken such decisions is to be considered
        to be the main establishment; (b) as regards a processor with
        establishments in more than one Member State, the place of its central
        administration in the Union, or, if the processor has no central
        administration in the Union, the establishment of the processor in the
        Union where the main processing activities in the context of the
        activities of an establishment of the processor take place to the extent
        that the processor is subject to specific obligations under this
        Regulation;

        (17) ‘representative’ means a natural or legal person established in the
        Union who, designated by the controller or processor in writing pursuant
        to Article 27, represents the controller or processor with regard to
        their respective obligations under this Regulation;

        (18) ‘enterprise’ means a natural or legal person engaged in an economic
        activity, irrespective of its legal form, including partnerships or
        associations regularly engaged in an economic activity;

        (19) ‘group of undertakings’ means a controlling undertaking and its
        controlled undertakings;

        (20) ‘binding corporate rules’ means personal data protection policies
        which are adhered to by a controller or processor established on the
        territory of a Member State for transfers or a set of transfers of
        personal data to a controller or processor in one or more third
        countries within a group of undertakings, or group of enterprises
        engaged in a joint economic activity;

        (21) ‘supervisory authority’ means an independent public authority which
        is established by a Member State pursuant to Article 51;

        (22) ‘supervisory authority concerned’ means a supervisory authority
        which is concerned by the processing of personal data because: (a) the
        controller or processor is established on the territory of the Member
        State of that supervisory authority; (b) data subjects residing in the
        Member State of that supervisory authority are substantially affected or
        likely to be substantially affected by the processing; or (c) a
        complaint has been lodged with that supervisory authority;

        (23) ‘cross-border processing’ means either: (a) processing of personal
        data which takes place in the context of the activities of
        establishments in more than one Member State of a controller or
        processor in the Union where the controller or processor is established
        in more than one Member State; or (b) processing of personal data which
        takes place in the context of the activities of a single establishment
        of a controller or processor in the Union but which substantially
        affects or is likely to substantially affect data subjects in more than
        one Member State.

        (24) ‘relevant and reasoned objection’ means an objection to a draft
        decision as to whether there is an infringement of this Regulation, or
        whether envisaged action in relation to the controller or processor
        complies with this Regulation, which clearly demonstrates the
        significance of the risks posed by the draft decision as regards the
        fundamental rights and freedoms of data subjects and, where applicable,
        the free flow of personal data within the Union;

        (25) ‘information society service’ means a service as defined in point
        (b) of Article 1(1) of Directive (EU) 2015/1535 of the European
        Parliament and of the Council (1);

        (26) ‘international organisation’ means an organisation and its
        subordinate bodies governed by public international law, or any other
        body which is set up by, or on the basis of, an agreement between two or
        more countries.
  - source_sentence: >-
      What type of data may be processed for purposes related to point (h) of
      paragraph 2?
    sentences:
      - >-
        1.Processing of personal data revealing racial or ethnic origin,
        political opinions, religious or philosophical beliefs, or trade union
        membership, and the processing of genetic data, biometric data for the
        purpose of uniquely identifying a natural person, data concerning health
        or data concerning a natural person's sex life or sexual orientation
        shall be prohibited.

        2.Paragraph 1 shall not apply if one of the following applies: (a)  the
        data subject has given explicit consent to the processing of those
        personal data for one or more specified purposes, except where Union or
        Member State law provide that the prohibition referred to in paragraph 1
        may not be lifted by the data subject; (b)  processing is necessary for
        the purposes of carrying out the obligations and exercising specific
        rights of the controller or of the data subject in the field of
        employment and social security and social protection law in so far as it
        is authorised by Union or Member State law or a collective agreement
        pursuant to Member State law providing for appropriate safeguards for
        the fundamental rights and the interests of the data subject; (c) 
        processing is necessary to protect the vital interests of the data
        subject or of another natural person where the data subject is
        physically or legally incapable of giving consent; (d)  processing is
        carried out in the course of its legitimate activities with appropriate
        safeguards by a foundation, association or any other not-for-profit body
        with a political, philosophical, religious or trade union aim and on
        condition that the processing relates solely to the members or to former
        members of the body or to persons who have regular contact with it in
        connection with its purposes and that the personal data are not
        disclosed outside that body without the consent of the data subjects;
        (e)  processing relates to personal data which are manifestly made
        public by the data subject; (f)  processing is necessary for the
        establishment, exercise or defence of legal claims or whenever courts
        are acting in their judicial capacity; (g)  processing is necessary for
        reasons of substantial public interest, on the basis of Union or Member
        State law which shall be proportionate to the aim pursued, respect the
        essence of the right to data protection and provide for suitable and
        specific measures to safeguard the fundamental rights and the interests
        of the data subject; (h)  processing is necessary for the purposes of
        preventive or occupational medicine, for the assessment of the working
        capacity of the employee, medical diagnosis, the provision of health or
        social care or treatment or the management of health or social care
        systems and services on the basis of Union or Member State law or
        pursuant to contract with a health professional and subject to the
        conditions and safeguards referred to in paragraph 3; (i)  processing is
        necessary for reasons of public interest in the area of public health,
        such as protecting against serious cross-border threats to health or
        ensuring high standards of quality and safety of health care and of
        medicinal products or medical devices, on the basis of Union or Member
        State law which provides for suitable and specific measures to safeguard
        the rights and freedoms of the data subject, in particular professional
        secrecy; 4.5.2016 L 119/38   (j)  processing is necessary for archiving
        purposes in the public interest, scientific or historical research
        purposes or statistical purposes in accordance with Article 89(1) based
        on Union or Member State law which shall be proportionate to the aim
        pursued, respect the essence of the right to data protection and provide
        for suitable and specific measures to safeguard the fundamental rights
        and the interests of the data subject.

        3.Personal data referred to in paragraph 1 may be processed for the
        purposes referred to in point (h) of paragraph 2 when those data are
        processed by or under the responsibility of a professional subject to
        the obligation of professional secrecy under Union or Member State law
        or rules established by national competent bodies or by another person
        also subject to an obligation of secrecy under Union or Member State law
        or rules established by national competent bodies.

        4.Member States may maintain or introduce further conditions, including
        limitations, with regard to the processing of genetic data, biometric
        data or data concerning health.
      - >-
        1.The data protection officer shall have at least the following tasks:
        (a)  to inform and advise the controller or the processor and the
        employees who carry out processing of their obligations pursuant to this
        Regulation and to other Union or Member State data protection
        provisions; (b)  to monitor compliance with this Regulation, with other
        Union or Member State data protection provisions and with the policies
        of the controller or processor in relation to the protection of personal
        data, including the assignment of responsibilities, awareness-raising
        and training of staff involved in processing operations, and the related
        audits; (c)  to provide advice where requested as regards the data
        protection impact assessment and monitor its performance pursuant to
        Article 35; (d)  to cooperate with the supervisory authority; (e)  to
        act as the contact point for the supervisory authority on issues
        relating to processing, including the prior consultation referred to in
        Article 36, and to consult, where appropriate, with regard to any other
        matter.

        2.The data protection officer shall in the performance of his or her
        tasks have due regard to the risk associated with processing operations,
        taking into account the nature, scope, context and purposes of
        processing. Section 5 Codes of conduct and certification
      - >-
        Processing should be lawful where it is necessary in the context of a
        contract or the intention to enter into a contract.
  - source_sentence: >-
      What may impede authorities in the discharge of their responsibilities
      under Union law?
    sentences:
      - >-
        1.The controller and the processor shall designate a data protection
        officer in any case where: (a)  the processing is carried out by a
        public authority or body, except for courts acting in their judicial
        capacity; (b)  the core activities of the controller or the processor
        consist of processing operations which, by virtue of their nature, their
        scope and/or their purposes, require regular and systematic monitoring
        of data subjects on a large scale; or (c)  the core activities of the
        controller or the processor consist of processing on a large scale of
        special categories of data pursuant to Article 9 and personal data
        relating to criminal convictions and offences referred to in Article 10

        2.A group of undertakings may appoint a single data protection officer
        provided that a data protection officer is easily accessible from each
        establishment.

        3.Where the controller or the processor is a public authority or body, a
        single data protection officer may be designated for several such
        authorities or bodies, taking account of their organisational structure
        and size.

        4.In cases other than those referred to in paragraph 1, the controller
        or processor or associations and other bodies representing categories of
        controllers or processors may or, where required by Union or Member
        State law shall, designate a data protection officer. The data
        protection officer may act for such associations and other bodies
        representing controllers or processors.

        5.The data protection officer shall be designated on the basis of
        professional qualities and, in particular, expert knowledge of data
        protection law and practices and the ability to fulfil the tasks
        referred to in Article 39

        6.The data protection officer may be a staff member of the controller or
        processor, or fulfil the tasks on the basis of a service contract.

        7.The controller or the processor shall publish the contact details of
        the data protection officer and communicate them to the supervisory
        authority.
      - >-
        This Regulation is without prejudice to international agreements
        concluded between the Union and third countries regulating the transfer
        of personal data including appropriate safeguards for the data subjects.
        Member States may conclude international agreements which involve the
        transfer of personal data to third countries or international
        organisations, as far as such agreements do not affect this Regulation
        or any other provisions of Union law and include an appropriate level of
        protection for the fundamental rights of the data subjects.
      - >-
        The objectives and principles of Directive 95/46/EC remain sound, but it
        has not prevented fragmentation in the implementation of data protection
        across the Union, legal uncertainty or a widespread public perception
        that there are significant risks to the protection of natural persons,
        in particular with regard to online activity. Differences in the level
        of protection of the rights and freedoms of natural persons, in
        particular the right to the protection of personal data, with regard to
        the processing of personal data in the Member States may prevent the
        free flow of personal data throughout the Union. Those differences may
        therefore constitute an obstacle to the pursuit of economic activities
        at the level of the Union, distort competition and impede authorities in
        the discharge of their responsibilities under Union law. Such a
        difference in levels of protection is due to the existence of
        differences in the implementation and application of Directive 95/46/EC.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
model-index:
  - name: multilingual-e5-large
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 1024
          type: dim_1024
        metrics:
          - type: cosine_accuracy@1
            value: 0.3290653008962868
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.3348271446862996
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.3559539052496799
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.3886043533930858
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.3290653008962868
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.32885189927443453
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.31869398207426375
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.28380281690140846
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.04062540337753272
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.11937529555421877
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.17929032559391017
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.2609802153031206
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.34967137880514326
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.3392725037091231
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.4165482880126111
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 768
          type: dim_768
        metrics:
          - type: cosine_accuracy@1
            value: 0.3290653008962868
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.3348271446862996
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.3565941101152369
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.3911651728553137
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.3290653008962868
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.32885189927443453
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.31907810499359796
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.2860435339308579
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.040070803135958795
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.11769625185650755
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.17699013287798807
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.2600215922621299
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.35038934007937644
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.3396022600247949
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.41513115137941903
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 512
          type: dim_512
        metrics:
          - type: cosine_accuracy@1
            value: 0.324583866837388
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.33034571062740076
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.3553137003841229
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.3886043533930858
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.324583866837388
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.3243704652155356
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.31549295774647884
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.28425096030729835
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.039408176645563966
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.11569400881462148
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.17452688474231048
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.2588290716980974
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.34755602204164354
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.3356639839034201
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.4105799203347045
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 256
          type: dim_256
        metrics:
          - type: cosine_accuracy@1
            value: 0.3111395646606914
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.31882202304737517
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.3418693982074264
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.36619718309859156
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.3111395646606914
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.31156636790439607
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.30371318822023047
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.2723431498079385
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.03702717845490271
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.10903486138141442
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.16522998831931382
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.24388584743594785
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.3316834258973034
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.32126318517163555
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.3876570902519949
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 128
          type: dim_128
        metrics:
          - type: cosine_accuracy@1
            value: 0.3028169014084507
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.3066581306017926
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.3258642765685019
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.354033290653009
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.3028169014084507
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.30217669654289375
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.29334186939820744
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.26325224071702946
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.03581534845465155
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.10498018962345104
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.15825094621698793
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.23457162530017844
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.3200987320599894
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.31138141983212375
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.3737489129899149
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 64
          type: dim_64
        metrics:
          - type: cosine_accuracy@1
            value: 0.264404609475032
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.26952624839948786
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.2912932138284251
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.3220230473751601
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.264404609475032
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.2639778062313273
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.2573623559539053
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.23399487836107555
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.03137978486480133
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.09184879304327909
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.13906413978147564
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.2079536154587263
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.28363892738216534
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.27433820092270755
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.3356052539796121
            name: Cosine Map@100

multilingual-e5-large

This is a sentence-transformers model finetuned from intfloat/multilingual-e5-large. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: intfloat/multilingual-e5-large
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'XLMRobertaModel'})
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'What may impede authorities in the discharge of their responsibilities under Union law?',
    'The objectives and principles of Directive 95/46/EC remain sound, but it has not prevented fragmentation in the implementation of data protection across the Union, legal uncertainty or a widespread public perception that there are significant risks to the protection of natural persons, in particular with regard to online activity. Differences in the level of protection of the rights and freedoms of natural persons, in particular the right to the protection of personal data, with regard to the processing of personal data in the Member States may prevent the free flow of personal data throughout the Union. Those differences may therefore constitute an obstacle to the pursuit of economic activities at the level of the Union, distort competition and impede authorities in the discharge of their responsibilities under Union law. Such a difference in levels of protection is due to the existence of differences in the implementation and application of Directive 95/46/EC.',
    'This Regulation is without prejudice to international agreements concluded between the Union and third countries regulating the transfer of personal data including appropriate safeguards for the data subjects. Member States may conclude international agreements which involve the transfer of personal data to third countries or international organisations, as far as such agreements do not affect this Regulation or any other provisions of Union law and include an appropriate level of protection for the fundamental rights of the data subjects.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.5237, 0.3440],
#         [0.5237, 1.0000, 0.5061],
#         [0.3440, 0.5061, 1.0000]])

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.3291
cosine_accuracy@3 0.3348
cosine_accuracy@5 0.356
cosine_accuracy@10 0.3886
cosine_precision@1 0.3291
cosine_precision@3 0.3289
cosine_precision@5 0.3187
cosine_precision@10 0.2838
cosine_recall@1 0.0406
cosine_recall@3 0.1194
cosine_recall@5 0.1793
cosine_recall@10 0.261
cosine_ndcg@10 0.3497
cosine_mrr@10 0.3393
cosine_map@100 0.4165

Information Retrieval

Metric Value
cosine_accuracy@1 0.3291
cosine_accuracy@3 0.3348
cosine_accuracy@5 0.3566
cosine_accuracy@10 0.3912
cosine_precision@1 0.3291
cosine_precision@3 0.3289
cosine_precision@5 0.3191
cosine_precision@10 0.286
cosine_recall@1 0.0401
cosine_recall@3 0.1177
cosine_recall@5 0.177
cosine_recall@10 0.26
cosine_ndcg@10 0.3504
cosine_mrr@10 0.3396
cosine_map@100 0.4151

Information Retrieval

Metric Value
cosine_accuracy@1 0.3246
cosine_accuracy@3 0.3303
cosine_accuracy@5 0.3553
cosine_accuracy@10 0.3886
cosine_precision@1 0.3246
cosine_precision@3 0.3244
cosine_precision@5 0.3155
cosine_precision@10 0.2843
cosine_recall@1 0.0394
cosine_recall@3 0.1157
cosine_recall@5 0.1745
cosine_recall@10 0.2588
cosine_ndcg@10 0.3476
cosine_mrr@10 0.3357
cosine_map@100 0.4106

Information Retrieval

Metric Value
cosine_accuracy@1 0.3111
cosine_accuracy@3 0.3188
cosine_accuracy@5 0.3419
cosine_accuracy@10 0.3662
cosine_precision@1 0.3111
cosine_precision@3 0.3116
cosine_precision@5 0.3037
cosine_precision@10 0.2723
cosine_recall@1 0.037
cosine_recall@3 0.109
cosine_recall@5 0.1652
cosine_recall@10 0.2439
cosine_ndcg@10 0.3317
cosine_mrr@10 0.3213
cosine_map@100 0.3877

Information Retrieval

Metric Value
cosine_accuracy@1 0.3028
cosine_accuracy@3 0.3067
cosine_accuracy@5 0.3259
cosine_accuracy@10 0.354
cosine_precision@1 0.3028
cosine_precision@3 0.3022
cosine_precision@5 0.2933
cosine_precision@10 0.2633
cosine_recall@1 0.0358
cosine_recall@3 0.105
cosine_recall@5 0.1583
cosine_recall@10 0.2346
cosine_ndcg@10 0.3201
cosine_mrr@10 0.3114
cosine_map@100 0.3737

Information Retrieval

Metric Value
cosine_accuracy@1 0.2644
cosine_accuracy@3 0.2695
cosine_accuracy@5 0.2913
cosine_accuracy@10 0.322
cosine_precision@1 0.2644
cosine_precision@3 0.264
cosine_precision@5 0.2574
cosine_precision@10 0.234
cosine_recall@1 0.0314
cosine_recall@3 0.0918
cosine_recall@5 0.1391
cosine_recall@10 0.208
cosine_ndcg@10 0.2836
cosine_mrr@10 0.2743
cosine_map@100 0.3356

Training Details

Training Dataset

Unnamed Dataset

  • Size: 391 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 391 samples:
    anchor positive
    type string string
    details
    • min: 8 tokens
    • mean: 16.9 tokens
    • max: 30 tokens
    • min: 27 tokens
    • mean: 372.91 tokens
    • max: 512 tokens
  • Samples:
    anchor positive
    On what date did the act occur? Court (Civil/Criminal): Civil
    Provisions: Directive 2015/366, Law 4537/2018
    Time of the act: 31.08.2022
    Outcome (not guilty, guilty): Partially accepts the claim.
    Reasoning: The Athens Peace Court ordered the bank to return the amount that was withdrawn from the plaintiffs' account and to pay additional compensation for the moral damage they suffered.
    Facts: The case concerns plaintiffs who fell victim to electronic fraud via phishing, resulting in the withdrawal of money from their bank account. The plaintiffs claimed that the bank did not take the necessary security measures to protect their accounts and sought compensation for the financial loss and moral damage they suffered. The court determined that the bank is responsible for the loss of the money, as it did not prove that the transactions were authorized by the plaintiffs. Furthermore, the court recognized that the bank's refusal to return the funds constitutes an infringement of the plaintiffs' personal rights, as it...
    For what purposes can more specific rules be provided regarding the employment context? 1.Member States may, by law or by collective agreements, provide for more specific rules to ensure the protection of the rights and freedoms in respect of the processing of employees' personal data in the employment context, in particular for the purposes of the recruitment, the performance of the contract of employment, including discharge of obligations laid down by law or by collective agreements, management, planning and organisation of work, equality and diversity in the workplace, health and safety at work, protection of employer's or customer's property and for the purposes of the exercise and enjoyment, on an individual or collective basis, of rights and benefits related to employment, and for the purpose of the termination of the employment relationship.
    2.Those rules shall include suitable and specific measures to safeguard the data subject's human dignity, legitimate interests and fundamental rights, with particular regard to the transparency of processing, the transfer of p...
    On which date were transactions detailed in the provided text conducted? Court (Civil/Criminal): Civil

    Provisions:

    Time of commission of the act:

    Outcome (not guilty, guilty):

    Rationale:

    Facts:
    The plaintiff holds credit card number ............ with the defendant banking corporation. Based on the application for alternative networks dated 19/7/2015 with number ......... submitted at a branch of the defendant, he was granted access to the electronic banking service (e-banking) to conduct banking transactions (debit, credit, updates, payments) remotely. On 30/11/2020, the plaintiff fell victim to electronic fraud through the "phishing" method, whereby an unknown perpetrator managed to withdraw a total amount of €3,121.75 from the aforementioned credit card. Specifically, the plaintiff received an email at 1:35 PM on 29/11/2020 from sender ...... with address ........, informing him that due to an impending system change, he needed to verify the mobile phone number linked to the credit card, urging him to complete the verification...
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            1024,
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2
  • gradient_accumulation_steps: 2
  • learning_rate: 2e-05
  • num_train_epochs: 20
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 2
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 20
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss dim_1024_cosine_ndcg@10 dim_768_cosine_ndcg@10 dim_512_cosine_ndcg@10 dim_256_cosine_ndcg@10 dim_128_cosine_ndcg@10 dim_64_cosine_ndcg@10
0.0102 1 9.6954 - - - - - -
0.0204 2 11.5048 - - - - - -
0.0306 3 2.1575 - - - - - -
0.0408 4 2.6843 - - - - - -
0.0510 5 0.0364 - - - - - -
0.0612 6 0.705 - - - - - -
0.0714 7 1.9957 - - - - - -
0.0816 8 0.9938 - - - - - -
0.0918 9 0.3187 - - - - - -
0.1020 10 0.1435 - - - - - -
0.1122 11 0.0818 - - - - - -
0.1224 12 0.6535 - - - - - -
0.1327 13 0.3915 - - - - - -
0.1429 14 0.5493 - - - - - -
0.1531 15 0.7231 - - - - - -
0.1633 16 0.0715 - - - - - -
0.1735 17 5.8663 - - - - - -
0.1837 18 0.2586 - - - - - -
0.1939 19 0.9353 - - - - - -
0.2041 20 2.5843 - - - - - -
0.2143 21 2.0583 - - - - - -
0.2245 22 6.9121 - - - - - -
0.2347 23 1.0921 - - - - - -
0.2449 24 5.4863 - - - - - -
0.2551 25 0.0549 - - - - - -
0.2653 26 2.345 - - - - - -
0.2755 27 4.264 - - - - - -
0.2857 28 2.4847 - - - - - -
0.2959 29 0.7634 - - - - - -
0.3061 30 2.047 - - - - - -
0.3163 31 0.694 - - - - - -
0.3265 32 0.7417 - - - - - -
0.3367 33 1.9942 - - - - - -
0.3469 34 2.8978 - - - - - -
0.3571 35 0.0126 - - - - - -
0.3673 36 1.9776 - - - - - -
0.3776 37 1.5667 - - - - - -
0.3878 38 5.5693 - - - - - -
0.3980 39 1.6802 - - - - - -
0.4082 40 0.2144 - - - - - -
0.4184 41 0.1797 - - - - - -
0.4286 42 5.7559 - - - - - -
0.4388 43 2.6372 - - - - - -
0.4490 44 1.8447 - - - - - -
0.4592 45 2.8156 - - - - - -
0.4694 46 3.1588 - - - - - -
0.4796 47 0.0552 - - - - - -
0.4898 48 3.3053 - - - - - -
0.5 49 2.8332 - - - - - -
0.5102 50 1.1961 - - - - - -
0.5204 51 1.0106 - - - - - -
0.5306 52 2.4593 - - - - - -
0.5408 53 3.4849 - - - - - -
0.5510 54 0.0338 - - - - - -
0.5612 55 1.5319 - - - - - -
0.5714 56 0.0419 - - - - - -
0.5816 57 0.1098 - - - - - -
0.5918 58 0.0457 - - - - - -
0.6020 59 0.0273 - - - - - -
0.6122 60 1.2946 - - - - - -
0.6224 61 3.4121 - - - - - -
0.6327 62 2.6015 - - - - - -
0.6429 63 2.0358 - - - - - -
0.6531 64 7.3114 - - - - - -
0.6633 65 6.8888 - - - - - -
0.6735 66 1.6606 - - - - - -
0.6837 67 5.2343 - - - - - -
0.6939 68 2.1977 - - - - - -
0.7041 69 0.1702 - - - - - -
0.7143 70 3.5715 - - - - - -
0.7245 71 1.4736 - - - - - -
0.7347 72 1.0967 - - - - - -
0.7449 73 1.2098 - - - - - -
0.7551 74 1.9541 - - - - - -
0.7653 75 4.0992 - - - - - -
0.7755 76 0.0145 - - - - - -
0.7857 77 0.0079 - - - - - -
0.7959 78 0.1081 - - - - - -
0.8061 79 1.7446 - - - - - -
0.8163 80 0.6343 - - - - - -
0.8265 81 4.7374 - - - - - -
0.8367 82 3.1082 - - - - - -
0.8469 83 0.0144 - - - - - -
0.8571 84 0.0057 - - - - - -
0.8673 85 0.7656 - - - - - -
0.8776 86 1.5191 - - - - - -
0.8878 87 0.1942 - - - - - -
0.8980 88 0.2429 - - - - - -
0.9082 89 7.0608 - - - - - -
0.9184 90 0.1635 - - - - - -
0.9286 91 0.057 - - - - - -
0.9388 92 3.1796 - - - - - -
0.9490 93 2.4068 - - - - - -
0.9592 94 0.9694 - - - - - -
0.9694 95 0.4878 - - - - - -
0.9796 96 0.4105 - - - - - -
0.9898 97 4.5006 - - - - - -
1.0 98 2.2675 0.3722 0.3678 0.3627 0.3436 0.3239 0.2694
1.0102 99 0.9602 - - - - - -
1.0204 100 5.0193 - - - - - -
1.0306 101 1.1252 - - - - - -
1.0408 102 0.7896 - - - - - -
1.0510 103 1.2793 - - - - - -
1.0612 104 0.3422 - - - - - -
1.0714 105 0.0204 - - - - - -
1.0816 106 0.018 - - - - - -
1.0918 107 0.0082 - - - - - -
1.1020 108 6.0895 - - - - - -
1.1122 109 0.0115 - - - - - -
1.1224 110 0.2657 - - - - - -
1.1327 111 0.0232 - - - - - -
1.1429 112 1.4261 - - - - - -
1.1531 113 5.6396 - - - - - -
1.1633 114 0.2395 - - - - - -
1.1735 115 0.001 - - - - - -
1.1837 116 1.053 - - - - - -
1.1939 117 0.0335 - - - - - -
1.2041 118 1.9711 - - - - - -
1.2143 119 1.7967 - - - - - -
1.2245 120 0.0046 - - - - - -
1.2347 121 0.0002 - - - - - -
1.2449 122 0.0585 - - - - - -
1.2551 123 0.3547 - - - - - -
1.2653 124 6.193 - - - - - -
1.2755 125 0.0073 - - - - - -
1.2857 126 0.3095 - - - - - -
1.2959 127 0.0026 - - - - - -
1.3061 128 0.0065 - - - - - -
1.3163 129 0.0326 - - - - - -
1.3265 130 0.0121 - - - - - -
1.3367 131 2.081 - - - - - -
1.3469 132 0.0329 - - - - - -
1.3571 133 4.8144 - - - - - -
1.3673 134 1.8287 - - - - - -
1.3776 135 0.0016 - - - - - -
1.3878 136 2.7057 - - - - - -
1.3980 137 0.0087 - - - - - -
1.4082 138 0.7368 - - - - - -
1.4184 139 0.1354 - - - - - -
1.4286 140 0.0446 - - - - - -
1.4388 141 0.2849 - - - - - -
1.4490 142 6.2924 - - - - - -
1.4592 143 0.4827 - - - - - -
1.4694 144 7.8315 - - - - - -
1.4796 145 6.0618 - - - - - -
1.4898 146 1.0472 - - - - - -
1.5 147 0.0007 - - - - - -
1.5102 148 0.0433 - - - - - -
1.5204 149 1.116 - - - - - -
1.5306 150 1.5491 - - - - - -
1.5408 151 0.2423 - - - - - -
1.5510 152 0.4355 - - - - - -
1.5612 153 0.0043 - - - - - -
1.5714 154 0.059 - - - - - -
1.5816 155 0.0175 - - - - - -
1.5918 156 2.8813 - - - - - -
1.6020 157 0.4372 - - - - - -
1.6122 158 0.0611 - - - - - -
1.6224 159 4.6339 - - - - - -
1.6327 160 2.1581 - - - - - -
1.6429 161 1.9109 - - - - - -
1.6531 162 10.7888 - - - - - -
1.6633 163 4.4287 - - - - - -
1.6735 164 4.1106 - - - - - -
1.6837 165 3.8159 - - - - - -
1.6939 166 0.0468 - - - - - -
1.7041 167 0.0023 - - - - - -
1.7143 168 0.0031 - - - - - -
1.7245 169 3.0379 - - - - - -
1.7347 170 0.0058 - - - - - -
1.7449 171 0.0097 - - - - - -
1.7551 172 0.114 - - - - - -
1.7653 173 0.0376 - - - - - -
1.7755 174 0.0006 - - - - - -
1.7857 175 1.7519 - - - - - -
1.7959 176 3.5166 - - - - - -
1.8061 177 2.073 - - - - - -
1.8163 178 0.1532 - - - - - -
1.8265 179 2.0969 - - - - - -
1.8367 180 1.867 - - - - - -
1.8469 181 18.7505 - - - - - -
1.8571 182 2.5291 - - - - - -
1.8673 183 2.8375 - - - - - -
1.8776 184 0.0902 - - - - - -
1.8878 185 0.0139 - - - - - -
1.8980 186 0.0356 - - - - - -
1.9082 187 0.0838 - - - - - -
1.9184 188 0.0391 - - - - - -
1.9286 189 1.2579 - - - - - -
1.9388 190 9.3381 - - - - - -
1.9490 191 0.094 - - - - - -
1.9592 192 0.0638 - - - - - -
1.9694 193 4.3027 - - - - - -
1.9796 194 0.002 - - - - - -
1.9898 195 0.9772 - - - - - -
2.0 196 0.0053 0.3497 0.3504 0.3476 0.3317 0.3201 0.2836

Framework Versions

  • Python: 3.12.11
  • Sentence Transformers: 5.1.0
  • Transformers: 4.51.3
  • PyTorch: 2.8.0+cu126
  • Accelerate: 1.10.1
  • Datasets: 4.0.0
  • Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}