{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _cond_k(self, word, suffix_len):\n        return (len(word) - suffix_len >= 3) and (\n            word[-suffix_len - 1] in {'i', 'l'}\n            or (word[-suffix_len - 3] == 'u' and word[-suffix_len - 1] == 'e')\n        )", "response": "Return Lovins condition K."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _cond_n(self, word, suffix_len):\n        if len(word) - suffix_len >= 3:\n            if word[-suffix_len - 3] == 's':\n                if len(word) - suffix_len >= 4:\n                    return True\n            else:\n                return True\n        return False", "response": "Return Lovins condition N."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _cond_s(self, word, suffix_len):\n        return word[-suffix_len - 2 : -suffix_len] == 'dr' or (\n            word[-suffix_len - 1] == 't'\n            and word[-suffix_len - 2 : -suffix_len] != 'tt'\n        )", "response": "Return Lovins condition S."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _cond_x(self, word, suffix_len):\n        return word[-suffix_len - 1] in {'i', 'l'} or (\n            word[-suffix_len - 3 : -suffix_len] == 'u'\n            and word[-suffix_len - 1] == 'e'\n        )", "response": "Return Lovins condition X."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _cond_bb(self, word, suffix_len):\n        return (\n            len(word) - suffix_len >= 3\n            and word[-suffix_len - 3 : -suffix_len] != 'met'\n            and word[-suffix_len - 4 : -suffix_len] != 'ryst'\n        )", "response": "Return Lovins condition BB."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreturns the stemmed version of the word.", "response": "def stem(self, word):\n        \"\"\"Return Lovins stem.\n\n        Parameters\n        ----------\n        word : str\n            The word to stem\n\n        Returns\n        -------\n        str\n            Word stem\n\n        Examples\n        --------\n        >>> stmr = Lovins()\n        >>> stmr.stem('reading')\n        'read'\n        >>> stmr.stem('suspension')\n        'suspens'\n        >>> stmr.stem('elusiveness')\n        'elus'\n\n        \"\"\"\n        # lowercase, normalize, and compose\n        word = normalize('NFC', text_type(word.lower()))\n\n        for suffix_len in range(11, 0, -1):\n            ending = word[-suffix_len:]\n            if (\n                ending in self._suffix\n                and len(word) - suffix_len >= 2\n                and (\n                    self._suffix[ending] is None\n                    or self._suffix[ending](word, suffix_len)\n                )\n            ):\n                word = word[:-suffix_len]\n                break\n\n        if word[-2:] in {\n            'bb',\n            'dd',\n            'gg',\n            'll',\n            'mm',\n            'nn',\n            'pp',\n            'rr',\n            'ss',\n            'tt',\n        }:\n            word = word[:-1]\n\n        for ending, replacement in self._recode:\n            if word.endswith(ending):\n                if callable(replacement):\n                    word = replacement(word)\n                else:\n                    word = word[: -len(ending)] + replacement\n\n        return word"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreturn the Damerau - Levenshtein distance between two strings src and tar.", "response": "def dist_abs(self, src, tar, cost=(1, 1, 1, 1)):\n        \"\"\"Return the Damerau-Levenshtein distance between two strings.\n\n        Parameters\n        ----------\n        src : str\n            Source string for comparison\n        tar : str\n            Target string for comparison\n        cost : tuple\n            A 4-tuple representing the cost of the four possible edits:\n            inserts, deletes, substitutions, and transpositions, respectively\n            (by default: (1, 1, 1, 1))\n\n        Returns\n        -------\n        int (may return a float if cost has float values)\n            The Damerau-Levenshtein distance between src & tar\n\n        Raises\n        ------\n        ValueError\n            Unsupported cost assignment; the cost of two transpositions must\n            not be less than the cost of an insert plus a delete.\n\n        Examples\n        --------\n        >>> cmp = DamerauLevenshtein()\n        >>> cmp.dist_abs('cat', 'hat')\n        1\n        >>> cmp.dist_abs('Niall', 'Neil')\n        3\n        >>> cmp.dist_abs('aluminum', 'Catalan')\n        7\n        >>> cmp.dist_abs('ATCG', 'TAGC')\n        2\n\n        \"\"\"\n        ins_cost, del_cost, sub_cost, trans_cost = cost\n\n        if src == tar:\n            return 0\n        if not src:\n            return len(tar) * ins_cost\n        if not tar:\n            return len(src) * del_cost\n\n        if 2 * trans_cost < ins_cost + del_cost:\n            raise ValueError(\n                'Unsupported cost assignment; the cost of two transpositions '\n                + 'must not be less than the cost of an insert plus a delete.'\n            )\n\n        d_mat = np_zeros((len(src)) * (len(tar)), dtype=np_int).reshape(\n            (len(src), len(tar))\n        )\n\n        if src[0] != tar[0]:\n            d_mat[0, 0] = min(sub_cost, ins_cost + del_cost)\n\n        src_index_by_character = {src[0]: 0}\n        for i in range(1, len(src)):\n            del_distance = d_mat[i - 1, 0] + del_cost\n            ins_distance = (i + 1) * del_cost + ins_cost\n            match_distance = i * del_cost + (\n                0 if src[i] == tar[0] else sub_cost\n            )\n            d_mat[i, 0] = min(del_distance, ins_distance, match_distance)\n\n        for j in range(1, len(tar)):\n            del_distance = (j + 1) * ins_cost + del_cost\n            ins_distance = d_mat[0, j - 1] + ins_cost\n            match_distance = j * ins_cost + (\n                0 if src[0] == tar[j] else sub_cost\n            )\n            d_mat[0, j] = min(del_distance, ins_distance, match_distance)\n\n        for i in range(1, len(src)):\n            max_src_letter_match_index = 0 if src[i] == tar[0] else -1\n            for j in range(1, len(tar)):\n                candidate_swap_index = (\n                    -1\n                    if tar[j] not in src_index_by_character\n                    else src_index_by_character[tar[j]]\n                )\n                j_swap = max_src_letter_match_index\n                del_distance = d_mat[i - 1, j] + del_cost\n                ins_distance = d_mat[i, j - 1] + ins_cost\n                match_distance = d_mat[i - 1, j - 1]\n                if src[i] != tar[j]:\n                    match_distance += sub_cost\n                else:\n                    max_src_letter_match_index = j\n\n                if candidate_swap_index != -1 and j_swap != -1:\n                    i_swap = candidate_swap_index\n\n                    if i_swap == 0 and j_swap == 0:\n                        pre_swap_cost = 0\n                    else:\n                        pre_swap_cost = d_mat[\n                            max(0, i_swap - 1), max(0, j_swap - 1)\n                        ]\n                    swap_distance = (\n                        pre_swap_cost\n                        + (i - i_swap - 1) * del_cost\n                        + (j - j_swap - 1) * ins_cost\n                        + trans_cost\n                    )\n                else:\n                    swap_distance = maxsize\n\n                d_mat[i, j] = min(\n                    del_distance, ins_distance, match_distance, swap_distance\n                )\n            src_index_by_character[src[i]] = i\n\n        return d_mat[len(src) - 1, len(tar) - 1]"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef dist(self, src, tar):\n        if src == tar:\n            return 0.0\n\n        src = src.encode('utf-8')\n        tar = tar.encode('utf-8')\n\n        self._compressor.compress(src)\n        src_comp = self._compressor.flush(zlib.Z_FULL_FLUSH)\n        self._compressor.compress(tar)\n        tar_comp = self._compressor.flush(zlib.Z_FULL_FLUSH)\n        self._compressor.compress(src + tar)\n        concat_comp = self._compressor.flush(zlib.Z_FULL_FLUSH)\n        self._compressor.compress(tar + src)\n        concat_comp2 = self._compressor.flush(zlib.Z_FULL_FLUSH)\n\n        return (\n            min(len(concat_comp), len(concat_comp2))\n            - min(len(src_comp), len(tar_comp))\n        ) / max(len(src_comp), len(tar_comp))", "response": "Return the NCD between two strings using zlib compression."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nreturn the Phonex code for a word.", "response": "def phonex(word, max_length=4, zero_pad=True):\n    \"\"\"Return the Phonex code for a word.\n\n    This is a wrapper for :py:meth:`Phonex.encode`.\n\n    Parameters\n    ----------\n    word : str\n        The word to transform\n    max_length : int\n        The length of the code returned (defaults to 4)\n    zero_pad : bool\n        Pad the end of the return value with 0s to achieve a max_length string\n\n    Returns\n    -------\n    str\n        The Phonex value\n\n    Examples\n    --------\n    >>> phonex('Christopher')\n    'C623'\n    >>> phonex('Niall')\n    'N400'\n    >>> phonex('Schmidt')\n    'S253'\n    >>> phonex('Smith')\n    'S530'\n\n    \"\"\"\n    return Phonex().encode(word, max_length, zero_pad)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef encode(self, word, max_length=4, zero_pad=True):\n        name = unicode_normalize('NFKD', text_type(word.upper()))\n        name = name.replace('\u00df', 'SS')\n\n        # Clamp max_length to [4, 64]\n        if max_length != -1:\n            max_length = min(max(4, max_length), 64)\n        else:\n            max_length = 64\n\n        name_code = last = ''\n\n        # Deletions effected by replacing with next letter which\n        # will be ignored due to duplicate handling of Soundex code.\n        # This is faster than 'moving' all subsequent letters.\n\n        # Remove any trailing Ss\n        while name[-1:] == 'S':\n            name = name[:-1]\n\n        # Phonetic equivalents of first 2 characters\n        # Works since duplicate letters are ignored\n        if name[:2] == 'KN':\n            name = 'N' + name[2:]  # KN.. == N..\n        elif name[:2] == 'PH':\n            name = 'F' + name[2:]  # PH.. == F.. (H ignored anyway)\n        elif name[:2] == 'WR':\n            name = 'R' + name[2:]  # WR.. == R..\n\n        if name:\n            # Special case, ignore H first letter (subsequent Hs ignored\n            # anyway)\n            # Works since duplicate letters are ignored\n            if name[0] == 'H':\n                name = name[1:]\n\n        if name:\n            # Phonetic equivalents of first character\n            if name[0] in self._uc_vy_set:\n                name = 'A' + name[1:]\n            elif name[0] in {'B', 'P'}:\n                name = 'B' + name[1:]\n            elif name[0] in {'V', 'F'}:\n                name = 'F' + name[1:]\n            elif name[0] in {'C', 'K', 'Q'}:\n                name = 'C' + name[1:]\n            elif name[0] in {'G', 'J'}:\n                name = 'G' + name[1:]\n            elif name[0] in {'S', 'Z'}:\n                name = 'S' + name[1:]\n\n            name_code = last = name[0]\n\n        # Modified Soundex code\n        for i in range(1, len(name)):\n            code = '0'\n            if name[i] in {'B', 'F', 'P', 'V'}:\n                code = '1'\n            elif name[i] in {'C', 'G', 'J', 'K', 'Q', 'S', 'X', 'Z'}:\n                code = '2'\n            elif name[i] in {'D', 'T'}:\n                if name[i + 1 : i + 2] != 'C':\n                    code = '3'\n            elif name[i] == 'L':\n                if name[i + 1 : i + 2] in self._uc_vy_set or i + 1 == len(\n                    name\n                ):\n                    code = '4'\n            elif name[i] in {'M', 'N'}:\n                if name[i + 1 : i + 2] in {'D', 'G'}:\n                    name = name[: i + 1] + name[i] + name[i + 2 :]\n                code = '5'\n            elif name[i] == 'R':\n                if name[i + 1 : i + 2] in self._uc_vy_set or i + 1 == len(\n                    name\n                ):\n                    code = '6'\n\n            if code != last and code != '0' and i != 0:\n                name_code += code\n\n            last = name_code[-1]\n\n        if zero_pad:\n            name_code += '0' * max_length\n        if not name_code:\n            name_code = '0'\n        return name_code[:max_length]", "response": "Return the Phonex code for a word."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns Pylint badge color.", "response": "def pylint_color(score):\n    \"\"\"Return Pylint badge color.\n\n    Parameters\n    ----------\n    score : float\n        A Pylint score\n\n    Returns\n    -------\n    str\n        Badge color\n\n    \"\"\"\n    # These are the score cutoffs for each color above.\n    # I.e. score==10 -> brightgreen, down to 7.5 > score >= 5 -> orange\n    score_cutoffs = (10, 9.5, 8.5, 7.5, 5)\n    for i in range(len(score_cutoffs)):\n        if score >= score_cutoffs[i]:\n            return BADGE_COLORS[i]\n    # and score < 5 -> red\n    return BADGE_COLORS[-1]"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef pydocstyle_color(score):\n    # These are the score cutoffs for each color above.\n    # I.e. score==0 -> brightgreen, down to 100 < score <= 200 -> orange\n    score_cutoffs = (0, 10, 25, 50, 100)\n    for i in range(len(score_cutoffs)):\n        if score <= score_cutoffs[i]:\n            return BADGE_COLORS[i]\n    # and score > 200 -> red\n    return BADGE_COLORS[-1]", "response": "Return pydocstyle badge color."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef flake8_color(score):\n    # These are the score cutoffs for each color above.\n    # I.e. score==0 -> brightgreen, down to 100 < score <= 200 -> orange\n    score_cutoffs = (0, 20, 50, 100, 200)\n    for i in range(len(score_cutoffs)):\n        if score <= score_cutoffs[i]:\n            return BADGE_COLORS[i]\n    # and score > 200 -> red\n    return BADGE_COLORS[-1]", "response": "Return flake8 badge color."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef gotoh(src, tar, gap_open=1, gap_ext=0.4, sim_func=sim_ident):\n    return Gotoh().dist_abs(src, tar, gap_open, gap_ext, sim_func)", "response": "Return the Gotoh score of two strings."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef dist_abs(self, src, tar, gap_open=1, gap_ext=0.4, sim_func=sim_ident):\n        d_mat = np_zeros((len(src) + 1, len(tar) + 1), dtype=np_float32)\n        p_mat = np_zeros((len(src) + 1, len(tar) + 1), dtype=np_float32)\n        q_mat = np_zeros((len(src) + 1, len(tar) + 1), dtype=np_float32)\n\n        d_mat[0, 0] = 0\n        p_mat[0, 0] = float('-inf')\n        q_mat[0, 0] = float('-inf')\n        for i in range(1, len(src) + 1):\n            d_mat[i, 0] = float('-inf')\n            p_mat[i, 0] = -gap_open - gap_ext * (i - 1)\n            q_mat[i, 0] = float('-inf')\n            q_mat[i, 1] = -gap_open\n        for j in range(1, len(tar) + 1):\n            d_mat[0, j] = float('-inf')\n            p_mat[0, j] = float('-inf')\n            p_mat[1, j] = -gap_open\n            q_mat[0, j] = -gap_open - gap_ext * (j - 1)\n\n        for i in range(1, len(src) + 1):\n            for j in range(1, len(tar) + 1):\n                sim_val = sim_func(src[i - 1], tar[j - 1])\n                d_mat[i, j] = max(\n                    d_mat[i - 1, j - 1] + sim_val,\n                    p_mat[i - 1, j - 1] + sim_val,\n                    q_mat[i - 1, j - 1] + sim_val,\n                )\n\n                p_mat[i, j] = max(\n                    d_mat[i - 1, j] - gap_open, p_mat[i - 1, j] - gap_ext\n                )\n\n                q_mat[i, j] = max(\n                    d_mat[i, j - 1] - gap_open, q_mat[i, j - 1] - gap_ext\n                )\n\n        i, j = (n - 1 for n in d_mat.shape)\n        return max(d_mat[i, j], p_mat[i, j], q_mat[i, j])", "response": "Return the Gotoh score of two strings."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn the bag distance between two strings.", "response": "def dist_abs(self, src, tar):\n        \"\"\"Return the bag distance between two strings.\n\n        Parameters\n        ----------\n        src : str\n            Source string for comparison\n        tar : str\n            Target string for comparison\n\n        Returns\n        -------\n        int\n            Bag distance\n\n        Examples\n        --------\n        >>> cmp = Bag()\n        >>> cmp.dist_abs('cat', 'hat')\n        1\n        >>> cmp.dist_abs('Niall', 'Neil')\n        2\n        >>> cmp.dist_abs('aluminum', 'Catalan')\n        5\n        >>> cmp.dist_abs('ATCG', 'TAGC')\n        0\n        >>> cmp.dist_abs('abcdefg', 'hijklm')\n        7\n        >>> cmp.dist_abs('abcdefg', 'hijklmno')\n        8\n\n        \"\"\"\n        if tar == src:\n            return 0\n        elif not src:\n            return len(tar)\n        elif not tar:\n            return len(src)\n\n        src_bag = Counter(src)\n        tar_bag = Counter(tar)\n        return max(\n            sum((src_bag - tar_bag).values()),\n            sum((tar_bag - src_bag).values()),\n        )"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreturning the normalized bag distance between two strings.", "response": "def dist(self, src, tar):\n        \"\"\"Return the normalized bag distance between two strings.\n\n        Bag distance is normalized by dividing by :math:`max( |src|, |tar| )`.\n\n        Parameters\n        ----------\n        src : str\n            Source string for comparison\n        tar : str\n            Target string for comparison\n\n        Returns\n        -------\n        float\n            Normalized bag distance\n\n        Examples\n        --------\n        >>> cmp = Bag()\n        >>> cmp.dist('cat', 'hat')\n        0.3333333333333333\n        >>> cmp.dist('Niall', 'Neil')\n        0.4\n        >>> cmp.dist('aluminum', 'Catalan')\n        0.625\n        >>> cmp.dist('ATCG', 'TAGC')\n        0.0\n\n        \"\"\"\n        if tar == src:\n            return 0.0\n        if not src or not tar:\n            return 1.0\n\n        max_length = max(len(src), len(tar))\n\n        return self.dist_abs(src, tar) / max_length"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nreturn the stemmer plus word.", "response": "def stem(self, word):\n        \"\"\"Return 'CLEF German stemmer plus' stem.\n\n        Parameters\n        ----------\n        word : str\n            The word to stem\n\n        Returns\n        -------\n        str\n            Word stem\n\n        Examples\n        --------\n        >>> stmr = CLEFGermanPlus()\n        >>> clef_german_plus('lesen')\n        'les'\n        >>> clef_german_plus('graues')\n        'grau'\n        >>> clef_german_plus('buchstabieren')\n        'buchstabi'\n\n        \"\"\"\n        # lowercase, normalize, and compose\n        word = normalize('NFC', text_type(word.lower()))\n\n        # remove umlauts\n        word = word.translate(self._accents)\n\n        # Step 1\n        wlen = len(word) - 1\n        if wlen > 4 and word[-3:] == 'ern':\n            word = word[:-3]\n        elif wlen > 3 and word[-2:] in {'em', 'en', 'er', 'es'}:\n            word = word[:-2]\n        elif wlen > 2 and (\n            word[-1] == 'e'\n            or (word[-1] == 's' and word[-2] in self._st_ending)\n        ):\n            word = word[:-1]\n\n        # Step 2\n        wlen = len(word) - 1\n        if wlen > 4 and word[-3:] == 'est':\n            word = word[:-3]\n        elif wlen > 3 and (\n            word[-2:] in {'er', 'en'}\n            or (word[-2:] == 'st' and word[-3] in self._st_ending)\n        ):\n            word = word[:-2]\n\n        return word"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn the MLIPNS similarity of two strings.", "response": "def sim_mlipns(src, tar, threshold=0.25, max_mismatches=2):\n    \"\"\"Return the MLIPNS similarity of two strings.\n\n    This is a wrapper for :py:meth:`MLIPNS.sim`.\n\n    Parameters\n    ----------\n    src : str\n        Source string for comparison\n    tar : str\n        Target string for comparison\n    threshold : float\n        A number [0, 1] indicating the maximum similarity score, below which\n        the strings are considered 'similar' (0.25 by default)\n    max_mismatches : int\n        A number indicating the allowable number of mismatches to remove before\n        declaring two strings not similar (2 by default)\n\n    Returns\n    -------\n    float\n        MLIPNS similarity\n\n    Examples\n    --------\n    >>> sim_mlipns('cat', 'hat')\n    1.0\n    >>> sim_mlipns('Niall', 'Neil')\n    0.0\n    >>> sim_mlipns('aluminum', 'Catalan')\n    0.0\n    >>> sim_mlipns('ATCG', 'TAGC')\n    0.0\n\n    \"\"\"\n    return MLIPNS().sim(src, tar, threshold, max_mismatches)"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns the MLIPNS distance between two strings.", "response": "def dist_mlipns(src, tar, threshold=0.25, max_mismatches=2):\n    \"\"\"Return the MLIPNS distance between two strings.\n\n    This is a wrapper for :py:meth:`MLIPNS.dist`.\n\n    Parameters\n    ----------\n    src : str\n        Source string for comparison\n    tar : str\n        Target string for comparison\n    threshold : float\n        A number [0, 1] indicating the maximum similarity score, below which\n        the strings are considered 'similar' (0.25 by default)\n    max_mismatches : int\n        A number indicating the allowable number of mismatches to remove before\n        declaring two strings not similar (2 by default)\n\n    Returns\n    -------\n    float\n        MLIPNS distance\n\n    Examples\n    --------\n    >>> dist_mlipns('cat', 'hat')\n    0.0\n    >>> dist_mlipns('Niall', 'Neil')\n    1.0\n    >>> dist_mlipns('aluminum', 'Catalan')\n    1.0\n    >>> dist_mlipns('ATCG', 'TAGC')\n    1.0\n\n    \"\"\"\n    return MLIPNS().dist(src, tar, threshold, max_mismatches)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn the MLIPNS similarity of two strings.", "response": "def sim(self, src, tar, threshold=0.25, max_mismatches=2):\n        \"\"\"Return the MLIPNS similarity of two strings.\n\n        Parameters\n        ----------\n        src : str\n            Source string for comparison\n        tar : str\n            Target string for comparison\n        threshold : float\n            A number [0, 1] indicating the maximum similarity score, below\n            which the strings are considered 'similar' (0.25 by default)\n        max_mismatches : int\n            A number indicating the allowable number of mismatches to remove\n            before declaring two strings not similar (2 by default)\n\n        Returns\n        -------\n        float\n            MLIPNS similarity\n\n        Examples\n        --------\n        >>> sim_mlipns('cat', 'hat')\n        1.0\n        >>> sim_mlipns('Niall', 'Neil')\n        0.0\n        >>> sim_mlipns('aluminum', 'Catalan')\n        0.0\n        >>> sim_mlipns('ATCG', 'TAGC')\n        0.0\n\n        \"\"\"\n        if tar == src:\n            return 1.0\n        if not src or not tar:\n            return 0.0\n\n        mismatches = 0\n        ham = Hamming().dist_abs(src, tar, diff_lens=True)\n        max_length = max(len(src), len(tar))\n        while src and tar and mismatches <= max_mismatches:\n            if (\n                max_length < 1\n                or (1 - (max_length - ham) / max_length) <= threshold\n            ):\n                return 1.0\n            else:\n                mismatches += 1\n                ham -= 1\n                max_length -= 1\n\n        if max_length < 1:\n            return 1.0\n        return 0.0"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn a similarity of two strings.", "response": "def sim(src, tar, method=sim_levenshtein):\n    \"\"\"Return a similarity of two strings.\n\n    This is a generalized function for calling other similarity functions.\n\n    Parameters\n    ----------\n    src : str\n        Source string for comparison\n    tar : str\n        Target string for comparison\n    method : function\n        Specifies the similarity metric (:py:func:`sim_levenshtein` by default)\n\n    Returns\n    -------\n    float\n        Similarity according to the specified function\n\n    Raises\n    ------\n    AttributeError\n        Unknown distance function\n\n    Examples\n    --------\n    >>> round(sim('cat', 'hat'), 12)\n    0.666666666667\n    >>> round(sim('Niall', 'Neil'), 12)\n    0.4\n    >>> sim('aluminum', 'Catalan')\n    0.125\n    >>> sim('ATCG', 'TAGC')\n    0.25\n\n    \"\"\"\n    if callable(method):\n        return method(src, tar)\n    else:\n        raise AttributeError('Unknown similarity function: ' + str(method))"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning a distance between two strings.", "response": "def dist(src, tar, method=sim_levenshtein):\n    \"\"\"Return a distance between two strings.\n\n    This is a generalized function for calling other distance functions.\n\n    Parameters\n    ----------\n    src : str\n        Source string for comparison\n    tar : str\n        Target string for comparison\n    method : function\n        Specifies the similarity metric (:py:func:`sim_levenshtein` by default)\n        -- Note that this takes a similarity metric function, not a distance\n        metric function.\n\n    Returns\n    -------\n    float\n        Distance according to the specified function\n\n    Raises\n    ------\n    AttributeError\n        Unknown distance function\n\n    Examples\n    --------\n    >>> round(dist('cat', 'hat'), 12)\n    0.333333333333\n    >>> round(dist('Niall', 'Neil'), 12)\n    0.6\n    >>> dist('aluminum', 'Catalan')\n    0.875\n    >>> dist('ATCG', 'TAGC')\n    0.75\n\n    \"\"\"\n    if callable(method):\n        return 1 - method(src, tar)\n    else:\n        raise AttributeError('Unknown distance function: ' + str(method))"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nencoding a MRA personal numeric identifier for a word.", "response": "def encode(self, word):\n        \"\"\"Return the MRA personal numeric identifier (PNI) for a word.\n\n        Parameters\n        ----------\n        word : str\n            The word to transform\n\n        Returns\n        -------\n        str\n            The MRA PNI\n\n        Examples\n        --------\n        >>> pe = MRA()\n        >>> pe.encode('Christopher')\n        'CHRPHR'\n        >>> pe.encode('Niall')\n        'NL'\n        >>> pe.encode('Smith')\n        'SMTH'\n        >>> pe.encode('Schmidt')\n        'SCHMDT'\n\n        \"\"\"\n        if not word:\n            return word\n        word = word.upper()\n        word = word.replace('\u00df', 'SS')\n        word = word[0] + ''.join(\n            c for c in word[1:] if c not in self._uc_v_set\n        )\n        word = self._delete_consecutive_repeats(word)\n        if len(word) > 6:\n            word = word[:3] + word[-3:]\n        return word"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreturns the m - degree of the term in the Porter stemmer.", "response": "def _m_degree(self, term):\n        \"\"\"Return Porter helper function _m_degree value.\n\n        m-degree is equal to the number of V to C transitions\n\n        Parameters\n        ----------\n        term : str\n            The word for which to calculate the m-degree\n\n        Returns\n        -------\n        int\n            The m-degree as defined in the Porter stemmer definition\n\n        \"\"\"\n        mdeg = 0\n        last_was_vowel = False\n        for letter in term:\n            if letter in self._vowels:\n                last_was_vowel = True\n            else:\n                if last_was_vowel:\n                    mdeg += 1\n                last_was_vowel = False\n        return mdeg"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _has_vowel(self, term):\n        for letter in term:\n            if letter in self._vowels:\n                return True\n        return False", "response": "Return True iff a vowel exists in the term."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _ends_in_doubled_cons(self, term):\n        return (\n            len(term) > 1\n            and term[-1] not in self._vowels\n            and term[-2] == term[-1]\n        )", "response": "Return True iff the stem ends in a doubled consonant."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreturn True iff the stem ends in cvc", "response": "def _ends_in_cvc(self, term):\n        \"\"\"Return Porter helper function _ends_in_cvc value.\n\n        Parameters\n        ----------\n        term : str\n            The word to scan for cvc\n\n        Returns\n        -------\n        bool\n            True iff the stem ends in cvc (as defined in the Porter stemmer\n            definition)\n\n        \"\"\"\n        return len(term) > 2 and (\n            term[-1] not in self._vowels\n            and term[-2] in self._vowels\n            and term[-3] not in self._vowels\n            and term[-1] not in tuple('wxY')\n        )"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nreturning the stemmed version of the given word.", "response": "def stem(self, word, early_english=False):\n        \"\"\"Return Porter stem.\n\n        Parameters\n        ----------\n        word : str\n            The word to stem\n        early_english : bool\n            Set to True in order to remove -eth & -est (2nd & 3rd person\n            singular verbal agreement suffixes)\n\n        Returns\n        -------\n        str\n            Word stem\n\n        Examples\n        --------\n        >>> stmr = Porter()\n        >>> stmr.stem('reading')\n        'read'\n        >>> stmr.stem('suspension')\n        'suspens'\n        >>> stmr.stem('elusiveness')\n        'elus'\n\n        >>> stmr.stem('eateth', early_english=True)\n        'eat'\n\n        \"\"\"\n        # lowercase, normalize, and compose\n        word = normalize('NFC', text_type(word.lower()))\n\n        # Return word if stem is shorter than 2\n        if len(word) < 3:\n            return word\n\n        # Re-map consonantal y to Y (Y will be C, y will be V)\n        if word[0] == 'y':\n            word = 'Y' + word[1:]\n        for i in range(1, len(word)):\n            if word[i] == 'y' and word[i - 1] in self._vowels:\n                word = word[:i] + 'Y' + word[i + 1 :]\n\n        # Step 1a\n        if word[-1] == 's':\n            if word[-4:] == 'sses':\n                word = word[:-2]\n            elif word[-3:] == 'ies':\n                word = word[:-2]\n            elif word[-2:] == 'ss':\n                pass\n            else:\n                word = word[:-1]\n\n        # Step 1b\n        step1b_flag = False\n        if word[-3:] == 'eed':\n            if self._m_degree(word[:-3]) > 0:\n                word = word[:-1]\n        elif word[-2:] == 'ed':\n            if self._has_vowel(word[:-2]):\n                word = word[:-2]\n                step1b_flag = True\n        elif word[-3:] == 'ing':\n            if self._has_vowel(word[:-3]):\n                word = word[:-3]\n                step1b_flag = True\n        elif early_english:\n            if word[-3:] == 'est':\n                if self._has_vowel(word[:-3]):\n                    word = word[:-3]\n                    step1b_flag = True\n            elif word[-3:] == 'eth':\n                if self._has_vowel(word[:-3]):\n                    word = word[:-3]\n                    step1b_flag = True\n\n        if step1b_flag:\n            if word[-2:] in {'at', 'bl', 'iz'}:\n                word += 'e'\n            elif self._ends_in_doubled_cons(word) and word[-1] not in {\n                'l',\n                's',\n                'z',\n            }:\n                word = word[:-1]\n            elif self._m_degree(word) == 1 and self._ends_in_cvc(word):\n                word += 'e'\n\n        # Step 1c\n        if word[-1] in {'Y', 'y'} and self._has_vowel(word[:-1]):\n            word = word[:-1] + 'i'\n\n        # Step 2\n        if len(word) > 1:\n            if word[-2] == 'a':\n                if word[-7:] == 'ational':\n                    if self._m_degree(word[:-7]) > 0:\n                        word = word[:-5] + 'e'\n                elif word[-6:] == 'tional':\n                    if self._m_degree(word[:-6]) > 0:\n                        word = word[:-2]\n            elif word[-2] == 'c':\n                if word[-4:] in {'enci', 'anci'}:\n                    if self._m_degree(word[:-4]) > 0:\n                        word = word[:-1] + 'e'\n            elif word[-2] == 'e':\n                if word[-4:] == 'izer':\n                    if self._m_degree(word[:-4]) > 0:\n                        word = word[:-1]\n            elif word[-2] == 'g':\n                if word[-4:] == 'logi':\n                    if self._m_degree(word[:-4]) > 0:\n                        word = word[:-1]\n            elif word[-2] == 'l':\n                if word[-3:] == 'bli':\n                    if self._m_degree(word[:-3]) > 0:\n                        word = word[:-1] + 'e'\n                elif word[-4:] == 'alli':\n                    if self._m_degree(word[:-4]) > 0:\n                        word = word[:-2]\n                elif word[-5:] == 'entli':\n                    if self._m_degree(word[:-5]) > 0:\n                        word = word[:-2]\n                elif word[-3:] == 'eli':\n                    if self._m_degree(word[:-3]) > 0:\n                        word = word[:-2]\n                elif word[-5:] == 'ousli':\n                    if self._m_degree(word[:-5]) > 0:\n                        word = word[:-2]\n            elif word[-2] == 'o':\n                if word[-7:] == 'ization':\n                    if self._m_degree(word[:-7]) > 0:\n                        word = word[:-5] + 'e'\n                elif word[-5:] == 'ation':\n                    if self._m_degree(word[:-5]) > 0:\n                        word = word[:-3] + 'e'\n                elif word[-4:] == 'ator':\n                    if self._m_degree(word[:-4]) > 0:\n                        word = word[:-2] + 'e'\n            elif word[-2] == 's':\n                if word[-5:] == 'alism':\n                    if self._m_degree(word[:-5]) > 0:\n                        word = word[:-3]\n                elif word[-7:] in {'iveness', 'fulness', 'ousness'}:\n                    if self._m_degree(word[:-7]) > 0:\n                        word = word[:-4]\n            elif word[-2] == 't':\n                if word[-5:] == 'aliti':\n                    if self._m_degree(word[:-5]) > 0:\n                        word = word[:-3]\n                elif word[-5:] == 'iviti':\n                    if self._m_degree(word[:-5]) > 0:\n                        word = word[:-3] + 'e'\n                elif word[-6:] == 'biliti':\n                    if self._m_degree(word[:-6]) > 0:\n                        word = word[:-5] + 'le'\n\n        # Step 3\n        if word[-5:] in 'icate':\n            if self._m_degree(word[:-5]) > 0:\n                word = word[:-3]\n        elif word[-5:] == 'ative':\n            if self._m_degree(word[:-5]) > 0:\n                word = word[:-5]\n        elif word[-5:] in {'alize', 'iciti'}:\n            if self._m_degree(word[:-5]) > 0:\n                word = word[:-3]\n        elif word[-4:] == 'ical':\n            if self._m_degree(word[:-4]) > 0:\n                word = word[:-2]\n        elif word[-3:] == 'ful':\n            if self._m_degree(word[:-3]) > 0:\n                word = word[:-3]\n        elif word[-4:] == 'ness':\n            if self._m_degree(word[:-4]) > 0:\n                word = word[:-4]\n\n        # Step 4\n        if word[-2:] == 'al':\n            if self._m_degree(word[:-2]) > 1:\n                word = word[:-2]\n        elif word[-4:] in {'ance', 'ence'}:\n            if self._m_degree(word[:-4]) > 1:\n                word = word[:-4]\n        elif word[-2:] in {'er', 'ic'}:\n            if self._m_degree(word[:-2]) > 1:\n                word = word[:-2]\n        elif word[-4:] in {'able', 'ible'}:\n            if self._m_degree(word[:-4]) > 1:\n                word = word[:-4]\n        elif word[-3:] == 'ant':\n            if self._m_degree(word[:-3]) > 1:\n                word = word[:-3]\n        elif word[-5:] == 'ement':\n            if self._m_degree(word[:-5]) > 1:\n                word = word[:-5]\n        elif word[-4:] == 'ment':\n            if self._m_degree(word[:-4]) > 1:\n                word = word[:-4]\n        elif word[-3:] == 'ent':\n            if self._m_degree(word[:-3]) > 1:\n                word = word[:-3]\n        elif word[-4:] in {'sion', 'tion'}:\n            if self._m_degree(word[:-3]) > 1:\n                word = word[:-3]\n        elif word[-2:] == 'ou':\n            if self._m_degree(word[:-2]) > 1:\n                word = word[:-2]\n        elif word[-3:] in {'ism', 'ate', 'iti', 'ous', 'ive', 'ize'}:\n            if self._m_degree(word[:-3]) > 1:\n                word = word[:-3]\n\n        # Step 5a\n        if word[-1] == 'e':\n            if self._m_degree(word[:-1]) > 1:\n                word = word[:-1]\n            elif self._m_degree(word[:-1]) == 1 and not self._ends_in_cvc(\n                word[:-1]\n            ):\n                word = word[:-1]\n\n        # Step 5b\n        if word[-2:] == 'll' and self._m_degree(word) > 1:\n            word = word[:-1]\n\n        # Change 'Y' back to 'y' if it survived stemming\n        for i in range(len(word)):\n            if word[i] == 'Y':\n                word = word[:i] + 'y' + word[i + 1 :]\n\n        return word"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns the SoundexBR encoding of a word.", "response": "def soundex_br(word, max_length=4, zero_pad=True):\n    \"\"\"Return the SoundexBR encoding of a word.\n\n    This is a wrapper for :py:meth:`SoundexBR.encode`.\n\n    Parameters\n    ----------\n    word : str\n        The word to transform\n    max_length : int\n        The length of the code returned (defaults to 4)\n    zero_pad : bool\n        Pad the end of the return value with 0s to achieve a max_length string\n\n    Returns\n    -------\n    str\n        The SoundexBR code\n\n    Examples\n    --------\n    >>> soundex_br('Oliveira')\n    'O416'\n    >>> soundex_br('Almeida')\n    'A453'\n    >>> soundex_br('Barbosa')\n    'B612'\n    >>> soundex_br('Ara\u00fajo')\n    'A620'\n    >>> soundex_br('Gon\u00e7alves')\n    'G524'\n    >>> soundex_br('Goncalves')\n    'G524'\n\n    \"\"\"\n    return SoundexBR().encode(word, max_length, zero_pad)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning the SoundexBR encoding of a word.", "response": "def encode(self, word, max_length=4, zero_pad=True):\n        \"\"\"Return the SoundexBR encoding of a word.\n\n        Parameters\n        ----------\n        word : str\n            The word to transform\n        max_length : int\n            The length of the code returned (defaults to 4)\n        zero_pad : bool\n            Pad the end of the return value with 0s to achieve a max_length\n            string\n\n        Returns\n        -------\n        str\n            The SoundexBR code\n\n        Examples\n        --------\n        >>> soundex_br('Oliveira')\n        'O416'\n        >>> soundex_br('Almeida')\n        'A453'\n        >>> soundex_br('Barbosa')\n        'B612'\n        >>> soundex_br('Ara\u00fajo')\n        'A620'\n        >>> soundex_br('Gon\u00e7alves')\n        'G524'\n        >>> soundex_br('Goncalves')\n        'G524'\n\n        \"\"\"\n        word = unicode_normalize('NFKD', text_type(word.upper()))\n        word = ''.join(c for c in word if c in self._uc_set)\n\n        if word[:2] == 'WA':\n            first = 'V'\n        elif word[:1] == 'K' and word[1:2] in {'A', 'O', 'U'}:\n            first = 'C'\n        elif word[:1] == 'C' and word[1:2] in {'I', 'E'}:\n            first = 'S'\n        elif word[:1] == 'G' and word[1:2] in {'E', 'I'}:\n            first = 'J'\n        elif word[:1] == 'Y':\n            first = 'I'\n        elif word[:1] == 'H':\n            first = word[1:2]\n            word = word[1:]\n        else:\n            first = word[:1]\n\n        sdx = first + word[1:].translate(self._trans)\n        sdx = self._delete_consecutive_repeats(sdx)\n        sdx = sdx.replace('0', '')\n\n        if zero_pad:\n            sdx += '0' * max_length\n\n        return sdx[:max_length]"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef filter_symlog(y, base=10.0):\n    log_base = np.log(base)\n    sign = np.sign(y)\n    logs = np.log(np.abs(y) / log_base)\n    return sign * logs", "response": "Symmetrical logarithmic scale.\n\n    Optional arguments:\n\n    *base*:\n        The base of the logarithm."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef filter_savitzky_golay(y, window_size=5, order=2, deriv=0, rate=1):\n    try:\n        window_size = np.abs(np.int(window_size))\n        order = np.abs(np.int(order))\n    except ValueError:\n        raise ValueError('window_size and order must be integers')\n\n    if window_size % 2 != 1 or window_size < 1:\n        raise ValueError('window_size size must be a positive odd number')\n    if window_size < order + 2:\n        raise ValueError('window_size is too small for the polynomials order')\n\n    order_range = range(order + 1)\n    half_window = (window_size - 1) // 2\n\n    # precompute limits\n    minimum = np.min(y)\n    maximum = np.max(y)\n    # precompute coefficients\n    b = np.mat([\n        [k ** i for i in order_range]\n        for k in range(-half_window, half_window + 1)\n    ])\n    m = np.linalg.pinv(b).A[deriv] * rate ** deriv * math.factorial(deriv)\n    # pad the signal at the extremes with values taken from the original signal\n    firstvals = y[0] - np.abs(y[1:half_window+1][::-1] - y[0])\n    lastvals = y[-1] + np.abs(y[-half_window-1:-1][::-1] - y[-1])\n    y = np.concatenate((firstvals, y, lastvals))\n    return np.clip(\n        np.convolve(m[::-1], y, mode='valid'),\n        minimum,\n        maximum,\n    )", "response": "Smooth and optionally differentiate with a Savitzky - Golay filter."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nshows usage and available curve functions.", "response": "def usage_function(parser):\n    \"\"\"Show usage and available curve functions.\"\"\"\n    parser.print_usage()\n    print('')\n    print('available functions:')\n    for function in sorted(FUNCTION):\n        doc = FUNCTION[function].__doc__.strip().splitlines()[0]\n        print('    %-12s %s' % (function + ':', doc))\n\n    return 0"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nshow usage and available palettes.", "response": "def usage_palette(parser):\n    \"\"\"Show usage and available palettes.\"\"\"\n    parser.print_usage()\n    print('')\n    print('available palettes:')\n    for palette in sorted(PALETTE):\n        print('    %-12s' % (palette,))\n\n    return 0"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef run():\n    import argparse\n\n    parser = argparse.ArgumentParser(\n        description=(\n            'Text mode diagrams using UTF-8 characters and fancy colors.'\n        ),\n        epilog=\"\"\"\n    (1): only works for the horizontal bar graph, the first argument is the key\n    and the second value is the data point.\n\"\"\",\n    )\n\n    group = parser.add_argument_group('optional drawing mode')\n    group.add_argument(\n        '-G', '--graph',\n        dest='mode', action='store_const', const='g',\n        help='axis drawing mode (default)',\n    )\n    group.add_argument(\n        '-H', '--horizontal-bars',\n        dest='mode', action='store_const', const='h',\n        help='horizontal drawing mode',\n    )\n    group.add_argument(\n        '-V', '--vertical-bars',\n        dest='mode', action='store_const', const='v',\n        help='vertical drawing mode',\n    )\n\n    group = parser.add_argument_group('optional drawing arguments')\n    group.add_argument(\n        '-a', '--axis',\n        dest='axis', action='store_const', const=True, default=True,\n        help='draw axis (default: yes)',\n    )\n    group.add_argument(\n        '-A', '--no-axis',\n        dest='axis', action='store_const', const=False,\n        help=\"don't draw axis\",\n    )\n    group.add_argument(\n        '-c', '--color',\n        dest='color', action='store_const', const=True, default=True,\n        help='use colors (default: yes)',\n    )\n    group.add_argument(\n        '-C', '--no-color',\n        dest='color', action='store_const', const=False,\n        help=\"don't use colors\",\n    )\n    group.add_argument(\n        '-l', '--legend',\n        dest='legend', action='store_const', const=True, default=True,\n        help='draw y-axis legend (default: yes)',\n    )\n    group.add_argument(\n        '-L', '--no-legend',\n        dest='legend', action='store_const', const=False,\n        help=\"don't draw y-axis legend\",\n    )\n    group.add_argument(\n        '-f', '--function',\n        default=None, metavar='function',\n        help='curve manipulation function, use \"help\" for a list',\n    )\n    group.add_argument(\n        '-p', '--palette',\n        default='default', metavar='palette',\n        help='palette name, use \"help\" for a list',\n    )\n    group.add_argument(\n        '-x', '--width',\n        default=0, type=int, metavar='characters',\n        help='drawing width (default: auto)',\n    )\n    group.add_argument(\n        '-y', '--height',\n        default=0, type=int, metavar='characters',\n        help='drawing height (default: auto)',\n    )\n    group.add_argument(\n        '-r', '--reverse',\n        default=False, action='store_true',\n        help='reverse draw graph',\n    )\n    group.add_argument(\n        '--sort-by-column',\n        default=0, type=int, metavar='index',\n        help='sort input data based on given column',\n    )\n\n    group = parser.add_argument_group('optional input and output arguments')\n    group.add_argument(\n        '-b', '--batch',\n        default=False, action='store_true',\n        help='batch mode (default: no)',\n    )\n    group.add_argument(\n        '-k', '--keys',\n        default=False, action='store_true',\n        help='input are key-value pairs (default: no) (1)',\n    )\n    group.add_argument(\n        '-s', '--sleep',\n        default=0, type=float,\n        help='batch poll sleep time (default: none)',\n    )\n    group.add_argument(\n        '-i', '--input',\n        default='-', metavar='file',\n        help='input file (default: stdin)',\n    )\n    group.add_argument(\n        '-o', '--output',\n        default='-', metavar='file',\n        help='output file (default: stdout)',\n    )\n    group.add_argument(\n        '-e', '--encoding',\n        dest='encoding', default='',\n        help='output encoding (default: auto)',\n    )\n\n    option = parser.parse_args()\n\n    if option.function == 'help':\n        return usage_function(parser)\n\n    if option.palette == 'help':\n        return usage_palette(parser)\n\n    option.mode = option.mode or 'g'\n    option.size = Point((option.width, option.height))\n\n    if option.input in ['-', 'stdin']:\n        istream = sys.stdin\n    else:\n        istream = open(option.input, 'r')\n\n    if option.output in ['-', 'stdout']:\n        try:\n            ostream = sys.stdout.buffer\n        except AttributeError:\n            ostream = sys.stdout\n    else:\n        ostream = open(option.output, 'wb')\n\n    option.encoding = option.encoding or Terminal().encoding\n\n    if option.mode == 'g':\n        engine = AxisGraph(option.size, option)\n\n    elif option.mode == 'h':\n        engine = HorizontalBarGraph(option.size, option)\n\n    elif option.mode == 'v':\n        engine = VerticalBarGraph(option.size, option)\n\n    else:\n        parser.error('invalid mode')\n        return 1\n\n    engine.consume(istream, ostream, batch=option.batch)", "response": "Entry point for the main entrypoint."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef size(self):\n        for fd in range(3):\n            cr = self._ioctl_GWINSZ(fd)\n            if cr:\n                break\n        if not cr:\n            try:\n                fd = os.open(os.ctermid(), os.O_RDONLY)\n                cr = self._ioctl_GWINSZ(fd)\n                os.close(fd)\n            except Exception:\n                pass\n\n        if not cr:\n            env = os.environ\n            cr = (env.get('LINES', 25), env.get('COLUMNS', 80))\n\n        return int(cr[1]), int(cr[0])", "response": "Get the current terminal size."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreturn the escape sequence for indexed color index.", "response": "def color(self, index):\n        \"\"\"Get the escape sequence for indexed color ``index``.\n\n        The ``index`` is a color index in the 256 color space. The color space\n        consists of:\n\n        * 0x00-0x0f: default EGA colors\n        * 0x10-0xe7: 6x6x6 RGB cubes\n        * 0xe8-0xff: gray scale ramp\n        \"\"\"\n        if self.colors == 16:\n            if index >= 8:\n                return self.csi('bold') + self.csi('setaf', index - 8)\n            else:\n                return self.csi('sgr0') + self.csi('setaf', index)\n        else:\n            return self.csi('setaf', index)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nreturning the escape sequence for the selected Control Sequence.", "response": "def csi(self, capname, *args):\n        \"\"\"Return the escape sequence for the selected Control Sequence.\"\"\"\n        value = curses.tigetstr(capname)\n        if value is None:\n            return b''\n        else:\n            return curses.tparm(value, *args)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning a value wrapped in the selected CSI and does a reset.", "response": "def csi_wrap(self, value, capname, *args):\n        \"\"\"Return a value wrapped in the selected CSI and does a reset.\"\"\"\n        if isinstance(value, str):\n            value = value.encode('utf-8')\n        return b''.join([\n            self.csi(capname, *args),\n            value,\n            self.csi('sgr0'),\n        ])"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef consume(self, istream, ostream, batch=False):\n        datapoints = []  # List of 2-tuples\n\n        if batch:\n            sleep = max(0.01, self.option.sleep)\n            fd = istream.fileno()\n            while True:\n                try:\n                    if select.select([fd], [], [], sleep):\n                        try:\n                            line = istream.readline()\n                            if line == '':\n                                break\n                            datapoints.append(self.consume_line(line))\n                        except ValueError:\n                            continue\n\n                        if self.option.sort_by_column:\n                            datapoints = sorted(datapoints, key=itemgetter(self.option.sort_by_column - 1))\n\n                        if len(datapoints) > 1:\n                            datapoints = datapoints[-self.maximum_points:]\n                            self.update([dp[0] for dp in datapoints], [dp[1] for dp in datapoints])\n                            self.render(ostream)\n\n                        time.sleep(sleep)\n\n                except KeyboardInterrupt:\n                    break\n\n        else:\n            for line in istream:\n                try:\n                    datapoints.append(self.consume_line(line))\n                except ValueError:\n                    pass\n\n            if self.option.sort_by_column:\n                datapoints = sorted(datapoints, key=itemgetter(self.option.sort_by_column - 1))\n\n            self.update([dp[0] for dp in datapoints], [dp[1] for dp in datapoints])\n            self.render(ostream)", "response": "Read points from istream and output to ostream."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef consume_line(self, line):\n        data = RE_VALUE_KEY.split(line.strip(), 1)\n        if len(data) == 1:\n            return float(data[0]), None\n        else:\n            return float(data[0]), data[1].strip()", "response": "Consume data from a line."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef update(self, points, values=None):\n        self.values = values or [None] * len(points)\n\n        if np is None:\n            if self.option.function:\n                warnings.warn('numpy not available, function ignored')\n            self.points = points\n            self.minimum = min(self.points)\n            self.maximum = max(self.points)\n            self.current = self.points[-1]\n\n        else:\n            self.points = self.apply_function(points)\n            self.minimum = np.min(self.points)\n            self.maximum = np.max(self.points)\n            self.current = self.points[-1]\n\n        if self.maximum == self.minimum:\n            self.extents = 1\n        else:\n            self.extents = (self.maximum - self.minimum)\n            self.extents = (self.maximum - self.minimum)", "response": "Update the internal state of the object."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ngenerates a color ramp for the current screen height.", "response": "def color_ramp(self, size):\n        \"\"\"Generate a color ramp for the current screen height.\"\"\"\n        color = PALETTE.get(self.option.palette, {})\n        color = color.get(self.term.colors, None)\n        color_ramp = []\n        if color is not None:\n            ratio = len(color) / float(size)\n            for i in range(int(size)):\n                color_ramp.append(self.term.color(color[int(ratio * i)]))\n\n        return color_ramp"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef human(self, size, base=1000, units=' kMGTZ'):\n        sign = '+' if size >= 0 else '-'\n        size = abs(size)\n        if size < 1000:\n            return '%s%d' % (sign, size)\n        for i, suffix in enumerate(units):\n            unit = 1000 ** (i + 1)\n            if size < unit:\n                return ('%s%.01f%s' % (\n                    sign,\n                    size / float(unit) * base,\n                    suffix,\n                )).strip()\n        raise OverflowError", "response": "Convert the input size to human readable short form."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef apply_function(self, points):\n        if not self.option.function:\n            return points\n\n        if np is None:\n            raise ImportError('numpy is not available')\n\n        if ':' in self.option.function:\n            function, arguments = self.option.function.split(':', 1)\n            arguments = arguments.split(',')\n        else:\n            function = self.option.function\n            arguments = []\n\n        # Resolve arguments\n        arguments = list(map(self._function_argument, arguments))\n\n        # Resolve function\n        filter_function = FUNCTION.get(function)\n\n        if filter_function is None:\n            raise TypeError('Invalid function \"%s\"' % (function,))\n\n        else:\n            # We wrap in ``list()`` to consume generators and iterators, as\n            # ``np.array`` doesn't do this for us.\n            return filter_function(np.array(list(points)), *arguments)", "response": "Run the filter function on the provided points."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nresolve the points to make a line between two points.", "response": "def line(self, p1, p2, resolution=1):\n        \"\"\"Resolve the points to make a line between two points.\"\"\"\n        xdiff = max(p1.x, p2.x) - min(p1.x, p2.x)\n        ydiff = max(p1.y, p2.y) - min(p1.y, p2.y)\n        xdir = [-1, 1][int(p1.x <= p2.x)]\n        ydir = [-1, 1][int(p1.y <= p2.y)]\n        r = int(round(max(xdiff, ydiff)))\n        if r == 0:\n            return\n\n        for i in range((r + 1) * resolution):\n            x = p1.x\n            y = p1.y\n\n            if xdiff:\n                x += (float(i) * xdiff) / r * xdir / resolution\n            if ydiff:\n                y += (float(i) * ydiff) / r * ydir / resolution\n\n            yield Point((x, y))"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nsetting a text value in the screen canvas.", "response": "def set_text(self, point, text):\n        \"\"\"Set a text value in the screen canvas.\"\"\"\n        if not self.option.legend:\n            return\n\n        if not isinstance(point, Point):\n            point = Point(point)\n\n        for offset, char in enumerate(str(text)):\n            self.screen.canvas[point.y][point.x + offset] = char"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nrenders the term to the given stream.", "response": "def render(self, stream):\n        \"\"\"Render graph to stream.\"\"\"\n        encoding = self.option.encoding or self.term.encoding or \"utf8\"\n\n        if self.option.color:\n            ramp = self.color_ramp(self.size.y)[::-1]\n        else:\n            ramp = None\n\n        if self.cycle >= 1 and self.lines:\n            stream.write(self.term.csi('cuu', self.lines))\n\n        zero = int(self.null / 4)  # Zero crossing\n        lines = 0\n        for y in range(self.screen.size.y):\n            if y == zero and self.size.y > 1:\n                stream.write(self.term.csi('smul'))\n            if ramp:\n                stream.write(ramp[y])\n\n            for x in range(self.screen.size.x):\n                point = Point((x, y))\n                if point in self.screen:\n                    value = self.screen[point]\n                    if isinstance(value, int):\n                        stream.write(chr(self.base + value).encode(encoding))\n                    else:\n                        stream.write(self.term.csi('sgr0'))\n                        stream.write(self.term.csi_wrap(\n                            value.encode(encoding),\n                            'bold'\n                        ))\n                        if y == zero and self.size.y > 1:\n                            stream.write(self.term.csi('smul'))\n                        if ramp:\n                            stream.write(ramp[y])\n                else:\n                    stream.write(b' ')\n\n            if y == zero and self.size.y > 1:\n                stream.write(self.term.csi('rmul'))\n            if ramp:\n                stream.write(self.term.csi('sgr0'))\n\n            stream.write(b'\\n')\n            lines += 1\n        stream.flush()\n\n        self.cycle = self.cycle + 1\n        self.lines = lines"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _normalised_numpy(self):\n        dx = (self.screen.width / float(len(self.points)))\n        oy = (self.screen.height)\n        points = np.array(self.points) - self.minimum\n        points = points * 4.0 / self.extents * self.size.y\n        for x, y in enumerate(points):\n            yield Point((\n                dx * x,\n                min(oy, oy - y),\n            ))", "response": "Normalised data points using numpy."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nnormalises data points using pure Python.", "response": "def _normalised_python(self):\n        \"\"\"Normalised data points using pure Python.\"\"\"\n        dx = (self.screen.width / float(len(self.points)))\n        oy = (self.screen.height)\n        for x, point in enumerate(self.points):\n            y = (point - self.minimum) * 4.0 / self.extents * self.size.y\n            yield Point((\n                dx * x,\n                min(oy, oy - y),\n            ))"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nsets pixel at x y point.", "response": "def set(self, point):\n        \"\"\"Set pixel at (x, y) point.\"\"\"\n        if not isinstance(point, Point):\n            point = Point(point)\n\n        rx = self.round(point.x)\n        ry = self.round(point.y)\n\n        item = Point((rx >> 1, min(ry >> 2, self.size.y)))\n        self.screen[item] |= self.pixels[ry & 3][rx & 1]"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef mem_size(self):\n        '''used when allocating memory ingame'''\n\n        data_len = self._data_mem_size\n        node_count = len(list(self.xml_doc.iter(tag=etree.Element)))\n\n        if self.compressed:\n            size = 52 * node_count + data_len + 630\n        else:\n            tags_len = 0\n            for e in self.xml_doc.iter(tag=etree.Element):\n                e_len = max(len(e.tag), 8)\n                e_len = (e_len + 3) & ~3\n                tags_len += e_len\n\n            size = 56 * node_count + data_len + 630 + tags_len\n\n        # debugging\n        #print('nodes:{} ({}) data:{} ({})'.format(node_count,hex(node_count), data_len, hex(data_len)))\n\n        return (size + 8) & ~7", "response": "used when allocating memory ingame"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nload the class from the class_path string", "response": "def _load_class(class_path, default):\n    \"\"\" Loads the class from the class_path string \"\"\"\n    if class_path is None:\n        return default\n\n    component = class_path.rsplit('.', 1)\n    result_processor = getattr(\n        importlib.import_module(component[0]),\n        component[1],\n        default\n    ) if len(component) > 1 else default\n\n    return result_processor"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _is_iterable(item):\n    return isinstance(item, collections.Iterable) and not isinstance(item, six.string_types)", "response": "Checks if an item is iterable"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nprocessing pagination requests from request parameter", "response": "def _process_pagination_values(request):\n    \"\"\" process pagination requests from request parameter \"\"\"\n    size = 20\n    page = 0\n    from_ = 0\n    if \"page_size\" in request.POST:\n        size = int(request.POST[\"page_size\"])\n        max_page_size = getattr(settings, \"SEARCH_MAX_PAGE_SIZE\", 100)\n        # The parens below are superfluous, but make it much clearer to the reader what is going on\n        if not (0 < size <= max_page_size):  # pylint: disable=superfluous-parens\n            raise ValueError(_('Invalid page size of {page_size}').format(page_size=size))\n\n        if \"page_index\" in request.POST:\n            page = int(request.POST[\"page_index\"])\n            from_ = page * size\n    return size, from_, page"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ncreating separate dictionary of supported filter values provided", "response": "def _process_field_values(request):\n    \"\"\" Create separate dictionary of supported filter values provided \"\"\"\n    return {\n        field_key: request.POST[field_key]\n        for field_key in request.POST\n        if field_key in course_discovery_filter_fields()\n    }"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nsearching for the specified object in the object tree.", "response": "def do_search(request, course_id=None):\n    \"\"\"\n    Search view for http requests\n\n    Args:\n        request (required) - django request object\n        course_id (optional) - course_id within which to restrict search\n\n    Returns:\n        http json response with the following fields\n            \"took\" - how many seconds the operation took\n            \"total\" - how many results were found\n            \"max_score\" - maximum score from these results\n            \"results\" - json array of result documents\n\n            or\n\n            \"error\" - displayable information about an error that occured on the server\n\n    POST Params:\n        \"search_string\" (required) - text upon which to search\n        \"page_size\" (optional)- how many results to return per page (defaults to 20, with maximum cutoff at 100)\n        \"page_index\" (optional) - for which page (zero-indexed) to include results (defaults to 0)\n    \"\"\"\n\n    # Setup search environment\n    SearchInitializer.set_search_enviroment(request=request, course_id=course_id)\n\n    results = {\n        \"error\": _(\"Nothing to search\")\n    }\n    status_code = 500\n\n    search_term = request.POST.get(\"search_string\", None)\n\n    try:\n        if not search_term:\n            raise ValueError(_('No search term provided for search'))\n\n        size, from_, page = _process_pagination_values(request)\n\n        # Analytics - log search request\n        track.emit(\n            'edx.course.search.initiated',\n            {\n                \"search_term\": search_term,\n                \"page_size\": size,\n                \"page_number\": page,\n            }\n        )\n\n        results = perform_search(\n            search_term,\n            user=request.user,\n            size=size,\n            from_=from_,\n            course_id=course_id\n        )\n\n        status_code = 200\n\n        # Analytics - log search results before sending to browser\n        track.emit(\n            'edx.course.search.results_displayed',\n            {\n                \"search_term\": search_term,\n                \"page_size\": size,\n                \"page_number\": page,\n                \"results_count\": results[\"total\"],\n            }\n        )\n\n    except ValueError as invalid_err:\n        results = {\n            \"error\": six.text_type(invalid_err)\n        }\n        log.debug(six.text_type(invalid_err))\n\n    except QueryParseError:\n        results = {\n            \"error\": _('Your query seems malformed. Check for unmatched quotes.')\n        }\n\n    # Allow for broad exceptions here - this is an entry point from external reference\n    except Exception as err:  # pylint: disable=broad-except\n        results = {\n            \"error\": _('An error occurred when searching for \"{search_string}\"').format(search_string=search_term)\n        }\n        log.exception(\n            'Search view exception when searching for %s for user %s: %r',\n            search_term,\n            request.user.id,\n            err\n        )\n\n    return JsonResponse(results, status=status_code)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef course_discovery(request):\n    results = {\n        \"error\": _(\"Nothing to search\")\n    }\n    status_code = 500\n\n    search_term = request.POST.get(\"search_string\", None)\n\n    try:\n        size, from_, page = _process_pagination_values(request)\n        field_dictionary = _process_field_values(request)\n\n        # Analytics - log search request\n        track.emit(\n            'edx.course_discovery.search.initiated',\n            {\n                \"search_term\": search_term,\n                \"page_size\": size,\n                \"page_number\": page,\n            }\n        )\n\n        results = course_discovery_search(\n            search_term=search_term,\n            size=size,\n            from_=from_,\n            field_dictionary=field_dictionary,\n        )\n\n        # Analytics - log search results before sending to browser\n        track.emit(\n            'edx.course_discovery.search.results_displayed',\n            {\n                \"search_term\": search_term,\n                \"page_size\": size,\n                \"page_number\": page,\n                \"results_count\": results[\"total\"],\n            }\n        )\n\n        status_code = 200\n\n    except ValueError as invalid_err:\n        results = {\n            \"error\": six.text_type(invalid_err)\n        }\n        log.debug(six.text_type(invalid_err))\n\n    except QueryParseError:\n        results = {\n            \"error\": _('Your query seems malformed. Check for unmatched quotes.')\n        }\n\n    # Allow for broad exceptions here - this is an entry point from external reference\n    except Exception as err:  # pylint: disable=broad-except\n        results = {\n            \"error\": _('An error occurred when searching for \"{search_string}\"').format(search_string=search_term)\n        }\n        log.exception(\n            'Search view exception when searching for %s for user %s: %r',\n            search_term,\n            request.user.id,\n            err\n        )\n\n    return JsonResponse(results, status=status_code)", "response": "Search for courses in the object tree and return the course tree."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ntranslating the hits response into a list of objects", "response": "def _translate_hits(es_response):\n    \"\"\" Provide resultset in our desired format from elasticsearch results \"\"\"\n\n    def translate_result(result):\n        \"\"\" Any conversion from ES result syntax into our search engine syntax \"\"\"\n        translated_result = copy.copy(result)\n        data = translated_result.pop(\"_source\")\n\n        translated_result.update({\n            \"data\": data,\n            \"score\": translated_result[\"_score\"]\n        })\n\n        return translated_result\n\n    def translate_facet(result):\n        \"\"\" Any conversion from ES facet syntax into our search engine sytax \"\"\"\n        terms = {term[\"term\"]: term[\"count\"] for term in result[\"terms\"]}\n        return {\n            \"terms\": terms,\n            \"total\": result[\"total\"],\n            \"other\": result[\"other\"],\n        }\n\n    results = [translate_result(hit) for hit in es_response[\"hits\"][\"hits\"]]\n    response = {\n        \"took\": es_response[\"took\"],\n        \"total\": es_response[\"hits\"][\"total\"],\n        \"max_score\": es_response[\"hits\"][\"max_score\"],\n        \"results\": results,\n    }\n\n    if \"facets\" in es_response:\n        response[\"facets\"] = {facet: translate_facet(es_response[\"facets\"][facet]) for facet in es_response[\"facets\"]}\n\n    return response"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn the field to apply into filter", "response": "def _get_filter_field(field_name, field_value):\n    \"\"\" Return field to apply into filter, if an array then use a range, otherwise look for a term match \"\"\"\n    filter_field = None\n    if isinstance(field_value, ValueRange):\n        range_values = {}\n        if field_value.lower:\n            range_values.update({\"gte\": field_value.lower_string})\n        if field_value.upper:\n            range_values.update({\"lte\": field_value.upper_string})\n        filter_field = {\n            \"range\": {\n                field_name: range_values\n            }\n        }\n    elif _is_iterable(field_value):\n        filter_field = {\n            \"terms\": {\n                field_name: field_value\n            }\n        }\n    else:\n        filter_field = {\n            \"term\": {\n                field_name: field_value\n            }\n        }\n    return filter_field"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _process_field_queries(field_dictionary):\n    def field_item(field):\n        \"\"\" format field match as \"match\" item for elasticsearch query \"\"\"\n        return {\n            \"match\": {\n                field: field_dictionary[field]\n            }\n        }\n\n    return [field_item(field) for field in field_dictionary]", "response": "Process the field queries for the resource."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nprocess the filter_dictionary to get the list of items that can be included and included.", "response": "def _process_filters(filter_dictionary):\n    \"\"\"\n    We have a filter_dictionary - this means that if the field is included\n    and matches, then we can include, OR if the field is undefined, then we\n    assume it is safe to include\n    \"\"\"\n    def filter_item(field):\n        \"\"\" format elasticsearch filter to pass if value matches OR field is not included \"\"\"\n        if filter_dictionary[field] is not None:\n            return {\n                \"or\": [\n                    _get_filter_field(field, filter_dictionary[field]),\n                    {\n                        \"missing\": {\n                            \"field\": field\n                        }\n                    }\n                ]\n            }\n\n        return {\n            \"missing\": {\n                \"field\": field\n            }\n        }\n\n    return [filter_item(field) for field in filter_dictionary]"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _process_exclude_dictionary(exclude_dictionary):\n    # not_properties will hold the generated term queries.\n    not_properties = []\n    for exclude_property in exclude_dictionary:\n        exclude_values = exclude_dictionary[exclude_property]\n        if not isinstance(exclude_values, list):\n            exclude_values = [exclude_values]\n        not_properties.extend([{\"term\": {exclude_property: exclude_value}} for exclude_value in exclude_values])\n\n    # Returning a query segment with an empty list freaks out ElasticSearch,\n    #   so just return an empty segment.\n    if not not_properties:\n        return {}\n\n    return {\n        \"not\": {\n            \"filter\": {\n                \"or\": not_properties\n            }\n        }\n    }", "response": "Processes the exclude dictionary and returns a query segment that will exclude unwanted results."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nprocess the facet terms.", "response": "def _process_facet_terms(facet_terms):\n    \"\"\" We have a list of terms with which we return facets \"\"\"\n    elastic_facets = {}\n    for facet in facet_terms:\n        facet_term = {\"field\": facet}\n        if facet_terms[facet]:\n            for facet_option in facet_terms[facet]:\n                facet_term[facet_option] = facet_terms[facet][facet_option]\n\n        elastic_facets[facet] = {\n            \"terms\": facet_term\n        }\n\n    return elastic_facets"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nfetching mapped - items structure from cache", "response": "def get_mappings(cls, index_name, doc_type):\n        \"\"\" fetch mapped-items structure from cache \"\"\"\n        return cache.get(cls.get_cache_item_name(index_name, doc_type), {})"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nsetting mappings into cache", "response": "def set_mappings(cls, index_name, doc_type, mappings):\n        \"\"\" set new mapped-items structure into cache \"\"\"\n        cache.set(cls.get_cache_item_name(index_name, doc_type), mappings)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nlogging the errors and raises a general ElasticSearch Exception", "response": "def log_indexing_error(cls, indexing_errors):\n        \"\"\" Logs indexing errors and raises a general ElasticSearch Exception\"\"\"\n        indexing_errors_log = []\n        for indexing_error in indexing_errors:\n            indexing_errors_log.append(str(indexing_error))\n        raise exceptions.ElasticsearchException(', '.join(indexing_errors_log))"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\ngets the mappings for the given doc_type from the index.", "response": "def _get_mappings(self, doc_type):\n        \"\"\"\n        Interfaces with the elasticsearch mappings for the index\n        prevents multiple loading of the same mappings from ES when called more than once\n\n        Mappings format in elasticsearch is as follows:\n        {\n           \"doc_type\": {\n              \"properties\": {\n                 \"nested_property\": {\n                    \"properties\": {\n                       \"an_analysed_property\": {\n                          \"type\": \"string\"\n                       },\n                       \"another_analysed_property\": {\n                          \"type\": \"string\"\n                       }\n                    }\n                 },\n                 \"a_not_analysed_property\": {\n                    \"type\": \"string\",\n                    \"index\": \"not_analyzed\"\n                 },\n                 \"a_date_property\": {\n                    \"type\": \"date\"\n                 }\n              }\n           }\n        }\n\n        We cache the properties of each doc_type, if they are not available, we'll load them again from Elasticsearch\n        \"\"\"\n        # Try loading the mapping from the cache.\n        mapping = ElasticSearchEngine.get_mappings(self.index_name, doc_type)\n\n        # Fall back to Elasticsearch\n        if not mapping:\n            mapping = self._es.indices.get_mapping(\n                index=self.index_name,\n                doc_type=doc_type,\n            ).get(self.index_name, {}).get('mappings', {}).get(doc_type, {})\n\n            # Cache the mapping, if one was retrieved\n            if mapping:\n                ElasticSearchEngine.set_mappings(\n                    self.index_name,\n                    doc_type,\n                    mapping\n                )\n\n        return mapping"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\ncheck the mappings for the content of the current document type and body.", "response": "def _check_mappings(self, doc_type, body):\n        \"\"\"\n        We desire to index content so that anything we want to be textually searchable(and therefore needing to be\n        analysed), but the other fields are designed to be filters, and only require an exact match. So, we want to\n        set up the mappings for these fields as \"not_analyzed\" - this will allow our filters to work faster because\n        they only have to work off exact matches\n        \"\"\"\n\n        # Make fields other than content be indexed as unanalyzed terms - content\n        # contains fields that are to be analyzed\n        exclude_fields = [\"content\"]\n        field_properties = getattr(settings, \"ELASTIC_FIELD_MAPPINGS\", {})\n\n        def field_property(field_name, field_value):\n            \"\"\"\n            Prepares field as property syntax for providing correct mapping desired for field\n\n            Mappings format in elasticsearch is as follows:\n            {\n               \"doc_type\": {\n                  \"properties\": {\n                     \"nested_property\": {\n                        \"properties\": {\n                           \"an_analysed_property\": {\n                              \"type\": \"string\"\n                           },\n                           \"another_analysed_property\": {\n                              \"type\": \"string\"\n                           }\n                        }\n                     },\n                     \"a_not_analysed_property\": {\n                        \"type\": \"string\",\n                        \"index\": \"not_analyzed\"\n                     },\n                     \"a_date_property\": {\n                        \"type\": \"date\"\n                     }\n                  }\n               }\n            }\n\n            We can only add new ones, but the format is the same\n            \"\"\"\n            prop_val = None\n            if field_name in field_properties:\n                prop_val = field_properties[field_name]\n            elif isinstance(field_value, dict):\n                props = {fn: field_property(fn, field_value[fn]) for fn in field_value}\n                prop_val = {\"properties\": props}\n            else:\n                prop_val = {\n                    \"type\": \"string\",\n                    \"index\": \"not_analyzed\",\n                }\n\n            return prop_val\n\n        new_properties = {\n            field: field_property(field, value)\n            for field, value in body.items()\n            if (field not in exclude_fields) and (field not in self._get_mappings(doc_type).get('properties', {}))\n        }\n\n        if new_properties:\n            self._es.indices.put_mapping(\n                index=self.index_name,\n                doc_type=doc_type,\n                body={\n                    doc_type: {\n                        \"properties\": new_properties,\n                    }\n                }\n            )\n            self._clear_mapping(doc_type)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nimplement call to index the specified document type and list of sources", "response": "def index(self, doc_type, sources, **kwargs):\n        \"\"\"\n        Implements call to add documents to the ES index\n        Note the call to _check_mappings which will setup fields with the desired mappings\n        \"\"\"\n\n        try:\n            actions = []\n            for source in sources:\n                self._check_mappings(doc_type, source)\n                id_ = source['id'] if 'id' in source else None\n                log.debug(\"indexing %s object with id %s\", doc_type, id_)\n                action = {\n                    \"_index\": self.index_name,\n                    \"_type\": doc_type,\n                    \"_id\": id_,\n                    \"_source\": source\n                }\n                actions.append(action)\n            # bulk() returns a tuple with summary information\n            # number of successfully executed actions and number of errors if stats_only is set to True.\n            _, indexing_errors = bulk(\n                self._es,\n                actions,\n                **kwargs\n            )\n            if indexing_errors:\n                ElasticSearchEngine.log_indexing_error(indexing_errors)\n        # Broad exception handler to protect around bulk call\n        except Exception as ex:\n            # log information and re-raise\n            log.exception(\"error while indexing - %s\", str(ex))\n            raise"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nimplement call to remove the documents from the index", "response": "def remove(self, doc_type, doc_ids, **kwargs):\n        \"\"\" Implements call to remove the documents from the index \"\"\"\n\n        try:\n            # ignore is flagged as an unexpected-keyword-arg; ES python client documents that it can be used\n            # pylint: disable=unexpected-keyword-arg\n            actions = []\n            for doc_id in doc_ids:\n                log.debug(\"Removing document of type %s and index %s\", doc_type, doc_id)\n                action = {\n                    '_op_type': 'delete',\n                    \"_index\": self.index_name,\n                    \"_type\": doc_type,\n                    \"_id\": doc_id\n                }\n                actions.append(action)\n            bulk(self._es, actions, **kwargs)\n        except BulkIndexError as ex:\n            valid_errors = [error for error in ex.errors if error['delete']['status'] != 404]\n\n            if valid_errors:\n                log.exception(\"An error occurred while removing documents from the index.\")\n                raise"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nimplementing the search method for the desired content of the a search.", "response": "def search(self,\n               query_string=None,\n               field_dictionary=None,\n               filter_dictionary=None,\n               exclude_dictionary=None,\n               facet_terms=None,\n               exclude_ids=None,\n               use_field_match=False,\n               **kwargs):  # pylint: disable=too-many-arguments, too-many-locals, too-many-branches, arguments-differ\n        \"\"\"\n        Implements call to search the index for the desired content.\n\n        Args:\n            query_string (str): the string of values upon which to search within the\n            content of the objects within the index\n\n            field_dictionary (dict): dictionary of values which _must_ exist and\n            _must_ match in order for the documents to be included in the results\n\n            filter_dictionary (dict): dictionary of values which _must_ match if the\n            field exists in order for the documents to be included in the results;\n            documents for which the field does not exist may be included in the\n            results if they are not otherwise filtered out\n\n            exclude_dictionary(dict): dictionary of values all of which which must\n            not match in order for the documents to be included in the results;\n            documents which have any of these fields and for which the value matches\n            one of the specified values shall be filtered out of the result set\n\n            facet_terms (dict): dictionary of terms to include within search\n            facets list - key is the term desired to facet upon, and the value is a\n            dictionary of extended information to include. Supported right now is a\n            size specification for a cap upon how many facet results to return (can\n            be an empty dictionary to use default size for underlying engine):\n\n            e.g.\n            {\n                \"org\": {\"size\": 10},  # only show top 10 organizations\n                \"modes\": {}\n            }\n\n            use_field_match (bool): flag to indicate whether to use elastic\n            filtering or elastic matching for field matches - this is nothing but a\n            potential performance tune for certain queries\n\n            (deprecated) exclude_ids (list): list of id values to exclude from the results -\n            useful for finding maches that aren't \"one of these\"\n\n        Returns:\n            dict object with results in the desired format\n            {\n                \"took\": 3,\n                \"total\": 4,\n                \"max_score\": 2.0123,\n                \"results\": [\n                    {\n                        \"score\": 2.0123,\n                        \"data\": {\n                            ...\n                        }\n                    },\n                    {\n                        \"score\": 0.0983,\n                        \"data\": {\n                            ...\n                        }\n                    }\n                ],\n                \"facets\": {\n                    \"org\": {\n                        \"total\": total_count,\n                        \"other\": 1,\n                        \"terms\": {\n                            \"MITx\": 25,\n                            \"HarvardX\": 18\n                        }\n                    },\n                    \"modes\": {\n                        \"total\": modes_count,\n                        \"other\": 15,\n                        \"terms\": {\n                            \"honor\": 58,\n                            \"verified\": 44,\n                        }\n                    }\n                }\n            }\n\n        Raises:\n            ElasticsearchException when there is a problem with the response from elasticsearch\n\n        Example usage:\n            .search(\n                \"find the words within this string\",\n                {\n                    \"must_have_field\": \"mast_have_value for must_have_field\"\n                },\n                {\n\n                }\n            )\n        \"\"\"\n\n        log.debug(\"searching index with %s\", query_string)\n\n        elastic_queries = []\n        elastic_filters = []\n\n        # We have a query string, search all fields for matching text within the \"content\" node\n        if query_string:\n            if six.PY2:\n                query_string = query_string.encode('utf-8').translate(None, RESERVED_CHARACTERS)\n            else:\n                query_string = query_string.translate(query_string.maketrans('', '', RESERVED_CHARACTERS))\n            elastic_queries.append({\n                \"query_string\": {\n                    \"fields\": [\"content.*\"],\n                    \"query\": query_string\n                }\n            })\n\n        if field_dictionary:\n            if use_field_match:\n                elastic_queries.extend(_process_field_queries(field_dictionary))\n            else:\n                elastic_filters.extend(_process_field_filters(field_dictionary))\n\n        if filter_dictionary:\n            elastic_filters.extend(_process_filters(filter_dictionary))\n\n        # Support deprecated argument of exclude_ids\n        if exclude_ids:\n            if not exclude_dictionary:\n                exclude_dictionary = {}\n            if \"_id\" not in exclude_dictionary:\n                exclude_dictionary[\"_id\"] = []\n            exclude_dictionary[\"_id\"].extend(exclude_ids)\n\n        if exclude_dictionary:\n            elastic_filters.append(_process_exclude_dictionary(exclude_dictionary))\n\n        query_segment = {\n            \"match_all\": {}\n        }\n        if elastic_queries:\n            query_segment = {\n                \"bool\": {\n                    \"must\": elastic_queries\n                }\n            }\n\n        query = query_segment\n        if elastic_filters:\n            filter_segment = {\n                \"bool\": {\n                    \"must\": elastic_filters\n                }\n            }\n            query = {\n                \"filtered\": {\n                    \"query\": query_segment,\n                    \"filter\": filter_segment,\n                }\n            }\n\n        body = {\"query\": query}\n        if facet_terms:\n            facet_query = _process_facet_terms(facet_terms)\n            if facet_query:\n                body[\"facets\"] = facet_query\n\n        try:\n            es_response = self._es.search(\n                index=self.index_name,\n                body=body,\n                **kwargs\n            )\n        except exceptions.ElasticsearchException as ex:\n            message = six.text_type(ex)\n            if 'QueryParsingException' in message:\n                log.exception(\"Malformed search query: %s\", message)\n                raise QueryParseError('Malformed search query.')\n            else:\n                # log information and re-raise\n                log.exception(\"error while searching index - %s\", str(message))\n                raise\n\n        return _translate_hits(es_response)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_search_engine(index=None):\n        search_engine_class = _load_class(getattr(settings, \"SEARCH_ENGINE\", None), None)\n        return search_engine_class(index=index) if search_engine_class else None", "response": "Returns the search engine class defined in settings. SEARCH_ENGINE or None if no search engine is defined."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef perform_search(\n        search_term,\n        user=None,\n        size=10,\n        from_=0,\n        course_id=None):\n    \"\"\" Call the search engine with the appropriate parameters \"\"\"\n    # field_, filter_ and exclude_dictionary(s) can be overridden by calling application\n    # field_dictionary includes course if course_id provided\n    (field_dictionary, filter_dictionary, exclude_dictionary) = SearchFilterGenerator.generate_field_filters(\n        user=user,\n        course_id=course_id\n    )\n\n    searcher = SearchEngine.get_search_engine(getattr(settings, \"COURSEWARE_INDEX_NAME\", \"courseware_index\"))\n    if not searcher:\n        raise NoSearchEngineError(\"No search engine specified in settings.SEARCH_ENGINE\")\n\n    results = searcher.search_string(\n        search_term,\n        field_dictionary=field_dictionary,\n        filter_dictionary=filter_dictionary,\n        exclude_dictionary=exclude_dictionary,\n        size=size,\n        from_=from_,\n        doc_type=\"courseware_content\",\n    )\n\n    # post-process the result\n    for result in results[\"results\"]:\n        result[\"data\"] = SearchResultProcessor.process_result(result[\"data\"], search_term, user)\n\n    results[\"access_denied_count\"] = len([r for r in results[\"results\"] if r[\"data\"] is None])\n    results[\"results\"] = [r for r in results[\"results\"] if r[\"data\"] is not None]\n\n    return results", "response": "Perform a search on the specified term in the specified base language."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nsearches the course - discovery activities against the search engine index of course details", "response": "def course_discovery_search(search_term=None, size=20, from_=0, field_dictionary=None):\n    \"\"\"\n    Course Discovery activities against the search engine index of course details\n    \"\"\"\n    # We'll ignore the course-enrollemnt informaiton in field and filter\n    # dictionary, and use our own logic upon enrollment dates for these\n    use_search_fields = [\"org\"]\n    (search_fields, _, exclude_dictionary) = SearchFilterGenerator.generate_field_filters()\n    use_field_dictionary = {}\n    use_field_dictionary.update({field: search_fields[field] for field in search_fields if field in use_search_fields})\n    if field_dictionary:\n        use_field_dictionary.update(field_dictionary)\n    if not getattr(settings, \"SEARCH_SKIP_ENROLLMENT_START_DATE_FILTERING\", False):\n        use_field_dictionary[\"enrollment_start\"] = DateRange(None, datetime.utcnow())\n\n    searcher = SearchEngine.get_search_engine(getattr(settings, \"COURSEWARE_INDEX_NAME\", \"courseware_index\"))\n    if not searcher:\n        raise NoSearchEngineError(\"No search engine specified in settings.SEARCH_ENGINE\")\n\n    results = searcher.search(\n        query_string=search_term,\n        doc_type=\"course_info\",\n        size=size,\n        from_=from_,\n        # only show when enrollment start IS provided and is before now\n        field_dictionary=use_field_dictionary,\n        # show if no enrollment end is provided and has not yet been reached\n        filter_dictionary={\"enrollment_end\": DateRange(datetime.utcnow(), None)},\n        exclude_dictionary=exclude_dictionary,\n        facet_terms=course_discovery_facets(),\n    )\n\n    return results"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef strings_in_dictionary(dictionary):\n        strings = [value for value in six.itervalues(dictionary) if not isinstance(value, dict)]\n        for child_dict in [dv for dv in six.itervalues(dictionary) if isinstance(dv, dict)]:\n            strings.extend(SearchResultProcessor.strings_in_dictionary(child_dict))\n        return strings", "response": "Used by default implementation for finding excerpt"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nfinding matches in the list of strings and words.", "response": "def find_matches(strings, words, length_hoped):\n        \"\"\" Used by default property excerpt \"\"\"\n        lower_words = [w.lower() for w in words]\n\n        def has_match(string):\n            \"\"\" Do any of the words match within the string \"\"\"\n            lower_string = string.lower()\n            for test_word in lower_words:\n                if test_word in lower_string:\n                    return True\n            return False\n\n        shortened_strings = [textwrap.wrap(s) for s in strings]\n        short_string_list = list(chain.from_iterable(shortened_strings))\n        matches = [ms for ms in short_string_list if has_match(ms)]\n\n        cumulative_len = 0\n        break_at = None\n        for idx, match in enumerate(matches):\n            cumulative_len += len(match)\n            if cumulative_len >= length_hoped:\n                break_at = idx\n                break\n\n        return matches[0:break_at]"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef decorate_matches(match_in, match_word):\n        matches = re.finditer(match_word, match_in, re.IGNORECASE)\n        for matched_string in set([match.group() for match in matches]):\n            match_in = match_in.replace(\n                matched_string,\n                getattr(settings, \"SEARCH_MATCH_DECORATION\", u\"<b>{}</b>\").format(matched_string)\n            )\n        return match_in", "response": "decorate the matches within the excerpt"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nadd properties defined in the result to the _results_fields dictionary", "response": "def add_properties(self):\n        \"\"\"\n        Called during post processing of result\n        Any properties defined in your subclass will get exposed as members of the result json from the search\n        \"\"\"\n        for property_name in [p[0] for p in inspect.getmembers(self.__class__) if isinstance(p[1], property)]:\n            self._results_fields[property_name] = getattr(self, property_name, None)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef process_result(cls, dictionary, match_phrase, user):\n        result_processor = _load_class(getattr(settings, \"SEARCH_RESULT_PROCESSOR\", None), cls)\n        srp = result_processor(dictionary, match_phrase)\n        if srp.should_remove(user):\n            return None\n        try:\n            srp.add_properties()\n        # protect around any problems introduced by subclasses within their properties\n        except Exception as ex:  # pylint: disable=broad-except\n            log.exception(\"error processing properties for %s - %s: will remove from results\",\n                          json.dumps(dictionary, cls=DjangoJSONEncoder), str(ex))\n            return None\n        return dictionary", "response": "Called from within search handler. Finds desired subclass and decides if the result should be removed and adds properties derived from the result information\n       "}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef excerpt(self):\n        if \"content\" not in self._results_fields:\n            return None\n\n        match_phrases = [self._match_phrase]\n        if six.PY2:\n            separate_phrases = [\n                phrase.decode('utf-8')\n                for phrase in shlex.split(self._match_phrase.encode('utf-8'))\n            ]\n        else:\n            separate_phrases = [\n                phrase\n                for phrase in shlex.split(self._match_phrase)\n            ]\n        if len(separate_phrases) > 1:\n            match_phrases.extend(separate_phrases)\n        else:\n            match_phrases = separate_phrases\n\n        matches = SearchResultProcessor.find_matches(\n            SearchResultProcessor.strings_in_dictionary(self._results_fields[\"content\"]),\n            match_phrases,\n            DESIRED_EXCERPT_LENGTH\n        )\n        excerpt_text = ELLIPSIS.join(matches)\n\n        for match_word in match_phrases:\n            excerpt_text = SearchResultProcessor.decorate_matches(excerpt_text, match_word)\n\n        return excerpt_text", "response": "Return a useful excerpt representing the matches within the results of the assessment."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef generate_field_filters(cls, **kwargs):\n        generator = _load_class(getattr(settings, \"SEARCH_FILTER_GENERATOR\", None), cls)()\n        return (\n            generator.field_dictionary(**kwargs),\n            generator.filter_dictionary(**kwargs),\n            generator.exclude_dictionary(**kwargs),\n        )", "response": "Returns a list of field filters for the specified class."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef set_search_enviroment(cls, **kwargs):\n        initializer = _load_class(getattr(settings, \"SEARCH_INITIALIZER\", None), cls)()\n        return initializer.initialize(**kwargs)", "response": "Called from within search handler\n        Finds desired subclass and calls initialize method\n       "}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _parse(self, filename):\n        self.names = {}\n        with codecs.open(filename, encoding=\"iso8859-1\") as f:\n            for line in f:\n                if any(map(lambda c: 128 < ord(c) < 160, line)):\n                    line = line.encode(\"iso8859-1\").decode(\"windows-1252\")\n                self._eat_name_line(line.strip())", "response": "Parses the data file and sets the names attribute"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _eat_name_line(self, line):\n        if line[0] not in \"#=\":\n            parts = line.split()\n            country_values = line[30:-1]\n            name = map_name(parts[1])\n            if not self.case_sensitive:\n                name = name.lower()\n\n            if parts[0] == \"M\":\n                self._set(name, u\"male\", country_values)\n            elif parts[0] == \"1M\" or parts[0] == \"?M\":\n                self._set(name, u\"mostly_male\", country_values)\n            elif parts[0] == \"F\":\n                self._set(name, u\"female\", country_values)\n            elif parts[0] == \"1F\" or parts[0] == \"?F\":\n                self._set(name, u\"mostly_female\", country_values)\n            elif parts[0] == \"?\":\n                self._set(name, self.unknown_value, country_values)\n            else:\n                raise \"Not sure what to do with a sex of %s\" % parts[0]", "response": "Parses one line of data file and sets the name of the user."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _set(self, name, gender, country_values):\n        if '+' in name:\n            for replacement in ['', ' ', '-']:\n                self._set(name.replace('+', replacement), gender, country_values)\n        else:\n            if name not in self.names:\n                self.names[name] = {}\n            self.names[name][gender] = country_values", "response": "Sets gender and relevant country values for names dictionary of detector"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nfind the most popular gender for the given name counting by given counter", "response": "def _most_popular_gender(self, name, counter):\n        \"\"\"Finds the most popular gender for the given name counting by given counter\"\"\"\n        if name not in self.names:\n            return self.unknown_value\n\n        max_count, max_tie = (0, 0)\n        best = self.names[name].keys()[0]\n        for gender, country_values in self.names[name].items():\n            count, tie = counter(country_values)\n            if count > max_count or (count == max_count and tie > max_tie):\n                max_count, max_tie, best = count, tie, gender\n\n        return best if max_count > 0 else self.unknown_value"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns the best gender for the given name and country pair", "response": "def get_gender(self, name, country=None):\n        \"\"\"Returns best gender for the given name and country pair\"\"\"\n        if not self.case_sensitive:\n            name = name.lower()\n\n        if name not in self.names:\n            return self.unknown_value\n        elif not country:\n            def counter(country_values):\n                country_values = map(ord, country_values.replace(\" \", \"\"))\n                return (len(country_values),\n                        sum(map(lambda c: c > 64 and c-55 or c-48, country_values)))\n            return self._most_popular_gender(name, counter)\n        elif country in self.__class__.COUNTRIES:\n            index = self.__class__.COUNTRIES.index(country)\n            counter = lambda e: (ord(e[index])-32, 0)\n            return self._most_popular_gender(name, counter)\n        else:\n            raise NoCountryError(\"No such country: %s\" % country)"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nwrites the specified string to the output file.", "response": "def output(self, msg, newline=True):\n        \"\"\"\n        Writes the specified string to the output target of the report.\n\n        :param msg: the message to output.\n        :type msg: str\n        :param newline:\n            whether or not to append a newline to the end of the message\n        :type newline: str\n        \"\"\"\n\n        click.echo(text_type(msg), nl=newline, file=self.output_file)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nexecute the tools on the TidyPy project and returns the issues that are found.", "response": "def execute_tools(config, path, progress=None):\n    \"\"\"\n    Executes the suite of TidyPy tools upon the project and returns the\n    issues that are found.\n\n    :param config: the TidyPy configuration to use\n    :type config: dict\n    :param path: that path to the project to analyze\n    :type path: str\n    :param progress:\n        the progress reporter object that will receive callbacks during the\n        execution of the tool suite. If not specified, not progress\n        notifications will occur.\n    :type progress: tidypy.Progress\n    :rtype: tidypy.Collector\n    \"\"\"\n\n    progress = progress or QuietProgress()\n    progress.on_start()\n\n    manager = SyncManager()\n    manager.start()\n\n    num_tools = 0\n    tools = manager.Queue()\n    for name, cls in iteritems(get_tools()):\n        if config[name]['use'] and cls.can_be_used():\n            num_tools += 1\n            tools.put({\n                'name': name,\n                'config': config[name],\n            })\n\n    collector = Collector(config)\n    if not num_tools:\n        progress.on_finish()\n        return collector\n\n    notifications = manager.Queue()\n    environment = manager.dict({\n        'finder': Finder(path, config),\n    })\n\n    workers = []\n    for _ in range(config['workers']):\n        worker = Worker(\n            args=(\n                tools,\n                notifications,\n                environment,\n            ),\n        )\n        worker.start()\n        workers.append(worker)\n\n    while num_tools:\n        try:\n            notification = notifications.get(True, 0.25)\n        except Empty:\n            pass\n        else:\n            if notification['type'] == 'start':\n                progress.on_tool_start(notification['tool'])\n            elif notification['type'] == 'complete':\n                collector.add_issues(notification['issues'])\n                progress.on_tool_finish(notification['tool'])\n                num_tools -= 1\n\n    progress.on_finish()\n\n    return collector"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef execute_reports(\n        config,\n        path,\n        collector,\n        on_report_finish=None,\n        output_file=None):\n    \"\"\"\n    Executes the configured suite of issue reports.\n\n    :param config: the TidyPy configuration to use\n    :type config: dict\n    :param path: that path to the project that was analyzed\n    :type path: str\n    :param collector: the issues to report\n    :type collector: tidypy.Collector\n    \"\"\"\n\n    reports = get_reports()\n    for report in config.get('requested_reports', []):\n        if report.get('type') and report['type'] in reports:\n            cfg = config.get('report', {}).get(report['type'], {})\n            cfg.update(report)\n            reporter = reports[report['type']](\n                cfg,\n                path,\n                output_file=output_file,\n            )\n            reporter.produce(collector)\n            if on_report_finish:\n                on_report_finish(report)", "response": "Executes the configured suite of issue reports."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ndetermining whether or not the specified file is excluded by the base path.", "response": "def is_excluded(self, path):\n        \"\"\"\n        Determines whether or not the specified file is excluded by the\n        project's configuration.\n\n        :param path: the path to check\n        :type path: pathlib.Path\n        :rtype: bool\n        \"\"\"\n\n        relpath = path.relative_to(self.base_path).as_posix()\n        return matches_masks(relpath, self.excludes)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef is_excluded_dir(self, path):\n\n        if self.is_excluded(path):\n            return True\n        return matches_masks(path.name, ALWAYS_EXCLUDED_DIRS)", "response": "Determines whether or not the specified directory is excluded by the current project s configuration."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef directories(self, filters=None, containing=None):\n\n        filters = compile_masks(filters or [r'.*'])\n        contains = compile_masks(containing)\n\n        for dirname, files in iteritems(self._found):\n            relpath = text_type(Path(dirname).relative_to(self.base_path))\n            if matches_masks(relpath, filters):\n                if not contains or self._contains(files, contains):\n                    yield dirname", "response": "Generates a generator that yields the paths to directories in the specified project that match the specified regular expressions."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef sys_paths(self, filters=None):\n\n        paths = set()\n\n        packages = list(self.packages(filters=filters))\n\n        for module in self.modules(filters=filters):\n            parent = text_type(Path(module).parent)\n            if parent not in packages:\n                paths.add(parent)\n\n        paths.update(self.topmost_directories([\n            text_type(Path(package).parent)\n            for package in packages\n        ]))\n\n        return list(paths)", "response": "Returns a list of paths that would be suitable to use in sys. path for the Python modules and packages found in this project."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef add_issues(self, issues):\n\n        if not isinstance(issues, (list, tuple)):\n            issues = [issues]\n        with self._lock:\n            self._all_issues.extend(issues)\n            self._cleaned_issues = None", "response": "Adds an issue to the collection."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef issue_count(self, include_unclean=False):\n\n        if include_unclean:\n            return len(self._all_issues)\n        self._ensure_cleaned_issues()\n        return len(self._cleaned_issues)", "response": "Returns the number of issues in the collection."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef get_issues(self, sortby=None):\n\n        self._ensure_cleaned_issues()\n        return self._sort_issues(self._cleaned_issues, sortby)", "response": "Returns the issues in the collection."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef get_grouped_issues(self, keyfunc=None, sortby=None):\n\n        if not keyfunc:\n            keyfunc = default_group\n        if not sortby:\n            sortby = self.DEFAULT_SORT\n        self._ensure_cleaned_issues()\n        return self._group_issues(self._cleaned_issues, keyfunc, sortby)", "response": "Returns the issues in the collection grouped by the keyfunc."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nparse a TidyPy configuration from a TOML - serialized TOML string containing a TidyPy configuration.", "response": "def parse(cls, content, is_pyproject=False):\n        \"\"\"\n        A convenience method for parsing a TOML-serialized configuration.\n\n        :param content: a TOML string containing a TidyPy configuration\n        :type content: str\n        :param is_pyproject:\n            whether or not the content is (or resembles) a ``pyproject.toml``\n            file, where the TidyPy configuration is located within a key named\n            ``tool``.\n        :type is_pyproject: bool\n        :rtype: dict\n        \"\"\"\n\n        parsed = pytoml.loads(content)\n\n        if is_pyproject:\n            parsed = parsed.get('tool', {})\n        parsed = parsed.get('tidypy', {})\n\n        return parsed"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nretrieve the TidyPy tools that are available in the current Python environment.", "response": "def get_tools():\n    \"\"\"\n    Retrieves the TidyPy tools that are available in the current Python\n    environment.\n\n    The returned dictionary has keys that are the tool names and values are the\n    tool classes.\n\n    :rtype: dict\n    \"\"\"\n\n    # pylint: disable=protected-access\n\n    if not hasattr(get_tools, '_CACHE'):\n        get_tools._CACHE = dict()\n        for entry in pkg_resources.iter_entry_points('tidypy.tools'):\n            try:\n                get_tools._CACHE[entry.name] = entry.load()\n            except ImportError as exc:  # pragma: no cover\n                output_error(\n                    'Could not load tool \"%s\" defined by \"%s\": %s' % (\n                        entry,\n                        entry.dist,\n                        exc,\n                    ),\n                )\n    return get_tools._CACHE"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef get_reports():\n\n    # pylint: disable=protected-access\n\n    if not hasattr(get_reports, '_CACHE'):\n        get_reports._CACHE = dict()\n        for entry in pkg_resources.iter_entry_points('tidypy.reports'):\n            try:\n                get_reports._CACHE[entry.name] = entry.load()\n            except ImportError as exc:  # pragma: no cover\n                output_error(\n                    'Could not load report \"%s\" defined by \"%s\": %s' % (\n                        entry,\n                        entry.dist,\n                        exc,\n                    ),\n                )\n    return get_reports._CACHE", "response": "Returns the TidyPy issue reports that are available in the current Python\n    environment."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nretrieving the TidyPy configuration extenders that are available in the current Python environment.", "response": "def get_extenders():\n    \"\"\"\n    Retrieves the TidyPy configuration extenders that are available in the\n    current Python environment.\n\n    The returned dictionary has keys are the extender names and values are the\n    extender classes.\n\n    :rtype: dict\n    \"\"\"\n\n    # pylint: disable=protected-access\n\n    if not hasattr(get_extenders, '_CACHE'):\n        get_extenders._CACHE = dict()\n        for entry in pkg_resources.iter_entry_points('tidypy.extenders'):\n            try:\n                get_extenders._CACHE[entry.name] = entry.load()\n            except ImportError as exc:  # pragma: no cover\n                output_error(\n                    'Could not load extender \"%s\" defined by \"%s\": %s' % (\n                        entry,\n                        entry.dist,\n                        exc,\n                    ),\n                )\n    return get_extenders._CACHE"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef purge_config_cache(location=None):\n\n    cache_path = get_cache_path(location)\n\n    if location:\n        os.remove(cache_path)\n    else:\n        shutil.rmtree(cache_path)", "response": "Clears out the cache of TidyPy configurations that were retrieved from the specified location."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nreturning a dict containing the default configuration for the TidyPy Tidy.", "response": "def get_default_config():\n    \"\"\"\n    Produces a stock/out-of-the-box TidyPy configuration.\n\n    :rtype: dict\n    \"\"\"\n\n    config = {}\n\n    for name, cls in iteritems(get_tools()):\n        config[name] = cls.get_default_config()\n\n    try:\n        workers = multiprocessing.cpu_count() - 1\n    except NotImplementedError:  # pragma: no cover\n        workers = 1\n    workers = max(1, min(4, workers))\n\n    config.update({\n        'exclude': [],\n        'merge-issues': True,\n        'workers': workers,\n        'requested_reports': [\n            {\n                'type': 'console',\n            },\n        ],\n        'disabled': [],\n        'noqa': True,\n        'extends': [],\n        'ignore-missing-extends': False,\n    })\n\n    return config"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef get_user_config(project_path, use_cache=True):\n\n    if sys.platform == 'win32':\n        user_config = os.path.expanduser(r'~\\\\tidypy')\n    else:\n        user_config = os.path.join(\n            os.getenv('XDG_CONFIG_HOME') or os.path.expanduser('~/.config'),\n            'tidypy'\n        )\n\n    if os.path.exists(user_config):\n        with open(user_config, 'r') as config_file:\n            config = pytoml.load(config_file).get('tidypy', {})\n\n        config = merge_dict(get_default_config(), config)\n        config = process_extensions(config, project_path, use_cache=use_cache)\n        return config\n\n    return None", "response": "Returns a TidyPy configuration that incorporates the configuration files of the current user."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_local_config(project_path, use_cache=True):\n\n    pyproject_path = os.path.join(project_path, 'pyproject.toml')\n\n    if os.path.exists(pyproject_path):\n        with open(pyproject_path, 'r') as config_file:\n            config = pytoml.load(config_file)\n\n        config = config.get('tool', {}).get('tidypy', {})\n        config = merge_dict(get_default_config(), config)\n        config = process_extensions(config, project_path, use_cache=use_cache)\n        return config\n\n    return None", "response": "Returns a TidyPy configuration for the specified project."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nreturning the Tidypy configuration for the specified project.", "response": "def get_project_config(project_path, use_cache=True):\n    \"\"\"\n    Produces the Tidypy configuration to use for the specified project.\n\n    If a ``pyproject.toml`` exists, the configuration will be based on that. If\n    not, the TidyPy configuration in the user's home directory will be used. If\n    one does not exist, the default configuration will be used.\n\n    :param project_path: the path to the project that is going to be analyzed\n    :type project_path: str\n    :param use_cache:\n        whether or not to use cached versions of any remote/referenced TidyPy\n        configurations. If not specified, defaults to ``True``.\n    :type use_cache: bool\n    :rtype: dict\n    \"\"\"\n\n    return get_local_config(project_path, use_cache=use_cache) \\\n        or get_user_config(project_path, use_cache=use_cache) \\\n        or get_default_config()"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nreturning the contents of two lists into a new list.", "response": "def merge_list(list1, list2):\n    \"\"\"\n    Merges the contents of two lists into a new list.\n\n    :param list1: the first list\n    :type list1: list\n    :param list2: the second list\n    :type list2: list\n    :returns: list\n    \"\"\"\n\n    merged = list(list1)\n\n    for value in list2:\n        if value not in merged:\n            merged.append(value)\n\n    return merged"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nprinting the specified string to stderr.", "response": "def output_error(msg):\n    \"\"\"\n    Prints the specified string to ``stderr``.\n\n    :param msg: the message to print\n    :type msg: str\n    \"\"\"\n\n    click.echo(click.style(msg, fg='red'), err=True)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef mod_sys_path(paths):\n\n    old_path = sys.path\n    sys.path = paths + sys.path\n    try:\n        yield\n    finally:\n        sys.path = old_path", "response": "A context manager that adds the specified paths to the sys. path during the execution of the block."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef compile_masks(masks):\n\n    if not masks:\n        masks = []\n    elif not isinstance(masks, (list, tuple)):\n        masks = [masks]\n\n    return [\n        re.compile(mask)\n        for mask in masks\n    ]", "response": "Returns a list of regular expressions that match the given list of regular expressions."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef matches_masks(target, masks):\n\n    for mask in masks:\n        if mask.search(target):\n            return True\n    return False", "response": "Determines whether or not the target string matches any of the regular expressions specified."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef read_file(filepath):\n\n    with _FILE_CACHE_LOCK:\n        if filepath not in _FILE_CACHE:\n            _FILE_CACHE[filepath] = _read_file(filepath)\n    return _FILE_CACHE[filepath]", "response": "Reads the contents of the specified file."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nparsing the specified file and returns the AST of the file.", "response": "def parse_python_file(filepath):\n    \"\"\"\n    Retrieves the AST of the specified file.\n\n    This function performs simple caching so that the same file isn't read or\n    parsed more than once per process.\n\n    :param filepath: the file to parse\n    :type filepath: str\n    :returns: ast.AST\n    \"\"\"\n\n    with _AST_CACHE_LOCK:\n        if filepath not in _AST_CACHE:\n            source = read_file(filepath)\n            _AST_CACHE[filepath] = ast.parse(source, filename=filepath)\n    return _AST_CACHE[filepath]"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ncall when an individual tool completes execution.", "response": "def on_tool_finish(self, tool):\n        \"\"\"\n        Called when an individual tool completes execution.\n\n        :param tool: the name of the tool that completed\n        :type tool: str\n        \"\"\"\n\n        with self._lock:\n            if tool in self.current_tools:\n                self.current_tools.remove(tool)\n                self.completed_tools.append(tool)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nexecute an x3270 command and return a Command object.", "response": "def exec_command(self, cmdstr):\n        \"\"\"\n            Execute an x3270 command\n\n            `cmdstr` gets sent directly to the x3270 subprocess on it's stdin.\n        \"\"\"\n        if self.is_terminated:\n            raise TerminatedError(\"this TerminalClient instance has been terminated\")\n\n        log.debug(\"sending command: %s\", cmdstr)\n        c = Command(self.app, cmdstr)\n        start = time.time()\n        c.execute()\n        elapsed = time.time() - start\n        log.debug(\"elapsed execution: {0}\".format(elapsed))\n        self.status = Status(c.status_line)\n\n        return c"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nterminate the underlying x3270 subprocess.", "response": "def terminate(self):\n        \"\"\"\n            terminates the underlying x3270 subprocess. Once called, this\n            Emulator instance must no longer be used.\n        \"\"\"\n        if not self.is_terminated:\n            log.debug(\"terminal client terminated\")\n            try:\n                self.exec_command(b\"Quit\")\n            except BrokenPipeError:  # noqa\n                # x3270 was terminated, since we are just quitting anyway, ignore it.\n                pass\n            except socket.error as e:\n                if e.errno != errno.ECONNRESET:\n                    raise\n                # this can happen because wc3270 closes the socket before\n                # the read() can happen, causing a socket error\n\n            self.app.close()\n\n            self.is_terminated = True"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn bool indicating connection state", "response": "def is_connected(self):\n        \"\"\"\n            Return bool indicating connection state\n        \"\"\"\n        # need to wrap in try/except b/c of wc3270's socket connection dynamics\n        try:\n            # this is basically a no-op, but it results in the the current status\n            # getting updated\n            self.exec_command(b\"Query(ConnectionState)\")\n\n            # connected status is like 'C(192.168.1.1)', disconnected is 'N'\n            return self.status.connection_state.startswith(b\"C(\")\n        except NotConnectedException:\n            return False"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef connect(self, host):\n        if not self.app.connect(host):\n            command = \"Connect({0})\".format(host).encode(\"ascii\")\n            self.exec_command(command)\n        self.last_host = host", "response": "Connect to a host"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef wait_for_field(self):\n        self.exec_command(\"Wait({0}, InputField)\".format(self.timeout).encode(\"ascii\"))\n        if self.status.keyboard != b\"U\":\n            raise KeyboardStateError(\n                \"keyboard not unlocked, state was: {0}\".format(\n                    self.status.keyboard.decode(\"ascii\")\n                )\n            )", "response": "Wait until the keyboard is ready and the cursor is positioned on a modifiable field."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nmove the cursor to the given co -ordinates.", "response": "def move_to(self, ypos, xpos):\n        \"\"\"\n            move the cursor to the given co-ordinates.  Co-ordinates are 1\n            based, as listed in the status area of the terminal.\n        \"\"\"\n        # the screen's co-ordinates are 1 based, but the command is 0 based\n        xpos -= 1\n        ypos -= 1\n        self.exec_command(\"MoveCursor({0}, {1})\".format(ypos, xpos).encode(\"ascii\"))"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nsend a string to the current screen.", "response": "def send_string(self, tosend, ypos=None, xpos=None):\n        \"\"\"\n            Send a string to the screen at the current cursor location or at\n            screen co-ordinates `ypos`/`xpos` if they are both given.\n\n            Co-ordinates are 1 based, as listed in the status area of the\n            terminal.\n        \"\"\"\n        if xpos is not None and ypos is not None:\n            self.move_to(ypos, xpos)\n\n        # escape double quotes in the data to send\n        tosend = tosend.replace('\"', '\"')\n\n        self.exec_command('String(\"{0}\")'.format(tosend).encode(\"ascii\"))"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ngetting a string at screen co -ordinates xpos ypos length at screen co -ordinates xpos ypos length at screen co -ordinates xpos ypos length at screen co -ordinates xpos length", "response": "def string_get(self, ypos, xpos, length):\n        \"\"\"\n            Get a string of `length` at screen co-ordinates `ypos`/`xpos`\n\n            Co-ordinates are 1 based, as listed in the status area of the\n            terminal.\n        \"\"\"\n        # the screen's co-ordinates are 1 based, but the command is 0 based\n        xpos -= 1\n        ypos -= 1\n        cmd = self.exec_command(\n            \"Ascii({0},{1},{2})\".format(ypos, xpos, length).encode(\"ascii\")\n        )\n        # this usage of ascii should only return a single line of data\n        assert len(cmd.data) == 1, cmd.data\n        return cmd.data[0].decode(\"ascii\")"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef string_found(self, ypos, xpos, string):\n        found = self.string_get(ypos, xpos, len(string))\n        log.debug('string_found() saw \"{0}\"'.format(found))\n        return found == string", "response": "Return True if string is found at screen co -ordinates xpos ypos and ypos are 1 based."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef fill_field(self, ypos, xpos, tosend, length):\n        if length < len(tosend):\n            raise FieldTruncateError('length limit %d, but got \"%s\"' % (length, tosend))\n        if xpos is not None and ypos is not None:\n            self.move_to(ypos, xpos)\n        self.delete_field()\n        self.send_string(tosend)", "response": "Fill the field at the given position and moves the string to the given position."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef from_func(cls, func, variables, vartype, name=None):\n        variables = tuple(variables)\n\n        configurations = frozenset(config\n                                   for config in itertools.product(vartype.value, repeat=len(variables))\n                                   if func(*config))\n\n        return cls(func, configurations, variables, vartype, name)", "response": "Construct a constraint from a validation function."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef from_configurations(cls, configurations, variables, vartype, name=None):\n        def func(*args): return args in configurations\n\n        return cls(func, configurations, variables, vartype, name)", "response": "Construct a constraint from a list of valid configurations."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nchecking that a solution satisfies the constraint.", "response": "def check(self, solution):\n        \"\"\"Check that a solution satisfies the constraint.\n\n        Args:\n            solution (container):\n                An assignment for the variables in the constraint.\n\n        Returns:\n            bool: True if the solution satisfies the constraint; otherwise False.\n\n        Examples:\n            This example creates a constraint that :math:`a \\\\ne b` on binary variables\n            and tests it for two candidate solutions, with additional unconstrained\n            variable c.\n\n            >>> import dwavebinarycsp\n            >>> const = dwavebinarycsp.Constraint.from_configurations([(0, 1), (1, 0)],\n            ...             ['a', 'b'], dwavebinarycsp.BINARY)\n            >>> solution = {'a': 1, 'b': 1, 'c': 0}\n            >>> const.check(solution)\n            False\n            >>> solution = {'a': 1, 'b': 0, 'c': 0}\n            >>> const.check(solution)\n            True\n\n        \"\"\"\n        return self.func(*(solution[v] for v in self.variables))"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef fix_variable(self, v, value):\n        variables = self.variables\n        try:\n            idx = variables.index(v)\n        except ValueError:\n            raise ValueError(\"given variable {} is not part of the constraint\".format(v))\n\n        if value not in self.vartype.value:\n            raise ValueError(\"expected value to be in {}, received {} instead\".format(self.vartype.value, value))\n\n        configurations = frozenset(config[:idx] + config[idx + 1:]  # exclude the fixed var\n                                   for config in self.configurations\n                                   if config[idx] == value)\n\n        if not configurations:\n            raise UnsatError(\"fixing {} to {} makes this constraint unsatisfiable\".format(v, value))\n\n        variables = variables[:idx] + variables[idx + 1:]\n\n        self.configurations = configurations\n        self.variables = variables\n\n        def func(*args): return args in configurations\n        self.func = func\n\n        self.name = '{} ({} fixed to {})'.format(self.name, v, value)", "response": "Fix the value of a variable and remove it from the constraint."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nflip a variable in the constraint.", "response": "def flip_variable(self, v):\n        \"\"\"Flip a variable in the constraint.\n\n        Args:\n            v (variable):\n                Variable in the constraint to take the complementary value of its\n                construction value.\n\n        Examples:\n            This example creates a constraint that :math:`a = b` on binary variables\n            and flips variable a.\n\n            >>> import dwavebinarycsp\n            >>> const = dwavebinarycsp.Constraint.from_func(operator.eq,\n            ...             ['a', 'b'], dwavebinarycsp.BINARY)\n            >>> const.check({'a': 0, 'b': 0})\n            True\n            >>> const.flip_variable('a')\n            >>> const.check({'a': 1, 'b': 0})\n            True\n            >>> const.check({'a': 0, 'b': 0})\n            False\n\n        \"\"\"\n        try:\n            idx = self.variables.index(v)\n        except ValueError:\n            raise ValueError(\"variable {} is not a variable in constraint {}\".format(v, self.name))\n\n        if self.vartype is dimod.BINARY:\n\n            original_func = self.func\n\n            def func(*args):\n                new_args = list(args)\n                new_args[idx] = 1 - new_args[idx]  # negate v\n                return original_func(*new_args)\n\n            self.func = func\n\n            self.configurations = frozenset(config[:idx] + (1 - config[idx],) + config[idx + 1:]\n                                            for config in self.configurations)\n\n        else:  # SPIN\n\n            original_func = self.func\n\n            def func(*args):\n                new_args = list(args)\n                new_args[idx] = -new_args[idx]  # negate v\n                return original_func(*new_args)\n\n            self.func = func\n\n            self.configurations = frozenset(config[:idx] + (-config[idx],) + config[idx + 1:]\n                                            for config in self.configurations)\n\n        self.name = '{} ({} flipped)'.format(self.name, v)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ncreate a copy of this object.", "response": "def copy(self):\n        \"\"\"Create a copy.\n\n        Examples:\n            This example copies constraint :math:`a \\\\ne b` and tests a solution\n            on the copied constraint.\n\n            >>> import dwavebinarycsp\n            >>> import operator\n            >>> const = dwavebinarycsp.Constraint.from_func(operator.ne,\n            ...             ['a', 'b'], 'BINARY')\n            >>> const2 = const.copy()\n            >>> const2 is const\n            False\n            >>> const2.check({'a': 1, 'b': 1})\n            False\n\n        \"\"\"\n        # each object is itself immutable (except the function)\n        return self.__class__(self.func, self.configurations, self.variables, self.vartype, name=self.name)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ncreating a new constraint that is the projection onto a subset of the variables.", "response": "def projection(self, variables):\n        \"\"\"Create a new constraint that is the projection onto a subset of the variables.\n\n        Args:\n            variables (iterable):\n                Subset of the constraint's variables.\n\n        Returns:\n            :obj:`.Constraint`: A new constraint over a subset of the variables.\n\n        Examples:\n\n            >>> import dwavebinarycsp\n            ...\n            >>> const = dwavebinarycsp.Constraint.from_configurations([(0, 0), (0, 1)],\n            ...                                                       ['a', 'b'],\n            ...                                                       dwavebinarycsp.BINARY)\n            >>> proj = const.projection(['a'])\n            >>> proj.variables\n            ['a']\n            >>> proj.configurations\n            {(0,)}\n\n        \"\"\"\n        # resolve iterables or mutability problems by casting the variables to a set\n        variables = set(variables)\n\n        if not variables.issubset(self.variables):\n            raise ValueError(\"Cannot project to variables not in the constraint.\")\n\n        idxs = [i for i, v in enumerate(self.variables) if v in variables]\n\n        configurations = frozenset(tuple(config[i] for i in idxs) for config in self.configurations)\n        variables = tuple(self.variables[i] for i in idxs)\n\n        return self.from_configurations(configurations, variables, self.vartype)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef xor_fault(a, b, out, fault):\n    if (a != b) == out:\n        return fault == 0\n    else:\n        return fault == 1", "response": "Returns True if a xor fault is 0 or 1."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nreturn True if a and b have a fault.", "response": "def and_fault(a, b, out, fault):\n    \"\"\"Returns True if AND(a, b) == out and fault == 0 or AND(a, b) != out and fault == 1.\"\"\"\n    if (a and b) == out:\n        return fault == 0\n    else:\n        return fault == 1"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef or_fault(a, b, out, fault):\n    if (a or b) == out:\n        return fault == 0\n    else:\n        return fault == 1", "response": "Returns True if a or b has a fault."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef assert_penaltymodel_factory_available():\n\n    from pkg_resources import iter_entry_points\n    from penaltymodel.core import FACTORY_ENTRYPOINT\n    from itertools import chain\n\n    supported = ('maxgap', 'mip')\n    factories = chain(*(iter_entry_points(FACTORY_ENTRYPOINT, name) for name in supported))\n\n    try:\n        next(factories)\n    except StopIteration:\n        raise AssertionError(\n            \"To use 'dwavebinarycsp', at least one penaltymodel factory must be installed. \"\n            \"Try {}.\".format(\n                \" or \".join(\"'pip install dwavebinarycsp[{}]'\".format(name) for name in supported)\n            ))", "response": "Assert that penaltymodel. core. penaltymodel. penaltymodel_factory is available."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef add_constraint(self, constraint, variables=tuple()):\n        if isinstance(constraint, Constraint):\n            if variables and (tuple(variables) != constraint.variables):\n                raise ValueError(\"mismatched variables and Constraint\")\n        elif isinstance(constraint, Callable):\n            constraint = Constraint.from_func(constraint, variables, self.vartype)\n        elif isinstance(constraint, Iterable):\n            constraint = Constraint.from_configurations(constraint, variables, self.vartype)\n        else:\n            raise TypeError(\"Unknown constraint type given\")\n\n        self.constraints.append(constraint)\n        for v in constraint.variables:\n            self.variables[v].append(constraint)", "response": "Add a constraint to the set of available base classes."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef stitch(csp, min_classical_gap=2.0, max_graph_size=8):\n\n    # ensure we have penaltymodel factory available\n    try:\n        dwavebinarycsp.assert_penaltymodel_factory_available()\n    except AssertionError as e:\n        raise RuntimeError(e)\n\n    def aux_factory():\n        for i in count():\n            yield 'aux{}'.format(i)\n\n    aux = aux_factory()\n\n    bqm = dimod.BinaryQuadraticModel.empty(csp.vartype)\n\n    # developer note: we could cache them and relabel, for now though let's do the simple thing\n    # penalty_models = {}\n    for const in csp.constraints:\n        configurations = const.configurations\n\n        if len(const.variables) > max_graph_size:\n            msg = (\"The given csp contains a constraint {const} with {num_var} variables. \"\n                   \"This cannot be mapped to a graph with {max_graph_size} nodes. \"\n                   \"Consider checking whether your constraint is irreducible.\"\n                   \"\").format(const=const, num_var=len(const.variables), max_graph_size=max_graph_size)\n            raise ImpossibleBQM(msg)\n\n        pmodel = None\n\n        if len(const) == 0:\n            # empty constraint\n            continue\n\n        if min_classical_gap <= 2.0:\n            if len(const) == 1 and max_graph_size >= 1:\n                bqm.update(_bqm_from_1sat(const))\n                continue\n            elif len(const) == 2 and max_graph_size >= 2:\n                bqm.update(_bqm_from_2sat(const))\n                continue\n\n        # developer note: we could cache them and relabel, for now though let's do the simple thing\n        # if configurations in penalty_models:\n        #     raise NotImplementedError\n\n        for G in iter_complete_graphs(const.variables, max_graph_size + 1, aux):\n\n            # construct a specification\n            spec = pm.Specification(\n                graph=G,\n                decision_variables=const.variables,\n                feasible_configurations=configurations,\n                min_classical_gap=min_classical_gap,\n                vartype=csp.vartype\n            )\n\n            # try to use the penaltymodel ecosystem\n            try:\n                pmodel = pm.get_penalty_model(spec)\n            except pm.ImpossiblePenaltyModel:\n                # hopefully adding more variables will make it possible\n                continue\n\n            if pmodel.classical_gap >= min_classical_gap:\n                break\n\n        # developer note: we could cache them and relabel, for now though let's do the simple thing\n        # penalty_models[configurations] = pmodel\n\n        else:\n            msg = (\"No penalty model can be build for constraint {}\".format(const))\n            raise ImpossibleBQM(msg)\n\n        bqm.update(pmodel.model)\n\n    return bqm", "response": "Builds a binary quadratic model with minimal energy levels at solutions to the specified constraint satisfaction problem."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _bqm_from_1sat(constraint):\n    configurations = constraint.configurations\n    num_configurations = len(configurations)\n\n    bqm = dimod.BinaryQuadraticModel.empty(constraint.vartype)\n\n    if num_configurations == 1:\n        val, = next(iter(configurations))\n        v, = constraint.variables\n        bqm.add_variable(v, -1 if val > 0 else +1, vartype=dimod.SPIN)\n    else:\n        bqm.add_variables_from((v, 0.0) for v in constraint.variables)\n\n    return bqm", "response": "create a bqm for a constraint with only one variable\n    bqm will have exactly classical gap 2."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _bqm_from_2sat(constraint):\n    configurations = constraint.configurations\n    variables = constraint.variables\n    vartype = constraint.vartype\n    u, v = constraint.variables\n\n    # if all configurations are present, then nothing is infeasible and the bqm is just all\n    # 0.0s\n    if len(configurations) == 4:\n        return dimod.BinaryQuadraticModel.empty(constraint.vartype)\n\n    # check if the constraint is irreducible, and if so, build the bqm for its two\n    # components\n    components = irreducible_components(constraint)\n    if len(components) > 1:\n        const0 = Constraint.from_configurations(((config[0],) for config in configurations),\n                                                (u,), vartype)\n        const1 = Constraint.from_configurations(((config[1],) for config in configurations),\n                                                (v,), vartype)\n        bqm = _bqm_from_1sat(const0)\n        bqm.update(_bqm_from_1sat(const1))\n        return bqm\n\n    assert len(configurations) > 1, \"single configurations should be irreducible\"\n\n    # if it is not irreducible, and there are infeasible configurations, then it is time to\n    # start building a bqm\n    bqm = dimod.BinaryQuadraticModel.empty(vartype)\n\n    # if the constraint is not irreducible and has two configurations, then it is either eq or ne\n    if all(operator.eq(*config) for config in configurations):\n        bqm.add_interaction(u, v, -1, vartype=dimod.SPIN)  # equality\n    elif all(operator.ne(*config) for config in configurations):\n        bqm.add_interaction(u, v, +1, vartype=dimod.SPIN)  # inequality\n    elif (1, 1) not in configurations:\n        bqm.add_interaction(u, v, 2, vartype=dimod.BINARY)  # penalize (1, 1)\n    elif (-1, +1) not in configurations and (0, 1) not in configurations:\n        bqm.add_interaction(u, v, -2, vartype=dimod.BINARY)\n        bqm.add_variable(v, 2, vartype=dimod.BINARY)\n    elif (+1, -1) not in configurations and (1, 0) not in configurations:\n        bqm.add_interaction(u, v, -2, vartype=dimod.BINARY)\n        bqm.add_variable(u, 2, vartype=dimod.BINARY)\n    else:\n        # (0, 0) not in configurations\n        bqm.add_interaction(u, v, 2, vartype=dimod.BINARY)\n        bqm.add_variable(u, -2, vartype=dimod.BINARY)\n        bqm.add_variable(v, -2, vartype=dimod.BINARY)\n\n    return bqm", "response": "create a bqm for a constraint with two variables."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef iter_complete_graphs(start, stop, factory=None):\n    _, nodes = start\n    nodes = list(nodes)  # we'll be appending\n\n    if factory is None:\n        factory = count()\n\n    while len(nodes) < stop:\n        # we need to construct a new graph each time, this is actually faster than copy and add\n        # the new edges in any case\n        G = nx.complete_graph(nodes)\n        yield G\n\n        v = next(factory)\n        while v in G:\n            v = next(factory)\n\n        nodes.append(v)", "response": "Iterate over the complete graphs."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef load_cnf(fp):\n\n    fp = iter(fp)  # handle lists/tuples/etc\n\n    csp = ConstraintSatisfactionProblem(dimod.BINARY)\n\n    # first look for the problem\n    num_clauses = num_variables = 0\n    problem_pattern = re.compile(_PROBLEM_REGEX)\n    for line in fp:\n        matches = problem_pattern.findall(line)\n        if matches:\n            if len(matches) > 1:\n                raise ValueError\n            nv, nc = matches[0]\n            num_variables, num_clauses = int(nv), int(nc)\n            break\n\n    # now parse the clauses, picking up where we left off looking for the header\n    clause_pattern = re.compile(_CLAUSE_REGEX)\n    for line in fp:\n        if clause_pattern.match(line) is not None:\n            clause = [int(v) for v in line.split(' ')[:-1]]  # line ends with a trailing 0\n\n            # -1 is the notation for NOT(1)\n            variables = [abs(v) for v in clause]\n\n            f = _cnf_or(clause)\n\n            csp.add_constraint(f, variables)\n\n    for v in range(1, num_variables+1):\n        csp.add_variable(v)\n    for v in csp.variables:\n        if v > num_variables:\n            msg = (\"given .cnf file's header defines variables [1, {}] and {} clauses \"\n                   \"but constraints a reference to variable {}\").format(num_variables, num_clauses, v)\n            raise ValueError(msg)\n\n    if len(csp) != num_clauses:\n        msg = (\"given .cnf file's header defines {} \"\n               \"clauses but the file contains {}\").format(num_clauses, len(csp))\n        raise ValueError(msg)\n\n    return csp", "response": "Load a constraint satisfaction problem from a. cnf file."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef and_gate(variables, vartype=dimod.BINARY, name='AND'):\n\n    variables = tuple(variables)\n\n    if vartype is dimod.BINARY:\n        configurations = frozenset([(0, 0, 0),\n                                    (0, 1, 0),\n                                    (1, 0, 0),\n                                    (1, 1, 1)])\n\n        def func(in1, in2, out): return (in1 and in2) == out\n\n    else:\n        # SPIN, vartype is checked by the decorator\n        configurations = frozenset([(-1, -1, -1),\n                                    (-1, +1, -1),\n                                    (+1, -1, -1),\n                                    (+1, +1, +1)])\n\n        def func(in1, in2, out): return ((in1 > 0) and (in2 > 0)) == (out > 0)\n\n    return Constraint(func, configurations, variables, vartype=vartype, name=name)", "response": "Returns a constraint that is satisfied when the variables in the variables are assigned values that match the valid states of an AND gate."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef xor_gate(variables, vartype=dimod.BINARY, name='XOR'):\n\n    variables = tuple(variables)\n    if vartype is dimod.BINARY:\n        configs = frozenset([(0, 0, 0),\n                             (0, 1, 1),\n                             (1, 0, 1),\n                             (1, 1, 0)])\n\n        def func(in1, in2, out): return (in1 != in2) == out\n\n    else:\n        # SPIN, vartype is checked by the decorator\n        configs = frozenset([(-1, -1, -1),\n                             (-1, +1, +1),\n                             (+1, -1, +1),\n                             (+1, +1, -1)])\n\n        def func(in1, in2, out): return ((in1 > 0) != (in2 > 0)) == (out > 0)\n\n    return Constraint(func, configs, variables, vartype=vartype, name=name)", "response": "Returns a constraint that is satisfied when the variables of an XOR gate are set to the current state of the current state."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nfull adder for a set of variables.", "response": "def fulladder_gate(variables, vartype=dimod.BINARY, name='FULL_ADDER'):\n    \"\"\"Full adder.\n\n    Args:\n        variables (list): Variable labels for the and gate as `[in1, in2, in3, sum, carry]`,\n            where `in1, in2, in3` are inputs to be added and `sum` and 'carry' the resultant\n            outputs.\n        vartype (Vartype, optional, default='BINARY'): Variable type. Accepted\n            input values:\n\n            * Vartype.SPIN, 'SPIN', {-1, 1}\n            * Vartype.BINARY, 'BINARY', {0, 1}\n        name (str, optional, default='FULL_ADDER'): Name for the constraint.\n\n    Returns:\n        Constraint(:obj:`.Constraint`): Constraint that is satisfied when its variables are\n        assigned values that match the valid states of a Boolean full adder.\n\n    Examples:\n        >>> import dwavebinarycsp\n        >>> import dwavebinarycsp.factories.constraint.gates as gates\n        >>> csp = dwavebinarycsp.ConstraintSatisfactionProblem(dwavebinarycsp.BINARY)\n        >>> csp.add_constraint(gates.fulladder_gate(['a', 'b', 'c_in', 'total', 'c_out'], name='FA1'))\n        >>> csp.check({'a': 1, 'b': 0, 'c_in': 1, 'total': 0, 'c_out': 1})\n        True\n\n    \"\"\"\n\n    variables = tuple(variables)\n\n    if vartype is dimod.BINARY:\n        configs = frozenset([(0, 0, 0, 0, 0),\n                             (0, 0, 1, 1, 0),\n                             (0, 1, 0, 1, 0),\n                             (0, 1, 1, 0, 1),\n                             (1, 0, 0, 1, 0),\n                             (1, 0, 1, 0, 1),\n                             (1, 1, 0, 0, 1),\n                             (1, 1, 1, 1, 1)])\n\n    else:\n        # SPIN, vartype is checked by the decorator\n        configs = frozenset([(-1, -1, -1, -1, -1),\n                             (-1, -1, +1, +1, -1),\n                             (-1, +1, -1, +1, -1),\n                             (-1, +1, +1, -1, +1),\n                             (+1, -1, -1, +1, -1),\n                             (+1, -1, +1, -1, +1),\n                             (+1, +1, -1, -1, +1),\n                             (+1, +1, +1, +1, +1)])\n\n    def func(in1, in2, in3, sum_, carry):\n        total = (in1 > 0) + (in2 > 0) + (in3 > 0)\n        if total == 0:\n            return (sum_ <= 0) and (carry <= 0)\n        elif total == 1:\n            return (sum_ > 0) and (carry <= 0)\n        elif total == 2:\n            return (sum_ <= 0) and (carry > 0)\n        elif total == 3:\n            return (sum_ > 0) and (carry > 0)\n        else:\n            raise ValueError(\"func recieved unexpected values\")\n\n    return Constraint(func, configs, variables, vartype=vartype, name=name)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef sat2in4(pos, neg=tuple(), vartype=dimod.BINARY, name='2-in-4'):\n    pos = tuple(pos)\n    neg = tuple(neg)\n\n    variables = pos + neg\n\n    if len(variables) != 4:\n        raise ValueError(\"\")\n\n    if neg and (len(neg) < 4):\n        # because 2-in-4 sat is symmetric, all negated is the same as none negated\n\n        const = sat2in4(pos=variables, vartype=vartype)  # make one that has no negations\n        for v in neg:\n            const.flip_variable(v)\n            const.name = name  # overwrite the name directly\n\n        return const\n\n    # we can just construct them directly for speed\n    if vartype is dimod.BINARY:\n        configurations = frozenset([(0, 0, 1, 1),\n                                    (0, 1, 0, 1),\n                                    (1, 0, 0, 1),\n                                    (0, 1, 1, 0),\n                                    (1, 0, 1, 0),\n                                    (1, 1, 0, 0)])\n    else:\n        # SPIN, vartype is checked by the decorator\n        configurations = frozenset([(-1, -1, +1, +1),\n                                    (-1, +1, -1, +1),\n                                    (+1, -1, -1, +1),\n                                    (-1, +1, +1, -1),\n                                    (+1, -1, +1, -1),\n                                    (+1, +1, -1, -1)])\n\n    def func(a, b, c, d):\n        if a == b:\n            return (b != c) and (c == d)\n        elif a == c:\n            # a != b\n            return b == d\n        else:\n            # a != b, a != c => b == c\n            return a == d\n\n    return Constraint(func, configurations, variables, vartype=vartype, name=name)", "response": "Return a new set of two - in - four variables that satisfy a two - in - four Satisfiability problem."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nreturning a random two - in - four constraint problem.", "response": "def random_2in4sat(num_variables, num_clauses, vartype=dimod.BINARY, satisfiable=True):\n    \"\"\"Random two-in-four (2-in-4) constraint satisfaction problem.\n\n    Args:\n        num_variables (integer): Number of variables (at least four).\n        num_clauses (integer): Number of constraints that together constitute the\n            constraint satisfaction problem.\n        vartype (Vartype, optional, default='BINARY'): Variable type. Accepted\n            input values:\n\n            * Vartype.SPIN, 'SPIN', {-1, 1}\n            * Vartype.BINARY, 'BINARY', {0, 1}\n        satisfiable (bool, optional, default=True): True if the CSP can be satisfied.\n\n    Returns:\n        CSP (:obj:`.ConstraintSatisfactionProblem`): CSP that is satisfied when its variables\n        are assigned values that satisfy a two-in-four satisfiability problem.\n\n    Examples:\n        This example creates a CSP with 6 variables and two random constraints and checks\n        whether a particular assignment of variables satisifies it.\n\n        >>> import dwavebinarycsp\n        >>> import dwavebinarycsp.factories as sat\n        >>> csp = sat.random_2in4sat(6, 2)\n        >>> csp.constraints    # doctest: +SKIP\n        [Constraint.from_configurations(frozenset({(1, 0, 1, 0), (1, 0, 0, 1), (1, 1, 1, 1), (0, 1, 1, 0), (0, 0, 0, 0),\n         (0, 1, 0, 1)}), (2, 4, 0, 1), Vartype.BINARY, name='2-in-4'),\n         Constraint.from_configurations(frozenset({(1, 0, 1, 1), (1, 1, 0, 1), (1, 1, 1, 0), (0, 0, 0, 1),\n         (0, 1, 0, 0), (0, 0, 1, 0)}), (1, 2, 4, 5), Vartype.BINARY, name='2-in-4')]\n        >>> csp.check({0: 1, 1: 0, 2: 1, 3: 1, 4: 0, 5: 0})       # doctest: +SKIP\n        True\n\n\n    \"\"\"\n\n    if num_variables < 4:\n        raise ValueError(\"a 2in4 problem needs at least 4 variables\")\n    if num_clauses > 16 * _nchoosek(num_variables, 4):  # 16 different negation patterns\n        raise ValueError(\"too many clauses\")\n\n    # also checks the vartype argument\n    csp = ConstraintSatisfactionProblem(vartype)\n\n    variables = list(range(num_variables))\n\n    constraints = set()\n\n    if satisfiable:\n        values = tuple(vartype.value)\n        planted_solution = {v: choice(values) for v in variables}\n\n        configurations = [(0, 0, 1, 1), (0, 1, 0, 1), (1, 0, 0, 1),\n                          (0, 1, 1, 0), (1, 0, 1, 0), (1, 1, 0, 0)]\n\n        while len(constraints) < num_clauses:\n            # sort the variables because constraints are hashed on configurations/variables\n            # because 2-in-4 sat is symmetric, we would not get a hash conflict for different\n            # variable orders\n            constraint_variables = sorted(sample(variables, 4))\n\n            # pick (uniformly) a configuration and determine which variables we need to negate to\n            # match the chosen configuration\n            config = choice(configurations)\n            pos = tuple(v for idx, v in enumerate(constraint_variables) if config[idx] == (planted_solution[v] > 0))\n            neg = tuple(v for idx, v in enumerate(constraint_variables) if config[idx] != (planted_solution[v] > 0))\n\n            const = sat2in4(pos=pos, neg=neg, vartype=vartype)\n\n            assert const.check(planted_solution)\n\n            constraints.add(const)\n    else:\n        while len(constraints) < num_clauses:\n            # sort the variables because constraints are hashed on configurations/variables\n            # because 2-in-4 sat is symmetric, we would not get a hash conflict for different\n            # variable orders\n            constraint_variables = sorted(sample(variables, 4))\n\n            # randomly determine negations\n            pos = tuple(v for v in constraint_variables if random() > .5)\n            neg = tuple(v for v in constraint_variables if v not in pos)\n\n            const = sat2in4(pos=pos, neg=neg, vartype=vartype)\n\n            constraints.add(const)\n\n    for const in constraints:\n        csp.add_constraint(const)\n\n    # in case any variables didn't make it in\n    for v in variables:\n        csp.add_variable(v)\n\n    return csp"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef random_xorsat(num_variables, num_clauses, vartype=dimod.BINARY, satisfiable=True):\n    if num_variables < 3:\n        raise ValueError(\"a xor problem needs at least 3 variables\")\n    if num_clauses > 8 * _nchoosek(num_variables, 3):  # 8 different negation patterns\n        raise ValueError(\"too many clauses\")\n\n    # also checks the vartype argument\n    csp = ConstraintSatisfactionProblem(vartype)\n\n    variables = list(range(num_variables))\n\n    constraints = set()\n\n    if satisfiable:\n        values = tuple(vartype.value)\n        planted_solution = {v: choice(values) for v in variables}\n\n        configurations = [(0, 0, 0), (0, 1, 1), (1, 0, 1), (1, 1, 0)]\n\n        while len(constraints) < num_clauses:\n            # because constraints are hashed on configurations/variables, and because the inputs\n            # to xor can be swapped without loss of generality, we can order them\n            x, y, z = sample(variables, 3)\n            if y > x:\n                x, y = y, x\n\n            # get the constraint\n            const = xor_gate([x, y, z], vartype=vartype)\n\n            # pick (uniformly) a configuration and determine which variables we need to negate to\n            # match the chosen configuration\n            config = choice(configurations)\n\n            for idx, v in enumerate(const.variables):\n                if config[idx] != (planted_solution[v] > 0):\n                    const.flip_variable(v)\n\n            assert const.check(planted_solution)\n\n            constraints.add(const)\n    else:\n        while len(constraints) < num_clauses:\n            # because constraints are hashed on configurations/variables, and because the inputs\n            # to xor can be swapped without loss of generality, we can order them\n            x, y, z = sample(variables, 3)\n            if y > x:\n                x, y = y, x\n\n            # get the constraint\n            const = xor_gate([x, y, z], vartype=vartype)\n\n            # randomly flip each variable in the constraint\n            for idx, v in enumerate(const.variables):\n                if random() > .5:\n                    const.flip_variable(v)\n\n            assert const.check(planted_solution)\n\n            constraints.add(const)\n\n    for const in constraints:\n        csp.add_constraint(const)\n\n    # in case any variables didn't make it in\n    for v in variables:\n        csp.add_variable(v)\n\n    return csp", "response": "Random XOR constraint satisfaction problem."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef kwarg_decorator(func):\n    @wraps(func)\n    def decorator(arg=None, **kwargs):\n        if arg is None:\n            return lambda arg: decorator(arg, **kwargs)\n        return func(arg, **kwargs)\n    return decorator", "response": "A decorator that can be used to decorate functions that accept a single arg and some kwargs in to a\n   "}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef signature_matches(func, args=(), kwargs={}):\n    try:\n        sig = inspect.signature(func)\n        sig.bind(*args, **kwargs)\n    except TypeError:\n        return False\n    else:\n        return True", "response": "Check if a function is callable with some args and kwargs."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nallowing a function to be used as either a decorator with args, or called as a normal function. @last_arg_decorator def register_a_thing(foo, func, bar=True): .. # Called as a decorator @register_a_thing(\"abc\", bar=False) def my_func(): ... # Called as a normal function call def my_other_func(): ... register_a_thing(\"def\", my_other_func, bar=True)", "response": "def last_arg_decorator(func):\n    \"\"\"\n    Allows a function to be used as either a decorator with args, or called as\n    a normal function.\n\n    @last_arg_decorator\n    def register_a_thing(foo, func, bar=True):\n        ..\n\n    # Called as a decorator\n    @register_a_thing(\"abc\", bar=False)\n    def my_func():\n        ...\n\n    # Called as a normal function call\n    def my_other_func():\n        ...\n\n    register_a_thing(\"def\", my_other_func, bar=True)\n    \"\"\"\n    @wraps(func)\n    def decorator(*args, **kwargs):\n        if signature_matches(func, args, kwargs):\n            return func(*args, **kwargs)\n        else:\n            return lambda last: func(*(args + (last,)), **kwargs)\n    return decorator"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nadd a model chooser definition to the registry.", "response": "def register_chooser(self, chooser, **kwargs):\n        \"\"\"Adds a model chooser definition to the registry.\"\"\"\n        if not issubclass(chooser, Chooser):\n            return self.register_simple_chooser(chooser, **kwargs)\n\n        self.choosers[chooser.model] = chooser(**kwargs)\n        return chooser"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\ngenerates a model chooser definition from a model and adds it to the registry.", "response": "def register_simple_chooser(self, model, **kwargs):\n        \"\"\"\n        Generates a model chooser definition from a model, and adds it to the\n        registry.\n        \"\"\"\n        name = '{}Chooser'.format(model._meta.object_name)\n        attrs = {'model': model}\n        attrs.update(kwargs)\n\n        chooser = type(name, (Chooser,), attrs)\n        self.register_chooser(chooser)\n\n        return model"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ngive an instance string in the form app. Model : pk returns a tuple of the model and instance. Raises ValueError on invalid instance string or missing instances.", "response": "def instance_from_str(instance_str):\n    \"\"\"\n    Given an instance string in the form \"app.Model:pk\", returns a tuple of\n    ``(model, instance)``. If the pk part is empty, ``instance`` will be\n    ``None``. Raises ``ValueError`` on invalid model strings or missing\n    instances.\n    \"\"\"\n    match = instance_str_re.match(instance_str)\n    if not match:\n        raise ValueError(\"Invalid instance string\")\n\n    model_string = match.group(1)\n    try:\n        model = apps.get_model(model_string)\n    except (LookupError, ValueError):\n        raise ValueError(\"Invalid instance string\")\n\n    pk = match.group(2)\n    if pk:\n        try:\n            return model, model._default_manager.get(pk=pk)\n        except model.DoesNotExist:\n            raise ValueError(\"Invalid instance string\")\n\n    return model, None"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ngets the audio - related fields for this client.", "response": "def formatter(self, api_client, data, newval):\n        \"\"\"Get audio-related fields\n\n        Try to find fields for the audio url for specified preferred quality\n        level, or next-lowest available quality url otherwise.\n        \"\"\"\n        url_map = data.get(\"audioUrlMap\")\n        audio_url = data.get(\"audioUrl\")\n\n        # Only an audio URL, not a quality map. This happens for most of the\n        # mobile client tokens and some of the others now. In this case\n        # substitute the empirically determined default values in the format\n        # used by the rest of the function so downstream consumers continue to\n        # work.\n        if audio_url and not url_map:\n            url_map = {\n                BaseAPIClient.HIGH_AUDIO_QUALITY: {\n                    \"audioUrl\": audio_url,\n                    \"bitrate\": 64,\n                    \"encoding\": \"aacplus\",\n                }\n            }\n        elif not url_map:  # No audio url available (e.g. ad tokens)\n            return None\n\n        valid_audio_formats = [BaseAPIClient.HIGH_AUDIO_QUALITY,\n                               BaseAPIClient.MED_AUDIO_QUALITY,\n                               BaseAPIClient.LOW_AUDIO_QUALITY]\n\n        # Only iterate over sublist, starting at preferred audio quality, or\n        # from the beginning of the list if nothing is found. Ensures that the\n        # bitrate used will always be the same or lower quality than was\n        # specified to prevent audio from skipping for slow connections.\n        preferred_quality = api_client.default_audio_quality\n        if preferred_quality in valid_audio_formats:\n            i = valid_audio_formats.index(preferred_quality)\n            valid_audio_formats = valid_audio_formats[i:]\n\n        for quality in valid_audio_formats:\n            audio_url = url_map.get(quality)\n\n            if audio_url:\n                return audio_url[self.field]\n\n        return audio_url[self.field] if audio_url else None"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nparses additional url fields and map them to inputs Attempt to create a dictionary with keys being user input and and response being the returned URL", "response": "def formatter(self, api_client, data, newval):\n        \"\"\"Parse additional url fields and map them to inputs\n\n        Attempt to create a dictionary with keys being user input, and\n        response being the returned URL\n        \"\"\"\n        if newval is None:\n            return None\n\n        user_param = data['_paramAdditionalUrls']\n        urls = {}\n        if isinstance(newval, str):\n            urls[user_param[0]] = newval\n        else:\n            for key, url in zip(user_param, newval):\n                urls[key] = url\n        return urls"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef from_json_list(cls, api_client, data):\n        return [cls.from_json(api_client, item) for item in data]", "response": "Convert a list of JSON values to a list of models\n       "}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef populate_fields(api_client, instance, data):\n        for key, value in instance.__class__._fields.items():\n            default = getattr(value, \"default\", None)\n            newval = data.get(value.field, default)\n\n            if isinstance(value, SyntheticField):\n                newval = value.formatter(api_client, data, newval)\n                setattr(instance, key, newval)\n                continue\n\n            model_class = getattr(value, \"model\", None)\n            if newval and model_class:\n                if isinstance(newval, list):\n                    newval = model_class.from_json_list(api_client, newval)\n                else:\n                    newval = model_class.from_json(api_client, newval)\n\n            if newval and value.formatter:\n                newval = value.formatter(api_client, newval)\n\n            setattr(instance, key, newval)", "response": "Populate all fields of a model with data."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nconvert one JSON value to a model object", "response": "def from_json(cls, api_client, data):\n        \"\"\"Convert one JSON value to a model object\n        \"\"\"\n        self = cls(api_client)\n        PandoraModel.populate_fields(api_client, self, data)\n        return self"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _send_cmd(self, cmd):\n        self._process.stdin.write(\"{}\\n\".format(cmd).encode(\"utf-8\"))\n        self._process.stdin.flush()", "response": "Write command to remote process\n       "}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nensure that the player is started.", "response": "def _ensure_started(self):\n        \"\"\"Ensure player backing process is started\n        \"\"\"\n        if self._process and self._process.poll() is None:\n            return\n\n        if not getattr(self, \"_cmd\"):\n            raise RuntimeError(\"Player command is not configured\")\n\n        log.debug(\"Starting playback command: %r\", self._cmd)\n        self._process = SilentPopen(self._cmd)\n        self._post_start()"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nplay a new song from a Pandora model.", "response": "def play(self, song):\n        \"\"\"Play a new song from a Pandora model\n\n        Returns once the stream starts but does not shut down the remote audio\n        output backend process. Calls the input callback when the user has\n        input.\n        \"\"\"\n        self._callbacks.play(song)\n        self._load_track(song)\n        time.sleep(2)  # Give the backend time to load the track\n\n        while True:\n            try:\n                self._callbacks.pre_poll()\n                self._ensure_started()\n                self._loop_hook()\n\n                readers, _, _ = select.select(\n                    self._get_select_readers(), [], [], 1)\n\n                for handle in readers:\n                    if handle.fileno() == self._control_fd:\n                        self._callbacks.input(handle.readline().strip(), song)\n                    else:\n                        value = self._read_from_process(handle)\n                        if self._player_stopped(value):\n                            return\n            finally:\n                self._callbacks.post_poll()"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef play_station(self, station):\n        for song in iterate_forever(station.get_playlist):\n            try:\n                self.play(song)\n            except StopIteration:\n                self.stop()\n                return", "response": "Play the station until something ends"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _post_start(self):\n        flags = fcntl.fcntl(self._process.stdout, fcntl.F_GETFL)\n        fcntl.fcntl(self._process.stdout, fcntl.F_SETFL, flags | os.O_NONBLOCK)", "response": "Set stdout to non - blocking mode and read the next available entry."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nformat a station selection menu and make the user select a station", "response": "def station_selection_menu(self, error=None):\n        \"\"\"Format a station menu and make the user select a station\n        \"\"\"\n        self.screen.clear()\n\n        if error:\n            self.screen.print_error(\"{}\\n\".format(error))\n\n        for i, station in enumerate(self.stations):\n            i = \"{:>3}\".format(i)\n            print(\"{}: {}\".format(Colors.yellow(i), station.name))\n\n        return self.stations[self.screen.get_integer(\"Station: \")]"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nplay callback for a single song", "response": "def play(self, song):\n        \"\"\"Play callback\n        \"\"\"\n        if song.is_ad:\n            print(\"{} \".format(Colors.cyan(\"Advertisement\")))\n        else:\n            print(\"{} by {}\".format(Colors.cyan(song.song_name),\n                                    Colors.yellow(song.artist_name)))"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\ninput callback handles key presses", "response": "def input(self, input, song):\n        \"\"\"Input callback, handles key presses\n        \"\"\"\n        try:\n            cmd = getattr(self, self.CMD_MAP[input][1])\n        except (IndexError, KeyError):\n            return self.screen.print_error(\n                \"Invalid command {!r}!\".format(input))\n\n        cmd(song)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nfunction decorator implementing retrying logic. exceptions: A tuple of exception classes; default (Exception,) The decorator will call the function up to max_tries times if it raises an exception. By default it catches instances of the Exception class and subclasses. This will recover after all but the most fatal errors. You may specify a custom tuple of exception classes with the 'exceptions' argument; the function will only be retried if it raises one of the specified exceptions.", "response": "def retries(max_tries, exceptions=(Exception,)):\n    \"\"\"Function decorator implementing retrying logic.\n\n    exceptions: A tuple of exception classes; default (Exception,)\n\n    The decorator will call the function up to max_tries times if it raises\n    an exception.\n\n    By default it catches instances of the Exception class and subclasses.\n    This will recover after all but the most fatal errors. You may specify a\n    custom tuple of exception classes with the 'exceptions' argument; the\n    function will only be retried if it raises one of the specified\n    exceptions.\n    \"\"\"\n    def decorator(func):\n        def function(*args, **kwargs):\n\n            retries_left = max_tries\n            while retries_left > 0:\n                try:\n                    retries_left -= 1\n                    return func(*args, **kwargs)\n\n                except exceptions as exc:\n                    # Don't retry for PandoraExceptions - unlikely that result\n                    # will change for same set of input parameters.\n                    if isinstance(exc, PandoraException):\n                        raise\n                    if retries_left > 0:\n                        time.sleep(delay_exponential(\n                            0.5, 2, max_tries - retries_left))\n                    else:\n                        raise\n\n        return function\n\n    return decorator"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef delay_exponential(base, growth_factor, attempts):\n    if base == 'rand':\n        base = random.random()\n    elif base <= 0:\n        raise ValueError(\"The 'base' param must be greater than 0, \"\n                         \"got: {}\".format(base))\n    time_to_sleep = base * (growth_factor ** (attempts - 1))\n    return time_to_sleep", "response": "Calculate time to sleep based on exponential function."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef iterate_forever(func, *args, **kwargs):\n    output = func(*args, **kwargs)\n\n    while True:\n        try:\n            playlist_item = next(output)\n            playlist_item.prepare_playback()\n            yield playlist_item\n        except StopIteration:\n            output = func(*args, **kwargs)", "response": "Iterate over a finite iterator forever yielding items."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\ngather user input and convert it to an integer", "response": "def get_integer(prompt):\n        \"\"\"Gather user input and convert it to an integer\n\n        Will keep trying till the user enters an interger or until they ^C the\n        program.\n        \"\"\"\n        while True:\n            try:\n                return int(input(prompt).strip())\n            except ValueError:\n                print(Colors.red(\"Invalid Input!\"))"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ncollect results a list of results a list of results", "response": "def collect(self, dataset_readers_list):\n        \"\"\"collect results\n\n\n        Returns:\n            a list of results\n\n        \"\"\"\n\n        ret = [ ]\n        for i, collector in enumerate(self.components):\n            report = ProgressReport(name='collecting results', done=(i + 1), total=len(self.components))\n            alphatwirl.progressbar.report_progress(report)\n            ret.append(collector.collect([(dataset, tuple(r.readers[i] for r in readerComposites))\n                                          for dataset, readerComposites in dataset_readers_list]))\n        return ret"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nopening the drop box", "response": "def open(self):\n        \"\"\"open the drop box\n\n        You need to call this method before starting putting packages.\n\n        Returns\n        -------\n        None\n\n        \"\"\"\n\n        self.workingArea.open()\n        self.runid_pkgidx_map = { }\n        self.runid_to_return = deque()"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nput a task This method places a task in the working area and have the dispatcher execute it. If you need to put multiple tasks, it can be much faster to use `put_multiple()` than to use this method multiple times depending of the dispatcher. Parameters ---------- package : callable A task Returns ------- int A package index assigned by the working area", "response": "def put(self, package):\n        \"\"\"put a task\n\n        This method places a task in the working area and have the\n        dispatcher execute it.\n\n        If you need to put multiple tasks, it can be much faster to\n        use `put_multiple()` than to use this method multiple times\n        depending of the dispatcher.\n\n        Parameters\n        ----------\n        package : callable\n            A task\n\n        Returns\n        -------\n        int\n            A package index assigned by the working area\n\n        \"\"\"\n\n        pkgidx = self.workingArea.put_package(package)\n\n        logger = logging.getLogger(__name__)\n        logger.info('submitting {}'.format(self.workingArea.package_relpath(pkgidx)))\n\n        runid = self.dispatcher.run(self.workingArea, pkgidx)\n        self.runid_pkgidx_map[runid] = pkgidx\n        return pkgidx"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef put_multiple(self, packages):\n\n        pkgidxs = [self.workingArea.put_package(p) for p in packages]\n\n        logger = logging.getLogger(__name__)\n        logger.info('submitting {}'.format(\n            ', '.join(['{}'.format(self.workingArea.package_relpath(i)) for i in pkgidxs])\n        ))\n\n        runids = self.dispatcher.run_multiple(self.workingArea, pkgidxs)\n        self.runid_pkgidx_map.update(zip(runids, pkgidxs))\n        return pkgidxs", "response": "put tasks in the working area and have\n        the dispatcher execute them"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nreturns pairs of package indices and results of all tasks This method waits until all tasks finish. Returns ------- list A list of pairs of package indices and results", "response": "def receive(self):\n        \"\"\"return pairs of package indices and results of all tasks\n\n        This method waits until all tasks finish.\n\n        Returns\n        -------\n        list\n            A list of pairs of package indices and results\n\n        \"\"\"\n\n        ret = [ ] # a list of (pkgid, result)\n        while True:\n\n            if self.runid_pkgidx_map:\n                self.runid_to_return.extend(self.dispatcher.poll())\n                ret.extend(self._collect_all_finished_pkgidx_result_pairs())\n\n            if not self.runid_pkgidx_map:\n                break\n            time.sleep(self.sleep)\n\n        ret = sorted(ret, key=itemgetter(0))\n\n        return ret"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef poll(self):\n\n        self.runid_to_return.extend(self.dispatcher.poll())\n        ret = self._collect_all_finished_pkgidx_result_pairs()\n        return ret", "response": "Returns a list of pairs of package indices and results of finished tasks\n       "}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nreturning a pair of a package index and result of a task This method waits until a tasks finishes. It returns `None` if no task is running. Returns ------- tuple or None A pair of a package index and result. `None` if no tasks is running.", "response": "def receive_one(self):\n        \"\"\"return a pair of a package index and result of a task\n\n        This method waits until a tasks finishes. It returns `None` if\n        no task is running.\n\n        Returns\n        -------\n        tuple or None\n            A pair of a package index and result. `None` if no tasks\n            is running.\n\n        \"\"\"\n\n        if not self.runid_pkgidx_map:\n            return None\n\n        while True:\n\n            if not self.runid_to_return:\n                self.runid_to_return.extend(self.dispatcher.poll())\n\n            ret = self._collect_next_finished_pkgidx_result_pair()\n\n            if ret is not None:\n                break\n\n            if self.runid_pkgidx_map:\n                time.sleep(self.sleep)\n\n        return ret"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nrunning the event loops in the background.", "response": "def run_multiple(self, eventLoops):\n        \"\"\"run the event loops in the background.\n\n        Args:\n            eventLoops (list): a list of event loops to run\n\n        \"\"\"\n\n        self.nruns += len(eventLoops)\n        return self.communicationChannel.put_multiple(eventLoops)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn pairs of run ids and results of finish event loops.", "response": "def poll(self):\n        \"\"\"Return pairs of run ids and results of finish event loops.\n        \"\"\"\n        ret = self.communicationChannel.receive_finished()\n        self.nruns -= len(ret)\n        return ret"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning a pair of a run id and a result. This method waits until an event loop finishes. This method returns None if no loop is running.", "response": "def receive_one(self):\n        \"\"\"Return a pair of a run id and a result.\n\n        This method waits until an event loop finishes.\n        This method returns None if no loop is running.\n        \"\"\"\n        if self.nruns == 0:\n            return None\n        ret = self.communicationChannel.receive_one()\n        if ret is not None:\n            self.nruns -= 1\n        return ret"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef receive(self):\n        ret = self.communicationChannel.receive_all()\n        self.nruns -= len(ret)\n        if self.nruns > 0:\n            import logging\n            logger = logging.getLogger(__name__)\n            logger.warning(\n                'too few results received: {} results received, {} more expected'.format(\n                    len(ret), self.nruns))\n        elif self.nruns < 0:\n            import logging\n            logger = logging.getLogger(__name__)\n            logger.warning(\n                'too many results received: {} results received, {} too many'.format(\n                    len(ret), -self.nruns))\n        return ret", "response": "This method waits until all event loops finish\n            and returns results."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nwaiting until all event loops end and returns the results.", "response": "def end(self):\n        \"\"\"wait until all event loops end and returns the results.\n\n        \"\"\"\n\n        results = self.communicationChannel.receive()\n\n        if self.nruns != len(results):\n            import logging\n            logger = logging.getLogger(__name__)\n            # logger.setLevel(logging.DEBUG)\n            logger.warning(\n                'too few results received: {} results received, {} expected'.format(\n                    len(results),\n                    self.nruns\n                ))\n\n        return results"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nconverts a dictionary to a list of tuples.", "response": "def key_vals_dict_to_tuple_list(key_vals_dict, fill=float('nan')):\n    \"\"\"Convert ``key_vals_dict`` to `tuple_list``.\n\n    Args:\n        key_vals_dict (dict): The first parameter.\n        fill: a value to fill missing data\n\n    Returns:\n        A list of tuples\n\n    \"\"\"\n\n    tuple_list = [ ]\n\n    if not key_vals_dict: return tuple_list\n\n    vlen = max([len(vs) for vs in itertools.chain(*key_vals_dict.values())])\n\n    for k, vs in key_vals_dict.items():\n        try:\n            tuple_list.extend([k + tuple(v) + (fill, )*(vlen - len(v)) for v in vs])\n        except TypeError:\n            # assume k is not a tuple\n            tuple_list.extend([(k, ) + tuple(v) + (fill, )*(vlen - len(v)) for v in vs])\n\n\n    return tuple_list"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nopening the working area .", "response": "def open(self):\n        \"\"\"Open the working area\n\n        Returns\n        -------\n        None\n        \"\"\"\n\n        self.path = self._prepare_dir(self.topdir)\n        self._copy_executable(area_path=self.path)\n        self._save_logging_levels(area_path=self.path)\n        self._put_python_modules(modules=self.python_modules, area_path=self.path)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef put_package(self, package):\n\n        self.last_package_index += 1\n        package_index = self.last_package_index\n\n        package_fullpath = self.package_fullpath(package_index)\n        # e.g., '{path}/tpd_20161129_122841_HnpcmF/task_00009.p.gz'\n\n        with gzip.open(package_fullpath, 'wb') as f:\n            pickle.dump(package, f, protocol=pickle.HIGHEST_PROTOCOL)\n            f.close()\n\n        result_fullpath = self.result_fullpath(package_index)\n        # e.g., '{path}/tpd_20161129_122841_HnpcmF/results/task_00009/result.p.gz'\n\n        result_dir = os.path.dirname(result_fullpath)\n        # e.g., '{path}/tpd_20161129_122841_HnpcmF/results/task_00009'\n\n        alphatwirl.mkdir_p(result_dir)\n\n        return package_index", "response": "Put a package into the cache."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef collect_result(self, package_index):\n\n        result_fullpath = self.result_fullpath(package_index)\n        # e.g., '{path}/tpd_20161129_122841_HnpcmF/results/task_00009/result.p.gz'\n\n        try:\n            with gzip.open(result_fullpath, 'rb') as f:\n                result = pickle.load(f)\n        except Exception as e:\n            logger = logging.getLogger(__name__)\n            logger.warning(e)\n            return None\n\n        return result", "response": "Collect the result of a task in the specified package."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreturn the full path of the package containing the does not exist entry.", "response": "def package_fullpath(self, package_index):\n        \"\"\"Returns the full path of the package\n\n        This method returns the full path to the package. This method\n        simply constructs the path based on the convention and doesn't\n        check if the package actually exists.\n\n        Parameters\n        ----------\n        package_index :\n            a package index\n\n        Returns\n        -------\n        str\n            the full path to the package\n\n        \"\"\"\n\n        ret = os.path.join(self.path, self.package_relpath(package_index))\n        # e.g., '{path}/tpd_20161129_122841_HnpcmF/task_00009.p.gz'\n\n        return ret"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nreturn the relative path of the result.", "response": "def result_relpath(self, package_index):\n        \"\"\"Returns the relative path of the result\n\n        This method returns the path to the result relative to the\n        top dir of the working area. This method simply constructs the\n        path based on the convention and doesn't check if the result\n        actually exists.\n\n        Parameters\n        ----------\n        package_index :\n            a package index\n\n        Returns\n        -------\n        str\n            the relative path to the result\n\n        \"\"\"\n\n        dirname = 'task_{:05d}'.format(package_index)\n        # e.g., 'task_00009'\n\n        ret = os.path.join('results', dirname, 'result.p.gz')\n        # e.g., 'results/task_00009/result.p.gz'\n\n        return ret"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef result_fullpath(self, package_index):\n\n        ret = os.path.join(self.path, self.result_relpath(package_index))\n        # e.g., '{path}/tpd_20161129_122841_HnpcmF/results/task_00009/result.p.gz'\n\n        return ret", "response": "Returns the full path of the result."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nsubmitting multiple jobs to the working area.", "response": "def run_multiple(self, workingArea, package_indices):\n        \"\"\"Submit multiple jobs\n\n        Parameters\n        ----------\n        workingArea :\n            A workingArea\n        package_indices : list(int)\n            A list of package indices\n\n        Returns\n        -------\n        list(str)\n            The list of the run IDs of the jobs\n\n        \"\"\"\n\n        if not package_indices:\n            return [ ]\n\n        job_desc = self._compose_job_desc(workingArea, package_indices)\n\n        clusterprocids = submit_jobs(job_desc, cwd=workingArea.path)\n\n        # TODO: make configurable\n        clusterids = clusterprocids2clusterids(clusterprocids)\n        for clusterid in clusterids:\n            change_job_priority([clusterid], 10)\n\n        self.clusterprocids_outstanding.extend(clusterprocids)\n\n        return clusterprocids"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreturns the list of the finished jobs and the status of the finished jobs.", "response": "def poll(self):\n        \"\"\"Return the run IDs of the finished jobs\n\n        Returns\n        -------\n        list(str)\n            The list of the run IDs of the finished jobs\n\n        \"\"\"\n\n        clusterids = clusterprocids2clusterids(self.clusterprocids_outstanding)\n        clusterprocid_status_list = query_status_for(clusterids)\n        # e.g., [['1730126.0', 2], ['1730127.0', 2], ['1730129.1', 1], ['1730130.0', 1]]\n\n        if clusterprocid_status_list:\n            clusterprocids, statuses = zip(*clusterprocid_status_list)\n        else:\n            clusterprocids, statuses = (), ()\n\n        clusterprocids_finished = [i for i in self.clusterprocids_outstanding if i not in clusterprocids]\n        self.clusterprocids_finished.extend(clusterprocids_finished)\n        self.clusterprocids_outstanding[:] = clusterprocids\n\n        # logging\n        counter = collections.Counter(statuses)\n        messages = [ ]\n        if counter:\n            messages.append(', '.join(['{}: {}'.format(HTCONDOR_JOBSTATUS[k], counter[k]) for k in counter.keys()]))\n        if self.clusterprocids_finished:\n            messages.append('Finished {}'.format(len(self.clusterprocids_finished)))\n        logger = logging.getLogger(__name__)\n        logger.info(', '.join(messages))\n\n        return clusterprocids_finished"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef failed_runids(self, runids):\n\n        # remove failed clusterprocids from self.clusterprocids_finished\n        # so that len(self.clusterprocids_finished)) becomes the number\n        # of the successfully finished jobs\n        for i in runids:\n            try:\n                self.clusterprocids_finished.remove(i)\n            except ValueError:\n                pass", "response": "Provide the run IDs of failed jobs Returns None"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef atpbar(iterable, name=None):\n    try:\n        len_ = len(iterable)\n    except TypeError:\n        logger = logging.getLogger(__name__)\n        logging.warning('length is unknown: {!r}'.format(iterable))\n        logging.warning('atpbar is turned off')\n        return iterable\n\n    if name is None:\n        name = repr(iterable)\n\n    return Atpbar(iterable, name=name, len_=len_)", "response": "Progress bar for a given iterable."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef getArrays(self, tree, branchName):\n        itsArray = self._getArray(tree, branchName)\n        if itsArray is None: return None, None\n        itsCountArray = self._getCounterArray(tree, branchName)\n        return itsArray, itsCountArray", "response": "return the array. array objects for the given tree and branch"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef begin(self):\n        if self.isopen: return\n        self.dropbox.open()\n        self.isopen = True", "response": "Begin consuming the current set of items."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef put(self, task, *args, **kwargs):\n        if not self.isopen:\n            logger = logging.getLogger(__name__)\n            logger.warning('the drop box is not open')\n            return\n        package = TaskPackage(task=task, args=args, kwargs=kwargs)\n        return self.dropbox.put(package)", "response": "put a task and its arguments into the dropbox"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nputting a list of tasks and their arguments This method can be used to put multiple tasks at once. Calling this method once with multiple tasks can be much faster than calling `put()` multiple times. Parameters ---------- task_args_kwargs_list : list A list of lists with three items that can be parameters of `put()`, i.e., `task`, `args`, `kwargs`. Returns ------- list A list of task IDs.", "response": "def put_multiple(self, task_args_kwargs_list):\n        \"\"\"put a list of tasks and their arguments\n\n        This method can be used to put multiple tasks at once. Calling\n        this method once with multiple tasks can be much faster than\n        calling `put()` multiple times.\n\n        Parameters\n        ----------\n        task_args_kwargs_list : list\n\n            A list of lists with three items that can be parameters of\n            `put()`, i.e., `task`, `args`, `kwargs`.\n\n        Returns\n        -------\n        list\n            A list of task IDs.\n\n        \"\"\"\n        if not self.isopen:\n            logger = logging.getLogger(__name__)\n            logger.warning('the drop box is not open')\n            return\n\n        packages = [ ]\n        for t in task_args_kwargs_list:\n            try:\n                task = t['task']\n                args = t.get('args', ())\n                kwargs = t.get('kwargs', {})\n                package = TaskPackage(task=task, args=args, kwargs=kwargs)\n            except TypeError:\n                package = TaskPackage(task=t, args=(), kwargs={})\n            packages.append(package)\n        return self.dropbox.put_multiple(packages)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nreturning a list of pairs of IDs and results of finished tasks.", "response": "def receive_finished(self):\n        \"\"\"return a list of pairs of IDs and results of finished tasks.\n\n        This method doesn't wait for tasks to finish. It returns IDs\n        and results which have already finished.\n\n        Returns\n        -------\n        list\n            A list of pairs of IDs and results\n\n        \"\"\"\n        if not self.isopen:\n            logger = logging.getLogger(__name__)\n            logger.warning('the drop box is not open')\n            return\n        return self.dropbox.poll()"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef receive_one(self):\n        if not self.isopen:\n            logger = logging.getLogger(__name__)\n            logger.warning('the drop box is not open')\n            return\n        return self.dropbox.receive_one()", "response": "return a pair of an ID and a result of a task."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn a list of pairs of IDs and results of all tasks.", "response": "def receive_all(self):\n        \"\"\"return a list of pairs of IDs and results of all tasks.\n\n        This method waits for all tasks to finish.\n\n        Returns\n        -------\n        list\n            A list of pairs of IDs and results\n\n        \"\"\"\n        if not self.isopen:\n            logger = logging.getLogger(__name__)\n            logger.warning('the drop box is not open')\n            return\n        return self.dropbox.receive()"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning a list of all results of all tasks. This method waits for all tasks to finish and returns a list of results of all tasks.", "response": "def receive(self):\n        \"\"\"return a list results of all tasks.\n\n        This method waits for all tasks to finish.\n\n        Returns\n        -------\n        list\n            A list of results of the tasks. The results are sorted in\n            the order in which the tasks are put.\n\n        \"\"\"\n        pkgidx_result_pairs = self.receive_all()\n        if pkgidx_result_pairs is None:\n            return\n        results = [r for _, r in pkgidx_result_pairs]\n        return results"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nends - close dropbox", "response": "def end(self):\n        \"\"\"end\n\n        \"\"\"\n        if not self.isopen: return\n        self.dropbox.close()\n        self.isopen = False"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef expand_path_cfg(path_cfg, alias_dict={ }, overriding_kargs={ }):\n\n    if isinstance(path_cfg, str):\n        return _expand_str(path_cfg, alias_dict, overriding_kargs)\n\n    if isinstance(path_cfg, dict):\n        return _expand_dict(path_cfg, alias_dict)\n\n    # assume tuple or list\n    return _expand_tuple(path_cfg, alias_dict, overriding_kargs)", "response": "expand a path config into a tuple or list of dicts"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _expand_str(path_cfg, alias_dict, overriding_kargs):\n\n    if path_cfg in alias_dict:\n        # e.g., path_cfg = 'var_cut'\n        return _expand_str_alias(path_cfg, alias_dict, overriding_kargs)\n\n    # e.g., path_cfg = 'ev : {low} <= ev.var[0] < {high}'\n    return _expand_for_lambda_str(path_cfg, alias_dict, overriding_kargs)", "response": "expand a path config given as a string"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nexpanding a path config given as a string", "response": "def _expand_str_alias(path_cfg, alias_dict, overriding_kargs):\n    \"\"\"expand a path config given as a string\n\n    Args:\n        path_cfg (str): an alias\n        alias_dict (dict):\n        overriding_kargs (dict):\n    \"\"\"\n\n    # e.g.,\n    # path_cfg = 'var_cut'\n\n    new_path_cfg = alias_dict[path_cfg]\n    # e.g., ('ev : {low} <= ev.var[0] < {high}', {'low': 10, 'high': 200})\n\n    new_overriding_kargs = dict(alias=path_cfg)\n    # e.g., {'alias': 'var_cut'}\n\n    new_overriding_kargs.update(overriding_kargs)\n    # e.g., {'alias': 'var_cut', 'name': 'var_cut25', 'low': 25}\n\n    return expand_path_cfg(new_path_cfg, alias_dict,new_overriding_kargs)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nexpand a path config given as a tuple", "response": "def _expand_tuple(path_cfg, alias_dict, overriding_kargs):\n    \"\"\"expand a path config given as a tuple\n\n    \"\"\"\n\n    # e.g.,\n    # path_cfg = ('ev : {low} <= ev.var[0] < {high}', {'low': 10, 'high': 200})\n    # overriding_kargs = {'alias': 'var_cut', 'name': 'var_cut25', 'low': 25}\n\n    new_path_cfg = path_cfg[0]\n    # e.g., 'ev : {low} <= ev.var[0] < {high}'\n\n    new_overriding_kargs = path_cfg[1].copy()\n    # e.g., {'low': 10, 'high': 200}\n\n    new_overriding_kargs.update(overriding_kargs)\n    # e.g., {'low': 25, 'high': 200, 'alias': 'var_cut', 'name': 'var_cut25'}\n\n    return expand_path_cfg(\n        new_path_cfg,\n        overriding_kargs=new_overriding_kargs,\n        alias_dict=alias_dict\n    )"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef poll(self):\n        finished_procs = [p for p in self.running_procs if p.poll() is not None]\n        self.running_procs = collections.deque([p for p in self.running_procs if p not in finished_procs])\n\n        for proc in finished_procs:\n            stdout, stderr = proc.communicate()\n            ## proc.communicate() returns (stdout, stderr) when\n            ## self.pipe = True. Otherwise they are (None, None)\n\n        finished_pids = [p.pid for p in finished_procs]\n        self.finished_pids.extend(finished_pids)\n\n        logger = logging.getLogger(__name__)\n        messages = 'Running: {}, Finished: {}'.format(len(self.running_procs), len(self.finished_pids))\n        logger.info(messages)\n\n        return finished_pids", "response": "check if the jobs are running and return a list of pids for\n        finished jobs\n       "}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef wait(self):\n        finished_pids = [ ]\n        while self.running_procs:\n            finished_pids.extend(self.poll())\n        return finished_pids", "response": "wait until all jobs finish and return a list of pids\n           "}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef getVector(self, tree, branchName):\n\n        if (tree, branchName) in self.__class__.addressDict:\n            return self.__class__.addressDict[(tree, branchName)]\n\n        itsVector = self._getVector(tree, branchName)\n        self.__class__.addressDict[(tree, branchName)] = itsVector\n\n        return itsVector", "response": "return the ROOT. vector object for the branch."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nbuilding a new Parallel object.", "response": "def build_parallel(parallel_mode, quiet=True, processes=4,\n                   user_modules=None, dispatcher_options=None):\n    \"\"\"initializes `Parallel`\n\n    Parameters\n    ----------\n    parallel_mode : str\n        \"multiprocessing\" (default), \"htcondor\" or \"subprocess\"\n    quiet : bool, optional\n        if True, progress bars will not be shown in the \"multiprocessing\" mode.\n    process : int, optional\n        The number of processes when ``parallel_mode`` is\n        \"multiprocessing\"\n    user_modules : list, optional\n        The names of modules to be sent to worker nodes when\n        parallel_mode is \"htcondor\"\n    dispatcher_options : dict, optional\n        Options to dispatcher\n\n    Returns\n    -------\n    parallel\n        an instance of the class `Parallel`\n\n    \"\"\"\n\n    if user_modules is None:\n        user_modules = [ ]\n\n    if dispatcher_options is None:\n        dispatcher_options = dict()\n\n    dispatchers = ('subprocess', 'htcondor')\n    parallel_modes = ('multiprocessing', ) + dispatchers\n    default_parallel_mode = 'multiprocessing'\n\n    if not parallel_mode in parallel_modes:\n        logger = logging.getLogger(__name__)\n        logger.warning('unknown parallel_mode \"{}\", use default \"{}\"'.format(\n            parallel_mode, default_parallel_mode\n        ))\n        parallel_mode = default_parallel_mode\n\n    if parallel_mode == 'multiprocessing':\n        if quiet:\n            atpbar.disable()\n        return _build_parallel_multiprocessing(processes=processes)\n\n    return _build_parallel_dropbox(\n        parallel_mode=parallel_mode,\n        user_modules=user_modules,\n        dispatcher_options=dispatcher_options\n    )"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef configure(self, component, all_dependencies):\n        ''' Ensure all config-time files have been generated. Return a\n            dictionary of generated items.\n        '''\n        r = {}\n\n        builddir = self.buildroot\n\n        # only dependencies which are actually valid can contribute to the\n        # config data (which includes the versions of all dependencies in its\n        # build info) if the dependencies aren't available we can't tell what\n        # version they are. Anything missing here should always be a test\n        # dependency that isn't going to be used, otherwise the yotta build\n        # command will fail before we get here\n        available_dependencies = OrderedDict((k, v) for k, v in all_dependencies.items() if v)\n\n        self.set_toplevel_definitions = ''\n        if self.build_info_include_file is None:\n            self.build_info_include_file, build_info_definitions = self.getBuildInfo(component.path, builddir)\n            self.set_toplevel_definitions += build_info_definitions\n\n        if self.config_include_file is None:\n            self.config_include_file, config_definitions, self.config_json_file = self._getConfigData(available_dependencies, component, builddir, self.build_info_include_file)\n            self.set_toplevel_definitions += config_definitions\n\n        self.configured = True\n        return {\n            'merged_config_include': self.config_include_file,\n               'merged_config_json': self.config_json_file,\n               'build_info_include': self.build_info_include_file\n        }", "response": "Configure the a\n            dictionary."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ngenerate all the top - level CMakeLists for this component and its dependencies", "response": "def generateRecursive(self, component, all_components, builddir=None, modbuilddir=None, processed_components=None, application=None):\n        ''' generate top-level CMakeLists for this component and its\n            dependencies: the CMakeLists are all generated in self.buildroot,\n            which MUST be out-of-source\n\n            !!! NOTE: experimenting with a slightly different way of doing\n            things here, this function is a generator that yields any errors\n            produced, so the correct use is:\n\n            for error in gen.generateRecursive(...):\n                print(error)\n        '''\n        assert(self.configured)\n\n        if builddir is None:\n            builddir = self.buildroot\n        if modbuilddir is None:\n            modbuilddir = os.path.join(builddir, 'ym')\n\n        if processed_components is None:\n            processed_components = dict()\n        if not self.target:\n            yield 'Target \"%s\" is not a valid build target' % self.target\n\n        toplevel = not len(processed_components)\n\n        logger.debug('generate build files: %s (target=%s)' % (component, self.target))\n        # because of the way c-family language includes work we need to put the\n        # public header directories of all components that this component\n        # depends on (directly OR indirectly) into the search path, which means\n        # we need to first enumerate all the direct and indirect dependencies\n        recursive_deps = component.getDependenciesRecursive(\n            available_components = all_components,\n                          target = self.target,\n                  available_only = True,\n                            test = True\n        )\n        dependencies = component.getDependencies(\n                  all_components,\n                          target = self.target,\n                  available_only = True,\n                            test = True\n        )\n\n        for name, dep in dependencies.items():\n            # if dep is a test dependency, then it might not be required (if\n            # we're not building tests). We don't actually know at this point\n            if not dep:\n                if dep.isTestDependency():\n                    logger.debug('Test dependency \"%s\" of \"%s\" is not installed.' % (name, component))\n                else:\n                    yield 'Required dependency \"%s\" of \"%s\" is not installed.' % (name, component)\n        # ensure this component is assumed to have been installed before we\n        # check for its dependencies, in case it has a circular dependency on\n        # itself\n        processed_components[component.getName()] = component\n        new_dependencies = OrderedDict([(name,c) for name,c in dependencies.items() if c and not name in processed_components])\n        self.generate(builddir, modbuilddir, component, new_dependencies, dependencies, recursive_deps, application, toplevel)\n\n        logger.debug('recursive deps of %s:' % component)\n        for d in recursive_deps.values():\n            logger.debug('    %s' % d)\n\n        processed_components.update(new_dependencies)\n        for name, c in new_dependencies.items():\n            for error in self.generateRecursive(\n                c, all_components, os.path.join(modbuilddir, name), modbuilddir, processed_components, application=application\n            ):\n                yield error"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _validateListedSubdirsExist(self, component):\n        ''' Return true if all the subdirectories which this component lists in\n            its module.json file exist (although their validity is otherwise\n            not checked).\n\n            If they don't, warning messages are printed.\n        '''\n        lib_subdirs = component.getLibs(explicit_only=True)\n        bin_subdirs = component.getBinaries()\n\n        ok = True\n        for d in lib_subdirs:\n            if not os.path.exists(os.path.join(component.path, d)):\n                logger.warning(\n                    \"lib directory \\\"%s\\\" doesn't exist but is listed in the module.json file of %s\", d, component\n                )\n                ok = False\n\n        for d in bin_subdirs:\n            if not os.path.exists(os.path.join(component.path, d)):\n                logger.warning(\n                    \"bin directory \\\"%s\\\" doesn't exist but is listed in the module.json file of %s\", d, component\n                )\n                ok = False\n\n        return ok", "response": "Return True if all the subdirectories which this component lists in\n            its module. json file exist and they are otherwise checked."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nlisting all sub - directories of the component.", "response": "def _listSubDirectories(self, component, toplevel):\n        ''' return: {\n                manual: [list of subdirectories with manual CMakeLists],\n                  auto: [list of pairs: (subdirectories name to autogenerate, a list of source files in that dir)],\n                   bin: {dictionary of subdirectory name to binary name},\n                   lib: {dictionary of subdirectory name to binary name},\n                  test: [list of directories that build tests],\n              resource: [list of directories that contain resources]\n            }\n        '''\n        manual_subdirs = []\n        auto_subdirs = []\n        header_subdirs = []\n        lib_subdirs = component.getLibs()\n        bin_subdirs = component.getBinaries()\n        test_subdirs = []\n        resource_subdirs = []\n        # if the application or library is set to get the sources from top level (\".\"),\n        # they'll be acumulated into a single array (top_sources below).\n        top_sources = []\n        start_on_top = \".\" in [os.path.normpath(x) for x in list(lib_subdirs.keys()) + list(bin_subdirs.keys())]\n        for f in sorted(os.listdir(component.path)):\n            if f in Ignore_Subdirs or f.startswith('.') or f.startswith('_'):\n                continue\n            check_cmakefile_path = os.path.join(f, 'CMakeLists.txt')\n            if os.path.isfile(os.path.join(component.path, check_cmakefile_path)) and not \\\n                    component.ignores(check_cmakefile_path):\n                self.checkStandardSourceDir(f, component)\n                # if the subdirectory has a CMakeLists.txt in it (and it isn't\n                # ignored), then delegate to that:\n                manual_subdirs.append(f)\n                # tests only supported in the `test` directory for now\n                if f in ('test',):\n                    test_subdirs.append(f)\n            else:\n                if os.path.isfile(os.path.join(component.path, f)):\n                    # top level source: check if it should be included\n                    if not component.ignores(f) and start_on_top:\n                        sf = self.createSourceFile(f, os.path.join(component.path, f), \".\")\n                        if sf is not None:\n                            top_sources.append(sf)\n                else:\n                    # otherwise, if the directory has source files, and is listed\n                    # as a source/test directory, generate a CMakeLists in the\n                    # corresponding temporary directory, and add that.\n                    sources = self.containsSourceFiles(os.path.join(component.path, f), component)\n                    if sources:\n                        if f in ('test',):\n                            auto_subdirs.append((f, sources))\n                            test_subdirs.append(f)\n                        elif start_on_top:\n                            # include the sources in this directory only if it's not\n                            # a potential test directory\n                            from yotta.lib import validate\n                            if not validate.isPotentialTestDir(f):\n                                top_sources.extend(sources)\n                                if f == component.getName():\n                                    header_subdirs.append((f, sources))\n                        elif os.path.normpath(f) in [fsutils.fullySplitPath(x)[0] for x in lib_subdirs] or \\\n                             os.path.normpath(f) in [fsutils.fullySplitPath(x)[0] for x in bin_subdirs]:\n                            for full_subpath in list(lib_subdirs.keys()) + list(bin_subdirs.keys()):\n                                if fsutils.fullySplitPath(full_subpath)[0] == os.path.normpath(f):\n                                    # this might be a sub-sub directory, in which\n                                    # case we need to re-calculate the sources just\n                                    # for the part we care about:\n                                    sources = self.containsSourceFiles(os.path.join(component.path, full_subpath), component)\n                                    auto_subdirs.append((full_subpath, sources))\n                        elif f == component.getName():\n                            header_subdirs.append((f, sources))\n                    elif toplevel and \\\n                         ((f in ('test',)) or \\\n                          (os.path.normpath(f) in lib_subdirs or start_on_top) or \\\n                          (os.path.normpath(f) in bin_subdirs or start_on_top) and not \\\n                          component.ignores(f)):\n                        # (if there aren't any source files then do nothing)\n                        # !!! FIXME: ensure this warning is covered in tests\n                        logger.warning(\"subdirectory \\\"%s\\\" of %s was ignored because it doesn't appear to contain any source files\", f, component)\n\n            # 'resource' directory also has special meaning, but there's no\n            # pattern for the files which might be in here:\n            if f in ('resource',):\n                resource_subdirs.append(os.path.join(component.path, f))\n\n            # issue a warning if a differently cased or common misspelling of a\n            # standard directory name was encountered:\n            check_directory_name_cases = list(lib_subdirs.keys()) + list(bin_subdirs.keys()) + ['test', 'resource']\n            if f.lower() in check_directory_name_cases + ['src'] and not \\\n               f in check_directory_name_cases and not \\\n               component.ignores(f):\n                self.checkStandardSourceDir(f, component)\n        if top_sources:\n            # all the top level sources are grouped into a single cmake-generated directory\n            # which is given the same name as the component\n            auto_subdirs.append((component.getName(), top_sources))\n\n        return {\n            \"manual\": manual_subdirs,\n              \"auto\": auto_subdirs,\n           \"headers\": header_subdirs,\n               \"bin\": {component.getName(): component.getName()} if (start_on_top and component.isApplication()) else bin_subdirs,\n               \"lib\": {component.getName(): component.getName()} if (start_on_top and not component.isApplication()) else lib_subdirs,\n              \"test\": test_subdirs,\n          \"resource\": resource_subdirs\n        }"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreturns the path_to_config_header cmake_set_definitions", "response": "def _getConfigData(self, all_dependencies, component, builddir, build_info_header_path):\n        ''' returns (path_to_config_header, cmake_set_definitions) '''\n        # ordered_json, , read/write ordered json, internal\n        from yotta.lib import ordered_json\n        add_defs_header = ''\n        set_definitions = ''\n        # !!! backwards-compatible \"TARGET_LIKE\" definitions for the top-level\n        # of the config. NB: THESE WILL GO AWAY\n        definitions = []\n        definitions.append(('TARGET', sanitizePreprocessorSymbol(self.target.getName())))\n        definitions.append(('TARGET_LIKE_%s' % sanitizePreprocessorSymbol(self.target.getName()),None))\n\n        # make the path to the build-info header available both to CMake and\n        # in the preprocessor:\n        full_build_info_header_path = replaceBackslashes(os.path.abspath(build_info_header_path))\n        logger.debug('build info header include path: \"%s\"', full_build_info_header_path)\n        definitions.append(('YOTTA_BUILD_INFO_HEADER', '\"'+full_build_info_header_path+'\"'))\n\n        for target in self.target.getSimilarTo_Deprecated():\n            if '*' not in target:\n                definitions.append(('TARGET_LIKE_%s' % sanitizePreprocessorSymbol(target),None))\n\n        merged_config = self.target.getMergedConfig()\n        logger.debug('target configuration data: %s', merged_config)\n        definitions += self._definitionsForConfig(merged_config, ['YOTTA', 'CFG'])\n\n        add_defs_header += '// yotta config data (including backwards-compatible definitions)\\n'\n\n        for k, v in definitions:\n            if v is not None:\n                add_defs_header += '#define %s %s\\n' % (k, v)\n                set_definitions += 'set(%s %s)\\n' % (k, v)\n            else:\n                add_defs_header += '#define %s\\n' % k\n                set_definitions += 'set(%s TRUE)\\n' % k\n\n        add_defs_header += '\\n// version definitions\\n'\n\n        for dep in list(all_dependencies.values()) + [component]:\n            add_defs_header += \"#define YOTTA_%s_VERSION_STRING \\\"%s\\\"\\n\" % (sanitizePreprocessorSymbol(dep.getName()), str(dep.getVersion()))\n            add_defs_header += \"#define YOTTA_%s_VERSION_MAJOR %d\\n\" % (sanitizePreprocessorSymbol(dep.getName()), dep.getVersion().major())\n            add_defs_header += \"#define YOTTA_%s_VERSION_MINOR %d\\n\" % (sanitizePreprocessorSymbol(dep.getName()), dep.getVersion().minor())\n            add_defs_header += \"#define YOTTA_%s_VERSION_PATCH %d\\n\" % (sanitizePreprocessorSymbol(dep.getName()), dep.getVersion().patch())\n\n        # add the component's definitions\n        defines = component.getDefines()\n        if defines:\n            add_defs_header += \"\\n// direct definitions (defines.json)\\n\"\n            for name, value in defines.items():\n                add_defs_header += \"#define %s %s\\n\" % (name, value)\n            add_defs_header += '\\n'\n\n        # use -include <definitions header> instead of lots of separate\n        # defines... this is compiler specific, but currently testing it\n        # out for gcc-compatible compilers only:\n        config_include_file = os.path.join(builddir, 'yotta_config.h')\n        config_json_file    = os.path.join(builddir, 'yotta_config.json')\n        set_definitions += 'set(YOTTA_CONFIG_MERGED_JSON_FILE \\\"%s\\\")\\n' % replaceBackslashes(os.path.abspath(config_json_file))\n\n        self._writeFile(\n            config_include_file,\n            '#ifndef __YOTTA_CONFIG_H__\\n'+\n            '#define __YOTTA_CONFIG_H__\\n'+\n            add_defs_header+\n            '#endif // ndef __YOTTA_CONFIG_H__\\n'\n        )\n        self._writeFile(\n            config_json_file,\n            ordered_json.dumps(merged_config)\n        )\n        return (config_include_file, set_definitions, config_json_file)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef getBuildInfo(self, sourcedir, builddir):\n        ''' Write the build info header file, and return (path_to_written_header, set_cmake_definitions) '''\n        cmake_defs = ''\n        preproc_defs = '// yotta build info, #include YOTTA_BUILD_INFO_HEADER to access\\n'\n        # standard library modules\n        import datetime\n        # vcs, , represent version controlled directories, internal\n        from yotta.lib import vcs\n\n        now = datetime.datetime.utcnow()\n        vcs_instance = vcs.getVCS(sourcedir)\n        if self.build_uuid is None:\n            import uuid\n            self.build_uuid = uuid.uuid4()\n\n        definitions = [\n            ('YOTTA_BUILD_YEAR',   now.year,        'UTC year'),\n            ('YOTTA_BUILD_MONTH',  now.month,       'UTC month 1-12'),\n            ('YOTTA_BUILD_DAY',    now.day,         'UTC day 1-31'),\n            ('YOTTA_BUILD_HOUR',   now.hour,        'UTC hour 0-24'),\n            ('YOTTA_BUILD_MINUTE', now.minute,      'UTC minute 0-59'),\n            ('YOTTA_BUILD_SECOND', now.second,      'UTC second 0-61'),\n            ('YOTTA_BUILD_UUID',   self.build_uuid, 'unique random UUID for each build'),\n        ]\n        if vcs_instance is not None:\n            commit_id = None\n            repotype = vcs_instance.__class__.__name__\n            try:\n                commit_id = vcs_instance.getCommitId()\n            except vcs.VCSNotInstalled as e:\n                logger.warning('%s is not installed, VCS status build info is not available', repotype)\n                commit_id = None\n            except vcs.VCSError as e:\n                logger.debug('%s', e)\n                logger.warning(\n                    'error detecting build info: \"%s\", build info is not available to the build. Please check that this is a valid %s repository!',\n                    str(e).split('\\n')[0],\n                    repotype\n                )\n            if commit_id is not None:\n                clean_state = int(vcs_instance.isClean())\n                description = vcs_instance.getDescription()\n                definitions += [\n                    ('YOTTA_BUILD_VCS_ID',    commit_id,   'git or mercurial hash'),\n                    ('YOTTA_BUILD_VCS_CLEAN', clean_state, 'evaluates true if the version control system was clean, otherwise false'),\n                    ('YOTTA_BUILD_VCS_DESCRIPTION', description, 'git describe or mercurial equivalent')\n                ]\n\n        for d in definitions:\n            preproc_defs += '#define %s %s // %s\\n' % d\n            cmake_defs   += 'set(%s \"%s\") # %s\\n' % d\n\n        buildinfo_include_file = os.path.join(builddir, 'yotta_build_info.h')\n        self._writeFile(\n            buildinfo_include_file,\n            '#ifndef __YOTTA_BUILD_INFO_H__\\n'+\n            '#define __YOTTA_BUILD_INFO_H__\\n'+\n            preproc_defs+\n            '#endif // ndef __YOTTA_BUILD_INFO_H__\\n'\n        )\n        return (buildinfo_include_file, cmake_defs)", "response": "Write the build info header file and return the path to the build info header and set_cmake_definitions"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef generate(\n            self, builddir, modbuilddir, component, active_dependencies, immediate_dependencies, all_dependencies, application, toplevel\n        ):\n        ''' active_dependencies is the dictionary of components that need to be\n            built for this component, but will not already have been built for\n            another component.\n        '''\n\n        include_root_dirs = ''\n        if application is not None and component is not application:\n            include_root_dirs += 'include_directories(\"%s\")\\n' % replaceBackslashes(application.path)\n\n        include_sys_dirs = ''\n        include_other_dirs = ''\n        for name, c in itertools.chain(((component.getName(), component),), all_dependencies.items()):\n            if c is not component and c.isTestDependency():\n                continue\n            include_root_dirs += 'include_directories(\"%s\")\\n' % replaceBackslashes(c.path)\n            dep_sys_include_dirs = c.getExtraSysIncludes()\n            for d in dep_sys_include_dirs:\n                include_sys_dirs += 'include_directories(SYSTEM \"%s\")\\n' % replaceBackslashes(os.path.join(c.path, d))\n            dep_extra_include_dirs = c.getExtraIncludes()\n            for d in dep_extra_include_dirs:\n                include_other_dirs += 'include_directories(\"%s\")\\n' % replaceBackslashes(os.path.join(c.path, d))\n\n        add_depend_subdirs = ''\n        for name, c in active_dependencies.items():\n            depend_subdir = replaceBackslashes(os.path.join(modbuilddir, name))\n            relpath = replaceBackslashes(os.path.relpath(depend_subdir, self.buildroot))\n            add_depend_subdirs += \\\n                'add_subdirectory(\\n' \\\n                '   \"%s\"\\n' \\\n                '   \"${CMAKE_BINARY_DIR}/%s\"\\n' \\\n                ')\\n' \\\n                % (depend_subdir, relpath)\n\n        delegate_to_existing = None\n        delegate_build_dir = None\n\n        module_is_empty = False\n        if os.path.isfile(os.path.join(component.path, 'CMakeLists.txt')) and not component.ignores('CMakeLists.txt'):\n            # adding custom CMake is a promise to generate a library: so the\n            # module is never empty in this case.\n            delegate_to_existing = component.path\n            add_own_subdirs = []\n            logger.debug(\"delegate to build dir: %s\", builddir)\n            delegate_build_dir = os.path.join(builddir, 'existing')\n        else:\n            # !!! TODO: if they don't exist, that should possibly be a fatal\n            # error, not just a warning\n            self._validateListedSubdirsExist(component)\n\n            subdirs = self._listSubDirectories(component, toplevel)\n            manual_subdirs      = subdirs['manual']\n            autogen_subdirs     = subdirs['auto']\n            binary_subdirs      = subdirs['bin']\n            lib_subdirs         = subdirs['lib']\n            test_subdirs        = subdirs['test']\n            resource_subdirs    = subdirs['resource']\n            header_subdirs      = subdirs['headers']\n            logger.debug(\"%s lib subdirs: %s, bin subdirs: %s\", component, lib_subdirs, binary_subdirs)\n\n            add_own_subdirs = []\n            for f in manual_subdirs:\n                if os.path.isfile(os.path.join(component.path, f, 'CMakeLists.txt')):\n                    # if this module is a test dependency, then don't recurse\n                    # to building its own tests.\n                    if f in test_subdirs and component.isTestDependency():\n                        continue\n                    add_own_subdirs.append(\n                        (os.path.join(component.path, f), f)\n                    )\n            # names of all directories at this level with stuff in: used to figure\n            # out what to link automatically\n            all_subdirs = manual_subdirs + [x[0] for x in autogen_subdirs]\n\n            # first check if this module is empty:\n            if component.isTestDependency():\n                if len(autogen_subdirs) + len(add_own_subdirs) == 0:\n                    module_is_empty = True\n            else:\n                if len(autogen_subdirs) + len(add_own_subdirs) <= len(test_subdirs):\n                    module_is_empty = True\n\n            # autogenerate CMakeLists for subdirectories as appropriate:\n            for f, source_files in autogen_subdirs:\n                if f in test_subdirs:\n                    # if this module is a test dependency, then don't recurse\n                    # to building its own tests.\n                    if component.isTestDependency():\n                        continue\n                    self.generateTestDirList(\n                        builddir, f, source_files, component, immediate_dependencies, toplevel=toplevel, module_is_empty=module_is_empty\n                    )\n                else:\n                    if f in binary_subdirs:\n                        is_executable = True\n                        object_name = binary_subdirs[f]\n                    else:\n                        # not a test subdir or binary subdir: it must be a lib\n                        # subdir\n                        assert(f in lib_subdirs)\n                        object_name = lib_subdirs[f]\n\n                    for header_dir, header_files in header_subdirs:\n                        source_files.extend(header_files)\n\n                    self.generateSubDirList(\n                                      builddir = builddir,\n                                       dirname = f,\n                                  source_files = source_files,\n                                     component = component,\n                                   all_subdirs = all_subdirs,\n                        immediate_dependencies = immediate_dependencies,\n                                   object_name = object_name,\n                              resource_subdirs = resource_subdirs,\n                                 is_executable = (f in binary_subdirs)\n                    )\n                add_own_subdirs.append(\n                    (os.path.join(builddir, f), f)\n                )\n\n            # from now on, completely forget that this component had any tests\n            # if it is itself a test dependency:\n            if component.isTestDependency():\n                test_subdirs = []\n\n            # if we're not building anything other than tests, and this is a\n            # library module (not a binary) then we need to generate a dummy\n            # library so that this component can still be linked against\n            if module_is_empty:\n                if len(binary_subdirs):\n                    logger.warning('nothing to build!')\n                else:\n                    add_own_subdirs.append(self.createDummyLib(\n                        component, builddir, [x[0] for x in immediate_dependencies.items() if not x[1].isTestDependency()]\n                    ))\n\n        toolchain_file_path = os.path.join(builddir, 'toolchain.cmake')\n        if toplevel:\n            # generate the top-level toolchain file:\n            template = jinja_environment.get_template('toolchain.cmake')\n            file_contents = template.render({  #pylint: disable=no-member\n                                   # toolchain files are provided in hierarchy\n                                   # order, but the template needs them in reverse\n                                   # order (base-first):\n                \"toolchain_files\": self.target.getToolchainFiles()\n            })\n            self._writeFile(toolchain_file_path, file_contents)\n\n        # generate the top-level CMakeLists.txt\n        template = jinja_environment.get_template('base_CMakeLists.txt')\n\n        relpath = os.path.relpath(builddir, self.buildroot)\n\n        file_contents = template.render({ #pylint: disable=no-member\n                            \"toplevel\": toplevel,\n                         \"target_name\": self.target.getName(),\n                     \"set_definitions\": self.set_toplevel_definitions,\n                      \"toolchain_file\": toolchain_file_path,\n                           \"component\": component,\n                             \"relpath\": relpath,\n                   \"include_root_dirs\": include_root_dirs,\n                    \"include_sys_dirs\": include_sys_dirs,\n                  \"include_other_dirs\": include_other_dirs,\n                  \"add_depend_subdirs\": add_depend_subdirs,\n                     \"add_own_subdirs\": add_own_subdirs,\n                 \"config_include_file\": self.config_include_file,\n                         \"delegate_to\": delegate_to_existing,\n                  \"delegate_build_dir\": delegate_build_dir,\n                 \"active_dependencies\": active_dependencies,\n                     \"module_is_empty\": module_is_empty,\n                      \"cmake_includes\": self.target.getAdditionalIncludes()\n        })\n        self._writeFile(os.path.join(builddir, 'CMakeLists.txt'), file_contents)", "response": "Generates a new entry for the current version of the component."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _getTags(repo):\n    ''' return a dictionary of {tag: tarball_url}'''\n    logger.debug('get tags for %s', repo)\n    g = Github(settings.getProperty('github', 'authtoken'))\n    repo = g.get_repo(repo)\n    tags = repo.get_tags()\n    logger.debug('tags for %s: %s', repo, [t.name for t in tags])\n    return {t.name: _ensureDomainPrefixed(t.tarball_url) for t in tags}", "response": "return a dictionary of tags for a given repository"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _getTipArchiveURL(repo):\n    ''' return a string containing a tarball url '''\n    g = Github(settings.getProperty('github', 'authtoken'))\n    repo = g.get_repo(repo)\n    return repo.get_archive_link('tarball')", "response": "return a string containing a tarball url"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _getCommitArchiveURL(repo, commit):\n    ''' return a string containing a tarball url '''\n    g = Github(settings.getProperty('github', 'authtoken'))\n    repo = g.get_repo(repo)\n    return repo.get_archive_link('tarball', commit)", "response": "return a string containing a tarball url"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nunpacks the specified tarball into the specified directory", "response": "def _getTarball(url, into_directory, cache_key, origin_info=None):\n    '''unpack the specified tarball url into the specified directory'''\n\n    try:\n        access_common.unpackFromCache(cache_key, into_directory)\n    except KeyError as e:\n        tok = settings.getProperty('github', 'authtoken')\n        headers = {}\n        if tok is not None:\n            headers['Authorization'] = 'token ' + str(tok)\n\n        logger.debug('GET %s', url)\n        response = requests.get(url, allow_redirects=True, stream=True, headers=headers)\n        response.raise_for_status()\n\n        logger.debug('getting file: %s', url)\n        logger.debug('headers: %s', response.headers)\n        response.raise_for_status()\n\n        # github doesn't exposes hashes of the archives being downloaded as far\n        # as I can tell :(\n        access_common.unpackTarballStream(\n                    stream = response,\n            into_directory = into_directory,\n                      hash = {},\n                 cache_key = cache_key,\n               origin_info = origin_info\n        )"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreturning a github component for any github url including git + ssh:// git + http:// etc. or None if this is not a Github URL.", "response": "def createFromSource(cls, vs, name=None):\n        ''' returns a github component for any github url (including\n            git+ssh:// git+http:// etc. or None if this is not a Github URL.\n            For all of these we use the github api to grab a tarball, because\n            that's faster.\n\n            Normally version will be empty, unless the original url was of the\n            form: 'owner/repo @version' or 'url://...#version', which can be used\n            to grab a particular tagged version.\n\n            (Note that for github components we ignore the component name - it\n             doesn't have to match the github module name)\n        '''\n        return GithubComponent(vs.location, vs.spec, vs.semantic_spec, name)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreturn a list of Version objects each with a tarball URL set", "response": "def availableVersions(self):\n        ''' return a list of Version objects, each with a tarball URL set '''\n        r = []\n        for t in self._getTags():\n            logger.debug(\"available version tag: %s\", t)\n            # ignore empty tags:\n            if not len(t[0].strip()):\n                continue\n            try:\n                r.append(GithubComponentVersion(t[0], t[0], url=t[1], name=self.name, cache_key=None))\n            except ValueError:\n                logger.debug('invalid version tag: %s', t)\n\n        return r"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef availableTags(self):\n        ''' return a list of GithubComponentVersion objects for all tags\n        '''\n        return [\n            GithubComponentVersion(\n                '', t[0], t[1], self.name, cache_key=_createCacheKey('tag', t[0], t[1], self.name)\n            ) for t in self._getTags()\n        ]", "response": "return a list of GithubComponentVersion objects for all tags\n       "}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef availableBranches(self):\n        ''' return a list of GithubComponentVersion objects for the tip of each branch\n        '''\n        return [\n            GithubComponentVersion(\n                '', b[0], b[1], self.name, cache_key=None\n            ) for b in _getBranchHeads(self.repo).items()\n        ]", "response": "return a list of GithubComponentVersion objects for the tip of each branch\n       "}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreturn a GithubComponentVersion object for a specific commit if valid", "response": "def commitVersion(self):\n        ''' return a GithubComponentVersion object for a specific commit if valid\n        '''\n        import re\n\n        commit_match = re.match('^[a-f0-9]{7,40}$', self.tagOrBranchSpec(), re.I)\n        if commit_match:\n            return GithubComponentVersion(\n                '', '', _getCommitArchiveURL(self.repo, self.tagOrBranchSpec()), self.name, cache_key=None\n            )\n\n        return None"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef createFromSource(cls, vs, name=None):\n        ''' returns a hg component for any hg:// url, or None if this is not\n            a hg component.\n\n            Normally version will be empty, unless the original url was of the\n            form 'hg+ssh://...#version', which can be used to grab a particular\n            tagged version.\n        '''\n        # strip hg of the url scheme:\n        if vs.location.startswith('hg+'):\n            location = vs.location[3:]\n        else:\n            location = vs.location\n        return HGComponent(location, vs.spec)", "response": "returns a hg component for any hg:// url or None if this is not\n           "}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef dropRootPrivs(fn):\n    ''' decorator to drop su/sudo privilages before running a function on\n        unix/linux.\n        The *real* uid is modified, so privileges are permanently dropped for\n        the process. (i.e. make sure you don't need to do\n\n        If there is a SUDO_UID environment variable, then we drop to that,\n        otherwise we drop to nobody.\n    '''\n\n    def wrapped_fn(*args, **kwargs):\n        q = multiprocessing.Queue()\n        p = multiprocessing.Process(target=_dropPrivsReturnViaQueue, args=(q, fn, args, kwargs))\n        p.start()\n\n        r = None\n        e = None\n        while True:\n            msg = q.get()\n            if msg[0] == 'return':\n                r = msg[1]\n            if msg[0] == 'exception':\n                e = msg[1](msg[2])\n            if msg[0] == 'finish':\n                # if the command raised an exception, propagate this:\n                if e is not None:\n                    raise e #pylint: disable=raising-bad-type\n                return r\n\n    return wrapped_fn", "response": "decorator to drop root privilages before running a function on the system"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef installAndBuild(args, following_args):\n    ''' Perform the build command, but provide detailed error information.\n        Returns {status:0, build_status:0, generate_status:0, install_status:0} on success.\n        If status: is nonzero there was some sort of error. Other properties\n        are optional, and may not be set if that step was not attempted.\n    '''\n    build_status = generate_status = install_status = 0\n\n    if not hasattr(args, 'build_targets'):\n        vars(args)['build_targets'] = []\n\n    if 'test' in args.build_targets:\n        logging.error('Cannot build \"test\". Use \"yotta test\" to run tests.')\n        return {'status':1}\n\n    cwd = os.getcwd()\n    c = validate.currentDirectoryModule()\n    if not c:\n        return {'status':1}\n\n    try:\n        target, errors = c.satisfyTarget(args.target, additional_config=args.config)\n    except access_common.AccessException as e:\n        logging.error(e)\n        return {'status':1}\n    if errors:\n        for error in errors:\n            logging.error(error)\n        return {'status':1}\n\n    # run the install command before building, we need to add some options the\n    # install command expects to be present to do this:\n    vars(args)['component'] = None\n    vars(args)['act_globally'] = False\n    if not hasattr(args, 'install_test_deps'):\n        if 'all_tests' in args.build_targets:\n            vars(args)['install_test_deps'] = 'all'\n        elif not len(args.build_targets):\n            vars(args)['install_test_deps'] = 'own'\n        else:\n            # If the named build targets include tests from other modules, we\n            # need to install the deps for those modules. To do this we need to\n            # be able to tell which module a library belongs to, which is not\n            # straightforward (especially if there is custom cmake involved).\n            # That's why this is 'all', and not 'none'.\n            vars(args)['install_test_deps'] = 'all'\n\n    # install may exit non-zero for non-fatal errors (such as incompatible\n    # version specs), which it will display\n    install_status = install.execCommand(args, [])\n\n    builddir = os.path.join(cwd, 'build', target.getName())\n\n    all_deps = c.getDependenciesRecursive(\n                      target = target,\n        available_components = [(c.getName(), c)],\n                        test = True\n    )\n\n    # if a dependency is missing the build will almost certainly fail, so don't try\n    missing = 0\n    for d in all_deps.values():\n        if not d and not (d.isTestDependency() and args.install_test_deps != 'all'):\n            logging.error('%s not available' % os.path.split(d.path)[1])\n            missing += 1\n    if missing:\n        logging.error('Missing dependencies prevent build. Use `yotta ls` to list them.')\n        return {'status': 1, 'install_status':install_status, 'missing_status':missing}\n\n    generator = cmakegen.CMakeGen(builddir, target)\n    # only pass available dependencies to\n    config = generator.configure(c, all_deps)\n    logging.debug(\"config done, merged config: %s\", config['merged_config_json'])\n\n    script_environment = {\n        'YOTTA_MERGED_CONFIG_FILE': config['merged_config_json']\n    }\n    # run pre-generate scripts for all components:\n    runScriptWithModules(c, all_deps.values(), 'preGenerate', script_environment)\n\n    app = c if len(c.getBinaries()) else None\n    for error in generator.generateRecursive(c, all_deps, builddir, application=app):\n        logging.error(error)\n        generate_status = 1\n\n    logging.debug(\"generate done.\")\n    # run pre-build scripts for all components:\n    runScriptWithModules(c, all_deps.values(), 'preBuild', script_environment)\n\n    if (not hasattr(args, 'generate_only')) or (not args.generate_only):\n        error = target.build(\n                builddir, c, args, release_build=args.release_build,\n                build_args=following_args, targets=args.build_targets,\n                release_no_debug_info_build=args.release_no_debug_info_build\n        )\n\n        if error:\n            logging.error(error)\n            build_status = 1\n        else:\n            # post-build scripts only get run if we were successful:\n            runScriptWithModules(c, all_deps.values(), 'postBuild', script_environment)\n\n        if install_status:\n            logging.warning(\n                \"There were also errors installing and resolving dependencies, \"+\n                \"which may have caused the build failure: see above, or run \"+\n                \"`yotta install` for details.\"\n            )\n\n    return {\n                'status': build_status or generate_status or install_status,\n        'missing_status': missing,\n          'build_status': build_status,\n       'generate_status': generate_status,\n        'install_status': install_status\n    }", "response": "Perform the install and build command."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _friendlyAuthError(fn):\n    ''' Decorator to print a friendly you-are-not-authorised message. Use\n        **outside** the _handleAuth decorator to only print the message after\n        the user has been given a chance to login. '''\n    @functools.wraps(fn)\n    def wrapped(*args, **kwargs):\n        try:\n            return fn(*args, **kwargs)\n        except requests.exceptions.HTTPError as e:\n            if e.response.status_code == requests.codes.unauthorized: #pylint: disable=no-member\n                logger.error('insufficient permission')\n            elif e.response.status_code == requests.codes.bad and 'jwt has expired' in e.response.text.lower(): #pylint: disable=no-member\n                logger.error('server returned status %s: %s', e.response.status_code, e.response.text)\n                logger.error('Check that your system clock is set accurately!')\n            else:\n                logger.error('server returned status %s: %s', e.response.status_code, e.response.text)\n            raise\n    return wrapped", "response": "Decorator to print a friendly you - are - not - authorised message. Use\n ArcGIS s _handleAuth decorator to only print the message after the user has given a chance to login."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreturns a decorator to swallow a requests exception for modules that are not accessible without logging in and turn it into an Unavailable exception.", "response": "def _raiseUnavailableFor401(message):\n    ''' Returns a decorator to swallow a requests exception for modules that\n        are not accessible without logging in, and turn it into an Unavailable\n        exception.\n    '''\n    def __raiseUnavailableFor401(fn):\n        def wrapped(*args, **kwargs):\n            try:\n                return fn(*args, **kwargs)\n            except requests.exceptions.HTTPError as e:\n                if e.response.status_code == requests.codes.unauthorized:\n                    raise access_common.Unavailable(message)\n                else:\n                    raise\n        return wrapped\n    return __raiseUnavailableFor401"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\npublishes a tarblob to the registry.", "response": "def publish(namespace, name, version, description_file, tar_file, readme_file,\n            readme_file_ext, registry=None):\n    ''' Publish a tarblob to the registry, if the request fails, an exception\n        is raised, which either triggers re-authentication, or is turned into a\n        return value by the decorators. (If successful, the decorated function\n        returns None)\n    '''\n    registry = registry or Registry_Base_URL\n\n    url = '%s/%s/%s/versions/%s' % (\n        registry,\n        namespace,\n        name,\n        version\n    )\n\n    if readme_file_ext == '.md':\n        readme_section_name = 'readme.md'\n    elif readme_file_ext == '':\n        readme_section_name = 'readme'\n    else:\n        raise ValueError('unsupported readme type: \"%s\"' % readme_file_ext)\n\n    # description file is in place as text (so read it), tar file is a file\n    body = OrderedDict([('metadata', (None, description_file.read(),'application/json')),\n                        ('tarball',('tarball', tar_file)),\n                        (readme_section_name, (readme_section_name, readme_file))])\n\n    headers = _headersForRegistry(registry)\n\n    response = requests.put(url, headers=headers, files=body)\n    response.raise_for_status()\n\n    return None"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef unpublish(namespace, name, version, registry=None):\n    ''' Try to unpublish a recently published version. Return any errors that\n        occur.\n    '''\n    registry = registry or Registry_Base_URL\n\n    url = '%s/%s/%s/versions/%s' % (\n        registry,\n        namespace,\n        name,\n        version\n    )\n\n    headers = _headersForRegistry(registry)\n    response = requests.delete(url, headers=headers)\n    response.raise_for_status()\n\n    return None", "response": "Try to unpublish a recently published version. Return any errors that occur."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef listOwners(namespace, name, registry=None):\n    ''' List the owners of a module or target (owners are the people with\n        permission to publish versions and add/remove the owners).\n    '''\n    registry = registry or Registry_Base_URL\n\n    url = '%s/%s/%s/owners' % (\n        registry,\n        namespace,\n        name\n    )\n\n    request_headers = _headersForRegistry(registry)\n\n    response = requests.get(url, headers=request_headers)\n\n    if response.status_code == 404:\n        logger.error('no such %s, \"%s\"' % (namespace[:-1], name))\n        return None\n\n    # raise exceptions for other errors - the auth decorators handle these and\n    # re-try if appropriate\n    response.raise_for_status()\n\n    return ordered_json.loads(response.text)", "response": "List the owners of a module or target."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef removeOwner(namespace, name, owner, registry=None):\n    ''' Remove an owner for a module or target (owners are the people with\n        permission to publish versions and add/remove the owners).\n    '''\n    registry = registry or Registry_Base_URL\n\n    url = '%s/%s/%s/owners/%s' % (\n        registry,\n        namespace,\n        name,\n        owner\n    )\n\n    request_headers = _headersForRegistry(registry)\n\n    response = requests.delete(url, headers=request_headers)\n\n    if response.status_code == 404:\n        logger.error('no such %s, \"%s\"' % (namespace[:-1], name))\n        return\n\n    # raise exceptions for other errors - the auth decorators handle these and\n    # re-try if appropriate\n    response.raise_for_status()\n\n    return True", "response": "Remove an owner for a module or target."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nset the api key for accessing a specific registry.", "response": "def setAPIKey(registry, api_key):\n    ''' Set the api key for accessing a registry. This is only necessary for\n        development/test registries.\n    '''\n    if (registry is None) or (registry == Registry_Base_URL):\n        return\n    sources = _getSources()\n    source = None\n    for s in sources:\n        if _sourceMatches(s, registry):\n            source = s\n    if source is None:\n        source = {\n           'type':'registry',\n            'url':registry,\n        }\n        sources.append(source)\n    source['apikey'] = api_key\n    settings.set('sources', sources)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef getPublicKey(registry=None):\n    ''' Return the user's public key (generating and saving a new key pair if necessary) '''\n    registry = registry or Registry_Base_URL\n    pubkey_pem = None\n    if _isPublicRegistry(registry):\n        pubkey_pem = settings.getProperty('keys', 'public')\n    else:\n        for s in _getSources():\n            if _sourceMatches(s, registry):\n                if 'keys' in s and s['keys'] and 'public' in s['keys']:\n                    pubkey_pem = s['keys']['public']\n                    break\n    if not pubkey_pem:\n        pubkey_pem, privatekey_pem = _generateAndSaveKeys()\n    else:\n        # settings are unicode, we should be able to safely decode to ascii for\n        # the key though, as it will either be hex or PEM encoded:\n        pubkey_pem = pubkey_pem.encode('ascii')\n    # if the key doesn't look like PEM, it might be hex-encided-DER (which we\n    # used historically), so try loading that:\n    if b'-----BEGIN PUBLIC KEY-----' in pubkey_pem:\n        pubkey = serialization.load_pem_public_key(pubkey_pem, default_backend())\n    else:\n        pubkey_der = binascii.unhexlify(pubkey_pem)\n        pubkey = serialization.load_der_public_key(pubkey_der, default_backend())\n    return _pubkeyWireFormat(pubkey)", "response": "Return the user s public key"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ngets the authentication data for a specific user in the registry.", "response": "def getAuthData(registry=None):\n    ''' Poll the registry to get the result of a completed authentication\n        (which, depending on the authentication the user chose or was directed\n        to, will include a github or other access token)\n    '''\n    registry = registry or Registry_Base_URL\n    url = '%s/tokens' % (\n        registry\n    )\n\n    request_headers = _headersForRegistry(registry)\n\n    logger.debug('poll for tokens... %s', request_headers)\n\n    try:\n        response = requests.get(url, headers=request_headers)\n    except requests.RequestException as e:\n        logger.debug(str(e))\n        return None\n\n    if response.status_code == requests.codes.unauthorized: #pylint: disable=no-member\n        logger.debug('Unauthorised')\n        return None\n    elif response.status_code == requests.codes.not_found: #pylint: disable=no-member\n        logger.debug('Not Found')\n        return None\n\n    body = response.text\n    logger.debug('auth data response: %s' % body);\n    r = {}\n\n    parsed_response = ordered_json.loads(body)\n\n    if 'error' in parsed_response:\n        raise AuthError(parsed_response['error'])\n\n    for token in parsed_response:\n        if 'provider' in token and token['provider'] and 'accessToken' in token:\n            r[token['provider']] = token['accessToken']\n            break\n\n    logger.debug('parsed auth tokens %s' % r);\n    return r"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns a registry component for anything that s a valid package name", "response": "def createFromSource(cls, vs, name, registry):\n        ''' returns a registry component for anything that's a valid package\n            name (this does not guarantee that the component actually exists in\n            the registry: use availableVersions() for that).\n        '''\n        # we deliberately allow only lowercase, hyphen, and (unfortunately)\n        # numbers in package names, to reduce the possibility of confusingly\n        # similar names: if the name doesn't match this then escalate to make\n        # the user fix it. Targets also allow +\n        if registry == 'targets':\n            name_match = re.match('^[a-z]+[a-z0-9+-]*$', name)\n            if not name_match:\n                raise access_common.AccessException(\n                    'Target name \"%s\" is not valid (must contain only lowercase letters, hyphen, plus, and numbers)' % name\n                )\n        else:\n            name_match = re.match('^[a-z]+[a-z0-9-]*$', name)\n            if not name_match:\n                raise access_common.AccessException(\n                    'Module name \"%s\" is not valid (must contain only lowercase letters, hyphen, and numbers)' % name\n                )\n        assert(vs.semantic_spec)\n        return RegistryThing(name, vs.semantic_spec, registry)"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreads a list of files and merge them with the current settings.", "response": "def read(self, filenames):\n        '''' Read a list of files. Their configuration values are merged, with\n             preference to values from files earlier in the list.\n        '''\n        for fn in filenames:\n            try:\n                self.configs[fn] = ordered_json.load(fn)\n            except IOError:\n                self.configs[fn] = OrderedDict()\n            except Exception as e:\n                self.configs[fn] = OrderedDict()\n                logging.warning(\n                    \"Failed to read settings file %s, it will be ignored. The error was: %s\",\n                    fn, e\n                )"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef get(self, path):\n        ''' return a configuration value\n\n            usage:\n                get('section.property')\n\n            Note that currently array indexes are not supported. You must\n            get the whole array.\n\n            returns None if any path element or the property is missing\n        '''\n        path = _splitPath(path)\n        for config in self.configs.values():\n            cur = config\n            for el in path:\n                if el in cur:\n                    cur = cur[el]\n                else:\n                    cur = None\n                    break\n            if cur is not None:\n                return cur\n        return None", "response": "get a configuration value for a given path element"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef set(self, path, value=None, filename=None):\n        ''' Set a configuration value. If no filename is specified, the\n            property is set in the first configuration file. Note that if a\n            filename is specified and the property path is present in an\n            earlier filename then set property will be hidden.\n\n            usage:\n                set('section.property', value='somevalue')\n\n            Note that currently array indexes are not supported. You must\n            set the whole array.\n        '''\n        if filename is None:\n            config = self._firstConfig()[1]\n        else:\n            config = self.configs[filename]\n\n        path = _splitPath(path)\n        for el in path[:-1]:\n            if el in config:\n                config = config[el]\n            else:\n                config[el] = OrderedDict()\n                config = config[el]\n        config[path[-1]] = value", "response": "Set a configuration value."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef islast(generator):\n    ''' indicate whether the current item is the last one in a generator\n    '''\n    next_x = None\n    first = True\n    for x in generator:\n        if not first:\n            yield (next_x, False)\n        next_x = x\n        first = False\n    if not first:\n        yield (next_x, True)", "response": "indicate whether the current item is the last one in a generator\n   "}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef remoteComponentFor(name, version_required, registry='modules'):\n    ''' Return a RemoteComponent sublclass for the specified component name and\n        source url (or version specification)\n        Raises an exception if any arguments are invalid.\n    '''\n\n    try:\n        vs = sourceparse.parseSourceURL(version_required)\n    except ValueError as e:\n        raise access_common.Unavailable(\n            '%s' % (e)\n        )\n\n    if vs.source_type == 'registry':\n        if registry not in ('modules', 'targets'):\n            raise Exception('no known registry namespace \"%s\"' % registry)\n        return registry_access.RegistryThing.createFromSource(\n            vs, name, registry=registry\n        )\n    elif vs.source_type == 'github':\n        return github_access.GithubComponent.createFromSource(vs, name)\n    elif vs.source_type == 'git':\n        return git_access.GitComponent.createFromSource(vs, name)\n    elif vs.source_type == 'hg':\n        return hg_access.HGComponent.createFromSource(vs, name)\n    else:\n        raise Exception('unsupported module source: \"%s\"' % vs.source_type)", "response": "Returns a RemoteComponent sublclass for the specified component name and version specification."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nreturning a Component or Target for the specified version from the specified list of search paths. If update is True then only update the Component and Target if it is installed via a symlink. If update is False then return None.", "response": "def satisfyVersionFromSearchPaths(name, version_required, search_paths, update=False, type='module', inherit_shrinkwrap=None):\n    ''' returns a Component/Target for the specified version, if found in the\n        list of search paths. If `update' is True, then also check for newer\n        versions of the found component, and update it in-place (unless it was\n        installed via a symlink).\n    '''\n    # Pack, , base class for targets and components, internal\n    from yotta.lib import pack\n\n    v = None\n    try:\n        sv = sourceparse.parseSourceURL(version_required)\n    except ValueError as e:\n        logging.error(e)\n        return None\n\n    try:\n        local_version = searchPathsFor(\n            name,\n            sv.semanticSpec(),\n            search_paths,\n            type,\n            inherit_shrinkwrap = inherit_shrinkwrap\n        )\n    except pack.InvalidDescription as e:\n        logger.error(e)\n        return None\n\n    logger.debug(\"%s %s locally\" % (('found', 'not found')[not local_version], name))\n    if local_version:\n        if update and not local_version.installedLinked():\n            #logger.debug('attempt to check latest version of %s @%s...' % (name, version_required))\n            v = latestSuitableVersion(name, version_required, registry=_registryNamespaceForType(type))\n            if local_version:\n                local_version.setLatestAvailable(v)\n\n        # if we don't need to update, then we're done\n        if local_version.installedLinked() or not local_version.outdated():\n            logger.debug(\"satisfy component from directory: %s\" % local_version.path)\n            # if a component exists (has a valid description file), and either is\n            # not outdated, or we are not updating\n            if name != local_version.getName():\n                raise Exception('Component %s found in incorrectly named directory %s (%s)' % (\n                    local_version.getName(), name, local_version.path\n                ))\n            return local_version\n\n        # otherwise, we need to update the installed component\n        logger.info('update outdated: %s@%s -> %s' % (\n            name,\n            local_version.getVersion(),\n            v\n        ))\n        # must rm the old component before continuing\n        fsutils.rmRf(local_version.path)\n        return _satisfyVersionByInstallingVersion(\n            name, version_required, local_version.path, v, type=type, inherit_shrinkwrap=inherit_shrinkwrap\n        )\n    return None"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns a Component or Target for the specified name + version requirement into a subdirectory of working_directory", "response": "def satisfyVersionByInstalling(name, version_required, working_directory, type='module', inherit_shrinkwrap=None):\n    ''' installs and returns a Component/Target for the specified name+version\n        requirement, into a subdirectory of `working_directory'\n    '''\n    v = latestSuitableVersion(name, version_required, _registryNamespaceForType(type))\n    install_into = os.path.join(working_directory, name)\n    return _satisfyVersionByInstallingVersion(\n        name, version_required, install_into, v, type=type, inherit_shrinkwrap = inherit_shrinkwrap\n    )"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _satisfyVersionByInstallingVersion(name, version_required, working_directory, version, type='module', inherit_shrinkwrap=None):\n    ''' installs and returns a Component/Target for the specified version requirement into\n        'working_directory' using the provided remote version object.\n        This function is not normally called via `satisfyVersionByInstalling',\n        which looks up a suitable remote version object.\n    '''\n    assert(version)\n    logger.info('download %s', version)\n    version.unpackInto(working_directory)\n    r = _clsForType(type)(working_directory, inherit_shrinkwrap = inherit_shrinkwrap)\n    if not r:\n        raise Exception(\n            'Dependency \"%s\":\"%s\" is not a valid %s.' % (name, version_required, type)\n        )\n    if name != r.getName():\n        raise Exception('%s %s (specification %s) has incorrect name %s' % (\n            type, name, version_required, r.getName()\n        ))\n    # error code deliberately ignored here for now, it isn't clear what the\n    # behaviour should be (abort? remove the unpacked state then abort?\n    # continue?)\n    r.runScript('postInstall')\n    return r", "response": "returns a Component or Target object that is suitable for the specified version requirement into the specified working_directory"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn a Component or Target for the specified version", "response": "def satisfyVersion(\n        name,\n        version_required,\n        available,\n        search_paths,\n        working_directory,\n        update_installed=None,\n        type='module',  # or 'target'\n        inherit_shrinkwrap=None\n    ):\n    ''' returns a Component/Target for the specified version (either to an already\n        installed copy (from the available list, or from disk), or to a newly\n        downloaded one), or None if the version could not be satisfied.\n\n        update_installed = None / 'Update'\n            None:   prevent any attempt to look for new versions if the\n                    component/target already exists\n            Update: replace any existing version with the newest available, if\n                    the newest available has a higher version\n    '''\n\n    r = satisfyFromAvailable(name, available, type=type)\n    if r is not None:\n        if not sourceparse.parseSourceURL(version_required).semanticSpecMatches(r.getVersion()):\n            raise access_common.SpecificationNotMet(\n                \"Installed %s %s doesn't match specification %s\" % (type, name, version_required)\n            )\n        return r\n\n    r = satisfyVersionFromSearchPaths(\n        name,\n        version_required,\n        search_paths,\n        (update_installed == 'Update'),\n        type = type,\n        inherit_shrinkwrap = inherit_shrinkwrap\n    )\n    if r is not None:\n        return r\n\n    return satisfyVersionByInstalling(\n        name, version_required, working_directory, type=type, inherit_shrinkwrap = inherit_shrinkwrap\n    )"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef sourceDirValidationError(dirname, component_name):\n    ''' validate source directory names in components '''\n    if dirname == component_name:\n        return 'Module %s public include directory %s should not contain source files' % (component_name, dirname)\n    elif dirname.lower() in ('source', 'src') and dirname != 'source':\n        return 'Module %s has non-standard source directory name: \"%s\" should be \"source\"' % (component_name, dirname)\n    elif isPotentialTestDir(dirname) and dirname != 'test':\n        return 'Module %s has non-standard test directory name: \"%s\" should be \"test\"' % (component_name, dirname)\n    elif not Source_Dir_Regex.match(dirname):\n        corrected = Source_Dir_Invalid_Regex.sub('', dirname.lower())\n        if not corrected:\n            corrected = 'source'\n        return 'Module %s has non-standard source directory name: \"%s\" should be \"%s\"' % (component_name, dirname, corrected)\n    else:\n        return None", "response": "validate source directory names in components"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nprint information about outdated modules, return 0 if there is nothing to be done and nonzero otherwise", "response": "def displayOutdated(modules, dependency_specs, use_colours):\n    ''' print information about outdated modules,\n        return 0 if there is nothing to be done and nonzero otherwise\n    '''\n    if use_colours:\n        DIM    = colorama.Style.DIM       #pylint: disable=no-member\n        NORMAL = colorama.Style.NORMAL    #pylint: disable=no-member\n        BRIGHT = colorama.Style.BRIGHT    #pylint: disable=no-member\n        YELLOW = colorama.Fore.YELLOW     #pylint: disable=no-member\n        RED    = colorama.Fore.RED        #pylint: disable=no-member\n        GREEN  = colorama.Fore.GREEN      #pylint: disable=no-member\n        RESET  = colorama.Style.RESET_ALL #pylint: disable=no-member\n    else:\n        DIM = BRIGHT = YELLOW = RED = GREEN = RESET = u''\n\n    status = 0\n\n    # access, , get components, internal\n    from yotta.lib import access\n    from yotta.lib import access_common\n    # sourceparse, , parse version source urls, internal\n    from yotta.lib import sourceparse\n\n    for name, m in modules.items():\n        if m.isTestDependency():\n            continue\n        try:\n            latest_v = access.latestSuitableVersion(name, '*', registry='modules', quiet=True)\n        except access_common.Unavailable as e:\n            latest_v = None\n\n        if not m:\n            m_version = u' ' + RESET + BRIGHT + RED + u\"missing\" + RESET\n        else:\n            m_version = DIM + u'@%s' % (m.version)\n        if not latest_v:\n            print(u'%s%s%s%s not available from the registry%s' % (RED, name, m_version, NORMAL, RESET))\n            status = 2\n            continue\n        elif not m or m.version < latest_v:\n            update_prevented_by = ''\n            if m:\n                specs_preventing_update = [\n                    x for x in dependency_specs\n                    if x.name == name and not\n                       sourceparse.parseSourceURL(x.nonShrinkwrappedVersionReq()).semanticSpecMatches(latest_v)\n                ]\n                shrinkwrap_prevents_update = [\n                    x for x in dependency_specs\n                    if x.name == name and x.isShrinkwrapped() and not\n                       sourceparse.parseSourceURL(x.versionReq()).semanticSpecMatches(latest_v)\n                ]\n                if len(specs_preventing_update):\n                    update_prevented_by = ' (update prevented by specifications: %s)' % (\n                        ', '.join(['%s from %s' % (x.version_req, x.specifying_module) for x in specs_preventing_update])\n                    )\n                if len(shrinkwrap_prevents_update):\n                    update_prevented_by += ' yotta-shrinkwrap.json prevents update'\n                if m.version.major() < latest_v.major():\n                    # major versions being outdated might be deliberate, so not\n                    # that bad:\n                    colour = GREEN\n                elif m.version.minor() < latest_v.minor():\n                    # minor outdated versions is moderately bad\n                    colour = YELLOW\n                else:\n                    # patch-outdated versions is really bad, because there should\n                    # be no reason not to update:\n                    colour = RED\n            else:\n                colour = RED\n            print(u'%s%s%s latest: %s%s%s%s' % (name, m_version, RESET, colour, latest_v.version, update_prevented_by, RESET))\n            if not status:\n                status = 1\n    return status"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreads the. yotta_origin. json file and return the value of the url property.", "response": "def origin(self):\n        ''' Read the .yotta_origin.json file (if present), and return the value\n            of the 'url' property '''\n        if self.origin_info is None:\n            self.origin_info = {}\n            try:\n                self.origin_info = ordered_json.load(os.path.join(self.path, Origin_Info_Fname))\n            except IOError:\n                pass\n        return self.origin_info.get('url', None)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef outdated(self):\n        ''' Return a truthy object if a newer suitable version is available,\n            otherwise return None.\n            (in fact the object returned is a ComponentVersion that can be used\n             to get the newer version)\n        '''\n        if self.latest_suitable_version and self.latest_suitable_version > self.version:\n            return self.latest_suitable_version\n        else:\n            return None", "response": "Return a truthy object if a newer suitable version is available return None otherwise"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef commitVCS(self, tag=None):\n        ''' Commit the current working directory state (or do nothing if the\n            working directory is not version controlled)\n        '''\n        if not self.vcs:\n            return\n        self.vcs.commit(message='version %s' % tag, tag=tag)", "response": "Commit the current working directory state"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ntests if this module ignores the file at path.", "response": "def ignores(self, path):\n        ''' Test if this module ignores the file at \"path\", which must be a\n            path relative to the root of the module.\n\n            If a file is within a directory that is ignored, the file is also\n            ignored.\n        '''\n        test_path = PurePath('/', path)\n\n        # also check any parent directories of this path against the ignore\n        # patterns:\n        test_paths = tuple([test_path] + list(test_path.parents))\n\n        for exp in self.ignore_patterns:\n            for tp in test_paths:\n                if tp.match(exp):\n                    logger.debug('\"%s\" ignored (\"%s\" matched \"%s\")', path, tp, exp)\n                    return True\n        return False"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nwriting the current component description to a package description file in the component directory.", "response": "def writeDescription(self):\n        ''' Write the current (possibly modified) component description to a\n            package description file in the component directory.\n        '''\n        ordered_json.dump(os.path.join(self.path, self.description_filename), self.description)\n        if self.vcs:\n            self.vcs.markForCommit(self.description_filename)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nwrite a tarball of the current component and target to the file object file_object which must already be opened at position 0", "response": "def generateTarball(self, file_object):\n        ''' Write a tarball of the current component/target to the file object\n            \"file_object\", which must already be open for writing at position 0\n        '''\n        archive_name = '%s-%s' % (self.getName(), self.getVersion())\n        def filterArchive(tarinfo):\n            if tarinfo.name.find(archive_name) == 0 :\n                unprefixed_name = tarinfo.name[len(archive_name)+1:]\n                tarinfo.mode &= 0o775\n            else:\n                unprefixed_name = tarinfo.name\n            if self.ignores(unprefixed_name):\n                return None\n            else:\n                return tarinfo\n        with tarfile.open(fileobj=file_object, mode='w:gz') as tf:\n            logger.info('generate archive extracting to \"%s\"' % archive_name)\n            tf.add(self.path, arcname=archive_name, filter=filterArchive)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\npublish the object to the appropriate registry. Returns a description of any anyErrors that occured or None if successful.", "response": "def publish(self, registry=None):\n        ''' Publish to the appropriate registry, return a description of any\n            errors that occured, or None if successful.\n            No VCS tagging is performed.\n        '''\n        if (registry is None) or (registry == registry_access.Registry_Base_URL):\n            if 'private' in self.description and self.description['private']:\n                return \"this %s is private and cannot be published\" % (self.description_filename.split('.')[0])\n        upload_archive = os.path.join(self.path, 'upload.tar.gz')\n        fsutils.rmF(upload_archive)\n        fd = os.open(upload_archive, os.O_CREAT | os.O_EXCL | os.O_RDWR | getattr(os, \"O_BINARY\", 0))\n        with os.fdopen(fd, 'rb+') as tar_file:\n            tar_file.truncate()\n            self.generateTarball(tar_file)\n            logger.debug('generated tar file of length %s', tar_file.tell())\n            tar_file.seek(0)\n            # calculate the hash of the file before we upload it:\n            shasum = hashlib.sha256()\n            while True:\n                chunk = tar_file.read(1000)\n                if not chunk:\n                    break\n                shasum.update(chunk)\n            logger.debug('generated tar file has hash %s', shasum.hexdigest())\n            tar_file.seek(0)\n            with self.findAndOpenReadme() as readme_file_wrapper:\n                if not readme_file_wrapper:\n                    logger.warning(\"no readme.md file detected\")\n                with open(self.getDescriptionFile(), 'r') as description_file:\n                    return registry_access.publish(\n                        self.getRegistryNamespace(),\n                        self.getName(),\n                        self.getVersion(),\n                        description_file,\n                        tar_file,\n                        readme_file_wrapper.file,\n                        readme_file_wrapper.extension().lower(),\n                        registry=registry\n                    )"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef unpublish(self, registry=None):\n        ''' Try to un-publish the current version. Return a description of any\n            errors that occured, or None if successful.\n        '''\n        return registry_access.unpublish(\n            self.getRegistryNamespace(),\n            self.getName(),\n            self.getVersion(),\n            registry=registry\n        )", "response": "Try to un - publish the current version. Return a description of any\n            errors that occured."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef getScript(self, scriptname):\n        ''' Return the specified script command. If the first part of the\n            command is a .py file, then the current python interpreter is\n            prepended.\n\n            If the script is a single string, rather than an array, it is\n            shlex-split.\n        '''\n        script = self.description.get('scripts', {}).get(scriptname, None)\n        if script is not None:\n            if isinstance(script, str) or isinstance(script, type(u'unicode string')):\n                import shlex\n                script = shlex.split(script)\n            # if the command is a python script, run it with the python\n            # interpreter being used to run yotta, also fetch the absolute path\n            # to the script relative to this module (so that the script can be\n            # distributed with the module, no matter what current working\n            # directory it will be executed in):\n            if len(script) and script[0].lower().endswith('.py'):\n                if not os.path.isabs(script[0]):\n                    absscript = os.path.abspath(os.path.join(self.path, script[0]))\n                    logger.debug('rewriting script %s to be absolute path %s', script[0], absscript)\n                    script[0] = absscript\n                import sys\n                script = [sys.executable] + script\n\n        return script", "response": "Return the specified script command."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef runScript(self, scriptname, additional_environment=None):\n        ''' Run the specified script from the scripts section of the\n            module.json file in the directory of this module.\n        '''\n        import subprocess\n        import shlex\n\n        command = self.getScript(scriptname)\n        if command is None:\n            logger.debug('%s has no script %s', self, scriptname)\n            return 0\n\n        if not len(command):\n            logger.error(\"script %s of %s is empty\", scriptname, self.getName())\n            return 1\n\n        # define additional environment variables for scripts:\n        env = os.environ.copy()\n        if additional_environment is not None:\n            env.update(additional_environment)\n\n        errcode = 0\n        child = None\n        try:\n            logger.debug('running script: %s', command)\n            child = subprocess.Popen(\n                command, cwd = self.path, env = env\n            )\n            child.wait()\n            if child.returncode:\n                logger.error(\n                    \"script %s (from %s) exited with non-zero status %s\",\n                    scriptname,\n                    self.getName(),\n                    child.returncode\n                )\n                errcode = child.returncode\n            child = None\n        finally:\n            if child is not None:\n                tryTerminate(child)\n        return errcode", "response": "Runs the specified script in the scripts section of the current module. json file in the directory of this module."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _truthyConfValue(v):\n    ''' Determine yotta-config truthiness. In yotta config land truthiness is\n        different to python or json truthiness (in order to map nicely only\n        preprocessor and CMake definediness):\n\n          json      -> python -> truthy/falsey\n          false     -> False  -> Falsey\n          null      -> None   -> Falsey\n          undefined -> None   -> Falsey\n          0         -> 0      -> Falsey\n          \"\"        -> \"\"     -> Truthy (different from python)\n          \"0\"       -> \"0\"    -> Truthy\n          {}        -> {}     -> Truthy (different from python)\n          []        -> []     -> Truthy (different from python)\n          everything else is truthy\n    '''\n    if v is False:\n        return False\n    elif v is None:\n        return False\n    elif v == 0:\n        return False\n    else:\n        # everything else is truthy!\n        return True", "response": "Determine whether the given value is truthy or not."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreturn a list of dependency specs for this component.", "response": "def getDependencySpecs(self, target=None):\n        ''' Returns [DependencySpec]\n\n            These are returned in the order that they are listed in the\n            component description file: this is so that dependency resolution\n            proceeds in a predictable way.\n        '''\n        deps = []\n\n        def specForDependency(name, version_spec, istest):\n            shrinkwrap = self.getShrinkwrapMapping()\n            shrinkwrap_version_req = None\n            if name in shrinkwrap:\n                # exact version, and pull from registry:\n                shrinkwrap_version_req = shrinkwrap[name]\n                logger.debug(\n                    'respecting %s shrinkwrap version %s for %s', self.getName(), shrinkwrap_version_req, name\n                )\n            return pack.DependencySpec(\n                                         name,\n                                         version_spec,\n                                         istest,\n                shrinkwrap_version_req = shrinkwrap_version_req,\n                     specifying_module = self.getName()\n            )\n\n        deps += [specForDependency(x[0], x[1], False) for x in self.description.get('dependencies', {}).items()]\n        target_deps = self.description.get('targetDependencies', {})\n        if target is not None:\n            for conf_key, target_conf_deps in target_deps.items():\n                if _truthyConfValue(target.getConfigValue(conf_key)) or conf_key in target.getSimilarTo_Deprecated():\n                    logger.debug(\n                        'Adding target-dependent dependency specs for target config %s to component %s' %\n                        (conf_key, self.getName())\n                    )\n                    deps += [specForDependency(x[0], x[1], False) for x in target_conf_deps.items()]\n\n\n        deps += [specForDependency(x[0], x[1], True) for x in self.description.get('testDependencies', {}).items()]\n        target_deps = self.description.get('testTargetDependencies', {})\n        if target is not None:\n            for conf_key, target_conf_deps in target_deps.items():\n                if _truthyConfValue(target.getConfigValue(conf_key)) or conf_key in target.getSimilarTo_Deprecated():\n                    logger.debug(\n                        'Adding test-target-dependent dependency specs for target config %s to component %s' %\n                        (conf_key, self.getName())\n                    )\n                    deps += [specForDependency(x[0], x[1], True) for x in target_conf_deps.items()]\n\n        # remove duplicates (use the first occurrence)\n        seen = set()\n        r = []\n        for dep in deps:\n            if not dep.name in seen:\n                r.append(dep)\n                seen.add(dep.name)\n\n        return r"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef hasDependency(self, name, target=None, test_dependencies=False):\n        ''' Check if this module has any dependencies with the specified name\n            in its dependencies list, or in target dependencies for the\n            specified target\n        '''\n        if name in self.description.get('dependencies', {}).keys():\n            return True\n\n        target_deps = self.description.get('targetDependencies', {})\n        if target is not None:\n            for conf_key, target_conf_deps in target_deps.items():\n                if _truthyConfValue(target.getConfigValue(conf_key)) or conf_key in target.getSimilarTo_Deprecated():\n                    if name in target_conf_deps:\n                        return True\n\n        if test_dependencies:\n            if name in self.description.get('testDependencies', {}).keys():\n                return True\n\n            if target is not None:\n                test_target_deps = self.description.get('testTargetDependencies', {})\n                for conf_key, target_conf_deps in test_target_deps.items():\n                    if _truthyConfValue(target.getConfigValue(conf_key)) or conf_key in target.getSimilarTo_Deprecated():\n                        if name in target_conf_deps:\n                            return True\n        return False", "response": "Check if this module has any dependencies with the specified name\n            in its dependencies list or in target dependencies for the specified target\n           ."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nchecks if this module or any of its dependencies have the specified name in their targetDependencies corresponding to the specified target.", "response": "def hasDependencyRecursively(self, name, target=None, test_dependencies=False):\n        ''' Check if this module, or any of its dependencies, have a\n            dependencies with the specified name in their dependencies, or in\n            their targetDependencies corresponding to the specified target.\n\n            Note that if recursive dependencies are not installed, this test\n            may return a false-negative.\n        '''\n        # checking dependencies recursively isn't entirely straightforward, so\n        # use the existing method to resolve them all before checking:\n        dependencies = self.getDependenciesRecursive(\n                               target = target,\n                                 test = test_dependencies\n        )\n        return (name in dependencies)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef getDependencies(self,\n        available_components = None,\n                 search_dirs = None,\n                      target = None,\n              available_only = False,\n                        test = False,\n                    warnings = True\n        ):\n        ''' Returns {component_name:component}\n        '''\n        if search_dirs is None:\n            search_dirs = [self.modulesPath()]\n        available_components = self.ensureOrderedDict(available_components)\n\n        components, errors = self.__getDependenciesWithProvider(\n            available_components = available_components,\n                     search_dirs = search_dirs,\n                          target = target,\n                update_installed = False,\n                        provider = self.provideInstalled,\n                            test = test\n        )\n        if warnings:\n            for error in errors:\n                logger.warning(error)\n        if available_only:\n            components = OrderedDict((k, v) for k, v in components.items() if v)\n        return components", "response": "Returns a dictionary of component_name - > component"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef __getDependenciesWithProvider(self,\n                      available_components = None,\n                               search_dirs = None,\n                                    target = None,\n                          update_installed = False,\n                                  provider = None,\n                                      test = False\n   ):\n        ''' Get installed components using \"provider\" to find (and possibly\n            install) components.\n\n            See documentation for __getDependenciesRecursiveWithProvider\n\n            returns (components, errors)\n        '''\n        # sourceparse, , parse version source urls, internal\n        from yotta.lib import sourceparse\n        errors = []\n        modules_path = self.modulesPath()\n        def satisfyDep(dspec):\n            try:\n                r = provider(\n                  dspec,\n                  available_components,\n                  search_dirs,\n                  modules_path,\n                  update_installed,\n                  self\n                )\n                if r and not sourceparse.parseSourceURL(dspec.versionReq()).semanticSpecMatches(r.getVersion()):\n                    shrinkwrap_msg = ''\n                    if dspec.isShrinkwrapped():\n                        shrinkwrap_msg = 'shrinkwrap on '\n                    msg = 'does not meet specification %s required by %s%s' % (\n                        dspec.versionReq(), shrinkwrap_msg, self.getName()\n                    )\n                    logger.debug('%s %s', r.getName(), msg)\n                    r.setError(msg)\n                return r\n            except access_common.Unavailable as e:\n                errors.append(e)\n                self.dependencies_failed = True\n            except vcs.VCSError as e:\n                errors.append(e)\n                self.dependencies_failed = True\n        specs = self.getDependencySpecs(target=target)\n        if not test:\n            # filter out things that aren't test dependencies if necessary:\n            specs = [x for x in specs if not x.is_test_dependency]\n        #dependencies = pool.map(\n        dependencies = map(\n            satisfyDep, specs\n        )\n        self.installed_dependencies = True\n        # stable order is important!\n        return (OrderedDict([((d and d.getName()) or specs[i].name, d) for i, d in enumerate(dependencies)]), errors)", "response": "Get installed components using the specified provider."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef __getDependenciesRecursiveWithProvider(self,\n                               available_components = None,\n                                        search_dirs = None,\n                                             target = None,\n                                     traverse_links = False,\n                                   update_installed = False,\n                                           provider = None,\n                                               test = False,\n                                         _processed = None\n    ):\n        ''' Get installed components using \"provider\" to find (and possibly\n            install) components.\n\n            This function is called with different provider functions in order\n            to retrieve a list of all of the dependencies, or install all\n            dependencies.\n\n            Returns\n            =======\n                (components, errors)\n\n                components: dictionary of name:Component\n                errors: sequence of errors\n\n            Parameters\n            ==========\n                available_components:\n                    None (default) or a dictionary of name:component. This is\n                    searched before searching directories or fetching remote\n                    components\n\n                search_dirs:\n                    None (default), or sequence of directories to search for\n                    already installed, (but not yet loaded) components. Used so\n                    that manually installed or linked components higher up the\n                    dependency tree are found by their users lower down.\n\n                    These directories are searched in order, and finally the\n                    current directory is checked.\n\n                target:\n                    None (default), or a Target object. If specified the target\n                    name and it's similarTo list will be used in resolving\n                    dependencies. If None, then only target-independent\n                    dependencies will be installed\n\n                traverse_links:\n                    False (default) or True: whether to recurse into linked\n                    dependencies. You normally want to set this to \"True\" when\n                    getting a list of dependencies, and False when installing\n                    them (unless the user has explicitly asked dependencies to\n                    be installed in linked components).\n\n                provider: None (default) or function:\n                          provider(\n                            dependency_spec,\n                            available_components,\n                            search_dirs,\n                            working_directory,\n                            update_if_installed\n                          )\n                test:\n                    True, False, 'toplevel': should test-only dependencies be\n                    included (yes, no, or only at this level, not recursively)\n        '''\n        def recursionFilter(c):\n            if not c:\n                logger.debug('do not recurse into failed component')\n                # don't recurse into failed components\n                return False\n            if c.getName() in _processed:\n                logger.debug('do not recurse into already processed component: %s' % c)\n                return False\n            if c.installedLinked() and not traverse_links:\n                return False\n            return True\n        available_components = self.ensureOrderedDict(available_components)\n        if search_dirs is None:\n            search_dirs = []\n        if _processed is None:\n            _processed = set()\n        assert(test in [True, False, 'toplevel'])\n        search_dirs.append(self.modulesPath())\n        logger.debug('process %s\\nsearch dirs:%s' % (self.getName(), search_dirs))\n        if self.isTestDependency():\n            logger.debug(\"won't provide test dependencies recursively for test dependency %s\", self.getName())\n            test = False\n        components, errors = self.__getDependenciesWithProvider(\n            available_components = available_components,\n                     search_dirs = search_dirs,\n                update_installed = update_installed,\n                          target = target,\n                        provider = provider,\n                            test = test\n        )\n        _processed.add(self.getName())\n        if errors:\n            errors = ['Failed to satisfy dependencies of %s:' % self.path] + errors\n        need_recursion = [x for x in filter(recursionFilter, components.values())]\n        available_components.update(components)\n        logger.debug('processed %s\\nneed recursion: %s\\navailable:%s\\nsearch dirs:%s' % (self.getName(), need_recursion, available_components, search_dirs))\n        if test == 'toplevel':\n            test = False\n        # NB: can't perform this step in parallel, since the available\n        # components list must be updated in order\n        for c in need_recursion:\n            dep_components, dep_errors = c.__getDependenciesRecursiveWithProvider(\n                available_components = available_components,\n                         search_dirs = search_dirs,\n                              target = target,\n                      traverse_links = traverse_links,\n                    update_installed = update_installed,\n                            provider = provider,\n                                test = test,\n                          _processed = _processed\n            )\n            available_components.update(dep_components)\n            components.update(dep_components)\n            errors += dep_errors\n        return (components, errors)", "response": "This function returns a list of all of the dependencies that are recursively installed and installed."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nretrieve and install all the dependencies of this component and its dependents recursively or from disk.", "response": "def satisfyDependenciesRecursive(\n                            self,\n            available_components = None,\n                     search_dirs = None,\n                update_installed = False,\n                  traverse_links = False,\n                          target = None,\n                            test = False\n        ):\n        ''' Retrieve and install all the dependencies of this component and its\n            dependencies, recursively, or satisfy them from a collection of\n            available_components or from disk.\n\n            Returns\n            =======\n                (components, errors)\n\n                components: dictionary of name:Component\n                errors: sequence of errors\n\n            Parameters\n            ==========\n\n                available_components:\n                    None (default) or a dictionary of name:component. This is\n                    searched before searching directories or fetching remote\n                    components\n\n                search_dirs:\n                    None (default), or sequence of directories to search for\n                    already installed, (but not yet loaded) components. Used so\n                    that manually installed or linked components higher up the\n                    dependency tree are found by their users lower down.\n\n                    These directories are searched in order, and finally the\n                    current directory is checked.\n\n                update_installed:\n                    False (default), True, or set(): whether to check the\n                    available versions of installed components, and update if a\n                    newer version is available. If this is a set(), only update\n                    things in the specified set.\n\n                traverse_links:\n                    False (default) or True: whether to recurse into linked\n                    dependencies when updating/installing.\n\n                target:\n                    None (default), or a Target object. If specified the target\n                    name and it's similarTo list will be used in resolving\n                    dependencies. If None, then only target-independent\n                    dependencies will be installed\n\n                test:\n                    True, False, or 'toplevel: should test-only dependencies be\n                    installed? (yes, no, or only for this module, not its\n                    dependencies).\n\n        '''\n        def provider(\n            dspec,\n            available_components,\n            search_dirs,\n            working_directory,\n            update_installed,\n            dep_of=None\n        ):\n            r = access.satisfyFromAvailable(dspec.name, available_components)\n            if r:\n                if r.isTestDependency() and not dspec.is_test_dependency:\n                    logger.debug('test dependency subsequently occurred as real dependency: %s', r.getName())\n                    r.setTestDependency(False)\n                return r\n            update_if_installed = False\n            if update_installed is True:\n                update_if_installed = True\n            elif update_installed:\n                update_if_installed = dspec.name in update_installed\n            r = access.satisfyVersionFromSearchPaths(\n                dspec.name,\n                dspec.versionReq(),\n                search_dirs,\n                update_if_installed,\n                inherit_shrinkwrap = dep_of.getShrinkwrap()\n            )\n            if r:\n                r.setTestDependency(dspec.is_test_dependency)\n                return r\n            # before resorting to install this module, check if we have an\n            # existing linked module (which wasn't picked up because it didn't\n            # match the version specification) - if we do, then we shouldn't\n            # try to install, but should return that anyway:\n            default_path = os.path.join(self.modulesPath(), dspec.name)\n            if fsutils.isLink(default_path):\n                r = Component(\n                                       default_path,\n                     test_dependency = dspec.is_test_dependency,\n                    installed_linked = fsutils.isLink(default_path),\n                  inherit_shrinkwrap = dep_of.getShrinkwrap()\n                )\n                if r:\n                    assert(r.installedLinked())\n                    return r\n                else:\n                    logger.error('linked module %s is invalid: %s', dspec.name, r.getError())\n                    return r\n\n            r = access.satisfyVersionByInstalling(\n                dspec.name,\n                dspec.versionReq(),\n                self.modulesPath(),\n                inherit_shrinkwrap = dep_of.getShrinkwrap()\n            )\n            if not r:\n                logger.error('could not install %s' % dspec.name)\n            if r is not None:\n                r.setTestDependency(dspec.is_test_dependency)\n            return r\n\n        return self.__getDependenciesRecursiveWithProvider(\n           available_components = available_components,\n                    search_dirs = search_dirs,\n                         target = target,\n                 traverse_links = traverse_links,\n               update_installed = update_installed,\n                       provider = provider,\n                           test = test\n        )"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef satisfyTarget(self, target_name_and_version, update_installed=False, additional_config=None, install_missing=True):\n        ''' Ensure that the specified target name (and optionally version,\n            github ref or URL) is installed in the targets directory of the\n            current component\n\n            returns (derived_target, errors)\n        '''\n        # Target, , represent an installed target, internal\n        from yotta.lib import target\n        application_dir = None\n        if self.isApplication():\n            application_dir = self.path\n        return target.getDerivedTarget(\n                                target_name_and_version,\n                                self.targetsPath(),\n              install_missing = install_missing,\n              application_dir = application_dir,\n             update_installed = update_installed,\n            additional_config = additional_config,\n                   shrinkwrap = self.getShrinkwrap()\n        )", "response": "Ensure that the specified target name and version is installed in the targets directory of the current component\n            returns a derived target object."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef getTarget(self, target_name_and_version, additional_config=None):\n        ''' Return a derived target object representing the selected target: if\n            the target is not installed, or is invalid then the returned object\n            will test false in a boolean context.\n\n            Returns derived_target\n\n            Errors are not displayed.\n        '''\n        derived_target, errors = self.satisfyTarget(\n                               target_name_and_version,\n           additional_config = additional_config,\n             install_missing = False\n        )\n        if len(errors):\n            return None\n        else:\n            return derived_target", "response": "Returns a derived target object representing the selected target."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreturns a dictionary of binaries to compile into the current module.", "response": "def getBinaries(self):\n        ''' Return a dictionary of binaries to compile: {\"dirname\":\"exename\"},\n            this is used when automatically generating CMakeLists\n\n            Note that currently modules may define only a single executable\n            binary or library to be built by the automatic build system, by\n            specifying `\"bin\": \"dir-to-be-built-into-binary\"`, or `\"lib\":\n            \"dir-to-be-built-into-library\"`, and the bin/lib will always have\n            the same name as the module. The default behaviour if nothing is\n            specified is for the 'source' directory to be built into a library.\n\n            The module.json syntax may allow for other combinations in the\n            future (and callers of this function should not rely on it\n            returning only a single item). For example, a \"bin\": {\"dirname\":\n            \"exename\"} syntax might be supported, however currently more\n            complex builds must be controlled by custom CMakeLists.\n        '''\n        # the module.json syntax is a subset of the package.json syntax: a\n        # single string that defines the source directory to use to build an\n        # executable with the same name as the component. This may be extended\n        # to include the rest of the npm syntax in future (map of source-dir to\n        # exe name).\n        if 'bin' in self.description:\n            return {os.path.normpath(self.description['bin']): self.getName()}\n        else:\n            return {}"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn a dictionary of libraries to compile.", "response": "def getLibs(self, explicit_only=False):\n        ''' Return a dictionary of libraries to compile: {\"dirname\":\"libname\"},\n            this is used when automatically generating CMakeLists.\n\n            If explicit_only is not set, then in the absence of both 'lib' and\n            'bin' sections in the module.json file, the \"source\" directory\n            will be returned.\n\n            Note that currently modules may define only a single executable\n            binary or library to be built by the automatic build system, by\n            specifying `\"bin\": \"dir-to-be-built-into-binary\"`, or `\"lib\":\n            \"dir-to-be-built-into-library\"`, and the bin/lib will always have\n            the same name as the module. The default behaviour if nothing is\n            specified is for the 'source' directory to be built into a library.\n\n            The module.json syntax may allow for other combinations in the\n            future (and callers of this function should not rely on it\n            returning only a single item). For example, a \"bin\": {\"dirname\":\n            \"exename\"} syntax might be supported, however currently more\n            complex builds must be controlled by custom CMakeLists.\n        '''\n        if 'lib' in self.description:\n            return {os.path.normpath(self.description['lib']): self.getName()}\n        elif 'bin' not in self.description and not explicit_only:\n            return {'source': self.getName()}\n        else:\n            return {}"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef licenses(self):\n        ''' Return a list of licenses that apply to this module. (Strings,\n            which may be SPDX identifiers)\n        '''\n        if 'license' in self.description:\n            return [self.description['license']]\n        else:\n            return [x['type'] for x in self.description['licenses']]", "response": "Return a list of licenses that apply to this module."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nreturning a list of extraIncludes for this entry.", "response": "def getExtraIncludes(self):\n        ''' Some components must export whole directories full of headers into\n            the search path. This is really really bad, and they shouldn't do\n            it, but support is provided as a concession to compatibility.\n        '''\n        if 'extraIncludes' in self.description:\n            return [os.path.normpath(x) for x in self.description['extraIncludes']]\n        else:\n            return []"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nreturn the list of system include paths that are needed to include the header files in the system.", "response": "def getExtraSysIncludes(self):\n        ''' Some components (e.g. libc) must export directories of header files\n            into the system include search path. They do this by adding a\n            'extraSysIncludes' : [ array of directories ] field in their\n            package description. This function returns the list of directories\n            (or an empty list), if it doesn't exist.\n        '''\n        if 'extraSysIncludes' in self.description:\n            return [os.path.normpath(x) for x in self.description['extraSysIncludes']]\n        else:\n            return []"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef checkDependenciesForShrinkwrap(dependency_list):\n    ''' return a list of errors encountered (e.g. dependency missing or\n        specification not met\n    '''\n    # sourceparse, , parse version specifications, internall\n    from yotta.lib import sourceparse\n    errors = []\n    # first gather the available versions of things:\n    available_versions = {}\n    for mod in dependency_list.get('modules', []):\n        available_versions[mod['name']] = mod['version']\n    # now check that the available versions satisfy all of the specifications\n    # from other modules:\n    for mod in dependency_list.get('modules', []):\n        for spec_info in mod.get('specifications', []):\n            name = spec_info['name']\n            spec = spec_info['version']\n            if spec_info.get('testOnly', False):\n                # test-only specifications are ignored for shrinkwrap\n                continue\n            if not name in available_versions:\n                errors.append('dependency %s (required by %s) is missing' % (\n                    name, mod['name']\n                ))\n            else:\n                available_version = available_versions[name]\n                parsed_spec = sourceparse.parseSourceURL(spec)\n                if not parsed_spec.semanticSpecMatches(available_version):\n                    errors.append('%s@%s does not meet specification %s required by %s' % (\n                        name, available_version, parsed_spec.semanticSpec(), mod['name']\n                    ))\n\n    return errors", "response": "check that all of the dependencies for shrinkwrap are met"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef availableVersions(self):\n        ''' return a list of GitCloneVersion objects for tags which are valid\n            semantic version idenfitifiers.\n        '''\n        r = []\n        for t in self.vcs.tags():\n            logger.debug(\"available version tag: %s\", t)\n            # ignore empty tags:\n            if not len(t.strip()):\n                continue\n            try:\n                r.append(GitCloneVersion(t, t, self))\n            except ValueError:\n                logger.debug('invalid version tag: %s', t)\n        return r", "response": "return a list of GitCloneVersion objects for tags which are valid\n            semantic version idenfitifiers."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreturning a GithubComponentVersion object for a specific commit if valid", "response": "def commitVersion(self, spec):\n        ''' return a GithubComponentVersion object for a specific commit if valid\n        '''\n        import re\n\n        commit_match = re.match('^[a-f0-9]{7,40}$', spec, re.I)\n        if commit_match:\n            return GitCloneVersion('', spec, self)\n\n        return None"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreturns a git component for any git:// url or None if this is not", "response": "def createFromSource(cls, vs, name=None):\n        ''' returns a git component for any git:// url, or None if this is not\n            a git component.\n\n            Normally version will be empty, unless the original url was of the\n            form 'git://...#version', which can be used to grab a particular\n            tag or branch, or ...#>=1.2.3, which can be used to specify\n            semantic version specifications on tags.\n        '''\n        return GitComponent(vs.location, vs.spec, vs.semantic_spec)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _mergeDictionaries(*args):\n    ''' merge dictionaries of dictionaries recursively, with elements from\n        dictionaries earlier in the argument sequence taking precedence\n    '''\n    # to support merging of OrderedDicts, copy the result type from the first\n    # argument:\n    result = type(args[0])()\n    for k, v in itertools.chain(*[x.items() for x in args]):\n        if not k in result:\n            result[k] = v\n        elif isinstance(result[k], dict) and isinstance(v, dict):\n            result[k] = _mergeDictionaries(result[k], v)\n    return result", "response": "merge dictionaries of dictionaries recursively with elements from\nAttributeNames dictionaries earlier in the argument sequence taking precedence\nAttributeNames"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _mirrorStructure(dictionary, value):\n    ''' create a new nested dictionary object with the same structure as\n        'dictionary', but with all scalar values replaced with 'value'\n    '''\n    result = type(dictionary)()\n    for k in dictionary.keys():\n        if isinstance(dictionary[k], dict):\n            result[k] = _mirrorStructure(dictionary[k], value)\n        else:\n            result[k] = value\n    return result", "response": "create a new nested dictionary object with the same structure as the dictionary but with all scalar values replaced with value"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nload additional config from a file or JSON file", "response": "def loadAdditionalConfig(config_path):\n    ''' returns (error, config)\n    '''\n    error = None\n    config = {}\n    if not config_path:\n        return (error, config)\n    if os.path.isfile(config_path):\n        try:\n            config = ordered_json.load(config_path)\n        except Exception as e:\n            error = \"Invalid syntax in file %s: %s\" % (config_path, e)\n    else:\n        # try to interpret the argument as literal JSON\n        try:\n            config = ordered_json.loads(config_path)\n        except Exception as e:\n            # if this fails too, guess whether it was intended to be JSON or\n            # not, and display an appropriate error message\n            if '{' in config_path or '}' in config_path:\n                error = \"Invalid syntax in literal JSON: %s\" % e\n            else:\n                error = \"File \\\"%s\\\" does not exist\" % config_path\n    logger.debug('read additional config: %s', config)\n    return (error, config)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef getDerivedTarget(\n        target_name_and_version,\n                   targets_path,\n                application_dir = None,\n                install_missing = True,\n               update_installed = False,\n              additional_config = None,\n                     shrinkwrap = None\n    ):\n    # access, , get components, internal\n    from yotta.lib import access\n    from yotta.lib import access_common\n    ''' Get the specified target description, optionally ensuring that it (and\n        all dependencies) are installed in targets_path.\n\n        Returns (DerivedTarget, errors), or (None, errors) if the leaf target\n        could not be found/installed.\n    '''\n    logger.debug('satisfy target: %s' % target_name_and_version);\n    if ',' in target_name_and_version:\n        name, version_req = target_name_and_version.split(',')\n    else:\n        name = target_name_and_version\n        version_req = '*'\n\n    # shrinkwrap is the raw json form, not mapping form here, so rearrange it\n    # before indexing:\n    if shrinkwrap is not None:\n        shrinkwrap_version_req = {\n            x['name']: x['version'] for x in shrinkwrap.get('targets', [])\n        }.get(name, None)\n    else:\n        shrinkwrap_version_req = None\n    if shrinkwrap_version_req is not None:\n        logger.debug(\n            'respecting shrinkwrap version %s for %s', shrinkwrap_version_req, name\n        )\n\n    dspec = pack.DependencySpec(\n                                 name,\n                                 version_req,\n        shrinkwrap_version_req = shrinkwrap_version_req\n    )\n\n    leaf_target      = None\n    previous_name    = dspec.name\n    search_dirs      = [targets_path]\n    target_hierarchy = []\n    errors           = []\n    while True:\n        t = None\n        try:\n            if install_missing:\n                t = access.satisfyVersion(\n                                 name = dspec.name,\n                     version_required = dspec.versionReq(),\n                            available = target_hierarchy,\n                         search_paths = search_dirs,\n                    working_directory = targets_path,\n                     update_installed = ('Update' if update_installed else None),\n                                 type = 'target',\n                   inherit_shrinkwrap = shrinkwrap\n                )\n            else:\n                t = access.satisfyVersionFromSearchPaths(\n                                 name = dspec.name,\n                     version_required = dspec.versionReq(),\n                         search_paths = search_dirs,\n                                 type = 'target',\n                   inherit_shrinkwrap = shrinkwrap\n                )\n        except access_common.AccessException as e:\n            errors.append(e)\n        if not t:\n            if install_missing:\n                logger.error(\n                    'could not install target %s for %s' %\n                    (dspec, previous_name)\n                )\n            break\n        else:\n            target_hierarchy.append(t)\n            previous_name = dspec.name\n            assert(isinstance(t, Target))\n            dspec = t.baseTargetSpec() #pylint: disable=no-member\n            if not leaf_target:\n                leaf_target = t\n            if dspec is None:\n                break\n    if leaf_target is None:\n        return (None, errors)\n    # if we have a valid target, try to load the app-specific config data (if\n    # any):\n    app_config = {}\n    if application_dir is not None:\n        app_config_fname = os.path.join(application_dir, App_Config_File)\n        if os.path.exists(app_config_fname):\n            try:\n                app_config = ordered_json.load(app_config_fname)\n            except Exception as e:\n                errors.append(Exception(\"Invalid application config.json: %s\" % (e)))\n    return (DerivedTarget(leaf_target, target_hierarchy[1:], app_config, additional_config), errors)", "response": "Returns aDerivedTarget object for the specified target name and version."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef baseTargetSpec(self):\n        ''' returns pack.DependencySpec for the base target of this target (or\n            None if this target does not inherit from another target.\n        '''\n        inherits = self.description.get('inherits', {})\n        if len(inherits) == 1:\n            name, version_req = list(inherits.items())[0]\n            shrinkwrap_version_req = self.getShrinkwrapMapping('targets').get(name, None)\n            if shrinkwrap_version_req is not None:\n                logger.debug(\n                    'respecting shrinkwrap version %s for %s', shrinkwrap_version_req, name\n                )\n            return pack.DependencySpec(\n                name,\n                version_req,\n                shrinkwrap_version_req = shrinkwrap_version_req\n            )\n        elif len(inherits) > 1:\n            logger.error('target %s specifies multiple base targets, but only one is allowed', self.getName())\n        return None", "response": "returns the base target spec for this target"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreturns the specified script if one exists", "response": "def getScript(self, scriptname):\n        ''' return the specified script if one exists (possibly inherited from\n            a base target)\n        '''\n        for t in self.hierarchy:\n            s = t.getScript(scriptname)\n            if s:\n                return s\n        return None"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nload the configuration information from the target hierarchy", "response": "def _loadConfig(self):\n        ''' load the configuration information from the target hierarchy '''\n        config_dicts = [self.additional_config, self.app_config] + [t.getConfig() for t in self.hierarchy]\n        # create an identical set of dictionaries, but with the names of the\n        # sources in place of the values. When these are merged they will show\n        # where each merged property came from:\n        config_blame = [\n            _mirrorStructure(self.additional_config, 'command-line config'),\n            _mirrorStructure(self.app_config, 'application\\'s config.json'),\n        ] + [\n            _mirrorStructure(t.getConfig(), t.getName()) for t in self.hierarchy\n        ]\n\n        self.config = _mergeDictionaries(*config_dicts)\n        self.config_blame = _mergeDictionaries(*config_blame)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nreturn a list of toolchain file paths in override order", "response": "def getToolchainFiles(self):\n        ''' return a list of toolchain file paths in override order (starting\n            at the bottom/leaf of the hierarchy and ending at the base).\n            The list is returned in the order they should be included\n            (most-derived last).\n        '''\n        return reversed([\n            os.path.join(x.path, x.description['toolchain']) for x in self.hierarchy if 'toolchain' in x.description\n        ])"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef getAdditionalIncludes(self):\n        ''' Return the list of cmake files which are to be included by yotta in\n            every module built. The list is returned in the order they should\n            be included (most-derived last).\n        '''\n        return reversed([\n            os.path.join(t.path, include_file)\n                for t in self.hierarchy\n                for include_file in t.description.get('cmakeIncludes', [])\n        ])", "response": "Return the list of cmake files which are to be included by yotta in\n            every module built."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn true if this target inherits from the named target.", "response": "def inheritsFrom(self, target_name):\n        ''' Return true if this target inherits from the named target (directly\n            or indirectly. Also returns true if this target is the named\n            target. Otherwise return false.\n        '''\n        for t in self.hierarchy:\n            if t and t.getName() == target_name or target_name in t.description.get('inherits', {}):\n                return True\n        return False"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef exec_helper(self, cmd, builddir):\n        ''' Execute the given command, returning an error message if an error occured\n            or None if the command was succesful.'''\n        try:\n            child = subprocess.Popen(cmd, cwd=builddir)\n            child.wait()\n        except OSError as e:\n            if e.errno == errno.ENOENT:\n                if cmd[0] == 'cmake':\n                    return 'CMake is not installed, please follow the installation instructions at http://docs.yottabuild.org/#installing'\n                else:\n                    return '%s is not installed' % (cmd[0])\n            else:\n                return 'command %s failed' % (cmd)\n        if child.returncode:\n            return 'command %s failed' % (cmd)", "response": "Execute the given command and return an error message if the command failed."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef build(self, builddir, component, args, release_build=False, build_args=None, targets=None,\n              release_no_debug_info_build=False):\n        ''' Execute the commands necessary to build this component, and all of\n            its dependencies. '''\n        if build_args is None:\n            build_args = []\n        if targets is None:\n            targets = []\n        # in the future this may be specified in the target description, but\n        # for now we only support cmake, so everything is simple:\n        if release_no_debug_info_build:\n            build_type = 'Release'\n        elif release_build:\n            build_type = 'RelWithDebInfo'\n        else:\n            build_type = 'Debug'\n        cmd = ['cmake', '-D', 'CMAKE_BUILD_TYPE=%s' % build_type, '-G', args.cmake_generator, '.']\n        res = self.exec_helper(cmd, builddir)\n        if res is not None:\n            return res\n\n        # work-around various yotta-specific issues with the generated\n        # Ninja/project files:\n        from yotta.lib import cmake_fixups\n        cmake_fixups.applyFixupsForFenerator(args.cmake_generator, builddir, component)\n\n        build_command = self.overrideBuildCommand(args.cmake_generator, targets=targets)\n        if build_command:\n            cmd = build_command + build_args\n        else:\n            cmd = ['cmake', '--build', builddir]\n            if len(targets):\n                # !!! FIXME: support multiple targets with the default CMake\n                # build command\n                cmd += ['--target', targets[0]]\n            cmd += build_args\n        res = self.exec_helper(cmd, builddir)\n        if res is not None:\n            return res\n        hint = self.hintForCMakeGenerator(args.cmake_generator, component)\n        if hint:\n            logger.info(hint)", "response": "Execute the cmake build command and return the unique ID of the new component."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nfind the full path of a program in a given builddir. Returns None if no path is specified. Returns None if no path is found.", "response": "def findProgram(self, builddir, program):\n        ''' Return the builddir-relative path of program, if only a partial\n            path is specified. Returns None and logs an error message if the\n            program is ambiguous or not found\n\t'''\n        # if this is an exact match, do no further checking:\n        if os.path.isfile(os.path.join(builddir, program)):\n            logging.info('found %s' % program)\n            return program\n        exact_matches = []\n        insensitive_matches = []\n        approx_matches = []\n        for path, dirs, files in os.walk(builddir):\n            if program in files:\n                exact_matches.append(os.path.relpath(os.path.join(path, program), builddir))\n                continue\n            files_lower = [f.lower() for f in files]\n            if program.lower() in files_lower:\n                insensitive_matches.append(\n                    os.path.relpath(\n                        os.path.join(path, files[files_lower.index(program.lower())]),\n                        builddir\n                    )\n                )\n                continue\n            # !!! TODO: in the future add approximate string matching (typos,\n            # etc.), for now we just test stripping any paths off program, and\n            # looking for substring matches:\n            pg_basen_lower_noext = os.path.splitext(os.path.basename(program).lower())[0]\n            for f in files_lower:\n                if pg_basen_lower_noext in f:\n                    approx_matches.append(\n                        os.path.relpath(\n                            os.path.join(path, files[files_lower.index(f)]),\n                            builddir\n                        )\n                    )\n\n        if len(exact_matches) == 1:\n            logging.info('found %s at %s', program, exact_matches[0])\n            return exact_matches[0]\n        elif len(exact_matches) > 1:\n            logging.error(\n                '%s matches multiple executables, please use a full path (one of %s)' % (\n                    program,\n                    ', or '.join(['\"'+os.path.join(m, program)+'\"' for m in exact_matches])\n                )\n            )\n            return None\n        # if we have matches with and without a file extension, prefer the\n        # no-file extension version, and discard the others (so we avoid\n        # picking up post-processed files):\n        reduced_approx_matches = []\n        for m in approx_matches:\n            root = os.path.splitext(m)[0]\n            if (m == root) or (root not in approx_matches):\n                reduced_approx_matches.append(m)\n        approx_matches = reduced_approx_matches\n\n        for matches in (insensitive_matches, approx_matches):\n            if len(matches) == 1:\n                logging.info('found %s at %s' % (\n                    program, matches[0]\n                ))\n                return matches[0]\n            elif len(matches) > 1:\n                logging.error(\n                    '%s is similar to several executables found. Please use an exact name:\\n%s' % (\n                        program,\n                        '\\n'.join(matches)\n                    )\n                )\n                return None\n        logging.error('could not find program \"%s\" to debug' %  program)\n        return None"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef start(self, builddir, program, forward_args):\n        ''' Launch the specified program. Uses the `start` script if specified\n            by the target, attempts to run it natively if that script is not\n            defined.\n        '''\n        child = None\n        try:\n            prog_path = self.findProgram(builddir, program)\n            if prog_path is None:\n                return\n\n            start_env, start_vars = self.buildProgEnvAndVars(prog_path, builddir)\n            if self.getScript('start'):\n                cmd = [\n                    os.path.expandvars(string.Template(x).safe_substitute(**start_vars))\n                    for x in self.getScript('start')\n                ] + forward_args\n            else:\n                cmd = shlex.split('./' + prog_path) + forward_args\n\n            logger.debug('starting program: %s', cmd)\n            child = subprocess.Popen(\n                cmd, cwd = builddir, env = start_env\n            )\n            child.wait()\n            if child.returncode:\n                return \"process exited with status %s\" % child.returncode\n            child = None\n        except OSError as e:\n            import errno\n            if e.errno == errno.ENOEXEC:\n                return (\"the program %s cannot be run (perhaps your target \"+\n                        \"needs to define a 'start' script to start it on its \"\n                        \"intended execution target?)\") % prog_path\n        finally:\n            if child is not None:\n                _tryTerminate(child)", "response": "Launches the specified program on the specified builddir."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef debug(self, builddir, program):\n        ''' Launch a debugger for the specified program. Uses the `debug`\n            script if specified by the target, falls back to the `debug` and\n            `debugServer` commands if not. `program` is inserted into the\n            $program variable in commands.\n        '''\n        try:\n            signal.signal(signal.SIGINT, _ignoreSignal);\n            if self.getScript('debug') is not None:\n                return self._debugWithScript(builddir, program)\n            elif 'debug' in self.description:\n                logger.warning(\n                    'target %s provides deprecated debug property. It should '+\n                    'provide script.debug instead.', self.getName()\n\n                )\n                return self._debugDeprecated(builddir, program)\n            else:\n                return \"Target %s does not specify debug commands\" % self\n        finally:\n            # clear the sigint handler\n            signal.signal(signal.SIGINT, signal.SIG_DFL);", "response": "Launch a debugger for the specified program."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef which(program):\n    ''' look for \"program\" in PATH (respecting PATHEXT), and return the path to\n        it, or None if it was not found\n    '''\n    # current directory / absolute paths:\n    if os.path.exists(program) and os.access(program, os.X_OK):\n        return program\n    # PATH:\n    for path in os.environ['PATH'].split(os.pathsep):\n        # path variables may be quoted:\n        path = path.strip('\"')\n        for ext in os.environ.get('PATHEXT', '').split(os.pathsep):\n            progpath = os.path.join(path, program + ext)\n            if os.path.exists(progpath) and os.access(progpath, os.X_OK):\n                return progpath\n    # not found\n    return None", "response": "look for program in PATH and return the path to\nTaxonomy it or None if it was not found"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef sometimesPruneCache(p):\n    ''' return decorator to prune cache after calling fn with a probability of p'''\n    def decorator(fn):\n        @functools.wraps(fn)\n        def wrapped(*args, **kwargs):\n            r = fn(*args, **kwargs)\n            if random.random() < p:\n                pruneCache()\n            return r\n        return wrapped\n    return decorator", "response": "return decorator to prune cache after calling fn with a probability of p"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef unpackFromCache(cache_key, to_directory):\n    ''' If the specified cache key exists, unpack the tarball into the\n        specified directory, otherwise raise NotInCache (a KeyError subclass).\n    '''\n    if cache_key is None:\n        raise NotInCache('\"None\" is never in cache')\n\n    cache_key = _encodeCacheKey(cache_key)\n\n    cache_dir = folders.cacheDirectory()\n    fsutils.mkDirP(cache_dir)\n    path = os.path.join(cache_dir, cache_key)\n    logger.debug('attempt to unpack from cache %s -> %s', path, to_directory)\n    try:\n        unpackFrom(path, to_directory)\n        try:\n            shutil.copy(path + '.json', os.path.join(to_directory, '.yotta_origin.json'))\n        except IOError as e:\n            if e.errno == errno.ENOENT:\n                pass\n            else:\n                raise\n        cache_logger.debug('unpacked %s from cache into %s', cache_key, to_directory)\n        return\n    except IOError as e:\n        if e.errno == errno.ENOENT:\n            cache_logger.debug('%s not in cache', cache_key)\n            raise NotInCache('not in cache')\n    except OSError as e:\n        if e.errno == errno.ENOTEMPTY:\n            logger.error('directory %s was not empty: probably simultaneous invocation of yotta! It is likely that downloaded sources are corrupted.')\n        else:\n            raise", "response": "Unpacks a tarball into a directory."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _downloadToCache(stream, hashinfo={}, origin_info=dict()):\n    ''' Download the specified stream to a temporary cache directory, and\n        returns a cache key that can be used to access/remove the file.\n        You should use either removeFromCache(cache_key) or _moveCachedFile to\n        move the downloaded file to a known key after downloading.\n    '''\n    hash_name  = None\n    hash_value = None\n    m = None\n\n    if len(hashinfo):\n        # check for hashes in preferred order. Currently this is just sha256\n        # (which the registry uses). Initial investigations suggest that github\n        # doesn't return a header with the hash of the file being downloaded.\n        for h in ('sha256',):\n            if h in hashinfo:\n                hash_name  = h\n                hash_value = hashinfo[h]\n                m = getattr(hashlib, h)()\n                break\n        if not hash_name:\n            logger.warning('could not find supported hash type in %s', hashinfo)\n\n    cache_dir = folders.cacheDirectory()\n    fsutils.mkDirP(cache_dir)\n    file_size = 0\n\n    (download_file, download_fname) = tempfile.mkstemp(dir=cache_dir, suffix='.locked')\n\n    with os.fdopen(download_file, 'wb') as f:\n        f.seek(0)\n        for chunk in stream.iter_content(4096):\n            f.write(chunk)\n            if hash_name:\n                m.update(chunk)\n\n        if hash_name:\n            calculated_hash = m.hexdigest()\n            logger.debug(\n                'calculated %s hash: %s check against: %s' % (\n                    hash_name, calculated_hash, hash_value\n                )\n            )\n            if hash_value and (hash_value != calculated_hash):\n                raise Exception('Hash verification failed.')\n        file_size = f.tell()\n        logger.debug('wrote tarfile of size: %s to %s', file_size, download_fname)\n        f.truncate()\n\n    extended_origin_info = {\n        'hash': hashinfo,\n        'size': file_size\n    }\n    extended_origin_info.update(origin_info)\n    ordered_json.dump(download_fname + '.json', extended_origin_info)\n    return os.path.basename(download_fname)", "response": "Download the specified stream to a temporary cache directory and return a cache key that can be used to access the file."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nmoves a file atomically within the cache.", "response": "def _moveCachedFile(from_key, to_key):\n    ''' Move a file atomically within the cache: used to make cached files\n        available at known keys, so they can be used by other processes.\n    '''\n    cache_dir = folders.cacheDirectory()\n    from_path = os.path.join(cache_dir, from_key)\n    to_path   = os.path.join(cache_dir, to_key)\n    try:\n        os.rename(from_path, to_path)\n        # if moving the actual file was successful, then try to move the\n        # metadata:\n        os.rename(from_path+'.json', to_path+'.json')\n    except Exception as e:\n        # if the source doesn't exist, or the destination doesn't exist, remove\n        # the file instead.\n        # windows error 183 == file already exists\n        # (be careful not to use WindowsError on non-windows platforms as it\n        # isn't defined)\n        if (isinstance(e, OSError) and e.errno == errno.ENOENT) or \\\n           (isinstance(e, getattr(__builtins__, \"WindowsError\", type(None))) and e.errno == 183):\n            fsutils.rmF(from_path)\n        else:\n            raise"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nunpack a responses stream that contains a tarball into a directory.", "response": "def unpackTarballStream(stream, into_directory, hash={}, cache_key=None, origin_info=dict()):\n    ''' Unpack a responses stream that contains a tarball into a directory. If\n        a hash is provided, then it will be used as a cache key (for future\n        requests you can try to retrieve the key value from the cache first,\n        before making the request)\n    '''\n    cache_key = _encodeCacheKey(cache_key)\n\n    # if the cache is disabled, then use a random cache key even if one was\n    # provided, so that the module is not persisted in the cache and its\n    # temporary download location is a random key:\n    if getMaxCachedModules() == 0:\n        cache_key = None\n\n    new_cache_key = _downloadToCache(stream, hash, origin_info)\n    unpackFromCache(new_cache_key, into_directory)\n\n    if cache_key is None:\n        # if we didn't provide a cache key, there's no point in storing the cache\n        removeFromCache(new_cache_key)\n    else:\n        # otherwise make this file available at the known cache key\n        _moveCachedFile(new_cache_key, cache_key)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nparse the specified version source URL or version spec and return an an instance of VersionSource", "response": "def parseSourceURL(source_url):\n    ''' Parse the specified version source URL (or version spec), and return an\n        instance of VersionSource\n    '''\n    name, spec = _getNonRegistryRef(source_url)\n    if spec:\n        return spec\n\n    try:\n        url_is_spec = version.Spec(source_url)\n    except ValueError:\n        url_is_spec = None\n\n    if url_is_spec is not None:\n        # if the url is an unadorned version specification (including an empty\n        # string) then the source is the module registry:\n        return VersionSource('registry', '', source_url)\n\n    raise InvalidVersionSpec(\"Invalid version specification: \\\"%s\\\"\" % (source_url))"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nparse a targetname and versionspec and return a tuple containing the name and versionspec.", "response": "def parseTargetNameAndSpec(target_name_and_spec):\n    ''' Parse targetname[@versionspec] and return a tuple\n        (target_name_string, version_spec_string).\n\n        targetname[,versionspec] is also supported (this is how target names\n        and specifications are stored internally, and was the documented way of\n        setting the spec on the commandline)\n\n        Also accepts raw github version specs (Owner/reponame#whatever), as the\n        name can be deduced from these.\n\n        Note that the specification split from the name is not validated. If\n        there is no specification (just a target name) passed in, then '*' will\n        be returned as the specification.\n    '''\n    import re\n    # fist check if this is a raw github specification that we can get the\n    # target name from:\n    name, spec = _getNonRegistryRef(target_name_and_spec)\n    if name:\n        return name, target_name_and_spec\n\n    # next split at the first @ or , if any\n    split_at = '@'\n    if target_name_and_spec.find('@') > target_name_and_spec.find(',') and \\\n            ',' in target_name_and_spec:\n        split_at = ','\n    name = target_name_and_spec.split(split_at)[0]\n    spec = target_name_and_spec[len(name)+1:]\n\n    name = name.strip()\n\n    # if there's no specification, return the explicit any-version\n    # specification:\n    if not spec:\n        spec = '*'\n\n    return name, spec"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nparsing modulename and versionspec and return a tuple", "response": "def parseModuleNameAndSpec(module_name_and_spec):\n    ''' Parse modulename[@versionspec] and return a tuple\n        (module_name_string, version_spec_string).\n\n        Also accepts raw github version specs (Owner/reponame#whatever), as the\n        name can be deduced from these.\n\n        Note that the specification split from the name is not validated. If\n        there is no specification (just a module name) passed in, then '*' will\n        be returned as the specification.\n    '''\n    import re\n    # fist check if this is a raw github specification that we can get the\n    # module name from:\n    name, spec = _getNonRegistryRef(module_name_and_spec)\n    if name:\n        return name, module_name_and_spec\n\n    # next split at the first @, if any\n    name = module_name_and_spec.split('@')[0]\n    spec = module_name_and_spec[len(name)+1:]\n\n    name = name.strip()\n\n    # if there's no specification, return the explicit any-version\n    # specification:\n    if not spec:\n        spec = '*'\n\n    return name, spec"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef gfit(X, sigma, p=5, nbin=200, unif_fraction=0.1):\n    min_x = min(min(X) - 2 * np.std(X, ddof=1), 0)\n    max_x = max(max(X) + 2 * np.std(X, ddof=1),\n                np.std(X, ddof=1))\n    xvals = np.linspace(min_x, max_x, nbin)\n    binw = (max_x - min_x) / (nbin - 1)\n\n    zero_idx = max(np.where(xvals <= 0)[0])\n    noise_kernel = norm().pdf(xvals / sigma) * binw / sigma\n\n    if zero_idx > 0:\n        noise_rotate = noise_kernel[list(np.arange(zero_idx, len(xvals))) +\n                                    list(np.arange(0, zero_idx))]\n    else:\n        noise_rotate = noise_kernel\n\n    XX = np.zeros((p, len(xvals)), dtype=np.float)\n    for ind, exp in enumerate(range(1, p+1)):\n        mask = np.ones_like(xvals)\n        mask[np.where(xvals <= 0)[0]] = 0\n        XX[ind, :] = pow(xvals, exp) * mask\n    XX = XX.T\n\n    def neg_loglik(eta):\n        mask = np.ones_like(xvals)\n        mask[np.where(xvals <= 0)[0]] = 0\n        g_eta_raw = np.exp(np.dot(XX, eta)) * mask\n        if ((np.sum(g_eta_raw) == np.inf) |\n            (np.sum(g_eta_raw) <=\n                100 * np.finfo(np.double).tiny)):\n                return (1000 * (len(X) + sum(eta ** 2)))\n\n        g_eta_main = g_eta_raw / sum(g_eta_raw)\n        g_eta = ((1 - unif_fraction) * g_eta_main +\n                 unif_fraction * mask / sum(mask))\n        f_eta = fftconvolve(g_eta, noise_rotate, mode='same')\n        return np.sum(np.interp(X, xvals,\n                      -np.log(np.maximum(f_eta, 0.0000001))))\n\n    eta_hat = minimize(neg_loglik,\n                       list(itertools.repeat(-1, p))).x\n    g_eta_raw = np.exp(np.dot(XX, eta_hat)) * mask\n    g_eta_main = g_eta_raw / sum(g_eta_raw)\n    g_eta = ((1 - unif_fraction) * g_eta_main +\n             unif_fraction * mask) / sum(mask)\n\n    return xvals, g_eta", "response": "Fit empirical Bayes prior in the hierarchical model."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nestimating Bayes posterior with Gaussian noise", "response": "def gbayes(x0, g_est, sigma):\n    \"\"\"\n    Estimate Bayes posterior with Gaussian noise [Efron2014]_.\n\n    Parameters\n    ----------\n    x0: ndarray\n        an observation\n    g_est: float\n        a prior density, as returned by gfit\n    sigma: int\n        noise estimate\n\n    Returns\n    -------\n    An array of the posterior estimate E[mu | x0]\n    \"\"\"\n\n    Kx = norm().pdf((g_est[0] - x0) / sigma)\n    post = Kx * g_est[1]\n    post /= sum(post)\n    return sum(post * g_est[0])"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef calibrateEB(variances, sigma2):\n    if (sigma2 <= 0 or min(variances) == max(variances)):\n        return(np.maximum(variances, 0))\n    sigma = np.sqrt(sigma2)\n    eb_prior = gfit(variances, sigma)\n    # Set up a partial execution of the function\n    part = functools.partial(gbayes, g_est=eb_prior,\n                             sigma=sigma)\n    if len(variances) >= 200:\n        # Interpolate to speed up computations:\n        calib_x = np.percentile(variances,\n                                np.arange(0, 102, 2))\n        calib_y = list(map(part, calib_x))\n        calib_all = np.interp(variances, calib_x, calib_y)\n    else:\n        calib_all = list(map(part, variances))\n\n    return np.asarray(calib_all)", "response": "Calibrate noisy variance estimates with empirical Bayes."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef calc_inbag(n_samples, forest):\n\n    if not forest.bootstrap:\n        e_s = \"Cannot calculate the inbag from a forest that has \"\n        e_s = \" bootstrap=False\"\n        raise ValueError(e_s)\n\n    n_trees = forest.n_estimators\n    inbag = np.zeros((n_samples, n_trees))\n    sample_idx = []\n    for t_idx in range(n_trees):\n        sample_idx.append(\n            _generate_sample_indices(forest.estimators_[t_idx].random_state,\n                                     n_samples))\n        inbag[:, t_idx] = np.bincount(sample_idx[-1], minlength=n_samples)\n    return inbag", "response": "Calculates the inbag of a random tree from a random forest."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _core_computation(X_train, X_test, inbag, pred_centered, n_trees,\n                      memory_constrained=False, memory_limit=None,\n                      test_mode=False):\n    \"\"\"\n    Helper function, that performs the core computation\n\n    Parameters\n    ----------\n    X_train : ndarray\n        An array with shape (n_train_sample, n_features).\n\n    X_test : ndarray\n        An array with shape (n_test_sample, n_features).\n\n    inbag : ndarray\n        The inbag matrix that fit the data. If set to `None` (default) it\n        will be inferred from the forest. However, this only works for trees\n        for which bootstrapping was set to `True`. That is, if sampling was\n        done with replacement. Otherwise, users need to provide their own\n        inbag matrix.\n\n    pred_centered : ndarray\n        Centered predictions that are an intermediate result in the\n        computation.\n\n    memory_constrained: boolean (optional)\n        Whether or not there is a restriction on memory. If False, it is\n        assumed that a ndarry of shape (n_train_sample,n_test_sample) fits\n        in main memory. Setting to True can actually provide a speed up if\n        memory_limit is tuned to the optimal range.\n\n    memory_limit: int (optional)\n        An upper bound for how much memory the itermediate matrices will take\n        up in Megabytes. This must be provided if memory_constrained=True.\n\n\n    \"\"\"\n    if not memory_constrained:\n        return np.sum((np.dot(inbag - 1, pred_centered.T) / n_trees) ** 2, 0)\n\n    if not memory_limit:\n        raise ValueError('If memory_constrained=True, must provide',\n                         'memory_limit.')\n\n    # Assumes double precision float\n    chunk_size = int((memory_limit * 1e6) / (8.0 * X_train.shape[0]))\n\n    if chunk_size == 0:\n        min_limit = 8.0 * X_train.shape[0] / 1e6\n        raise ValueError('memory_limit provided is too small.' +\n                         'For these dimensions, memory_limit must ' +\n                         'be greater than or equal to %.3e' % min_limit)\n\n    chunk_edges = np.arange(0, X_test.shape[0] + chunk_size, chunk_size)\n    inds = range(X_test.shape[0])\n    chunks = [inds[chunk_edges[i]:chunk_edges[i+1]]\n              for i in range(len(chunk_edges)-1)]\n    if test_mode:\n        print('Number of chunks: %d' % (len(chunks),))\n    V_IJ = np.concatenate([\n                np.sum((np.dot(inbag-1, pred_centered[chunk].T)/n_trees)**2, 0)\n                for chunk in chunks])\n    return V_IJ", "response": "Helper function that performs the core computation of the forests."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\ncalculates the error bars for a RandomForest object.", "response": "def random_forest_error(forest, X_train, X_test, inbag=None,\n                        calibrate=True, memory_constrained=False,\n                        memory_limit=None):\n    \"\"\"\n    Calculate error bars from scikit-learn RandomForest estimators.\n\n    RandomForest is a regressor or classifier object\n    this variance can be used to plot error bars for RandomForest objects\n\n    Parameters\n    ----------\n    forest : RandomForest\n        Regressor or Classifier object.\n\n    X_train : ndarray\n        An array with shape (n_train_sample, n_features). The design matrix for\n        training data.\n\n    X_test : ndarray\n        An array with shape (n_test_sample, n_features). The design matrix\n        for testing data\n\n    inbag : ndarray, optional\n        The inbag matrix that fit the data. If set to `None` (default) it\n        will be inferred from the forest. However, this only works for trees\n        for which bootstrapping was set to `True`. That is, if sampling was\n        done with replacement. Otherwise, users need to provide their own\n        inbag matrix.\n\n    calibrate: boolean, optional\n        Whether to apply calibration to mitigate Monte Carlo noise.\n        Some variance estimates may be negative due to Monte Carlo effects if\n        the number of trees in the forest is too small. To use calibration,\n        Default: True\n\n    memory_constrained: boolean, optional\n        Whether or not there is a restriction on memory. If False, it is\n        assumed that a ndarry of shape (n_train_sample,n_test_sample) fits\n        in main memory. Setting to True can actually provide a speed up if\n        memory_limit is tuned to the optimal range.\n\n    memory_limit: int, optional.\n        An upper bound for how much memory the itermediate matrices will take\n        up in Megabytes. This must be provided if memory_constrained=True.\n\n    Returns\n    -------\n    An array with the unbiased sampling variance (V_IJ_unbiased)\n    for a RandomForest object.\n\n    See Also\n    ----------\n    :func:`calc_inbag`\n\n    Notes\n    -----\n    The calculation of error is based on the infinitesimal jackknife variance,\n    as described in [Wager2014]_ and is a Python implementation of the R code\n    provided at: https://github.com/swager/randomForestCI\n\n    .. [Wager2014] S. Wager, T. Hastie, B. Efron. \"Confidence Intervals for\n       Random Forests: The Jackknife and the Infinitesimal Jackknife\", Journal\n       of Machine Learning Research vol. 15, pp. 1625-1651, 2014.\n    \"\"\"\n    if inbag is None:\n        inbag = calc_inbag(X_train.shape[0], forest)\n\n    pred = np.array([tree.predict(X_test) for tree in forest]).T\n    pred_mean = np.mean(pred, 0)\n    pred_centered = pred - pred_mean\n    n_trees = forest.n_estimators\n    V_IJ = _core_computation(X_train, X_test, inbag, pred_centered, n_trees,\n                             memory_constrained, memory_limit)\n    V_IJ_unbiased = _bias_correction(V_IJ, inbag, pred_centered, n_trees)\n\n    # Correct for cases where resampling is done without replacement:\n    if np.max(inbag) == 1:\n        variance_inflation = 1 / (1 - np.mean(inbag)) ** 2\n        V_IJ_unbiased *= variance_inflation\n\n    if not calibrate:\n        return V_IJ_unbiased\n\n    if V_IJ_unbiased.shape[0] <= 20:\n        print(\"No calibration with n_samples <= 20\")\n        return V_IJ_unbiased\n    if calibrate:\n\n        calibration_ratio = 2\n        n_sample = np.ceil(n_trees / calibration_ratio)\n        new_forest = copy.deepcopy(forest)\n        new_forest.estimators_ =\\\n            np.random.permutation(new_forest.estimators_)[:int(n_sample)]\n        new_forest.n_estimators = int(n_sample)\n\n        results_ss = random_forest_error(new_forest, X_train, X_test,\n                                         calibrate=False,\n                                         memory_constrained=memory_constrained,\n                                         memory_limit=memory_limit)\n        # Use this second set of variance estimates\n        # to estimate scale of Monte Carlo noise\n        sigma2_ss = np.mean((results_ss - V_IJ_unbiased)**2)\n        delta = n_sample / n_trees\n        sigma2 = (delta**2 + (1 - delta)**2) / (2 * (1 - delta)**2) * sigma2_ss\n\n        # Use Monte Carlo noise scale estimate for empirical Bayes calibration\n        V_IJ_calibrated = calibrateEB(V_IJ_unbiased, sigma2)\n\n        return V_IJ_calibrated"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef generate_self_signed_certificate(self, domain='', r=None):\n        r = self.local_renderer\n        r.env.domain = domain or r.env.domain\n        assert r.env.domain, 'No SSL domain defined.'\n        role = r or self.genv.ROLE or ALL\n        ssl_dst = 'roles/%s/ssl' % (role,)\n        if not os.path.isdir(ssl_dst):\n            os.makedirs(ssl_dst)\n        r.env.base_dst = '%s/%s' % (ssl_dst, r.env.domain)\n        r.local('openssl req -new -newkey rsa:{ssl_length} '\n            '-days {ssl_days} -nodes -x509 '\n            '-subj \"/C={ssl_country}/ST={ssl_state}/L={ssl_city}/O={ssl_organization}/CN={ssl_domain}\" '\n            '-keyout {ssl_base_dst}.key -out {ssl_base_dst}.crt')", "response": "Generate a self - signed certificate for use in an internal development ArcGIS load balancer."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef generate_csr(self, domain='', r=None):\n        r = r or self.local_renderer\n        r.env.domain = domain or r.env.domain\n        role = self.genv.ROLE or ALL\n        site = self.genv.SITE or self.genv.default_site\n        print('self.genv.default_site:', self.genv.default_site, file=sys.stderr)\n        print('site.csr0:', site, file=sys.stderr)\n        ssl_dst = 'roles/%s/ssl' % (role,)\n        print('ssl_dst:', ssl_dst)\n        if not os.path.isdir(ssl_dst):\n            os.makedirs(ssl_dst)\n        for site, site_data in self.iter_sites():\n            print('site.csr1:', site, file=sys.stderr)\n            assert r.env.domain, 'No SSL domain defined.'\n            r.env.ssl_base_dst = '%s/%s' % (ssl_dst, r.env.domain.replace('*.', ''))\n            r.env.ssl_csr_year = date.today().year\n            r.local('openssl req -nodes -newkey rsa:{ssl_length} '\n                '-subj \"/C={ssl_country}/ST={ssl_state}/L={ssl_city}/O={ssl_organization}/CN={ssl_domain}\" '\n                '-keyout {ssl_base_dst}.{ssl_csr_year}.key -out {ssl_base_dst}.{ssl_csr_year}.csr')", "response": "Generate a certificate signing request for the current server."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_expiration_date(self, fn):\n        r = self.local_renderer\n        r.env.crt_fn = fn\n        with hide('running'):\n            ret = r.local('openssl x509 -noout -in {ssl_crt_fn} -dates', capture=True)\n        matches = re.findall('notAfter=(.*?)$', ret, flags=re.IGNORECASE)\n        if matches:\n            return dateutil.parser.parse(matches[0])", "response": "Reads the expiration date of a local crt file."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nprints out the expiration dates for all local. crt files and their expiration dates.", "response": "def list_expiration_dates(self, base='roles/all/ssl'):\n        \"\"\"\n        Scans through all local .crt files and displays the expiration dates.\n        \"\"\"\n        max_fn_len = 0\n        max_date_len = 0\n        data = []\n        for fn in os.listdir(base):\n            fqfn = os.path.join(base, fn)\n            if not os.path.isfile(fqfn):\n                continue\n            if not fn.endswith('.crt'):\n                continue\n            expiration_date = self.get_expiration_date(fqfn)\n            max_fn_len = max(max_fn_len, len(fn))\n            max_date_len = max(max_date_len, len(str(expiration_date)))\n            data.append((fn, expiration_date))\n        print('%s %s %s' % ('Filename'.ljust(max_fn_len), 'Expiration Date'.ljust(max_date_len), 'Expired'))\n        now = datetime.now().replace(tzinfo=pytz.UTC)\n        for fn, dt in sorted(data):\n\n            if dt is None:\n                expired = '?'\n            elif dt < now:\n                expired = 'YES'\n            else:\n                expired = 'NO'\n            print('%s %s %s' % (fn.ljust(max_fn_len), str(dt).ljust(max_date_len), expired))"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nverifying the certificate chain.", "response": "def verify_certificate_chain(self, base=None, crt=None, csr=None, key=None):\n        \"\"\"\n        Confirms the key, CSR, and certificate files all match.\n        \"\"\"\n        from burlap.common import get_verbose, print_fail, print_success\n\n        r = self.local_renderer\n\n        if base:\n            crt = base + '.crt'\n            csr = base + '.csr'\n            key = base + '.key'\n        else:\n            assert crt and csr and key, 'If base not provided, crt and csr and key must be given.'\n\n        assert os.path.isfile(crt)\n        assert os.path.isfile(csr)\n        assert os.path.isfile(key)\n\n        csr_md5 = r.local('openssl req -noout -modulus -in %s | openssl md5' % csr, capture=True)\n        key_md5 = r.local('openssl rsa -noout -modulus -in %s | openssl md5' % key, capture=True)\n        crt_md5 = r.local('openssl x509 -noout -modulus -in %s | openssl md5' % crt, capture=True)\n\n        match = crt_md5 == csr_md5 == key_md5\n\n        if self.verbose or not match:\n            print('crt:', crt_md5)\n            print('csr:', csr_md5)\n            print('key:', key_md5)\n\n        if match:\n            print_success('Files look good!')\n        else:\n            print_fail('Files no not match!')\n            raise Exception('Files no not match!')"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _get_environ_handler(name, d):\n\n    def func(site=None, **kwargs):\n        from fabric import state\n\n        # We can't auto-set default_site, because that break tasks that have\n        # to operate over multiple sites.\n        # If a task requires a site, it can pull from default_site as needed.\n        #site = site or d.get('default_site') or env.SITE\n\n        BURLAP_SHELL_PREFIX = int(os.environ.get('BURLAP_SHELL_PREFIX', '0'))\n        if BURLAP_SHELL_PREFIX:\n            print('#!/bin/bash')\n            print('# Generated with:')\n            print('#')\n            print('#     export BURLAP_SHELL_PREFIX=1; export BURLAP_COMMAND_PREFIX=0; fab %s' % (' '.join(sys.argv[1:]),))\n            print('#')\n\n        BURLAP_COMMAND_PREFIX = int(os.environ.get('BURLAP_COMMAND_PREFIX', '1'))\n        with_args = []\n        if not BURLAP_COMMAND_PREFIX:\n            for k in state.output:\n                state.output[k] = False\n\n        hostname = kwargs.get('hostname')\n        hostname = hostname or kwargs.get('name')\n        hostname = hostname or kwargs.get('hn')\n        hostname = hostname or kwargs.get('h')\n\n        verbose = int(kwargs.get('verbose', '0'))\n        common.set_verbose(verbose)\n\n        # Load environment for current role.\n        env.update(env_default)\n        env[common.ROLE] = os.environ[common.ROLE] = name\n        if site:\n            env[common.SITE] = os.environ[common.SITE] = site\n        env.update(d)\n\n        # Load host retriever.\n        retriever = None\n        if env.hosts_retriever:\n            # Dynamically retrieve hosts.\n#             module_name = '.'.join(env.hosts_retriever.split('.')[:-1])\n#             func_name = env.hosts_retriever.split('.')[-1]\n#             retriever = getattr(importlib.import_module(module_name), func_name)\n            retriever = common.get_hosts_retriever()\n            if verbose:\n                print('Using retriever:', env.hosts_retriever, retriever)\n\n        # Load host translator.\n        translator = None\n        if hostname:\n            # Filter hosts list by a specific host name.\n            module_name = '.'.join(env.hostname_translator.split('.')[:-1])\n            func_name = env.hostname_translator.split('.')[-1]\n            translator = getattr(importlib.import_module(module_name), func_name)\n\n        # Re-load environment for current role, incase loading\n        # the retriever/translator reset some environment values.\n        env.update(env_default)\n        env[common.ROLE] = os.environ[common.ROLE] = name\n        if site:\n            env[common.SITE] = os.environ[common.SITE] = site\n        env.update(d)\n\n        # Dynamically retrieve hosts.\n        if env.hosts_retriever:\n            if verbose:\n                print('Building host list with retriever %s...' % env.hosts_retriever)\n            env.hosts = list(retriever(site=site))\n            if verbose:\n                print('Found hosts:')\n                print(env.hosts)\n\n        # Filter hosts list by a specific host name.\n        if hostname:\n            _hostname = hostname\n            hostname = translator(hostname=hostname)\n            _hosts = env.hosts\n            env.hosts = [_ for _ in env.hosts if _ == hostname]\n            assert env.hosts, 'Hostname %s does not match any known hosts.' % (_hostname,)\n\n        if env.is_local is None:\n            if env.hosts:\n                env.is_local = 'localhost' in env.hosts or '127.0.0.1' in env.hosts\n            elif env.host_string:\n                env.is_local = 'localhost' in env.host_string or '127.0.0.1' in env.host_string\n\n        for cb in common.post_role_load_callbacks:\n            cb()\n\n        # Ensure satchels don't cache values from previously loaded roles.\n        common.reset_all_satchels()\n\n        if env.hosts and not env.host_string:\n            env.host_string = env.hosts[0]\n\n        if verbose:\n            print('Loaded role %s.' % (name,), file=sys.stderr)\n\n    func.__doc__ = 'Sets enivronment variables for the \"%s\" role.' % (name,)\n    return func", "response": "Dynamically creates a Fabric task for each configuration role."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef update_merge(d, u):\n    import collections\n    for k, v in u.items():\n        if isinstance(v, collections.Mapping):\n            r = update_merge(d.get(k, dict()), v)\n            d[k] = r\n        else:\n            d[k] = u[k]\n    return d", "response": "Recursively merges two nested dictionary u into a single dictionary d."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef check_version():\n    global CHECK_VERSION\n    if not CHECK_VERSION:\n        return\n    # Ensure we only check once in this process.\n    CHECK_VERSION = 0\n    # Lookup most recent remote version.\n    from six.moves.urllib.request import urlopen\n    try:\n        response = urlopen(\"https://pypi.org/pypi/burlap/json\")\n        data = json.loads(response.read().decode())\n        remote_release = sorted(tuple(map(int, _.split('.'))) for _ in data['releases'].keys())[-1]\n        remote_release_str = '.'.join(map(str, remote_release))\n        local_release = VERSION\n        local_release_str = '.'.join(map(str, local_release))\n        # Display warning.\n        if remote_release > local_release:\n            print('\\033[93m')\n            print(\"You are using burlap version %s, however version %s is available.\" % (local_release_str, remote_release_str))\n            print(\"You should consider upgrading via the 'pip install --upgrade burlap' command.\")\n            print('\\033[0m')\n    except Exception as exc:\n        print('\\033[93m')\n        print(\"Unable to check for updated burlap version: %s\" % exc)\n        print('\\033[0m')", "response": "Checks the local version of the burlap server and displays a warning message if the latest version is available on PyPI."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef task_or_dryrun(*args, **kwargs):\n    invoked = bool(not args or kwargs)\n    task_class = kwargs.pop(\"task_class\", WrappedCallableTask)\n#     if invoked:\n#         func, args = args[0], ()\n#     else:\n    func, args = args[0], ()\n\n    def wrapper(func):\n        return task_class(func, *args, **kwargs)\n    wrapper.is_task_or_dryrun = True\n    wrapper.wrapped = func\n\n    return wrapper if invoked else wrapper(func)", "response": "Decorator for creating a new - style task."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef runs_once(meth):\n    from burlap.common import get_dryrun, runs_once_methods\n    if get_dryrun():\n        pass\n    else:\n        runs_once_methods.append(meth)\n        _runs_once(meth)\n    return meth", "response": "A wrapper around Fabric s runs_once method to support dryrun feature."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef is_file(self, path, use_sudo=False):\n        if self.is_local and not use_sudo:\n            return os.path.isfile(path)\n        else:\n            func = use_sudo and _sudo or _run\n            with self.settings(hide('running', 'warnings'), warn_only=True):\n                return func('[ -f \"%(path)s\" ]' % locals()).succeeded", "response": "Check if a path exists and is a file."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ncheck if a path exists and is a directory.", "response": "def is_dir(self, path, use_sudo=False):\n        \"\"\"\n        Check if a path exists, and is a directory.\n        \"\"\"\n        if self.is_local and not use_sudo:\n            return os.path.isdir(path)\n        else:\n            func = use_sudo and _sudo or _run\n            with self.settings(hide('running', 'warnings'), warn_only=True):\n                return func('[ -d \"%(path)s\" ]' % locals()).succeeded"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef is_link(self, path, use_sudo=False):\n        func = use_sudo and _sudo or _run\n        with self.settings(hide('running', 'warnings'), warn_only=True):\n            return func('[ -L \"%(path)s\" ]' % locals()).succeeded", "response": "Check if a path exists and is a symbolic link."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nget the owner name of a file or directory.", "response": "def get_owner(self, path, use_sudo=False):\n        \"\"\"\n        Get the owner name of a file or directory.\n        \"\"\"\n        func = use_sudo and run_as_root or self.run\n        # I'd prefer to use quiet=True, but that's not supported with older\n        # versions of Fabric.\n        with self.settings(hide('running', 'stdout'), warn_only=True):\n            result = func('stat -c %%U \"%(path)s\"' % locals())\n            if result.failed and 'stat: illegal option' in result:\n                # Try the BSD version of stat\n                return func('stat -f %%Su \"%(path)s\"' % locals())\n            return result"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef umask(self, use_sudo=False):\n        func = use_sudo and run_as_root or self.run\n        return func('umask')", "response": "Get the user s umask."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nupload a template file to the specified destination.", "response": "def upload_template(self, filename, destination, context=None, use_jinja=False,\n                        template_dir=None, use_sudo=False, backup=True,\n                        mirror_local_mode=False, mode=None,\n                        mkdir=False, chown=False, user=None):\n        \"\"\"\n        Upload a template file.\n\n        This is a wrapper around :func:`fabric.contrib.files.upload_template`\n        that adds some extra parameters.\n\n        If ``mkdir`` is True, then the remote directory will be created, as\n        the current user or as ``user`` if specified.\n\n        If ``chown`` is True, then it will ensure that the current user (or\n        ``user`` if specified) is the owner of the remote file.\n        \"\"\"\n\n        if mkdir:\n            remote_dir = os.path.dirname(destination)\n            if use_sudo:\n                self.sudo('mkdir -p %s' % quote(remote_dir), user=user)\n            else:\n                self.run('mkdir -p %s' % quote(remote_dir))\n\n        if not self.dryrun:\n            _upload_template(\n                filename=filename,\n                destination=destination,\n                context=context,\n                use_jinja=use_jinja,\n                template_dir=template_dir,\n                use_sudo=use_sudo,\n                backup=backup,\n                mirror_local_mode=mirror_local_mode,\n                mode=mode,\n            )\n\n        if chown:\n            if user is None:\n                user = self.genv.user\n            run_as_root('chown %s: %s' % (user, quote(destination)))"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef md5sum(self, filename, use_sudo=False):\n        func = use_sudo and run_as_root or self.run\n        with self.settings(hide('running', 'stdout', 'stderr', 'warnings'), warn_only=True):\n            # Linux (LSB)\n            if exists(u'/usr/bin/md5sum'):\n                res = func(u'/usr/bin/md5sum %(filename)s' % locals())\n            # BSD / OS X\n            elif exists(u'/sbin/md5'):\n                res = func(u'/sbin/md5 -r %(filename)s' % locals())\n            # SmartOS Joyent build\n            elif exists(u'/opt/local/gnu/bin/md5sum'):\n                res = func(u'/opt/local/gnu/bin/md5sum %(filename)s' % locals())\n            # SmartOS Joyent build\n            # (the former doesn't exist, at least on joyent_20130222T000747Z)\n            elif exists(u'/opt/local/bin/md5sum'):\n                res = func(u'/opt/local/bin/md5sum %(filename)s' % locals())\n            # Try to find ``md5sum`` or ``md5`` on ``$PATH`` or abort\n            else:\n                md5sum = func(u'which md5sum')\n                md5 = func(u'which md5')\n                if exists(md5sum):\n                    res = func('%(md5sum)s %(filename)s' % locals())\n                elif exists(md5):\n                    res = func('%(md5)s %(filename)s' % locals())\n                else:\n                    abort('No MD5 utility was found on this system.')\n\n        if res.succeeded:\n            _md5sum = res\n        else:\n            warn(res)\n            _md5sum = None\n\n        if isinstance(_md5sum, six.string_types):\n            _md5sum = _md5sum.strip().split('\\n')[-1].split()[0]\n\n        return _md5sum", "response": "Compute the MD5 sum of a file."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef uncommented_lines(self, filename, use_sudo=False):\n        func = run_as_root if use_sudo else self.run\n        res = func('cat %s' % quote(filename), quiet=True)\n        if res.succeeded:\n            return [line for line in res.splitlines() if line and not line.startswith('#')]\n        return []", "response": "Get the lines of a remote file ignoring empty or commented ones\n           "}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef getmtime(self, path, use_sudo=False):\n        func = use_sudo and run_as_root or self.run\n        with self.settings(hide('running', 'stdout')):\n            return int(func('stat -c %%Y \"%(path)s\" ' % locals()).strip())", "response": "Get the time of last modification of a file or directory."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef copy(self, source, destination, recursive=False, use_sudo=False):\n        func = use_sudo and run_as_root or self.run\n        options = '-r ' if recursive else ''\n        func('/bin/cp {0}{1} {2}'.format(options, quote(source), quote(destination)))", "response": "Copy a file or directory to a new location"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nmoves a file or directory to another location", "response": "def move(self, source, destination, use_sudo=False):\n        \"\"\"\n        Move a file or directory\n        \"\"\"\n        func = use_sudo and run_as_root or self.run\n        func('/bin/mv {0} {1}'.format(quote(source), quote(destination)))"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef remove(self, path, recursive=False, use_sudo=False):\n        func = use_sudo and run_as_root or self.run\n        options = '-r ' if recursive else ''\n        func('/bin/rm {0}{1}'.format(options, quote(path)))", "response": "Remove a file or directory from the cache"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef require(self, path=None, contents=None, source=None, url=None, md5=None,\n         use_sudo=False, owner=None, group='', mode=None, verify_remote=True,\n         temp_dir='/tmp'):\n        \"\"\"\n        Require a file to exist and have specific contents and properties.\n\n        You can provide either:\n\n        - *contents*: the required contents of the file::\n\n            from fabtools import require\n\n            require.file('/tmp/hello.txt', contents='Hello, world')\n\n        - *source*: the local path of a file to upload::\n\n            from fabtools import require\n\n            require.file('/tmp/hello.txt', source='files/hello.txt')\n\n        - *url*: the URL of a file to download (*path* is then optional)::\n\n            from fabric.api import cd\n            from fabtools import require\n\n            with cd('tmp'):\n                require.file(url='http://example.com/files/hello.txt')\n\n        If *verify_remote* is ``True`` (the default), then an MD5 comparison\n        will be used to check whether the remote file is the same as the\n        source. If this is ``False``, the file will be assumed to be the\n        same if it is present. This is useful for very large files, where\n        generating an MD5 sum may take a while.\n\n        When providing either the *contents* or the *source* parameter, Fabric's\n        ``put`` function will be used to upload the file to the remote host.\n        When ``use_sudo`` is ``True``, the file will first be uploaded to a temporary\n        directory, then moved to its final location. The default temporary\n        directory is ``/tmp``, but can be overridden with the *temp_dir* parameter.\n        If *temp_dir* is an empty string, then the user's home directory will\n        be used.\n\n        If `use_sudo` is `True`, then the remote file will be owned by root,\n        and its mode will reflect root's default *umask*. The optional *owner*,\n        *group* and *mode* parameters can be used to override these properties.\n\n        .. note:: This function can be accessed directly from the\n                  ``fabtools.require`` module for convenience.\n\n        \"\"\"\n        func = use_sudo and run_as_root or self.run\n\n        # 1) Only a path is given\n        if path and not (contents or source or url):\n            assert path\n            if not self.is_file(path):\n                func('touch \"%(path)s\"' % locals())\n\n        # 2) A URL is specified (path is optional)\n        elif url:\n            if not path:\n                path = os.path.basename(urlparse(url).path)\n\n            if not self.is_file(path) or md5 and self.md5sum(path) != md5:\n                func('wget --progress=dot:mega \"%(url)s\" -O \"%(path)s\"' % locals())\n\n        # 3) A local filename, or a content string, is specified\n        else:\n            if source:\n                assert not contents\n                t = None\n            else:\n                fd, source = mkstemp()\n                t = os.fdopen(fd, 'w')\n                t.write(contents)\n                t.close()\n\n            if verify_remote:\n                # Avoid reading the whole file into memory at once\n                digest = hashlib.md5()\n                f = open(source, 'rb')\n                try:\n                    while True:\n                        d = f.read(BLOCKSIZE)\n                        if not d:\n                            break\n                        digest.update(d)\n                finally:\n                    f.close()\n            else:\n                digest = None\n\n            if (not self.is_file(path, use_sudo=use_sudo) or\n                    (verify_remote and\n                        self.md5sum(path, use_sudo=use_sudo) != digest.hexdigest())):\n                with self.settings(hide('running')):\n                    self.put(local_path=source, remote_path=path, use_sudo=use_sudo, temp_dir=temp_dir)\n\n            if t is not None:\n                os.unlink(source)\n\n        # Ensure correct owner\n        if use_sudo and owner is None:\n            owner = 'root'\n        if (owner and self.get_owner(path, use_sudo) != owner) or \\\n           (group and self.get_group(path, use_sudo) != group):\n            func('chown %(owner)s:%(group)s \"%(path)s\"' % locals())\n\n        # Ensure correct mode\n        if use_sudo and mode is None:\n            mode = oct(0o666 & ~int(self.umask(use_sudo=True), base=8))\n        if mode and self.get_mode(path, use_sudo) != mode:\n            func('chmod %(mode)s \"%(path)s\"' % locals())", "response": "This function requires a file to exist and has specific contents and properties."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef check_for_change(self):\n        r = self.local_renderer\n        lm = self.last_manifest\n        last_fingerprint = lm.fingerprint\n        current_fingerprint = self.get_target_geckodriver_version_number()\n        self.vprint('last_fingerprint:', last_fingerprint)\n        self.vprint('current_fingerprint:', current_fingerprint)\n        if last_fingerprint != current_fingerprint:\n            print('A new release is available. %s' % self.get_most_recent_version())\n            return True\n        print('No updates found.')\n        return False", "response": "Determines if a new release has been made."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn a dict that can be used to record any data necessary to detect changes to the current version of the geckodriver.", "response": "def record_manifest(self):\n        \"\"\"\n        Called after a deployment to record any data necessary to detect changes\n        for a future deployment.\n        \"\"\"\n        manifest = super(SeleniumSatchel, self).record_manifest()\n        manifest['fingerprint'] = str(self.get_target_geckodriver_version_number())\n        return manifest"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef update(kernel=False):\n    manager = MANAGER\n    cmds = {'yum -y --color=never': {False: '--exclude=kernel* update', True: 'update'}}\n    cmd = cmds[manager][kernel]\n    run_as_root(\"%(manager)s %(cmd)s\" % locals())", "response": "Update the availabe packages in a single resource group."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nchecks if an RPM package is installed.", "response": "def is_installed(pkg_name):\n    \"\"\"\n    Check if an RPM package is installed.\n    \"\"\"\n    manager = MANAGER\n    with settings(hide('running', 'stdout', 'stderr', 'warnings'), warn_only=True):\n        res = run(\"rpm --query %(pkg_name)s\" % locals())\n        if res.succeeded:\n            return True\n        return False"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ninstall one or more RPM packages.", "response": "def install(packages, repos=None, yes=None, options=None):\n    \"\"\"\n    Install one or more RPM packages.\n\n    Extra *repos* may be passed to ``yum`` to enable extra repositories at install time.\n\n    Extra *yes* may be passed to ``yum`` to validate license if necessary.\n\n    Extra *options* may be passed to ``yum`` if necessary\n    (e.g. ``['--nogpgcheck', '--exclude=package']``).\n\n    ::\n\n        import burlap\n\n        # Install a single package, in an alternative install root\n        burlap.rpm.install('emacs', options='--installroot=/my/new/location')\n\n        # Install multiple packages silently\n        burlap.rpm.install([\n            'unzip',\n            'nano'\n        ], '--quiet')\n\n    \"\"\"\n    manager = MANAGER\n    if options is None:\n        options = []\n    elif isinstance(options, six.string_types):\n        options = [options]\n    if not isinstance(packages, six.string_types):\n        packages = \" \".join(packages)\n    if repos:\n        for repo in repos:\n            options.append('--enablerepo=%(repo)s' % locals())\n    options = \" \".join(options)\n    if isinstance(yes, str):\n        run_as_root('yes %(yes)s | %(manager)s %(options)s install %(packages)s' % locals())\n    else:\n        run_as_root('%(manager)s %(options)s install %(packages)s' % locals())"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ninstalling a group of packages.", "response": "def groupinstall(group, options=None):\n    \"\"\"\n    Install a group of packages.\n\n    You can use ``yum grouplist`` to get the list of groups.\n\n    Extra *options* may be passed to ``yum`` if necessary like\n    (e.g. ``['--nogpgcheck', '--exclude=package']``).\n\n    ::\n\n        import burlap\n\n        # Install development packages\n        burlap.rpm.groupinstall('Development tools')\n\n    \"\"\"\n    manager = MANAGER\n    if options is None:\n        options = []\n    elif isinstance(options, str):\n        options = [options]\n    options = \" \".join(options)\n    run_as_root('%(manager)s %(options)s groupinstall \"%(group)s\"' % locals(), pty=False)"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nremoves one or more packages. Extra *options* may be passed to ``yum`` if necessary.", "response": "def uninstall(packages, options=None):\n    \"\"\"\n    Remove one or more packages.\n\n    Extra *options* may be passed to ``yum`` if necessary.\n\n    \"\"\"\n    manager = MANAGER\n    if options is None:\n        options = []\n    elif isinstance(options, six.string_types):\n        options = [options]\n    if not isinstance(packages, six.string_types):\n        packages = \" \".join(packages)\n    options = \" \".join(options)\n    run_as_root('%(manager)s %(options)s remove %(packages)s' % locals())"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nremoves an existing software group. Extra *options* may be passed to ``yum`` if necessary.", "response": "def groupuninstall(group, options=None):\n    \"\"\"\n    Remove an existing software group.\n\n    Extra *options* may be passed to ``yum`` if necessary.\n\n    \"\"\"\n    manager = MANAGER\n    if options is None:\n        options = []\n    elif isinstance(options, str):\n        options = [options]\n    options = \" \".join(options)\n    run_as_root('%(manager)s %(options)s groupremove \"%(group)s\"' % locals())"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ngetting the list of yum repositories.", "response": "def repolist(status='', media=None):\n    \"\"\"\n    Get the list of ``yum`` repositories.\n\n    Returns enabled repositories by default. Extra *status* may be passed\n    to list disabled repositories if necessary.\n\n    Media and debug repositories are kept disabled, except if you pass *media*.\n\n    ::\n\n        import burlap\n\n        # Install a package that may be included in disabled repositories\n        burlap.rpm.install('vim', burlap.rpm.repolist('disabled'))\n\n    \"\"\"\n    manager = MANAGER\n    with settings(hide('running', 'stdout')):\n        if media:\n            repos = run_as_root(\"%(manager)s repolist %(status)s | sed '$d' | sed -n '/repo id/,$p'\" % locals())\n        else:\n            repos = run_as_root(\"%(manager)s repolist %(status)s | sed '/Media\\\\|Debug/d' | sed '$d' | sed -n '/repo id/,$p'\" % locals())\n        return [line.split(' ')[0] for line in repos.splitlines()[1:]]"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef sync(self, sync_set, force=0, site=None, role=None):\n        from burlap.dj import dj\n        force = int(force)\n\n        r = self.local_renderer\n\n        r.env.sync_force_flag = ' --force ' if force else ''\n\n        _settings = dj.get_settings(site=site, role=role)\n        assert _settings, 'Unable to import settings.'\n        for k in _settings.__dict__.iterkeys():\n            if k.startswith('AWS_'):\n                r.genv[k] = _settings.__dict__[k]\n\n        site_data = r.genv.sites[r.genv.SITE]\n        r.env.update(site_data)\n\n        r.env.virtualenv_bin_dir = os.path.split(sys.executable)[0]\n\n        rets = []\n        for paths in r.env.sync_sets[sync_set]:\n            is_local = paths.get('is_local', True)\n            local_path = paths['local_path'] % r.genv\n            remote_path = paths['remote_path']\n            remote_path = remote_path.replace(':/', '/')\n            if not remote_path.startswith('s3://'):\n                remote_path = 's3://' + remote_path\n            local_path = local_path % r.genv\n\n            if is_local:\n                #local_or_dryrun('which s3sync')#, capture=True)\n                r.env.local_path = os.path.abspath(local_path)\n            else:\n                #run('which s3sync')\n                r.env.local_path = local_path\n\n            if local_path.endswith('/') and not r.env.local_path.endswith('/'):\n                r.env.local_path = r.env.local_path + '/'\n\n            r.env.remote_path = remote_path % r.genv\n\n            print('Syncing %s to %s...' % (r.env.local_path, r.env.remote_path))\n\n            # Superior Python version.\n            if force:\n                r.env.sync_cmd = 'put'\n            else:\n                r.env.sync_cmd = 'sync'\n            r.local(\n                'export AWS_ACCESS_KEY_ID={aws_access_key_id}; '\\\n                'export AWS_SECRET_ACCESS_KEY={aws_secret_access_key}; '\\\n                '{s3cmd_path} {sync_cmd} --progress --acl-public --guess-mime-type --no-mime-magic '\\\n                '--delete-removed --cf-invalidate --recursive {sync_force_flag} '\\\n                '{local_path} {remote_path}')", "response": "Uploads media to an Amazon S3 bucket using s3sync."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nissues invalidation requests to a Cloudfront distribution for the current static media bucket.", "response": "def invalidate(self, *paths):\n        \"\"\"\n        Issues invalidation requests to a Cloudfront distribution\n        for the current static media bucket, triggering it to reload the specified\n        paths from the origin.\n\n        Note, only 1000 paths can be issued in a request at any one time.\n        \"\"\"\n        dj = self.get_satchel('dj')\n        if not paths:\n            return\n        # http://boto.readthedocs.org/en/latest/cloudfront_tut.html\n        _settings = dj.get_settings()\n        if not _settings.AWS_STATIC_BUCKET_NAME:\n            print('No static media bucket set.')\n            return\n        if isinstance(paths, six.string_types):\n            paths = paths.split(',')\n        all_paths = map(str.strip, paths)\n        i = 0\n        while 1:\n            paths = all_paths[i:i+1000]\n            if not paths:\n                break\n\n            c = boto.connect_cloudfront()\n            rs = c.get_all_distributions()\n            target_dist = None\n            for dist in rs:\n                print(dist.domain_name, dir(dist), dist.__dict__)\n                bucket_name = dist.origin.dns_name.replace('.s3.amazonaws.com', '')\n                if bucket_name == _settings.AWS_STATIC_BUCKET_NAME:\n                    target_dist = dist\n                    break\n            if not target_dist:\n                raise Exception(('Target distribution %s could not be found in the AWS account.') % (settings.AWS_STATIC_BUCKET_NAME,))\n            print('Using distribution %s associated with origin %s.' % (target_dist.id, _settings.AWS_STATIC_BUCKET_NAME))\n            inval_req = c.create_invalidation_request(target_dist.id, paths)\n            print('Issue invalidation request %s.' % (inval_req,))\n            i += 1000"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nget an S3 bucket of the given name creating one if it doesn t already exist.", "response": "def get_or_create_bucket(self, name):\n        \"\"\"\n        Gets an S3 bucket of the given name, creating one if it doesn't already exist.\n\n        Should be called with a role, if AWS credentials are stored in role settings. e.g.\n\n            fab local s3.get_or_create_bucket:mybucket\n        \"\"\"\n        from boto.s3 import connection\n        if self.dryrun:\n            print('boto.connect_s3().create_bucket(%s)' % repr(name))\n        else:\n            conn = connection.S3Connection(\n                self.genv.aws_access_key_id,\n                self.genv.aws_secret_access_key\n            )\n            bucket = conn.create_bucket(name)\n            return bucket"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef static(self):\n        fn = self.render_to_file('ip/ip_interfaces_static.template')\n        r = self.local_renderer\n        r.put(local_path=fn, remote_path=r.env.interfaces_fn, use_sudo=True)", "response": "Configures the server to use a static IP."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef record_manifest(self):\n        manifest = super(TarballSatchel, self).record_manifest()\n        manifest['timestamp'] = self.timestamp\n        return manifest", "response": "Returns a manifest dict that contains the current timestamp and the manifest s version number."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef get_thumbprint(self):\n        extensions = self.extensions.split(' ')\n        name_str = ' -or '.join('-name \"%s\"' % ext for ext in extensions)\n        cmd = 'find ' + self.base_dir + r' -type f \\( ' + name_str + r' \\) -exec md5sum {} \\; | sort -k 2 | md5sum'\n        return getoutput(cmd)", "response": "Calculates the current thumbprint of the item being tracked."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ncalculating the current thumbprint of the item being tracked.", "response": "def get_thumbprint(self):\n        \"\"\"\n        Calculates the current thumbprint of the item being tracked.\n        \"\"\"\n        d = {}\n        if self.names:\n            names = self.names\n        else:\n            names = list(self.satchel.lenv)\n        for name in self.names:\n            d[name] = deepcopy(self.satchel.env[name])\n        return d"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ncalculating the current thumbprint of the item being tracked.", "response": "def get_thumbprint(self):\n        \"\"\"\n        Calculates the current thumbprint of the item being tracked.\n        \"\"\"\n        d = {}\n        for tracker in self.trackers:\n            d[type(tracker).__name__] = tracker.get_thumbprint()\n        return d"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nchecking if a package is installed.", "response": "def is_installed(pkg_name):\n    \"\"\"\n    Check if a package is installed.\n    \"\"\"\n    with settings(hide('running', 'stdout', 'stderr', 'warnings'), warn_only=True):\n        res = run(\"dpkg -s %(pkg_name)s\" % locals())\n        for line in res.splitlines():\n            if line.startswith(\"Status: \"):\n                status = line[8:]\n                if \"installed\" in status.split(' '):\n                    return True\n        return False"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef install(packages, update=False, options=None, version=None):\n    manager = MANAGER\n    if update:\n        update_index()\n    if options is None:\n        options = []\n    if version is None:\n        version = ''\n    if version and not isinstance(packages, list):\n        version = '=' + version\n    if not isinstance(packages, six.string_types):\n        packages = \" \".join(packages)\n    options.append(\"--quiet\")\n    options.append(\"--assume-yes\")\n    options = \" \".join(options)\n    cmd = '%(manager)s install %(options)s %(packages)s%(version)s' % locals()\n    run_as_root(cmd, pty=False)", "response": "Install one or more packages."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nremoving one or more packages. If *purge* is ``True``, the package configuration files will be removed from the system. Extra *options* may be passed to ``apt-get`` if necessary.", "response": "def uninstall(packages, purge=False, options=None):\n    \"\"\"\n    Remove one or more packages.\n\n    If *purge* is ``True``, the package configuration files will be\n    removed from the system.\n\n    Extra *options* may be passed to ``apt-get`` if necessary.\n    \"\"\"\n    manager = MANAGER\n    command = \"purge\" if purge else \"remove\"\n    if options is None:\n        options = []\n    if not isinstance(packages, six.string_types):\n        packages = \" \".join(packages)\n    options.append(\"--assume-yes\")\n    options = \" \".join(options)\n    cmd = '%(manager)s %(command)s %(options)s %(packages)s' % locals()\n    run_as_root(cmd, pty=False)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nenable unattended package installation by preseeding ``debconf`` parameters. Example:: import burlap # Unattended install of Postfix mail server burlap.deb.preseed_package('postfix', { 'postfix/main_mailer_type': ('select', 'Internet Site'), 'postfix/mailname': ('string', 'example.com'), 'postfix/destinations': ('string', 'example.com, localhost.localdomain, localhost'), }) burlap.deb.install('postfix')", "response": "def preseed_package(pkg_name, preseed):\n    \"\"\"\n    Enable unattended package installation by preseeding ``debconf``\n    parameters.\n\n    Example::\n\n        import burlap\n\n        # Unattended install of Postfix mail server\n        burlap.deb.preseed_package('postfix', {\n            'postfix/main_mailer_type': ('select', 'Internet Site'),\n            'postfix/mailname': ('string', 'example.com'),\n            'postfix/destinations': ('string', 'example.com, localhost.localdomain, localhost'),\n        })\n        burlap.deb.install('postfix')\n\n    \"\"\"\n    for q_name, _ in preseed.items():\n        q_type, q_answer = _\n        run_as_root('echo \"%(pkg_name)s %(q_name)s %(q_type)s %(q_answer)s\" | debconf-set-selections' % locals())"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ngetting the state of kdpg selections.", "response": "def get_selections():\n    \"\"\"\n    Get the state of ``dkpg`` selections.\n\n    Returns a dict with state => [packages].\n    \"\"\"\n    with settings(hide('stdout')):\n        res = run_as_root('dpkg --get-selections')\n    selections = dict()\n    for line in res.splitlines():\n        package, status = line.split()\n        selections.setdefault(status, list()).append(package)\n    return selections"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nchecks if the given key id exists in apt keyring.", "response": "def apt_key_exists(keyid):\n    \"\"\"\n    Check if the given key id exists in apt keyring.\n    \"\"\"\n\n    # Command extracted from apt-key source\n    gpg_cmd = 'gpg --ignore-time-conflict --no-options --no-default-keyring --keyring /etc/apt/trusted.gpg'\n\n    with settings(hide('everything'), warn_only=True):\n        res = run('%(gpg_cmd)s --fingerprint %(keyid)s' % locals())\n\n    return res.succeeded"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nadd a new trust package to the current public key.", "response": "def add_apt_key(filename=None, url=None, keyid=None, keyserver='subkeys.pgp.net', update=False):\n    \"\"\"\n    Trust packages signed with this public key.\n\n    Example::\n\n        import burlap\n\n        # Varnish signing key from URL and verify fingerprint)\n        burlap.deb.add_apt_key(keyid='C4DEFFEB', url='http://repo.varnish-cache.org/debian/GPG-key.txt')\n\n        # Nginx signing key from default key server (subkeys.pgp.net)\n        burlap.deb.add_apt_key(keyid='7BD9BF62')\n\n        # From custom key server\n        burlap.deb.add_apt_key(keyid='7BD9BF62', keyserver='keyserver.ubuntu.com')\n\n        # From a file\n        burlap.deb.add_apt_key(keyid='7BD9BF62', filename='nginx.asc'\n    \"\"\"\n\n    if keyid is None:\n        if filename is not None:\n            run_as_root('apt-key add %(filename)s' % locals())\n        elif url is not None:\n            run_as_root('wget %(url)s -O - | apt-key add -' % locals())\n        else:\n            raise ValueError('Either filename, url or keyid must be provided as argument')\n    else:\n        if filename is not None:\n            _check_pgp_key(filename, keyid)\n            run_as_root('apt-key add %(filename)s' % locals())\n        elif url is not None:\n            tmp_key = '/tmp/tmp.burlap.key.%(keyid)s.key' % locals()\n            run_as_root('wget %(url)s -O %(tmp_key)s' % locals())\n            _check_pgp_key(tmp_key, keyid)\n            run_as_root('apt-key add %(tmp_key)s' % locals())\n        else:\n            keyserver_opt = '--keyserver %(keyserver)s' % locals() if keyserver is not None else ''\n            run_as_root('apt-key adv %(keyserver_opt)s --recv-keys %(keyid)s' % locals())\n\n    if update:\n        update_index()"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nconfiguring the local renderer to use the most current version of the Debian system.", "response": "def configure(self):\n        \"\"\"\n        Enables the repository for a most current version on Debian systems.\n\n            https://www.rabbitmq.com/install-debian.html\n        \"\"\"\n\n        os_version = self.os_version\n        if not self.dryrun and os_version.distro != UBUNTU:\n            raise NotImplementedError(\"OS %s is not supported.\" % os_version)\n\n        r = self.local_renderer\n\n        r.env.codename = (r.run('lsb_release -c -s') or \"`lsb_release -c -s`\").strip()\n        r.sudo('apt-add-repository \"deb http://download.virtualbox.org/virtualbox/debian {codename} contrib\"')\n        r.sudo('cd /tmp; wget -q https://www.virtualbox.org/download/oracle_vbox.asc -O- | sudo apt-key add -')\n        r.sudo('apt-get update')"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef exists(self, name):\n        with self.settings(hide('running', 'stdout', 'warnings'), warn_only=True):\n            return self.run('getent group %(name)s' % locals()).succeeded", "response": "Check if a group exists."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ncreate a new group.", "response": "def create(self, name, gid=None):\n        \"\"\"\n        Create a new group.\n\n        Example::\n\n            import burlap\n\n            if not burlap.group.exists('admin'):\n                burlap.group.create('admin')\n\n        \"\"\"\n        args = []\n        if gid:\n            args.append('-g %s' % gid)\n        args.append(name)\n        args = ' '.join(args)\n        self.sudo('groupadd --force %s || true' % args)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nentering a password change for a user.", "response": "def enter_password_change(self, username=None, old_password=None):\n        \"\"\"\n        Responds to a forced password change via `passwd` prompts due to password expiration.\n        \"\"\"\n        from fabric.state import connections\n        from fabric.network import disconnect_all\n        r = self.local_renderer\n#         print('self.genv.user:', self.genv.user)\n#         print('self.env.passwords:', self.env.passwords)\n        r.genv.user = r.genv.user or username\n        r.pc('Changing password for user {user} via interactive prompts.')\n        r.env.old_password = r.env.default_passwords[self.genv.user]\n#         print('self.genv.user:', self.genv.user)\n#         print('self.env.passwords:', self.env.passwords)\n        r.env.new_password = self.env.passwords[self.genv.user]\n        if old_password:\n            r.env.old_password = old_password\n        prompts = {\n            '(current) UNIX password: ': r.env.old_password,\n            'Enter new UNIX password: ': r.env.new_password,\n            'Retype new UNIX password: ': r.env.new_password,\n            #\"Login password for '%s': \" % r.genv.user: r.env.new_password,\n#             \"Login password for '%s': \" % r.genv.user: r.env.old_password,\n        }\n        print('prompts:', prompts)\n\n        r.env.password = r.env.old_password\n        with self.settings(warn_only=True):\n            ret = r._local(\"sshpass -p '{password}' ssh -o StrictHostKeyChecking=no {user}@{host_string} echo hello\", capture=True)\n            #code 1 = good password, but prompts needed\n            #code 5 = bad password\n            #code 6 = good password, but host public key is unknown\n        if ret.return_code in (1, 6) or 'hello' in ret:\n            # Login succeeded, so we haven't yet changed the password, so use the default password.\n            self.genv.password = r.env.old_password\n        elif self.genv.user in self.genv.user_passwords:\n            # Otherwise, use the password or key set in the config.\n            self.genv.password = r.env.new_password\n        else:\n            # Default password fails and there's no current password, so clear.\n            self.genv.password = None\n        print('using password:', self.genv.password)\n\n        # Note, the correct current password should be set in host.initrole(), not here.\n        #r.genv.password = r.env.new_password\n        #r.genv.password = r.env.new_password\n        with self.settings(prompts=prompts):\n            ret = r._run('echo checking for expired password')\n            print('ret:[%s]' % ret)\n            do_disconnect = 'passwd: password updated successfully' in ret\n            print('do_disconnect:', do_disconnect)\n            if do_disconnect:\n                # We need to disconnect to reset the session or else Linux will again prompt\n                # us to change our password.\n                disconnect_all()\n\n                # Further logins should require the new password.\n                self.genv.password = r.env.new_password"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nadd the user to the given list of groups.", "response": "def togroups(self, user, groups):\n        \"\"\"\n        Adds the user to the given list of groups.\n        \"\"\"\n\n        r = self.local_renderer\n\n        if isinstance(groups, six.string_types):\n            groups = [_.strip() for _ in groups.split(',') if _.strip()]\n        for group in groups:\n            r.env.username = user\n            r.env.group = group\n            r.sudo('groupadd --force {group}')\n            r.sudo('adduser {username} {group}')"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nconfigure the user to use an SSL key without a password.", "response": "def passwordless(self, username, pubkey):\n        \"\"\"\n        Configures the user to use an SSL key without a password.\n        Assumes you've run generate_keys() first.\n        \"\"\"\n\n        r = self.local_renderer\n\n        r.env.username = username\n        r.env.pubkey = pubkey\n        if not self.dryrun:\n            assert os.path.isfile(r.env.pubkey), \\\n                'Public key file \"%s\" does not exist.' % (str(r.env.pubkey),)\n\n        first = os.path.splitext(r.env.pubkey)[0]\n        r.env.pubkey = first+'.pub'\n        r.env.pemkey = first+'.pem'\n        r.env.home = r.env.home_template.format(username=username)\n\n        # Upload the SSH key.\n        r.sudo('mkdir -p {home}/.ssh')\n        r.sudo('chown -R {user}:{user} {home}/.ssh')\n\n        if r.env.passwordless_method == UPLOAD_KEY:\n            put_remote_paths = self.put(local_path=r.env.pubkey)\n            r.env.put_remote_path = put_remote_paths[0]\n            r.sudo('cat {put_remote_path} >> {home}/.ssh/authorized_keys')\n            r.sudo('rm -f {put_remote_path}')\n        elif r.env.passwordless_method == CAT_KEY:\n            r.env.password = r.env.default_passwords.get(r.env.username, r.genv.password)\n            if r.env.password:\n                r.local(\"cat {pubkey} | sshpass -p '{password}' ssh {user}@{host_string} 'cat >> {home}/.ssh/authorized_keys'\")\n            else:\n                r.local(\"cat {pubkey} | ssh {user}@{host_string} 'cat >> {home}/.ssh/authorized_keys'\")\n        else:\n            raise NotImplementedError\n\n        # Disable password.\n        r.sudo('cp /etc/sudoers {tmp_sudoers_fn}')\n        r.sudo('echo \"{username} ALL=(ALL) NOPASSWD: ALL\" >> {tmp_sudoers_fn}')\n        r.sudo('sudo EDITOR=\"cp {tmp_sudoers_fn}\" visudo')\n\n        r.sudo('service ssh reload')\n\n        print('You should now be able to login with:')\n        r.env.host_string = self.genv.host_string or (self.genv.hosts and self.genv.hosts[0])#self.genv.hostname_hostname\n        r.comment('\\tssh -i {pemkey} {username}@{host_string}')"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef generate_keys(self, username, hostname):\n\n        r = self.local_renderer\n\n        #r.env.key_filename = r.env.key_filename or env.key_filename\n        #assert r.env.key_filename, 'r.env.key_filename or env.key_filename must be set. e.g. roles/role/app_name-role.pem'\n        r.env.key_filename = self.env.key_filename_template.format(\n            ROLE=self.genv.ROLE,\n            host=hostname,\n            username=username,\n        )\n        if os.path.isfile(r.env.key_filename):\n            r.pc('Key file {key_filename} already exists. Skipping generation.'.format(**r.env))\n        else:\n            r.local(\"ssh-keygen -t {key_type} -b {key_bits} -f {key_filename} -N ''\")\n            r.local('chmod {key_perms} {key_filename}')\n            if r.env.key_filename.endswith('.pem'):\n                src = r.env.key_filename+'.pub'\n                dst = (r.env.key_filename+'.pub').replace('.pem', '')\n#                 print('generate_keys:', src, dst)\n                r.env.src = src\n                r.env.dst = dst\n                r.local('mv {src} {dst}')\n        return r.env.key_filename", "response": "Generate a new key file for the user."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ncreating a user with the given username.", "response": "def create(self, username, groups=None, uid=None, create_home=None, system=False, password=None, home_dir=None):\n        \"\"\"\n        Creates a user with the given username.\n        \"\"\"\n        r = self.local_renderer\n        r.env.username = username\n\n        args = []\n\n        if uid:\n            args.append('-u %s' % uid)\n\n        if create_home is None:\n            create_home = not system\n\n        if create_home is True:\n            if home_dir:\n                args.append('--home %s' % home_dir)\n        elif create_home is False:\n            args.append('--no-create-home')\n\n        if password is None:\n            pass\n        elif password:\n            crypted_password = _crypt_password(password)\n            args.append('-p %s' % quote(crypted_password))\n        else:\n            args.append('--disabled-password')\n\n        args.append('--gecos \"\"')\n\n        if system:\n            args.append('--system')\n\n        r.env.args = ' '.join(args)\n        r.env.groups = (groups or '').strip()\n        r.sudo('adduser {args} {username} || true')\n        if groups:\n            for group in groups.split(' '):\n                group = group.strip()\n                if not group:\n                    continue\n                r.sudo('adduser %s %s || true' % (username, group))"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nexpires the user s password", "response": "def expire_password(self, username):\n        \"\"\"\n        Forces the user to change their password the next time they login.\n        \"\"\"\n        r = self.local_renderer\n        r.env.username = username\n        r.sudo('chage -d 0 {username}')"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef run_as_root(command, *args, **kwargs):\n    from burlap.common import run_or_dryrun, sudo_or_dryrun\n    if env.user == 'root':\n        func = run_or_dryrun\n    else:\n        func = sudo_or_dryrun\n    return func(command, *args, **kwargs)", "response": "Run a remote command as the root user."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nreturn a hash of the contents of a file.", "response": "def get_file_hash(fin, block_size=2**20):\n    \"\"\"\n    Iteratively builds a file hash without loading the entire file into memory.\n    Designed to process an arbitrary binary file.\n    \"\"\"\n    if isinstance(fin, six.string_types):\n        fin = open(fin)\n    h = hashlib.sha512()\n    while True:\n        data = fin.read(block_size)\n        if not data:\n            break\n        try:\n            h.update(data)\n        except TypeError:\n            # Fixes Python3 error \"TypeError: Unicode-objects must be encoded before hashing\".\n            h.update(data.encode('utf-8'))\n    return h.hexdigest()"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef check(self):\n        self._validate_settings()\n        r = self.local_renderer\n        r.env.alias = r.env.aliases[0]\n        r.sudo(r.env.check_command_template)", "response": "Run inadyn from the commandline to test the configuration."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef list_env(self, key=None):\n        for k, v in sorted(self.genv.items(), key=lambda o: o[0]):\n            if key and k != key:\n                continue\n            print('%s ' % (k,))\n            pprint(v, indent=4)", "response": "Displays a list of environment key value pairs."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef list_server_specs(self, cpu=1, memory=1, hdd=1):\n        r = self.local_renderer\n\n        cpu = int(cpu)\n        memory = int(memory)\n        hdd = int(hdd)\n\n        # CPU\n        if cpu:\n            cmd = 'cat /proc/cpuinfo | grep -i \"model name\"'\n            ret = r.run(cmd)\n            matches = map(str.strip, re.findall(r'model name\\s+:\\s*([^\\n]+)', ret, re.DOTALL|re.I))\n            cores = {}\n            for match in matches:\n                cores.setdefault(match, 0)\n                cores[match] += 1\n\n        # Memory\n        if memory:\n            cmd = 'dmidecode --type 17'\n            ret = r.sudo(cmd)\n            #print repr(ret)\n            matches = re.findall(r'Memory\\s+Device\\r\\n(.*?)(?:\\r\\n\\r\\n|$)', ret, flags=re.DOTALL|re.I)\n            #print len(matches)\n            #print matches[0]\n            memory_slot_dicts = []\n            for match in matches:\n                attrs = dict([(_a.strip(), _b.strip()) for _a, _b in re.findall(r'^([^:]+):\\s+(.*)$', match, flags=re.MULTILINE)])\n                #print attrs\n                memory_slot_dicts.append(attrs)\n            total_memory_gb = 0\n            total_slots_filled = 0\n            total_slots = len(memory_slot_dicts)\n            memory_types = set()\n            memory_forms = set()\n            memory_speeds = set()\n            for memory_dict in memory_slot_dicts:\n                try:\n                    size = int(round(float(re.findall(r'([0-9]+)\\s+MB', memory_dict['Size'])[0])/1024.))\n                    #print size\n                    total_memory_gb += size\n                    total_slots_filled += 1\n                except IndexError:\n                    pass\n                _v = memory_dict['Type']\n                if _v != 'Unknown':\n                    memory_types.add(_v)\n                _v = memory_dict['Form Factor']\n                if _v != 'Unknown':\n                    memory_forms.add(_v)\n                #_v = memory_dict['Speed']\n                #if _v != 'Unknown':\n                    #memory_speeds.add(_v)\n\n        # Storage\n        if hdd:\n            #cmd = 'ls /dev/*d* | grep \"/dev/[a-z]+d[a-z]$\"'\n            cmd = 'find /dev -maxdepth 1 | grep -E \"/dev/[a-z]+d[a-z]$\"'\n            devices = map(str.strip, r.run(cmd).split('\\n'))\n            total_drives = len(devices)\n            total_physical_storage_gb = 0\n            total_logical_storage_gb = 0\n            drive_transports = set()\n            for device in devices:\n                #cmd = 'udisks --show-info %s |grep -i \"  size:\"' % (device)\n                cmd = 'udisksctl info -b %s |grep -i \"  size:\"' % (device)\n                ret = r.run(cmd)\n                size_bytes = float(re.findall(r'size:\\s*([0-9]+)', ret, flags=re.I)[0].strip())\n                size_gb = int(round(size_bytes/1024/1024/1024))\n                #print device, size_gb\n                total_physical_storage_gb += size_gb\n\n                with self.settings(warn_only=True):\n                    cmd = 'hdparm -I %s|grep -i \"Transport:\"' % device\n                    ret = self.sudo(cmd)\n                    if ret and not ret.return_code:\n                        drive_transports.add(ret.split('Transport:')[-1].strip())\n\n            cmd = \"df | grep '^/dev/[mhs]d*' | awk '{{s+=$2}} END {{print s/1048576}}'\"\n            ret = r.run(cmd)\n            total_logical_storage_gb = float(ret)\n\n        if cpu:\n            print('-'*80)\n            print('CPU')\n            print('-'*80)\n            type_str = ', '.join(['%s x %i' % (_type, _count) for _type, _count in cores.items()])\n            print('Cores: %i' % sum(cores.values()))\n            print('Types: %s' % type_str)\n\n        if memory:\n            print('-'*80)\n            print('MEMORY')\n            print('-'*80)\n            print('Total: %s GB' % total_memory_gb)\n            print('Type: %s' % list_to_str_or_unknown(memory_types))\n            print('Form: %s' % list_to_str_or_unknown(memory_forms))\n            print('Speed: %s' % list_to_str_or_unknown(memory_speeds))\n            print('Slots: %i (%i filled, %i empty)' % (total_slots, total_slots_filled, total_slots - total_slots_filled))\n\n        if hdd:\n            print('-'*80)\n            print('STORAGE')\n            print('-'*80)\n            print('Total physical drives: %i' % total_drives)\n            print('Total physical storage: %s GB' % total_physical_storage_gb)\n            print('Total logical storage: %s GB' % total_logical_storage_gb)\n            print('Types: %s' % list_to_str_or_unknown(drive_transports))", "response": "Displays a list of common servers characteristics like number of CPU cores amount of memory and hard drive capacity."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef shell(self, gui=0, command='', dryrun=None, shell_interactive_cmd_str=None):\n        from burlap.common import get_hosts_for_site\n\n        if dryrun is not None:\n            self.dryrun = dryrun\n\n        r = self.local_renderer\n\n        if r.genv.SITE != r.genv.default_site:\n            shell_hosts = get_hosts_for_site()\n            if shell_hosts:\n                r.genv.host_string = shell_hosts[0]\n\n        r.env.SITE = r.genv.SITE or r.genv.default_site\n\n        if int(gui):\n            r.env.shell_default_options.append('-X')\n\n        if 'host_string' not in self.genv or not self.genv.host_string:\n            if 'available_sites' in self.genv and r.env.SITE not in r.genv.available_sites:\n                raise Exception('No host_string set. Unknown site %s.' % r.env.SITE)\n            else:\n                raise Exception('No host_string set.')\n\n        if '@' in r.genv.host_string:\n            r.env.shell_host_string = r.genv.host_string\n        else:\n            r.env.shell_host_string = '{user}@{host_string}'\n\n        if command:\n            r.env.shell_interactive_cmd_str = command\n        else:\n            r.env.shell_interactive_cmd_str = r.format(shell_interactive_cmd_str or r.env.shell_interactive_cmd)\n\n        r.env.shell_default_options_str = ' '.join(r.env.shell_default_options)\n        if self.is_local:\n            self.vprint('Using direct local.')\n            cmd = '{shell_interactive_cmd_str}'\n        elif r.genv.key_filename:\n            self.vprint('Using key filename.')\n            # If host_string contains the port, then strip it off and pass separately.\n            port = r.env.shell_host_string.split(':')[-1]\n            if port.isdigit():\n                r.env.shell_host_string = r.env.shell_host_string.split(':')[0] + (' -p %s' % port)\n            cmd = 'ssh -t {shell_default_options_str} -i {key_filename} {shell_host_string} \"{shell_interactive_cmd_str}\"'\n        elif r.genv.password:\n            self.vprint('Using password.')\n            cmd = 'ssh -t {shell_default_options_str} {shell_host_string} \"{shell_interactive_cmd_str}\"'\n        else:\n            # No explicit password or key file needed?\n            self.vprint('Using nothing.')\n            cmd = 'ssh -t {shell_default_options_str} {shell_host_string} \"{shell_interactive_cmd_str}\"'\n        r.local(cmd)", "response": "Open an SSH connection and return a dictionary of command and arguments."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ndisplay percent of disk usage.", "response": "def disk(self):\n        \"\"\"\n        Display percent of disk usage.\n        \"\"\"\n        r = self.local_renderer\n        r.run(r.env.disk_usage_command)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef tunnel(self, local_port, remote_port):\n        r = self.local_renderer\n        r.env.tunnel_local_port = local_port\n        r.env.tunnel_remote_port = remote_port\n        r.local(' ssh -i {key_filename} -L {tunnel_local_port}:localhost:{tunnel_remote_port} {user}@{host_string} -N')", "response": "Creates an SSH tunnel."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nsets a key value pair in a satchel s local renderer.", "response": "def set_satchel_value(self, satchel, key, value):\n        \"\"\"\n        Sets a key/value pair in a satchel's local renderer.\n        \"\"\"\n        satchel = self.get_satchel(satchel)\n        r = satchel.local_renderer\n        setattr(r.env, key, value)\n        print('Set %s=%s in satchel %s.' % (key, value, satchel.name))"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ngetting the installed version of a package.", "response": "def package_version(name, python_cmd='python'):\n    \"\"\"\n    Get the installed version of a package\n\n    Returns ``None`` if it can't be found.\n    \"\"\"\n    cmd = '''%(python_cmd)s -c \\\n        \"import pkg_resources;\\\n        dist = pkg_resources.get_distribution('%(name)s');\\\n        print dist.version\"\n        ''' % locals()\n    res = run(cmd, quiet=True)\n    if res.succeeded:\n        return res\n    return"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ninstall the latest version of setuptools.", "response": "def install_setuptools(python_cmd='python', use_sudo=True):\n    \"\"\"\n    Install the latest version of `setuptools`_.\n\n    ::\n\n        import burlap\n\n        burlap.python_setuptools.install_setuptools()\n\n    \"\"\"\n\n    setuptools_version = package_version('setuptools', python_cmd)\n    distribute_version = package_version('distribute', python_cmd)\n\n    if setuptools_version is None:\n        _install_from_scratch(python_cmd, use_sudo)\n    else:\n        if distribute_version is None:\n            _upgrade_from_setuptools(python_cmd, use_sudo)\n        else:\n            _upgrade_from_distribute(python_cmd, use_sudo)"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ninstalls setuptools from scratch using installer", "response": "def _install_from_scratch(python_cmd, use_sudo):\n    \"\"\"\n    Install setuptools from scratch using installer\n    \"\"\"\n\n    with cd(\"/tmp\"):\n        download(EZ_SETUP_URL)\n\n        command = '%(python_cmd)s ez_setup.py' % locals()\n        if use_sudo:\n            run_as_root(command)\n        else:\n            run(command)\n\n        run('rm -f ez_setup.py')"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ninstalling Python packages with easy_install.", "response": "def install(packages, upgrade=False, use_sudo=False, python_cmd='python'):\n    \"\"\"\n    Install Python packages with ``easy_install``.\n\n    Examples::\n\n        import burlap\n\n        # Install a single package\n        burlap.python_setuptools.install('package', use_sudo=True)\n\n        # Install a list of packages\n        burlap.python_setuptools.install(['pkg1', 'pkg2'], use_sudo=True)\n\n    .. note:: most of the time, you'll want to use\n              :py:func:`burlap.python.install()` instead,\n              which uses ``pip`` to install packages.\n\n    \"\"\"\n    argv = []\n    if upgrade:\n        argv.append(\"-U\")\n    if isinstance(packages, six.string_types):\n        argv.append(packages)\n    else:\n        argv.extend(packages)\n    _easy_install(argv, python_cmd, use_sudo)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\ninstalling packages using easy_install", "response": "def _easy_install(argv, python_cmd, use_sudo):\n    \"\"\"\n    Install packages using easy_install\n\n    We don't know if the easy_install command in the path will be the\n    right one, so we use the setuptools entry point to call the script's\n    main function ourselves.\n    \"\"\"\n    command = \"\"\"python -c \"\\\n        from pkg_resources import load_entry_point;\\\n        ez = load_entry_point('setuptools', 'console_scripts', 'easy_install');\\\n        ez(argv=%(argv)r)\\\n    \"\"\" % locals()\n    if use_sudo:\n        run_as_root(command)\n    else:\n        run(command)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ninstalling all the necessary packages necessary for managing virtual hosts with pip.", "response": "def bootstrap(self, force=0):\n        \"\"\"\n        Installs all the necessary packages necessary for managing virtual\n        environments with pip.\n        \"\"\"\n        force = int(force)\n        if self.has_pip() and not force:\n            return\n\n        r = self.local_renderer\n\n        if r.env.bootstrap_method == GET_PIP:\n            r.sudo('curl --silent --show-error --retry 5 https://bootstrap.pypa.io/get-pip.py | python')\n        elif r.env.bootstrap_method == EZ_SETUP:\n            r.run('wget http://peak.telecommunity.com/dist/ez_setup.py -O /tmp/ez_setup.py')\n            with self.settings(warn_only=True):\n                r.sudo('python /tmp/ez_setup.py -U setuptools')\n            r.sudo('easy_install -U pip')\n        elif r.env.bootstrap_method == PYTHON_PIP:\n            r.sudo('apt-get install -y python-pip')\n        else:\n            raise NotImplementedError('Unknown pip bootstrap method: %s' % r.env.bootstrap_method)\n\n        r.sudo('pip {quiet_flag} install --upgrade pip')\n        r.sudo('pip {quiet_flag} install --upgrade virtualenv')"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreturns True if the virtualenv tool is installed.", "response": "def has_virtualenv(self):\n        \"\"\"\n        Returns true if the virtualenv tool is installed.\n        \"\"\"\n        with self.settings(warn_only=True):\n            ret = self.run_or_local('which virtualenv').strip()\n            return bool(ret)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef virtualenv_exists(self, virtualenv_dir=None):\n        r = self.local_renderer\n        ret = True\n        with self.settings(warn_only=True):\n            ret = r.run_or_local('ls {virtualenv_dir}') or ''\n            ret = 'cannot access' not in ret.strip().lower()\n\n        if self.verbose:\n            if ret:\n                print('Yes')\n            else:\n                print('No')\n\n        return ret", "response": "Returns True if the virtual environment exists."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nlist the packages that require the given package.", "response": "def what_requires(self, name):\n        \"\"\"\n        Lists the packages that require the given package.\n        \"\"\"\n        r = self.local_renderer\n        r.env.name = name\n        r.local('pipdeptree -p {name} --reverse')"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef init(self):\n        r = self.local_renderer\n\n#         if self.virtualenv_exists():\n#             print('virtualenv exists')\n#             return\n\n        print('Creating new virtual environment...')\n        with self.settings(warn_only=True):\n            cmd = '[ ! -d {virtualenv_dir} ] && virtualenv --no-site-packages {virtualenv_dir} || true'\n            if self.is_local:\n                r.run_or_local(cmd)\n            else:\n                r.sudo(cmd)", "response": "Creates the virtual environment."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef get_combined_requirements(self, requirements=None):\n\n        requirements = requirements or self.env.requirements\n\n        def iter_lines(fn):\n            with open(fn, 'r') as fin:\n                for line in fin.readlines():\n                    line = line.strip()\n                    if not line or line.startswith('#'):\n                        continue\n                    yield line\n\n        content = []\n        if isinstance(requirements, (tuple, list)):\n            for f in requirements:\n                f = self.find_template(f)\n                content.extend(list(iter_lines(f)))\n        else:\n            assert isinstance(requirements, six.string_types)\n            f = self.find_template(requirements)\n            content.extend(list(iter_lines(f)))\n\n        return '\\n'.join(content)", "response": "Returns all requirements files combined into one string."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef record_manifest(self):\n        manifest = super(PIPSatchel, self).record_manifest()\n        manifest['all-requirements'] = self.get_combined_requirements()\n        if self.verbose:\n            pprint(manifest, indent=4)\n        return manifest", "response": "Called after a deployment to record any data necessary to detect changes\n       "}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nlists virtual machines in the current environment.", "response": "def list_instances(show=1, name=None, group=None, release=None, except_release=None):\n    \"\"\"\n    Retrieves all virtual machines instances in the current environment.\n    \"\"\"\n    from burlap.common import shelf, OrderedDict, get_verbose\n\n    verbose = get_verbose()\n    require('vm_type', 'vm_group')\n    assert env.vm_type, 'No VM type specified.'\n    env.vm_type = (env.vm_type or '').lower()\n    _name = name\n    _group = group\n    _release = release\n    if verbose:\n        print('name=%s, group=%s, release=%s' % (_name, _group, _release))\n\n    env.vm_elastic_ip_mappings = shelf.get('vm_elastic_ip_mappings')\n\n    data = type(env)()\n    if env.vm_type == EC2:\n        if verbose:\n            print('Checking EC2...')\n        for instance in get_all_running_ec2_instances():\n            name = instance.tags.get(env.vm_name_tag)\n            group = instance.tags.get(env.vm_group_tag)\n            release = instance.tags.get(env.vm_release_tag)\n            if env.vm_group and env.vm_group != group:\n                if verbose:\n                    print(('Skipping instance %s because its group \"%s\" '\n                        'does not match env.vm_group \"%s\".') \\\n                            % (instance.public_dns_name, group, env.vm_group))\n                continue\n            if _group and group != _group:\n                if verbose:\n                    print(('Skipping instance %s because its group \"%s\" '\n                        'does not match local group \"%s\".') \\\n                            % (instance.public_dns_name, group, _group))\n                continue\n            if _name and name != _name:\n                if verbose:\n                    print(('Skipping instance %s because its name \"%s\" '\n                        'does not match name \"%s\".') \\\n                            % (instance.public_dns_name, name, _name))\n                continue\n            if _release and release != _release:\n                if verbose:\n                    print(('Skipping instance %s because its release \"%s\" '\n                        'does not match release \"%s\".') \\\n                            % (instance.public_dns_name, release, _release))\n                continue\n            if except_release and release == except_release:\n                continue\n            if verbose:\n                print('Adding instance %s (%s).' \\\n                    % (name, instance.public_dns_name))\n            data.setdefault(name, type(env)())\n            data[name]['id'] = instance.id\n            data[name]['public_dns_name'] = instance.public_dns_name\n            if verbose:\n                print('Public DNS: %s' % instance.public_dns_name)\n\n            if env.vm_elastic_ip_mappings and name in env.vm_elastic_ip_mappings:\n                data[name]['ip'] = env.vm_elastic_ip_mappings[name]\n            else:\n                data[name]['ip'] = socket.gethostbyname(instance.public_dns_name)\n\n        if int(show):\n            pprint(data, indent=4)\n        return data\n    elif env.vm_type == KVM:\n        #virsh list\n        pass\n    else:\n        raise NotImplementedError"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nget or create EC2 security groups.", "response": "def get_or_create_ec2_security_groups(names=None, verbose=1):\n    \"\"\"\n    Creates a security group opening 22, 80 and 443\n    \"\"\"\n    verbose = int(verbose)\n\n    if verbose:\n        print('Creating EC2 security groups...')\n\n    conn = get_ec2_connection()\n\n    if isinstance(names, six.string_types):\n        names = names.split(',')\n    names = names or env.vm_ec2_selected_security_groups\n    if verbose:\n        print('Group names:', names)\n\n    ret = []\n    for name in names:\n        try:\n            group_id = get_ec2_security_group_id(name)\n            if verbose:\n                print('group_id:', group_id)\n            #group = conn.get_all_security_groups(groupnames=[name])[0]\n            # Note, groups in a VPC can't be referred to by name?\n            group = conn.get_all_security_groups(group_ids=[group_id])[0]\n        except boto.exception.EC2ResponseError as e:\n            if verbose:\n                print(e)\n            group = get_ec2_connection().create_security_group(\n                name,\n                name,\n                vpc_id=env.vm_ec2_vpc_id,\n            )\n            print('group_id:', group.id)\n            set_ec2_security_group_id(name, group.id)\n        ret.append(group)\n\n        # Find existing rules.\n        actual_sets = set()\n        for rule in list(group.rules):\n            ip_protocol = rule.ip_protocol\n            from_port = rule.from_port\n            to_port = rule.to_port\n            for cidr_ip in rule.grants:\n                #print('Revoking:', ip_protocol, from_port, to_port, cidr_ip)\n                #group.revoke(ip_protocol, from_port, to_port, cidr_ip)\n                rule_groups = ((rule.groups and rule.groups.split(',')) or [None])\n                for src_group in rule_groups:\n                    src_group = (src_group or '').strip()\n                    if src_group:\n                        actual_sets.add((ip_protocol, from_port, to_port, str(cidr_ip), src_group))\n                    else:\n                        actual_sets.add((ip_protocol, from_port, to_port, str(cidr_ip)))\n\n        # Find actual rules.\n        expected_sets = set()\n        for authorization in env.vm_ec2_available_security_groups.get(name, []):\n            if verbose:\n                print('authorization:', authorization)\n            if len(authorization) == 4 or (len(authorization) == 5 and not (authorization[-1] or '').strip()):\n                src_group = None\n                ip_protocol, from_port, to_port, cidr_ip = authorization[:4]\n                if cidr_ip:\n                    expected_sets.add((ip_protocol, str(from_port), str(to_port), cidr_ip))\n            else:\n                ip_protocol, from_port, to_port, cidr_ip, src_group = authorization\n                if cidr_ip:\n                    expected_sets.add((ip_protocol, str(from_port), str(to_port), cidr_ip, src_group))\n\n        # Calculate differences and update rules if we own the group.\n        if env.vm_ec2_security_group_owner:\n            if verbose:\n                print('expected_sets:')\n                print(expected_sets)\n                print('actual_sets:')\n                print(actual_sets)\n            del_sets = actual_sets.difference(expected_sets)\n            if verbose:\n                print('del_sets:')\n                print(del_sets)\n            add_sets = expected_sets.difference(actual_sets)\n            if verbose:\n                print('add_sets:')\n                print(add_sets)\n\n            # Revoke deleted.\n            for auth in del_sets:\n                print(len(auth))\n                print('revoking:', auth)\n                group.revoke(*auth)\n\n            # Create fresh rules.\n            for auth in add_sets:\n                print('authorizing:', auth)\n                group.authorize(*auth)\n\n    return ret"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ncreate a new EC2 key pair if it doesn t already exist.", "response": "def get_or_create_ec2_key_pair(name=None, verbose=1):\n    \"\"\"\n    Creates and saves an EC2 key pair to a local PEM file.\n    \"\"\"\n    verbose = int(verbose)\n    name = name or env.vm_ec2_keypair_name\n    pem_path = 'roles/%s/%s.pem' % (env.ROLE, name)\n    conn = get_ec2_connection()\n    kp = conn.get_key_pair(name)\n    if kp:\n        print('Key pair %s already exists.' % name)\n    else:\n        # Note, we only get the private key during creation.\n        # If we don't save it here, it's lost forever.\n        kp = conn.create_key_pair(name)\n        open(pem_path, 'wb').write(kp.material)\n        os.system('chmod 600 %s' % pem_path)\n        print('Key pair %s created.' % name)\n    #return kp\n    return pem_path"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef get_or_create_ec2_instance(name=None, group=None, release=None, verbose=0, backend_opts=None):\n    from burlap.common import shelf, OrderedDict\n    from boto.exception import EC2ResponseError\n\n    assert name, \"A name must be specified.\"\n\n    backend_opts = backend_opts or {}\n\n    verbose = int(verbose)\n\n    conn = get_ec2_connection()\n\n    security_groups = get_or_create_ec2_security_groups()\n    security_group_ids = [_.id for _ in security_groups]\n    if verbose:\n        print('security_groups:', security_group_ids)\n\n    pem_path = get_or_create_ec2_key_pair()\n\n    assert env.vm_ec2_ami, 'No AMI specified.'\n    print('Creating EC2 instance from %s...' % (env.vm_ec2_ami,))\n    print(env.vm_ec2_zone)\n    opts = backend_opts.get('run_instances', {})\n    reservation = conn.run_instances(\n        env.vm_ec2_ami,\n        key_name=env.vm_ec2_keypair_name,\n        #security_groups=env.vm_ec2_selected_security_groups,#conflicts with subnet_id?!\n        security_group_ids=security_group_ids,\n        placement=env.vm_ec2_zone,\n        instance_type=env.vm_ec2_instance_type,\n        subnet_id=env.vm_ec2_subnet_id,\n        **opts\n    )\n    instance = reservation.instances[0]\n\n    # Name new instance.\n    # Note, creation is not instantious, so we may have to wait for a moment\n    # before we can access it.\n    while 1:\n        try:\n            if name:\n                instance.add_tag(env.vm_name_tag, name)\n            if group:\n                instance.add_tag(env.vm_group_tag, group)\n            if release:\n                instance.add_tag(env.vm_release_tag, release)\n            break\n        except EC2ResponseError as e:\n            #print('Unable to set tag: %s' % e)\n            print('Waiting for the instance to be created...')\n            if verbose:\n                print(e)\n            time.sleep(3)\n\n    # Assign IP.\n    allocation_id = None\n    if env.vm_ec2_use_elastic_ip:\n        # Initialize name/ip mapping since we can't tag elastic IPs.\n        shelf.setdefault('vm_elastic_ip_mappings', OrderedDict())\n        vm_elastic_ip_mappings = shelf.get('vm_elastic_ip_mappings')\n        elastic_ip = vm_elastic_ip_mappings.get(name)\n        if not elastic_ip:\n            print('Allocating new elastic IP address...')\n            addr = conn.allocate_address(domain=env.vm_ec2_allocate_address_domain)\n            #allocation_id = addr.allocation_id\n            #print('allocation_id:',allocation_id)\n            elastic_ip = addr.public_ip\n            print('Allocated address %s.' % elastic_ip)\n            vm_elastic_ip_mappings[name] = str(elastic_ip)\n            shelf.set('vm_elastic_ip_mappings', vm_elastic_ip_mappings)\n            #conn.get_all_addresses()\n\n        # Lookup allocation_id.\n        all_eips = conn.get_all_addresses()\n        for eip in all_eips:\n            if elastic_ip == eip.public_ip:\n                allocation_id = eip.allocation_id\n                break\n        print('allocation_id:', allocation_id)\n\n        while 1:\n            try:\n                conn.associate_address(\n                    instance_id=instance.id,\n                    #public_ip=elastic_ip,\n                    allocation_id=allocation_id, # needed for VPC instances\n                    )\n                print('IP address associated!')\n                break\n            except EC2ResponseError as e:\n                #print('Unable to assign IP: %s' % e)\n                print('Waiting to associate IP address...')\n                if verbose:\n                    print(e)\n                time.sleep(3)\n\n    # Confirm public DNS name was assigned.\n    while 1:\n        try:\n            instance = get_all_ec2_instances(instance_ids=[instance.id])[0]\n            #assert instance.public_dns_name, 'No public DNS name found!'\n            if instance.public_dns_name:\n                break\n        except Exception as e:\n            print('error:', e)\n        except SystemExit as e:\n            print('systemexit:', e)\n        print('Waiting for public DNS name to be assigned...')\n        time.sleep(3)\n\n    # Confirm we can SSH into the server.\n    #TODO:better handle timeouts? try/except doesn't really work?\n    env.connection_attempts = 10\n    while 1:\n        try:\n            with settings(warn_only=True):\n                env.host_string = instance.public_dns_name\n                ret = run_or_dryrun('who -b')\n                #print 'ret.return_code:',ret.return_code\n                if not ret.return_code:\n                    break\n        except Exception as e:\n            print('error:', e)\n        except SystemExit as e:\n            print('systemexit:', e)\n        print('Waiting for sshd to accept connections...')\n        time.sleep(3)\n\n    print(\"\")\n    print(\"Login with: ssh -o StrictHostKeyChecking=no -i %s %s@%s\" \\\n        % (pem_path, env.user, instance.public_dns_name))\n    print(\"OR\")\n    print(\"fab %(ROLE)s:hostname=%(name)s shell\" % dict(name=name, ROLE=env.ROLE))\n\n    ip = socket.gethostbyname(instance.public_dns_name)\n    print(\"\")\n    print(\"\"\"Example hosts entry:)\n%(ip)s    www.mydomain.com # %(name)s\"\"\" % dict(ip=ip, name=name))\n    return instance", "response": "Get or create an EC2 instance."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef exists(name=None, group=None, release=None, except_release=None, verbose=1):\n    verbose = int(verbose)\n    instances = list_instances(\n        name=name,\n        group=group,\n        release=release,\n        except_release=except_release,\n        verbose=verbose,\n        show=verbose)\n    ret = bool(instances)\n    if verbose:\n        print('\\ninstance %s exist' % ('DOES' if ret else 'does NOT'))\n    #return ret\n    return instances", "response": "Determines if a virtual machine instance exists."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreturns a virtual machine instance.", "response": "def get_or_create(name=None, group=None, config=None, extra=0, verbose=0, backend_opts=None):\n    \"\"\"\n    Creates a virtual machine instance.\n    \"\"\"\n    require('vm_type', 'vm_group')\n\n    backend_opts = backend_opts or {}\n\n    verbose = int(verbose)\n    extra = int(extra)\n\n    if config:\n        config_fn = common.find_template(config)\n        config = yaml.load(open(config_fn))\n        env.update(config)\n\n    env.vm_type = (env.vm_type or '').lower()\n    assert env.vm_type, 'No VM type specified.'\n\n    group = group or env.vm_group\n    assert group, 'No VM group specified.'\n\n    ret = exists(name=name, group=group)\n    if not extra and ret:\n        if verbose:\n            print('VM %s:%s exists.' % (name, group))\n        return ret\n\n    today = datetime.date.today()\n    release = int('%i%02i%02i' % (today.year, today.month, today.day))\n\n    if not name:\n        existing_instances = list_instances(\n            group=group,\n            release=release,\n            verbose=verbose)\n        name = env.vm_name_template.format(index=len(existing_instances)+1)\n\n    if env.vm_type == EC2:\n        return get_or_create_ec2_instance(\n            name=name,\n            group=group,\n            release=release,\n            verbose=verbose,\n            backend_opts=backend_opts)\n    else:\n        raise NotImplementedError"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ndelete one or more VM instances from existence.", "response": "def delete(name=None, group=None, release=None, except_release=None,\n    dryrun=1, verbose=1):\n    \"\"\"\n    Permanently erase one or more VM instances from existence.\n    \"\"\"\n\n    verbose = int(verbose)\n\n    if env.vm_type == EC2:\n        conn = get_ec2_connection()\n\n        instances = list_instances(\n            name=name,\n            group=group,\n            release=release,\n            except_release=except_release,\n        )\n\n        for instance_name, instance_data in instances.items():\n            public_dns_name = instance_data['public_dns_name']\n            print('\\nDeleting %s (%s)...' \\\n                % (instance_name, instance_data['id']))\n            if not get_dryrun():\n                conn.terminate_instances(instance_ids=[instance_data['id']])\n\n            # Clear host key on localhost.\n            known_hosts = os.path.expanduser('~/.ssh/known_hosts')\n            cmd = 'ssh-keygen -f \"%s\" -R %s' % (known_hosts, public_dns_name)\n            local_or_dryrun(cmd)\n\n    else:\n        raise NotImplementedError"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef get_name():\n    if env.vm_type == EC2:\n        for instance in get_all_running_ec2_instances():\n            if env.host_string == instance.public_dns_name:\n                name = instance.tags.get(env.vm_name_tag)\n                return name\n    else:\n        raise NotImplementedError", "response": "Retrieves the name associated with the current host string."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef respawn(name=None, group=None):\n\n    if name is None:\n        name = get_name()\n\n    delete(name=name, group=group)\n    instance = get_or_create(name=name, group=group)\n    env.host_string = instance.public_dns_name", "response": "Deletes and recreates one or more VM instances."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef deploy_code(self):\n\n        assert self.genv.SITE, 'Site unspecified.'\n        assert self.genv.ROLE, 'Role unspecified.'\n\n        r = self.local_renderer\n\n        if self.env.exclusions:\n            r.env.exclusions_str = ' '.join(\n                \"--exclude='%s'\" % _ for _ in self.env.exclusions)\n\n        r.local(r.env.rsync_command)\n        r.sudo('chown -R {rsync_chown_user}:{rsync_chown_group} {rsync_dst_dir}')", "response": "Generate a rsync of all deployable code."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef init_env():\n    env.ROLES_DIR = ROLE_DIR\n    env.services = []\n    env.confirm_deployment = False\n    env.is_local = None\n    env.base_config_dir = '.'\n    env.src_dir = 'src' # The path relative to fab where the code resides.\n    env.sites = {} # {site:site_settings}\n    env[SITE] = None\n    env[ROLE] = None\n\n    env.hosts_retriever = None\n    env.hosts_retrievers = type(env)() #'default':lambda hostname: hostname,\n\n    env.hostname_translator = 'default'\n    env.hostname_translators = type(env)()\n    env.hostname_translators.default = lambda hostname: hostname\n\n    env.default_site = None\n\n    # A list of all site names that should be available on the current host.\n    env.available_sites = []\n\n    # A list of all site names per host.\n    # {hostname: [sites]}\n    # If no entry found, will use available_sites.\n    env.available_sites_by_host = {}\n\n    # The command run to determine the percent of disk usage.\n    env.disk_usage_command = \"df -H | grep -vE '^Filesystem|tmpfs|cdrom|none' | awk '{print $5 \" \" $1}'\"\n\n    env.burlap_data_dir = '.burlap'\n\n    env.setdefault('roledefs', {})\n    env.setdefault('roles', [])\n    env.setdefault('hosts', [])\n    env.setdefault('exclude_hosts', [])", "response": "Initializes the environment variables for the node - specific code."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef add_class_methods_as_module_level_functions_for_fabric(instance, module_name, method_name, module_alias=None):\n    '''\n    Utility to take the methods of the instance of a class, instance,\n    and add them as functions to a module, module_name, so that Fabric\n    can find and call them. Call this at the bottom of a module after\n    the class definition.\n    '''\n    import imp\n    from .decorators import task_or_dryrun\n\n    # get the module as an object\n    module_obj = sys.modules[module_name]\n\n    module_alias = re.sub('[^a-zA-Z0-9]+', '', module_alias or '')\n\n    # Iterate over the methods of the class and dynamically create a function\n    # for each method that calls the method and add it to the current module\n    # NOTE: inspect.ismethod actually executes the methods?!\n    #for method in inspect.getmembers(instance, predicate=inspect.ismethod):\n\n    method_obj = getattr(instance, method_name)\n\n    if not method_name.startswith('_'):\n\n        # get the bound method\n        func = getattr(instance, method_name)\n\n#         if module_name == 'buildbot' or module_alias == 'buildbot':\n#             print('-'*80)\n#             print('module_name:', module_name)\n#             print('method_name:', method_name)\n#             print('module_alias:', module_alias)\n#             print('module_obj:', module_obj)\n#             print('func.module:', func.__module__)\n\n        # Convert executable to a Fabric task, if not done so already.\n        if not hasattr(func, 'is_task_or_dryrun'):\n            func = task_or_dryrun(func)\n\n        if module_name == module_alias \\\n        or (module_name.startswith('satchels.') and module_name.endswith(module_alias)):\n\n            # add the function to the current module\n            setattr(module_obj, method_name, func)\n\n        else:\n\n            # Dynamically create a module for the virtual satchel.\n            _module_obj = module_obj\n            module_obj = create_module(module_alias)\n            setattr(module_obj, method_name, func)\n            post_import_modules.add(module_alias)\n\n        fabric_name = '%s.%s' % (module_alias or module_name, method_name)\n        func.wrapped.__func__.fabric_name = fabric_name\n\n        return func", "response": "Utility to take the methods of an instance of a class and instance and add them as functions to a module level module."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nconverting a string of comma delimited values and returns a list.", "response": "def str_to_list(s):\n    \"\"\"\n    Converts a string of comma delimited values and returns a list.\n    \"\"\"\n    if s is None:\n        return []\n    elif isinstance(s, (tuple, list)):\n        return s\n    elif not isinstance(s, six.string_types):\n        raise NotImplementedError('Unknown type: %s' % type(s))\n    return [_.strip().lower() for _ in (s or '').split(',') if _.strip()]"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef get_hosts_retriever(s=None):\n    s = s or env.hosts_retriever\n#     #assert s, 'No hosts retriever specified.'\n    if not s:\n        return env_hosts_retriever\n#     module_name = '.'.join(s.split('.')[:-1])\n#     func_name = s.split('.')[-1]\n#     retriever = getattr(importlib.import_module(module_name), func_name)\n#     return retriever\n    return str_to_callable(s) or env_hosts_retriever", "response": "Given the function name looks up the method for dynamically retrieving host data."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef append_or_dryrun(*args, **kwargs):\n    from fabric.contrib.files import append\n\n    dryrun = get_dryrun(kwargs.get('dryrun'))\n\n    if 'dryrun' in kwargs:\n        del kwargs['dryrun']\n\n    use_sudo = kwargs.pop('use_sudo', False)\n\n    text = args[0] if len(args) >= 1 else kwargs.pop('text')\n\n    filename = args[1] if len(args) >= 2 else kwargs.pop('filename')\n\n    if dryrun:\n        text = text.replace('\\n', '\\\\n')\n        cmd = 'echo -e \"%s\" >> %s' % (text, filename)\n        cmd_run = 'sudo' if use_sudo else 'run'\n        if BURLAP_COMMAND_PREFIX:\n            print('%s %s: %s' % (render_command_prefix(), cmd_run, cmd))\n        else:\n            print(cmd)\n    else:\n        append(filename=filename, text=text.replace(r'\\n', '\\n'), use_sudo=use_sudo, **kwargs)", "response": "Wrapper around Fabric s contrib. files. append to give it a dryrun option."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef disable_attribute_or_dryrun(*args, **kwargs):\n    dryrun = get_dryrun(kwargs.get('dryrun'))\n\n    if 'dryrun' in kwargs:\n        del kwargs['dryrun']\n\n    use_sudo = kwargs.pop('use_sudo', False)\n    run_cmd = sudo_or_dryrun if use_sudo else run_or_dryrun\n    run_cmd_str = 'sudo' if use_sudo else 'run'\n\n    key = args[0] if len(args) >= 1 else kwargs.pop('key')\n\n    filename = args[1] if len(args) >= 2 else kwargs.pop('filename')\n\n    comment_pattern = args[2] if len(args) >= 3 else kwargs.pop('comment_pattern', r'#\\s*')\n\n    equals_pattern = args[3] if len(args) >= 4 else kwargs.pop('equals_pattern', r'\\s*=\\s*')\n\n    equals_literal = args[4] if len(args) >= 5 else kwargs.pop('equals_pattern', '=')\n\n    context = dict(\n        key=key,\n        uncommented_literal='%s%s' % (key, equals_literal), # key=value\n        uncommented_pattern='%s%s' % (key, equals_pattern), # key = value\n        uncommented_pattern_partial='^%s%s[^\\\\n]*' % (key, equals_pattern), # key=\n        commented_pattern='%s%s%s' % (comment_pattern, key, equals_pattern), # #key=value\n        commented_pattern_partial='^%s%s%s[^\\\\n]*' % (comment_pattern, key, equals_pattern), # #key=\n        filename=filename,\n        backup=filename+'.bak',\n        comment_pattern=comment_pattern,\n        equals_pattern=equals_pattern,\n    )\n\n    cmds = [\n        # Replace partial un-commented text with full commented text.\n        'sed -i -r -e \"s/{uncommented_pattern_partial}//g\" {filename}'.format(**context),\n    ]\n\n    if dryrun:\n        for cmd in cmds:\n            if BURLAP_COMMAND_PREFIX:\n                print('%s %s: %s' % (render_command_prefix(), run_cmd_str, cmd))\n            else:\n                print(cmd)\n    else:\n        for cmd in cmds:\n#             print('enable attr:', cmd)\n            run_cmd(cmd)", "response": "Disable an attribute or dryrun."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef write_temp_file_or_dryrun(content, *args, **kwargs):\n    dryrun = get_dryrun(kwargs.get('dryrun'))\n    if dryrun:\n        fd, tmp_fn = tempfile.mkstemp()\n        os.remove(tmp_fn)\n        cmd_run = 'local'\n        cmd = 'cat <<EOT >> %s\\n%s\\nEOT' % (tmp_fn, content)\n        if BURLAP_COMMAND_PREFIX:\n            print('%s %s: %s' % (render_command_prefix(), cmd_run, cmd))\n        else:\n            print(cmd)\n    else:\n        fd, tmp_fn = tempfile.mkstemp()\n        fout = open(tmp_fn, 'w')\n        fout.write(content)\n        fout.close()\n    return tmp_fn", "response": "Writes the given content to a local temporary file."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef sed_or_dryrun(*args, **kwargs):\n    dryrun = get_dryrun(kwargs.get('dryrun'))\n    if 'dryrun' in kwargs:\n        del kwargs['dryrun']\n\n    use_sudo = kwargs.get('use_sudo', False)\n\n    if dryrun:\n        context = dict(\n            filename=args[0] if len(args) >= 1 else kwargs['filename'],\n            before=args[1] if len(args) >= 2 else kwargs['before'],\n            after=args[2] if len(args) >= 3 else kwargs['after'],\n            backup=args[3] if len(args) >= 4 else kwargs.get('backup', '.bak'),\n            limit=kwargs.get('limit', ''),\n        )\n        cmd = 'sed -i{backup} -r -e \"/{limit}/ s/{before}/{after}/g {filename}\"'.format(**context)\n        cmd_run = 'sudo' if use_sudo else 'run'\n        if BURLAP_COMMAND_PREFIX:\n            print('%s %s: %s' % (render_command_prefix(), cmd_run, cmd))\n        else:\n            print(cmd)\n    else:\n        from fabric.contrib.files import sed\n        sed(*args, **kwargs)", "response": "Wrapper around Fabric s contrib. files. sed to give it a dryrun option."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef pretty_bytes(bytes): # pylint: disable=redefined-builtin\n    if not bytes:\n        return bytes, 'bytes'\n    sign = bytes/float(bytes)\n    bytes = abs(bytes)\n    for x in ['bytes', 'KB', 'MB', 'GB', 'TB']:\n        if bytes < 1024.0:\n            #return \"%3.1f %s\" % (bytes, x)\n            return sign*bytes, x\n        bytes /= 1024.0", "response": "Returns a tuple of the format scaled_float unit_string."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreturns a subset of the env dictionary containing only those keys with the name prefix.", "response": "def get_component_settings(prefixes=None):\n    \"\"\"\n    Returns a subset of the env dictionary containing\n    only those keys with the name prefix.\n    \"\"\"\n    prefixes = prefixes or []\n    assert isinstance(prefixes, (tuple, list)), 'Prefixes must be a sequence type, not %s.' % type(prefixes)\n    data = {}\n    for name in prefixes:\n        name = name.lower().strip()\n        for k in sorted(env):\n            if k.startswith('%s_' % name):\n                new_k = k[len(name)+1:]\n                data[new_k] = env[k]\n    return data"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef get_last_modified_timestamp(path, ignore=None):\n    ignore = ignore or []\n    if not isinstance(path, six.string_types):\n        return\n    ignore_str = ''\n    if ignore:\n        assert isinstance(ignore, (tuple, list))\n        ignore_str = ' '.join(\"! -name '%s'\" % _ for _ in ignore)\n    cmd = 'find \"'+path+'\" ' + ignore_str + ' -type f -printf \"%T@ %p\\n\" | sort -n | tail -1 | cut -f 1 -d \" \"'\n         #'find '+path+' -type f -printf \"%T@ %p\\n\" | sort -n | tail -1 | cut -d \" \" -f1\n    ret = subprocess.check_output(cmd, shell=True)\n    # Note, we round now to avoid rounding errors later on where some formatters\n    # use different decimal contexts.\n    try:\n        ret = round(float(ret), 2)\n    except ValueError:\n        return\n    return ret", "response": "Returns the last modified timestamp of the given directory."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreturn a subset of the env dictionary keys that differ between old and new.", "response": "def check_settings_for_differences(old, new, as_bool=False, as_tri=False):\n    \"\"\"\n    Returns a subset of the env dictionary keys that differ,\n    either being added, deleted or changed between old and new.\n    \"\"\"\n\n    assert not as_bool or not as_tri\n\n    old = old or {}\n    new = new or {}\n\n    changes = set(k for k in set(new.iterkeys()).intersection(old.iterkeys()) if new[k] != old[k])\n    if changes and as_bool:\n        return True\n\n    added_keys = set(new.iterkeys()).difference(old.iterkeys())\n    if added_keys and as_bool:\n        return True\n    if not as_tri:\n        changes.update(added_keys)\n\n    deled_keys = set(old.iterkeys()).difference(new.iterkeys())\n    if deled_keys and as_bool:\n        return True\n    if as_bool:\n        return False\n    if not as_tri:\n        changes.update(deled_keys)\n\n    if as_tri:\n        return added_keys, changes, deled_keys\n\n    return changes"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef get_packager():\n\n    # TODO: remove once fabric stops using contextlib.nested.\n    # https://github.com/fabric/fabric/issues/1364\n    import warnings\n    warnings.filterwarnings(\"ignore\", category=DeprecationWarning)\n\n    common_packager = get_rc('common_packager')\n    if common_packager:\n        return common_packager\n    #TODO:cache result by current env.host_string so we can handle multiple hosts with different OSes\n    with settings(warn_only=True):\n        with hide('running', 'stdout', 'stderr', 'warnings'):\n            ret = _run('cat /etc/fedora-release')\n            if ret.succeeded:\n                common_packager = YUM\n            else:\n                ret = _run('cat /etc/lsb-release')\n                if ret.succeeded:\n                    common_packager = APT\n                else:\n                    for pn in PACKAGERS:\n                        ret = _run('which %s' % pn)\n                        if ret.succeeded:\n                            common_packager = pn\n                            break\n    if not common_packager:\n        raise Exception('Unable to determine packager.')\n    set_rc('common_packager', common_packager)\n    return common_packager", "response": "Returns the packager detected on the remote system."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreturns a named tuple describing the operating system on the remote host.", "response": "def get_os_version():\n    \"\"\"\n    Returns a named tuple describing the operating system on the remote host.\n    \"\"\"\n\n    # TODO: remove once fabric stops using contextlib.nested.\n    # https://github.com/fabric/fabric/issues/1364\n    import warnings\n    warnings.filterwarnings(\"ignore\", category=DeprecationWarning)\n\n    common_os_version = get_rc('common_os_version')\n    if common_os_version:\n        return common_os_version\n    with settings(warn_only=True):\n        with hide('running', 'stdout', 'stderr', 'warnings'):\n\n            ret = _run_or_local('cat /etc/lsb-release')\n            if ret.succeeded:\n                return OS(\n                    type=LINUX,\n                    distro=UBUNTU,\n                    release=re.findall(r'DISTRIB_RELEASE=([0-9\\.]+)', ret)[0])\n\n            ret = _run_or_local('cat /etc/debian_version')\n            if ret.succeeded:\n                return OS(\n                    type=LINUX,\n                    distro=DEBIAN,\n                    release=re.findall(r'([0-9\\.]+)', ret)[0])\n\n            ret = _run_or_local('cat /etc/fedora-release')\n            if ret.succeeded:\n                return OS(\n                    type=LINUX,\n                    distro=FEDORA,\n                    release=re.findall(r'release ([0-9]+)', ret)[0])\n\n            raise Exception('Unable to determine OS version.')"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef render_to_string(template, extra=None):\n    from jinja2 import Template\n    extra = extra or {}\n    final_fqfn = find_template(template)\n    assert final_fqfn, 'Template not found: %s' % template\n    template_content = open(final_fqfn, 'r').read()\n    t = Template(template_content)\n    if extra:\n        context = env.copy()\n        context.update(extra)\n    else:\n        context = env\n    rendered_content = t.render(**context)\n    rendered_content = rendered_content.replace('&quot;', '\"')\n    return rendered_content", "response": "Renders the given template to a string."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nrenders a template to a local file.", "response": "def render_to_file(template, fn=None, extra=None, **kwargs):\n    \"\"\"\n    Returns a template to a local file.\n    If no filename given, a temporary filename will be generated and returned.\n    \"\"\"\n    import tempfile\n    dryrun = get_dryrun(kwargs.get('dryrun'))\n    append_newline = kwargs.pop('append_newline', True)\n    style = kwargs.pop('style', 'cat') # |echo\n    formatter = kwargs.pop('formatter', None)\n    content = render_to_string(template, extra=extra)\n    if append_newline and not content.endswith('\\n'):\n        content += '\\n'\n\n    if formatter and callable(formatter):\n        content = formatter(content)\n\n    if dryrun:\n        if not fn:\n            fd, fn = tempfile.mkstemp()\n            fout = os.fdopen(fd, 'wt')\n            fout.close()\n    else:\n        if fn:\n            fout = open(fn, 'w')\n        else:\n            fd, fn = tempfile.mkstemp()\n            fout = os.fdopen(fd, 'wt')\n        fout.write(content)\n        fout.close()\n    assert fn\n\n    if style == 'cat':\n        cmd = 'cat <<EOF > %s\\n%s\\nEOF' % (fn, content)\n    elif style == 'echo':\n        cmd = 'echo -e %s > %s' % (shellquote(content), fn)\n    else:\n        raise NotImplementedError\n\n    if BURLAP_COMMAND_PREFIX:\n        print('%s run: %s' % (render_command_prefix(), cmd))\n    else:\n        print(cmd)\n\n    return fn"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef install_config(local_path=None, remote_path=None, render=True, extra=None, formatter=None):\n    local_path = find_template(local_path)\n    if render:\n        extra = extra or {}\n        local_path = render_to_file(template=local_path, extra=extra, formatter=formatter)\n    put_or_dryrun(local_path=local_path, remote_path=remote_path, use_sudo=True)", "response": "Installs a new config file to a remote file."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef iter_sites(sites=None, site=None, renderer=None, setter=None, no_secure=False, verbose=None):\n    if verbose is None:\n        verbose = get_verbose()\n\n    hostname = get_current_hostname()\n\n    target_sites = env.available_sites_by_host.get(hostname, None)\n\n    if sites is None:\n        site = site or env.SITE or ALL\n        if site == ALL:\n            sites = list(six.iteritems(env.sites))\n        else:\n            sys.stderr.flush()\n            sites = [(site, env.sites.get(site))]\n\n    renderer = renderer #or render_remote_paths\n    env_default = save_env()\n    for _site, site_data in sorted(sites):\n        if no_secure and _site.endswith('_secure'):\n            continue\n\n        # Only load site configurations that are allowed for this host.\n        if target_sites is None:\n            pass\n        else:\n            assert isinstance(target_sites, (tuple, list))\n            if _site not in target_sites:\n                if verbose:\n                    print('Skipping site %s because not in among target sites.' % _site)\n                continue\n\n        env.update(env_default)\n        env.update(env.sites.get(_site, {}))\n        env.SITE = _site\n        if callable(renderer):\n            renderer()\n        if setter:\n            setter(_site)\n        yield _site, site_data\n\n    # Revert modified keys.\n    env.update(env_default)\n\n    # Remove keys that were added, not simply updated.\n    added_keys = set(env).difference(env_default)\n    for key in added_keys:\n        # Don't remove internally maintained variables, because these are used to cache hostnames\n        # used by iter_sites().\n        if key.startswith('_'):\n            continue\n        del env[key]", "response": "Iterate over sites and set environment variables for each site."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef topological_sort(source):\n    if isinstance(source, dict):\n        source = source.items()\n    pending = sorted([(name, set(deps)) for name, deps in source]) # copy deps so we can modify set in-place\n    emitted = []\n    while pending:\n        next_pending = []\n        next_emitted = []\n        for entry in pending:\n            name, deps = entry\n            deps.difference_update(emitted) # remove deps we emitted last pass\n            if deps: # still has deps? recheck during next pass\n                next_pending.append(entry)\n            else: # no more deps? time to emit\n                yield name\n                emitted.append(name) # <-- not required, but helps preserve original ordering\n                next_emitted.append(name) # remember what we emitted for difference_update() in next pass\n        if not next_emitted: # all entries have unmet deps, one of two things is wrong...\n            raise ValueError(\"cyclic or missing dependancy detected: %r\" % (next_pending,))\n        pending = next_pending\n        emitted = next_emitted", "response": "perform topo sort on elements."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns a list of hosts that have been configured to support the given site.", "response": "def get_hosts_for_site(site=None):\n    \"\"\"\n    Returns a list of hosts that have been configured to support the given site.\n    \"\"\"\n    site = site or env.SITE\n    hosts = set()\n    for hostname, _sites in six.iteritems(env.available_sites_by_host):\n#         print('checking hostname:',hostname, _sites)\n        for _site in _sites:\n            if _site == site:\n#                 print( '_site:',_site)\n                host_ip = get_host_ip(hostname)\n#                 print( 'host_ip:',host_ip)\n                if host_ip:\n                    hosts.add(host_ip)\n                    break\n    return list(hosts)"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns a copy of the global environment with all the local variables copied back into it.", "response": "def collect_genv(self, include_local=True, include_global=True):\n        \"\"\"\n        Returns a copy of the global environment with all the local variables copied back into it.\n        \"\"\"\n        e = type(self.genv)()\n        if include_global:\n            e.update(self.genv)\n        if include_local:\n            for k, v in self.lenv.items():\n                e['%s_%s' % (self.obj.name.lower(), k)] = v\n        return e"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nsetting the default values for the satchel.", "response": "def _set_defaults(self):\n        \"\"\"\n        Wrapper around the overrideable set_defaults().\n        \"\"\"\n\n        # Register an \"enabled\" flag on all satchels.\n        # How this flag is interpreted depends on the individual satchel.\n        _prefix = '%s_enabled' % self.name\n        if _prefix not in env:\n            env[_prefix] = True\n\n        # Do a one-time init of custom defaults.\n        _key = '_%s' % self.name\n        if _key not in env:\n            env[_key] = True\n            self.set_defaults()"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef register(self):\n\n        self._set_defaults()\n\n        all_satchels[self.name.upper()] = self\n\n        manifest_recorder[self.name] = self.record_manifest\n\n        # Register service commands.\n        if self.required_system_packages:\n            required_system_packages[self.name.upper()] = self.required_system_packages", "response": "Register this satchel to the global registeries for fast lookup from other satchels."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef unregister(self):\n\n        for k in list(env.keys()):\n            if k.startswith(self.env_prefix):\n                del env[k]\n\n        try:\n            del all_satchels[self.name.upper()]\n        except KeyError:\n            pass\n\n        try:\n            del manifest_recorder[self.name]\n        except KeyError:\n            pass\n\n        try:\n            del manifest_deployers[self.name.upper()]\n        except KeyError:\n            pass\n\n        try:\n            del manifest_deployers_befores[self.name.upper()]\n        except KeyError:\n            pass\n\n        try:\n            del required_system_packages[self.name.upper()]\n        except KeyError:\n            pass", "response": "Removes this satchel from the global registeries."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef get_tasks(self):\n        tasks = set(self.tasks)#DEPRECATED\n        for _name in dir(self):\n            # Skip properties so we don't accidentally execute any methods.\n            if isinstance(getattr(type(self), _name, None), property):\n                continue\n            attr = getattr(self, _name)\n            if hasattr(attr, '__call__') and getattr(attr, 'is_task', False):\n                tasks.add(_name)\n        return sorted(tasks)", "response": "Returns an ordered list of all task names."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef local_renderer(self):\n        if not self._local_renderer:\n            r = self.create_local_renderer()\n            self._local_renderer = r\n        return self._local_renderer", "response": "Returns the cached local renderer."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef all_other_enabled_satchels(self):\n        return dict(\n            (name, satchel)\n            for name, satchel in self.all_satchels.items()\n            if name != self.name.upper() and name.lower() in map(str.lower, self.genv.services)\n        )", "response": "Returns a dictionary of all satchels used in the current configuration excluding ourselves."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nreturns a version of env filtered to only include the variables in our namespace.", "response": "def lenv(self):\n        \"\"\"\n        Returns a version of env filtered to only include the variables in our namespace.\n        \"\"\"\n        _env = type(env)()\n        for _k, _v in six.iteritems(env):\n            if _k.startswith(self.name+'_'):\n                _env[_k[len(self.name)+1:]] = _v\n        return _env"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef param_changed_to(self, key, to_value, from_value=None):\n        last_value = getattr(self.last_manifest, key)\n        current_value = self.current_manifest.get(key)\n        if from_value is not None:\n            return last_value == from_value and current_value == to_value\n        return last_value != to_value and current_value == to_value", "response": "Returns true if the given parameter with name key has transitioned to the given value."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nset the site specific settings for the current site.", "response": "def set_site_specifics(self, site):\n        \"\"\"\n        Loads settings for the target site.\n        \"\"\"\n        r = self.local_renderer\n        site_data = self.genv.sites[site].copy()\n        r.env.site = site\n        if self.verbose:\n            print('set_site_specifics.data:')\n            pprint(site_data, indent=4)\n\n        # Remove local namespace settings from the global namespace\n        # by converting <satchel_name>_<variable_name> to <variable_name>.\n        local_ns = {}\n        for k, v in list(site_data.items()):\n            if k.startswith(self.name + '_'):\n                _k = k[len(self.name + '_'):]\n                local_ns[_k] = v\n                del site_data[k]\n\n        r.env.update(local_ns)\n        r.env.update(site_data)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef vprint(self, *args, **kwargs):\n        if self.verbose:\n            curframe = inspect.currentframe()\n            calframe = inspect.getouterframes(curframe, 2)\n            caller_name = calframe[1][3]\n            prefix = '%s.%s:' % (self.name.lower(), caller_name)\n            print(prefix, *args, **kwargs)", "response": "Print the current caller s name and the current caller s name."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef get_package_list(self):\n        os_version = self.os_version # OS(type=LINUX, distro=UBUNTU, release='14.04')\n        self.vprint('os_version:', os_version)\n\n        # Lookup legacy package list.\n        # OS: [package1, package2, ...],\n        req_packages1 = self.required_system_packages\n        if req_packages1:\n            deprecation('The required_system_packages attribute is deprecated, '\n                'use the packager_system_packages property instead.')\n\n        # Lookup new package list.\n        # OS: [package1, package2, ...],\n        req_packages2 = self.packager_system_packages\n\n        patterns = [\n            (os_version.type, os_version.distro, os_version.release),\n            (os_version.distro, os_version.release),\n            (os_version.type, os_version.distro),\n            (os_version.distro,),\n            os_version.distro,\n        ]\n        self.vprint('req_packages1:', req_packages1)\n        self.vprint('req_packages2:', req_packages2)\n        package_list = None\n        found = False\n        for pattern in patterns:\n            self.vprint('pattern:', pattern)\n            for req_packages in (req_packages1, req_packages2):\n                if pattern in req_packages:\n                    package_list = req_packages[pattern]\n                    found = True\n                    break\n        if not found:\n            print('Warning: No operating system pattern found for %s' % (os_version,))\n        self.vprint('package_list:', package_list)\n        return package_list", "response": "Returns a list of all required packages."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef install_packages(self):\n        os_version = self.os_version\n        package_list = self.get_package_list()\n        if package_list:\n            package_list_str = ' '.join(package_list)\n            if os_version.distro == UBUNTU:\n                self.sudo('apt-get update --fix-missing; DEBIAN_FRONTEND=noninteractive apt-get install --yes %s' % package_list_str)\n            elif os_version.distro == DEBIAN:\n                self.sudo('apt-get update --fix-missing; DEBIAN_FRONTEND=noninteractive apt-get install --yes %s' % package_list_str)\n            elif os_version.distro == FEDORA:\n                self.sudo('yum install --assumeyes %s' % package_list_str)\n            else:\n                raise NotImplementedError('Unknown distro: %s' % os_version.distro)", "response": "Installs all required packages listed for this satchel."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nlike run but re - runs the command for each site in the current role.", "response": "def run_on_all_sites(self, cmd, *args, **kwargs):\n        \"\"\"\n        Like run(), but re-runs the command for each site in the current role.\n        \"\"\"\n        r = self.local_renderer\n        for _site, _data in iter_sites():\n            r.env.SITE = _site\n            with self.settings(warn_only=True):\n                r.run('export SITE={SITE}; export ROLE={ROLE}; '+cmd)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef file_contains(self, *args, **kwargs):\n        from fabric.contrib.files import contains\n        return contains(*args, **kwargs)", "response": "returns True if the file contains the key"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreturning a dictionary representing a serialized state of the service.", "response": "def record_manifest(self):\n        \"\"\"\n        Returns a dictionary representing a serialized state of the service.\n        \"\"\"\n        manifest = get_component_settings(prefixes=[self.name])\n\n        # Record a signature of each template so we know to redeploy when they change.\n        for template in self.get_templates():\n            # Dereference brace notation. e.g. convert '{var}' to `env[var]`.\n            if template and template.startswith('{') and template.endswith('}'):\n                template = self.env[template[1:-1]]\n\n            if not template:\n                continue\n\n            if template.startswith('%s/' % self.name):\n                fqfn = self.find_template(template)\n            else:\n                fqfn = self.find_template('%s/%s' % (self.name, template))\n            assert fqfn, 'Unable to find template: %s/%s' % (self.name, template)\n            manifest['_%s' % template] = get_file_hash(fqfn)\n\n        for tracker in self.get_trackers():\n            manifest['_tracker_%s' % tracker.get_natural_key_hash()] = tracker.get_thumbprint()\n\n        if self.verbose:\n            pprint(manifest, indent=4)\n\n        return manifest"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn true if at least one tracker detects a change.", "response": "def has_changes(self):\n        \"\"\"\n        Returns true if at least one tracker detects a change.\n        \"\"\"\n        lm = self.last_manifest\n        for tracker in self.get_trackers():\n            last_thumbprint = lm['_tracker_%s' % tracker.get_natural_key_hash()]\n            if tracker.is_changed(last_thumbprint):\n                return True\n        return False"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nconfiguring the manifest based on the last manifest entry.", "response": "def configure(self):\n        \"\"\"\n        The standard method called to apply functionality when the manifest changes.\n        \"\"\"\n        lm = self.last_manifest\n        for tracker in self.get_trackers():\n            self.vprint('Checking tracker:', tracker)\n            last_thumbprint = lm['_tracker_%s' % tracker.get_natural_key_hash()]\n            self.vprint('last thumbprint:', last_thumbprint)\n            has_changed = tracker.is_changed(last_thumbprint)\n            self.vprint('Tracker changed:', has_changed)\n            if has_changed:\n                self.vprint('Change detected!')\n                tracker.act()"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nchecking if a PostgreSQL user exists.", "response": "def user_exists(name):\n    \"\"\"\n    Check if a PostgreSQL user exists.\n    \"\"\"\n    with settings(hide('running', 'stdout', 'stderr', 'warnings'),\n                  warn_only=True):\n        res = _run_as_pg('''psql -t -A -c \"SELECT COUNT(*) FROM pg_user WHERE usename = '%(name)s';\"''' % locals())\n    return (res == \"1\")"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef create_user(name, password, superuser=False, createdb=False,\n                createrole=False, inherit=True, login=True,\n                connection_limit=None, encrypted_password=False):\n    \"\"\"\n    Create a PostgreSQL user.\n\n    Example::\n\n        import burlap\n\n        # Create DB user if it does not exist\n        if not burlap.postgres.user_exists('dbuser'):\n            burlap.postgres.create_user('dbuser', password='somerandomstring')\n\n        # Create DB user with custom options\n        burlap.postgres.create_user('dbuser2', password='s3cr3t',\n            createdb=True, createrole=True, connection_limit=20)\n\n    \"\"\"\n    options = [\n        'SUPERUSER' if superuser else 'NOSUPERUSER',\n        'CREATEDB' if createdb else 'NOCREATEDB',\n        'CREATEROLE' if createrole else 'NOCREATEROLE',\n        'INHERIT' if inherit else 'NOINHERIT',\n        'LOGIN' if login else 'NOLOGIN',\n    ]\n    if connection_limit is not None:\n        options.append('CONNECTION LIMIT %d' % connection_limit)\n    password_type = 'ENCRYPTED' if encrypted_password else 'UNENCRYPTED'\n    options.append(\"%s PASSWORD '%s'\" % (password_type, password))\n    options = ' '.join(options)\n    _run_as_pg('''psql -c \"CREATE USER %(name)s %(options)s;\"''' % locals())", "response": "Create a PostgreSQL user."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nwrite the file used to store login credentials for PostgreSQL.", "response": "def write_pgpass(self, name=None, site=None, use_sudo=0, root=0):\n        \"\"\"\n        Write the file used to store login credentials for PostgreSQL.\n        \"\"\"\n\n        r = self.database_renderer(name=name, site=site)\n\n        root = int(root)\n\n        use_sudo = int(use_sudo)\n\n        r.run('touch {pgpass_path}')\n        if '~' in r.env.pgpass_path:\n            r.run('chmod {pgpass_chmod} {pgpass_path}')\n        else:\n            r.sudo('chmod {pgpass_chmod} {pgpass_path}')\n\n        if root:\n            r.env.shell_username = r.env.get('db_root_username', 'postgres')\n            r.env.shell_password = r.env.get('db_root_password', 'password')\n        else:\n            r.env.shell_username = r.env.db_user\n            r.env.shell_password = r.env.db_password\n\n        r.append(\n            '{db_host}:{port}:*:{shell_username}:{shell_password}',\n            r.env.pgpass_path,\n            use_sudo=use_sudo)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef dumpload(self, site=None, role=None):\n        r = self.database_renderer(site=site, role=role)\n        r.run('pg_dump -c --host={host_string} --username={db_user} '\n            '--blobs --format=c {db_name} -n public | '\n            'pg_restore -U {db_postgresql_postgres_user} --create '\n            '--dbname={db_name}')", "response": "Dumps and loads a database snapshot simultaneously."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn True if a database with the given name exists. False otherwise.", "response": "def exists(self, name='default', site=None, use_root=False):\n        \"\"\"\n        Returns true if a database with the given name exists. False otherwise.\n        \"\"\"\n\n        r = self.database_renderer(name=name, site=site)\n\n        if int(use_root):\n            kwargs = dict(\n                db_user=r.env.get('db_root_username', 'postgres'),\n                db_password=r.env.get('db_root_password', 'password'),\n                db_host=r.env.db_host,\n                db_name=r.env.db_name,\n            )\n            r.env.update(kwargs)\n\n        # Set pgpass file.\n        if r.env.db_password:\n            self.write_pgpass(name=name, root=use_root)\n\n#        cmd = ('psql --username={db_user} --no-password -l '\\\n#            '--host={db_host} --dbname={db_name}'\\\n#            '| grep {db_name} | wc -l').format(**env)\n\n        ret = None\n        with settings(warn_only=True):\n            ret = r.run('psql --username={db_user} --host={db_host} -l '\\\n            '| grep {db_name} | wc -l')\n            if ret is not None:\n                if 'password authentication failed' in ret:\n                    ret = False\n                else:\n                    ret = int(ret) >= 1\n\n        if ret is not None:\n            print('%s database on site %s %s exist' % (name, self.genv.SITE, 'DOES' if ret else 'DOES NOT'))\n            return ret"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nloading a database snapshot into the target database server.", "response": "def load(self, dump_fn='', prep_only=0, force_upload=0, from_local=0, name=None, site=None, dest_dir=None, force_host=None):\n        \"\"\"\n        Restores a database snapshot onto the target database server.\n\n        If prep_only=1, commands for preparing the load will be generated,\n        but not the command to finally load the snapshot.\n        \"\"\"\n\n        r = self.database_renderer(name=name, site=site)\n\n        # Render the snapshot filename.\n        r.env.dump_fn = self.get_default_db_fn(fn_template=dump_fn, dest_dir=dest_dir)\n\n        from_local = int(from_local)\n\n        prep_only = int(prep_only)\n\n        missing_local_dump_error = r.format(\n            \"Database dump file {dump_fn} does not exist.\"\n        )\n\n        # Copy snapshot file to target.\n        if self.is_local:\n            r.env.remote_dump_fn = dump_fn\n        else:\n            r.env.remote_dump_fn = '/tmp/' + os.path.split(r.env.dump_fn)[-1]\n\n        if not prep_only and not self.is_local:\n            if not self.dryrun:\n                assert os.path.isfile(r.env.dump_fn), missing_local_dump_error\n            #r.pc('Uploading PostgreSQL database snapshot...')\n#                 r.put(\n#                     local_path=r.env.dump_fn,\n#                     remote_path=r.env.remote_dump_fn)\n            #r.local('rsync -rvz --progress --no-p --no-g '\n                #'--rsh \"ssh -o StrictHostKeyChecking=no -i {key_filename}\" '\n                #'{dump_fn} {user}@{host_string}:{remote_dump_fn}')\n            self.upload_snapshot(name=name, site=site, local_dump_fn=r.env.dump_fn, remote_dump_fn=r.env.remote_dump_fn)\n\n        if self.is_local and not prep_only and not self.dryrun:\n            assert os.path.isfile(r.env.dump_fn), missing_local_dump_error\n\n        if force_host:\n            r.env.db_host = force_host\n\n        with settings(warn_only=True):\n            r.sudo('dropdb --if-exists --no-password --user={db_root_username} --host={db_host} {db_name}', user=r.env.postgres_user)\n\n        r.sudo('psql --no-password --user={db_root_username} --host={db_host} -c \"CREATE DATABASE {db_name};\"', user=r.env.postgres_user)\n\n        with settings(warn_only=True):\n\n            if r.env.engine == POSTGIS:\n                r.sudo('psql --user={db_root_username} --no-password --dbname={db_name} --host={db_host} --command=\"CREATE EXTENSION postgis;\"',\n                    user=r.env.postgres_user)\n                r.sudo('psql --user={db_root_username} --no-password --dbname={db_name} --host={db_host} --command=\"CREATE EXTENSION postgis_topology;\"',\n                    user=r.env.postgres_user)\n\n        with settings(warn_only=True):\n            r.sudo('psql --user={db_root_username} --host={db_host} -c \"REASSIGN OWNED BY {db_user} TO {db_root_username};\"', user=r.env.postgres_user)\n\n        with settings(warn_only=True):\n            r.sudo('psql --user={db_root_username} --host={db_host} -c \"DROP OWNED BY {db_user} CASCADE;\"', user=r.env.postgres_user)\n\n        r.sudo('psql --user={db_root_username} --host={db_host} -c \"DROP USER IF EXISTS {db_user}; '\n            'CREATE USER {db_user} WITH PASSWORD \\'{db_password}\\'; '\n            'GRANT ALL PRIVILEGES ON DATABASE {db_name} to {db_user};\"', user=r.env.postgres_user)\n        for createlang in r.env.createlangs:\n            r.env.createlang = createlang\n            r.sudo('createlang -U {db_root_username} --host={db_host} {createlang} {db_name} || true', user=r.env.postgres_user)\n\n        if not prep_only:\n            # Ignore errors needed to work around bug \"ERROR:  schema \"public\" already exists\", which is thrown in 9.6 even if we use --clean and --if-exists?\n            with settings(warn_only=True):\n                r.sudo(r.env.load_command, user=r.env.postgres_user)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef shell(self, name='default', site=None, **kwargs):\n        r = self.database_renderer(name=name, site=site)\n        self.write_pgpass(name=name, site=site, root=True)\n\n        db_name = kwargs.get('db_name')\n        if db_name:\n            r.env.db_name = db_name\n            r.run('/bin/bash -i -c \"psql --username={db_root_username} --host={db_host} --dbname={db_name}\"')\n        else:\n            r.run('/bin/bash -i -c \"psql --username={db_root_username} --host={db_host}\"')", "response": "Opens a SQL shell to the given database."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef drop_database(self, name):\n        with settings(warn_only=True):\n            self.sudo('dropdb %s' % (name,), user='postgres')", "response": "Drop a PostgreSQL database."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nloads a table from one database to another.", "response": "def load_table(self, table_name, src, dst='localhost', name=None, site=None):\n        \"\"\"\n        Directly transfers a table between two databases.\n        \"\"\"\n        #TODO: incomplete\n        r = self.database_renderer(name=name, site=site)\n        r.env.table_name = table_name\n        r.run('psql --user={dst_db_user} --host={dst_db_host} --command=\"DROP TABLE IF EXISTS {table_name} CASCADE;\"')\n        r.run('pg_dump -t {table_name} --user={dst_db_user} --host={dst_db_host} | psql --user={src_db_user} --host={src_db_host}')"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ngets the list of network interfaces on SmartOS.", "response": "def interfaces():\n    \"\"\"\n    Get the list of network interfaces. Will return all datalinks on SmartOS.\n    \"\"\"\n    with settings(hide('running', 'stdout')):\n        if is_file('/usr/sbin/dladm'):\n            res = run('/usr/sbin/dladm show-link')\n        else:\n            res = sudo('/sbin/ifconfig -s')\n    return [line.split(' ')[0] for line in res.splitlines()[1:]]"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ngets the IPv4 address assigned to an interface.", "response": "def address(interface):\n    \"\"\"\n    Get the IPv4 address assigned to an interface.\n\n    Example::\n\n        import burlap\n\n        # Print all configured IP addresses\n        for interface in burlap.network.interfaces():\n            print(burlap.network.address(interface))\n\n    \"\"\"\n    with settings(hide('running', 'stdout')):\n        res = (sudo(\"/sbin/ifconfig %(interface)s | grep 'inet '\" % locals()) or '').split('\\n')[-1].strip()\n    if 'addr' in res:\n        return res.split()[1].split(':')[1]\n    return res.split()[1]"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nreturn a dictionary representing a serialized state of the service manifest.", "response": "def record_manifest(self):\n        \"\"\"\n        Returns a dictionary representing a serialized state of the service.\n        \"\"\"\n        data = {}\n        data['required_packages'] = self.install_required(type=SYSTEM, verbose=False, list_only=True)\n        data['required_packages'].sort()\n        data['custom_packages'] = self.install_custom(list_only=True)\n        data['custom_packages'].sort()\n        data['repositories'] = self.get_repositories()\n        return data"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef update(self):\n        packager = self.packager\n        if packager == APT:\n            self.sudo('DEBIAN_FRONTEND=noninteractive apt-get -yq update')\n        elif packager == YUM:\n            self.sudo('yum update')\n        else:\n            raise Exception('Unknown packager: %s' % (packager,))", "response": "Update the current user s apt or yum."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ninstalls system packages listed in apt - requirements. txt.", "response": "def install_apt(self, fn=None, package_name=None, update=0, list_only=0):\n        \"\"\"\n        Installs system packages listed in apt-requirements.txt.\n        \"\"\"\n        r = self.local_renderer\n        assert self.genv[ROLE]\n        apt_req_fqfn = fn or (self.env.apt_requirments_fn and self.find_template(self.env.apt_requirments_fn))\n        if not apt_req_fqfn:\n            return []\n        assert os.path.isfile(apt_req_fqfn)\n\n        lines = list(self.env.apt_packages or [])\n        for _ in open(apt_req_fqfn).readlines():\n            if _.strip() and not _.strip().startswith('#') \\\n            and (not package_name or _.strip() == package_name):\n                lines.extend(_pkg.strip() for _pkg in _.split(' ') if _pkg.strip())\n\n        if list_only:\n            return lines\n\n        tmp_fn = r.write_temp_file('\\n'.join(lines))\n        apt_req_fqfn = tmp_fn\n\n        if not self.genv.is_local:\n            r.put(local_path=tmp_fn, remote_path=tmp_fn)\n            apt_req_fqfn = self.genv.put_remote_path\n        r.sudo('DEBIAN_FRONTEND=noninteractive apt-get -yq update --fix-missing')\n        r.sudo('DEBIAN_FRONTEND=noninteractive apt-get -yq install `cat \"%s\" | tr \"\\\\n\" \" \"`' % apt_req_fqfn)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ninstall system packages listed in yum - requirements. txt.", "response": "def install_yum(self, fn=None, package_name=None, update=0, list_only=0):\n        \"\"\"\n        Installs system packages listed in yum-requirements.txt.\n        \"\"\"\n        assert self.genv[ROLE]\n        yum_req_fn = fn or self.find_template(self.genv.yum_requirments_fn)\n        if not yum_req_fn:\n            return []\n        assert os.path.isfile(yum_req_fn)\n        update = int(update)\n        if list_only:\n            return [\n                _.strip() for _ in open(yum_req_fn).readlines()\n                if _.strip() and not _.strip.startswith('#')\n                and (not package_name or _.strip() == package_name)\n            ]\n        if update:\n            self.sudo_or_dryrun('yum update --assumeyes')\n        if package_name:\n            self.sudo_or_dryrun('yum install --assumeyes %s' % package_name)\n        else:\n            if self.genv.is_local:\n                self.put_or_dryrun(local_path=yum_req_fn)\n                yum_req_fn = self.genv.put_remote_fn\n            self.sudo_or_dryrun('yum install --assumeyes $(cat %(yum_req_fn)s)' % yum_req_fn)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef install_custom(self, *args, **kwargs):\n        if not self.env.manage_custom:\n            return\n        packager = self.packager\n        if packager == APT:\n            return self.install_apt(*args, **kwargs)\n        elif packager == YUM:\n            return self.install_yum(*args, **kwargs)\n        else:\n            raise NotImplementedError('Unknown packager: %s' % (packager,))", "response": "Installs all system packages listed in the appropriate\n        - requirements. txt."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nupdates or upgrades all system packages.", "response": "def refresh(self, *args, **kwargs):\n        \"\"\"\n        Updates/upgrades all system packages.\n        \"\"\"\n        r = self.local_renderer\n        packager = self.packager\n        if packager == APT:\n            r.sudo('DEBIAN_FRONTEND=noninteractive apt-get -yq update --fix-missing')\n        elif packager == YUM:\n            raise NotImplementedError\n            #return upgrade_yum(*args, **kwargs)\n        else:\n            raise Exception('Unknown packager: %s' % (packager,))"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef upgrade(self, full=0):\n        full = int(full)\n        r = self.local_renderer\n        packager = self.packager\n        if packager == APT:\n            r.sudo('DEBIAN_FRONTEND=noninteractive apt-get -yq upgrade')\n            if full:\n                r.sudo('DEBIAN_FRONTEND=noninteractive apt-get dist-upgrade -yq')\n        elif packager == YUM:\n            raise NotImplementedError\n        else:\n            raise Exception('Unknown packager: %s' % (packager,))", "response": "Updates or upgrades all system packages."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef list_required(self, type=None, service=None): # pylint: disable=redefined-builtin\n        from burlap.common import (\n            required_system_packages,\n            required_python_packages,\n            required_ruby_packages,\n        )\n        service = (service or '').strip().upper()\n        type = (type or '').lower().strip()\n        assert not type or type in PACKAGE_TYPES, 'Unknown package type: %s' % (type,)\n        packages_set = set()\n        packages = []\n        version = self.os_version\n\n        for _service, satchel in self.all_other_enabled_satchels.items():\n\n            _service = _service.strip().upper()\n            if service and service != _service:\n                continue\n\n            _new = []\n\n            if not type or type == SYSTEM:\n\n                #TODO:deprecated, remove\n                _new.extend(required_system_packages.get(\n                    _service, {}).get((version.distro, version.release), []))\n\n                try:\n                    _pkgs = satchel.packager_system_packages\n                    if self.verbose:\n                        print('pkgs:')\n                        pprint(_pkgs, indent=4)\n                    for _key in [(version.distro, version.release), version.distro]:\n                        if self.verbose:\n                            print('checking key:', _key)\n                        if _key in _pkgs:\n                            if self.verbose:\n                                print('satchel %s requires:' % satchel, _pkgs[_key])\n                            _new.extend(_pkgs[_key])\n                            break\n                except AttributeError:\n                    pass\n\n            if not type or type == PYTHON:\n\n                #TODO:deprecated, remove\n                _new.extend(required_python_packages.get(\n                    _service, {}).get((version.distro, version.release), []))\n\n                try:\n                    _pkgs = satchel.packager_python_packages\n                    for _key in [(version.distro, version.release), version.distro]:\n                        if _key in _pkgs:\n                            _new.extend(_pkgs[_key])\n                except AttributeError:\n                    pass\n                print('_new:', _new)\n\n            if not type or type == RUBY:\n\n                #TODO:deprecated, remove\n                _new.extend(required_ruby_packages.get(\n                    _service, {}).get((version.distro, version.release), []))\n\n            for _ in _new:\n                if _ in packages_set:\n                    continue\n                packages_set.add(_)\n                packages.append(_)\n        if self.verbose:\n            for package in sorted(packages):\n                print('package:', package)\n        return packages", "response": "Displays all packages required by the current role."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef install_required(self, type=None, service=None, list_only=0, **kwargs): # pylint: disable=redefined-builtin\n        r = self.local_renderer\n        list_only = int(list_only)\n        type = (type or '').lower().strip()\n        assert not type or type in PACKAGE_TYPES, 'Unknown package type: %s' % (type,)\n        lst = []\n        if type:\n            types = [type]\n        else:\n            types = PACKAGE_TYPES\n        for _type in types:\n            if _type == SYSTEM:\n                content = '\\n'.join(self.list_required(type=_type, service=service))\n                if list_only:\n                    lst.extend(_ for _ in content.split('\\n') if _.strip())\n                    if self.verbose:\n                        print('content:', content)\n                    break\n                fd, fn = tempfile.mkstemp()\n                fout = open(fn, 'w')\n                fout.write(content)\n                fout.close()\n                self.install_custom(fn=fn)\n            else:\n                raise NotImplementedError\n        return lst", "response": "Installs system packages listed as required by services this host uses."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef uninstall_blacklisted(self):\n        from burlap.system import distrib_family\n        blacklisted_packages = self.env.blacklisted_packages\n        if not blacklisted_packages:\n            print('No blacklisted packages.')\n            return\n        else:\n            family = distrib_family()\n            if family == DEBIAN:\n                self.sudo('DEBIAN_FRONTEND=noninteractive apt-get -yq purge %s' % ' '.join(blacklisted_packages))\n            else:\n                raise NotImplementedError('Unknown family: %s' % family)", "response": "Uninstalls all blacklisted packages."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\ndeploy all crontabs to the host.", "response": "def deploy(self, site=None):\n        \"\"\"\n        Writes entire crontab to the host.\n        \"\"\"\n        r = self.local_renderer\n\n        self.deploy_logrotate()\n\n        cron_crontabs = []\n#         if self.verbose:\n#             print('hostname: \"%s\"' % (hostname,), file=sys.stderr)\n        for _site, site_data in self.iter_sites(site=site):\n            r.env.cron_stdout_log = r.format(r.env.stdout_log_template)\n            r.env.cron_stderr_log = r.format(r.env.stderr_log_template)\n            r.sudo('touch {cron_stdout_log}')\n            r.sudo('touch {cron_stderr_log}')\n            r.sudo('sudo chown {user}:{user} {cron_stdout_log}')\n            r.sudo('sudo chown {user}:{user} {cron_stderr_log}')\n\n            if self.verbose:\n                print('site:', site, file=sys.stderr)\n                print('env.crontabs_selected:', self.env.crontabs_selected, file=sys.stderr)\n\n            for selected_crontab in self.env.crontabs_selected:\n                lines = self.env.crontabs_available.get(selected_crontab, [])\n                if self.verbose:\n                    print('lines:', lines, file=sys.stderr)\n                for line in lines:\n                    cron_crontabs.append(r.format(line))\n\n        if not cron_crontabs:\n            return\n\n        cron_crontabs = self.env.crontab_headers + cron_crontabs\n        cron_crontabs.append('\\n')\n        r.env.crontabs_rendered = '\\n'.join(cron_crontabs)\n        fn = self.write_to_file(content=r.env.crontabs_rendered)\n        print('fn:', fn)\n        r.env.put_remote_path = r.put(local_path=fn)\n        if isinstance(r.env.put_remote_path, (tuple, list)):\n            r.env.put_remote_path = r.env.put_remote_path[0]\n        r.sudo('crontab -u {cron_user} {put_remote_path}')"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef load(self, dump_fn='', prep_only=0, force_upload=0, from_local=0, name=None, site=None, dest_dir=None):\n\n        r = self.database_renderer(name=name, site=site)\n\n        # Render the snapshot filename.\n        r.env.dump_fn = self.get_default_db_fn(fn_template=dump_fn, dest_dir=dest_dir)\n\n        from_local = int(from_local)\n\n        prep_only = int(prep_only)\n\n        missing_local_dump_error = r.format('Database dump file {dump_fn} does not exist.')\n\n        # Copy snapshot file to target.\n        if self.is_local:\n            r.env.remote_dump_fn = dump_fn\n        else:\n            r.env.remote_dump_fn = '/tmp/' + os.path.split(r.env.dump_fn)[-1]\n\n        if not prep_only and not self.is_local:\n            if not self.dryrun:\n                assert os.path.isfile(r.env.dump_fn), missing_local_dump_error\n            r.pc('Uploading MongoDB database snapshot...')\n#                 r.put(\n#                     local_path=r.env.dump_fn,\n#                     remote_path=r.env.remote_dump_fn)\n            r.local('rsync -rvz --progress --no-p --no-g '\n                '--rsh \"ssh -o StrictHostKeyChecking=no -i {key_filename}\" '\n                '{dump_fn} {user}@{host_string}:{remote_dump_fn}')\n\n        if self.is_local and not prep_only and not self.dryrun:\n            assert os.path.isfile(r.env.dump_fn), missing_local_dump_error\n\n        r.run_or_local(r.env.load_command)", "response": "Load a database snapshot into the target database server."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nopen a SQL shell to the given database.", "response": "def shell(self, name='default', user=None, password=None, root=0, verbose=1, write_password=1, no_db=0, no_pw=0):\n        \"\"\"\n        Opens a SQL shell to the given database, assuming the configured database\n        and user supports this feature.\n        \"\"\"\n        raise NotImplementedError"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef update_settings(self, d, role, path='roles/{role}/settings.yaml'):\n        try:\n            import ruamel.yaml\n            load_func = ruamel.yaml.round_trip_load\n            dump_func = ruamel.yaml.round_trip_dump\n        except ImportError:\n            print('Warning: ruamel.yaml not available, reverting to yaml package, possible lost of formatting may occur.')\n            import yaml\n            load_func = yaml.load\n            dump_func = yaml.dump\n        settings_fn = path.format(role=role)\n        data = load_func(open(settings_fn))\n        data.update(d)\n        settings_str = dump_func(data)\n        open(settings_fn, 'w').write(settings_str)", "response": "Updates a key - value pair in a settings file."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef configure_bleeding(self):\n        lm = self.last_manifest\n        r = self.local_renderer\n        if self.env.bleeding and not lm.bleeding:\n            # Install.\n            r.append(\n                text='deb http://www.rabbitmq.com/debian/ testing main',\n                filename='/etc/apt/sources.list.d/rabbitmq.list',\n                use_sudo=True)\n            r.sudo('cd /tmp; wget -O- https://www.rabbitmq.com/rabbitmq-release-signing-key.asc | sudo apt-key add -')\n            r.sudo('apt-get update')\n\n        elif not self.env.bleeding and lm.bleeding:\n            # Uninstall.\n            r.sudo('rm -f /etc/apt/sources.list.d/rabbitmq.list')\n            r.sudo('apt-get update')", "response": "Configure BLEEDING environment variable."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\ninstalls and configures RabbitMQ.", "response": "def _configure_users(self, site=None, full=0, only_data=0):\n        \"\"\"\n        Installs and configures RabbitMQ.\n        \"\"\"\n\n        site = site or ALL\n\n        full = int(full)\n\n        if full and not only_data:\n            packager = self.get_satchel('packager')\n            packager.install_required(type=SYSTEM, service=self.name)\n\n        r = self.local_renderer\n\n        params = self.get_user_vhosts(site=site) # [(user, password, vhost)]\n\n        with settings(warn_only=True):\n            self.add_admin_user()\n\n        params = sorted(list(params))\n        if not only_data:\n            for user, password, vhost in params:\n                r.env.broker_user = user\n                r.env.broker_password = password\n                r.env.broker_vhost = vhost\n                with settings(warn_only=True):\n                    r.sudo('rabbitmqctl add_user {broker_user} {broker_password}')\n                    r.sudo('rabbitmqctl add_vhost {broker_vhost}')\n                    r.sudo('rabbitmqctl set_permissions -p {broker_vhost} {broker_user} \".*\" \".*\" \".*\"')\n                    r.sudo('rabbitmqctl set_permissions -p {broker_vhost} {admin_username} \".*\" \".*\" \".*\"')\n\n        return params"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef record_manifest(self):\n        data = super(RabbitMQSatchel, self).record_manifest()\n        params = sorted(list(self.get_user_vhosts())) # [(user, password, vhost)]\n        data['rabbitmq_all_site_vhosts'] = params\n        data['sites'] = list(self.genv.sites or [])\n        return data", "response": "Returns a dictionary representing a serialized state of the service."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreturning a generator yielding all the keys that have values that differ between each dictionary.", "response": "def iter_dict_differences(a, b):\n    \"\"\"\n    Returns a generator yielding all the keys that have values that differ between each dictionary.\n    \"\"\"\n    common_keys = set(a).union(b)\n    for k in common_keys:\n        a_value = a.get(k)\n        b_value = b.get(k)\n        if a_value != b_value:\n            yield k, (a_value, b_value)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef get_component_order(component_names):\n    assert isinstance(component_names, (tuple, list))\n    component_dependences = {}\n    for _name in component_names:\n        deps = set(manifest_deployers_befores.get(_name, []))\n        deps = deps.intersection(component_names)\n        component_dependences[_name] = deps\n    component_order = list(topological_sort(component_dependences.items()))\n    return component_order", "response": "Given a list of components re - orders them according to inter - component dependencies."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef get_deploy_funcs(components, current_thumbprint, previous_thumbprint, preview=False):\n    for component in components:\n        funcs = manifest_deployers.get(component, [])\n        for func_name in funcs:\n\n            #TODO:remove this after burlap.* naming prefix bug fixed\n            if func_name.startswith('burlap.'):\n                print('skipping %s' % func_name)\n                continue\n\n            takes_diff = manifest_deployers_takes_diff.get(func_name, False)\n\n            func = resolve_deployer(func_name)\n            current = current_thumbprint.get(component)\n            last = previous_thumbprint.get(component)\n            if takes_diff:\n                yield func_name, partial(func, last=last, current=current)\n            else:\n                yield func_name, partial(func)", "response": "Returns a generator yielding the named functions needed for a deployment."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns the path to the manifest file.", "response": "def manifest_filename(self):\n        \"\"\"\n        Returns the path to the manifest file.\n        \"\"\"\n        r = self.local_renderer\n        tp_fn = r.format(r.env.data_dir + '/manifest.yaml')\n        return tp_fn"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nreturning a dictionary representing the current configuration state.", "response": "def get_current_thumbprint(self, components=None):\n        \"\"\"\n        Returns a dictionary representing the current configuration state.\n\n        Thumbprint is of the form:\n\n            {\n                component_name1: {key: value},\n                component_name2: {key: value},\n                ...\n            }\n\n        \"\"\"\n        components = str_to_component_list(components)\n        if self.verbose:\n            print('deploy.get_current_thumbprint.components:', components)\n        manifest_data = {} # {component:data}\n        for component_name, func in sorted(manifest_recorder.items()):\n            self.vprint('Checking thumbprint for component %s...' % component_name)\n            manifest_key = assert_valid_satchel(component_name)\n            service_name = clean_service_name(component_name)\n            if service_name not in self.genv.services:\n                self.vprint('Skipping unused component:', component_name)\n                continue\n            elif components and service_name not in components:\n                self.vprint('Skipping non-matching component:', component_name)\n                continue\n            try:\n                self.vprint('Retrieving manifest for %s...' % component_name)\n                manifest_data[manifest_key] = func()\n                if self.verbose:\n                    pprint(manifest_data[manifest_key], indent=4)\n            except exceptions.AbortDeployment as e:\n                raise\n        return manifest_data"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef get_previous_thumbprint(self, components=None):\n        components = str_to_component_list(components)\n        tp_fn = self.manifest_filename\n        tp_text = None\n        if self.file_exists(tp_fn):\n            fd = six.BytesIO()\n            get(tp_fn, fd)\n            tp_text = fd.getvalue()\n            manifest_data = {}\n            raw_data = yaml.load(tp_text)\n            for k, v in raw_data.items():\n                manifest_key = assert_valid_satchel(k)\n                service_name = clean_service_name(k)\n                if components and service_name not in components:\n                    continue\n                manifest_data[manifest_key] = v\n            return manifest_data", "response": "Returns a dictionary representing the previous configuration state."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef lock(self):\n        self.init()\n        r = self.local_renderer\n        if self.file_exists(r.env.lockfile_path):\n            raise exceptions.AbortDeployment('Lock file %s exists. Perhaps another deployment is currently underway?' % r.env.lockfile_path)\n        else:\n            self.vprint('Locking %s.' % r.env.lockfile_path)\n            r.env.hostname = socket.gethostname()\n            r.run_or_local('echo \"{hostname}\" > {lockfile_path}')", "response": "Lock the remote server."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef unlock(self):\n        self.init()\n        r = self.local_renderer\n        if self.file_exists(r.env.lockfile_path):\n            self.vprint('Unlocking %s.' % r.env.lockfile_path)\n            r.run_or_local('rm -f {lockfile_path}')", "response": "Unlocks the local cache file."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ncreates a fake deployment of the current satchel.", "response": "def fake(self, components=None):#, set_satchels=None):\n        \"\"\"\n        Update the thumbprint on the remote server but execute no satchel configurators.\n\n        components = A comma-delimited list of satchel names to limit the fake deployment to.\n        set_satchels = A semi-colon delimited list of key-value pairs to set in satchels before recording a fake deployment.\n        \"\"\"\n\n        self.init()\n\n        # In cases where we only want to fake deployment of a specific satchel, then simply copy the last thumbprint and overwrite with a subset\n        # of the current thumbprint filtered by our target components.\n        if components:\n            current_tp = self.get_previous_thumbprint() or {}\n            current_tp.update(self.get_current_thumbprint(components=components) or {})\n        else:\n            current_tp = self.get_current_thumbprint(components=components) or {}\n\n        tp_text = yaml.dump(current_tp)\n        r = self.local_renderer\n        r.upload_content(content=tp_text, fn=self.manifest_filename)\n\n        # Ensure all cached manifests are cleared, so they reflect the newly deployed changes.\n        self.reset_all_satchels()"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_component_funcs(self, components=None):\n\n        current_tp = self.get_current_thumbprint(components=components) or {}\n        previous_tp = self.get_previous_thumbprint(components=components) or {}\n\n        if self.verbose:\n            print('Current thumbprint:')\n            pprint(current_tp, indent=4)\n            print('Previous thumbprint:')\n            pprint(previous_tp, indent=4)\n\n        differences = list(iter_dict_differences(current_tp, previous_tp))\n        if self.verbose:\n            print('Differences:')\n            pprint(differences, indent=4)\n        component_order = get_component_order([k for k, (_, _) in differences])\n        if self.verbose:\n            print('component_order:')\n            pprint(component_order, indent=4)\n        plan_funcs = list(get_deploy_funcs(component_order, current_tp, previous_tp))\n\n        return component_order, plan_funcs", "response": "Calculates the component functions that need to be executed for a deployment."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef preview(self, components=None, ask=0):\n\n        ask = int(ask)\n\n        self.init()\n\n        component_order, plan_funcs = self.get_component_funcs(components=components)\n\n        print('\\n%i changes found for host %s.\\n' % (len(component_order), self.genv.host_string))\n        if component_order and plan_funcs:\n            if self.verbose:\n                print('These components have changed:\\n')\n                for component in sorted(component_order):\n                    print((' '*4)+component)\n            print('Deployment plan for host %s:\\n' % self.genv.host_string)\n            for func_name, _ in plan_funcs:\n                print(success_str((' '*4)+func_name))\n        if component_order:\n            print()\n\n        if ask and self.genv.host_string == self.genv.hosts[-1]:\n            if component_order:\n                if not raw_input('Begin deployment? [yn] ').strip().lower().startswith('y'):\n                    sys.exit(0)\n            else:\n                sys.exit(0)", "response": "Inspects differences between the last deployment and the current code state."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef push(self, components=None, yes=0):\n        from burlap import notifier\n        service = self.get_satchel('service')\n        self.lock()\n        try:\n\n            yes = int(yes)\n            if not yes:\n                # If we want to confirm the deployment with the user, and we're at the first server,\n                # then run the preview.\n                if self.genv.host_string == self.genv.hosts[0]:\n                    execute(partial(self.preview, components=components, ask=1))\n\n            notifier.notify_pre_deployment()\n            component_order, plan_funcs = self.get_component_funcs(components=components)\n\n            service.pre_deploy()\n            for func_name, plan_func in plan_funcs:\n                print('Executing %s...' % func_name)\n                plan_func()\n            self.fake(components=components)\n\n            service.post_deploy()\n            notifier.notify_post_deployment()\n\n        finally:\n            self.unlock()", "response": "Execute all satchel configurators to apply pending changes to the server."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef get_thumbprint(self):\n        d = {}\n        settings = dj.get_settings()\n        for name in self.names:\n            d[name] = getattr(settings, name)\n        return d", "response": "Calculates the current thumbprint of the item being tracked."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef get_settings(self, site=None, role=None):\n        r = self.local_renderer\n        _stdout = sys.stdout\n        _stderr = sys.stderr\n        if not self.verbose:\n            sys.stdout = StringIO()\n            sys.stderr = StringIO()\n        try:\n            sys.path.insert(0, r.env.src_dir)\n\n            # Temporarily override SITE.\n            tmp_site = self.genv.SITE\n            if site and site.endswith('_secure'):\n                site = site[:-7]\n            site = site or self.genv.SITE or self.genv.default_site\n            self.set_site(site)\n\n            # Temporarily override ROLE.\n            tmp_role = self.genv.ROLE\n            if role:\n                self.set_role(role)\n\n            try:\n                # We need to explicitly delete sub-modules from sys.modules. Otherwise, reload() skips\n                # them and they'll continue to contain obsolete settings.\n                if r.env.delete_module_with_prefixes:\n                    for name in sorted(sys.modules):\n                        for prefix in r.env.delete_module_with_prefixes:\n                            if name.startswith(prefix):\n                                if self.verbose:\n                                    print('Deleting module %s prior to re-import.' % name)\n                                del sys.modules[name]\n                                break\n\n                for name in list(sys.modules):\n                    for s in r.env.delete_module_containing:\n                        if s in name:\n                            del sys.modules[name]\n                            break\n\n                if r.env.settings_module in sys.modules:\n                    del sys.modules[r.env.settings_module]\n\n                #TODO:fix r.env.settings_module not loading from settings?\n#                 print('r.genv.django_settings_module:', r.genv.django_settings_module, file=_stdout)\n#                 print('r.genv.dj_settings_module:', r.genv.dj_settings_module, file=_stdout)\n#                 print('r.env.settings_module:', r.env.settings_module, file=_stdout)\n                if 'django_settings_module' in r.genv:\n                    r.env.settings_module = r.genv.django_settings_module\n                else:\n                    r.env.settings_module = r.env.settings_module or r.genv.dj_settings_module\n                if self.verbose:\n                    print('r.env.settings_module:', r.env.settings_module, r.format(r.env.settings_module))\n                module = import_module(r.format(r.env.settings_module))\n\n                if site:\n                    assert site == module.SITE, 'Unable to set SITE to \"%s\" Instead it is set to \"%s\".' % (site, module.SITE)\n\n                # Works as long as settings.py doesn't also reload anything.\n                import imp\n                imp.reload(module)\n\n            except ImportError as e:\n                print('Warning: Could not import settings for site \"%s\": %s' % (site, e), file=_stdout)\n                traceback.print_exc(file=_stdout)\n                #raise # breaks *_secure pseudo sites\n                return\n            finally:\n                if tmp_site:\n                    self.set_site(tmp_site)\n                if tmp_role:\n                    self.set_role(tmp_role)\n        finally:\n            sys.stdout = _stdout\n            sys.stderr = _stderr\n            sys.path.remove(r.env.src_dir)\n        return module", "response": "Retrieves the Django settings dictionary."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ninstalling all custom SQL files.", "response": "def install_sql(self, site=None, database='default', apps=None, stop_on_error=0, fn=None):\n        \"\"\"\n        Installs all custom SQL.\n        \"\"\"\n        #from burlap.db import load_db_set\n\n        stop_on_error = int(stop_on_error)\n\n        site = site or ALL\n\n        name = database\n\n        r = self.local_renderer\n        paths = glob.glob(r.format(r.env.install_sql_path_template))\n\n        apps = [_ for _ in (apps or '').split(',') if _.strip()]\n        if self.verbose:\n            print('install_sql.apps:', apps)\n\n        def cmp_paths(d0, d1):\n            if d0[1] and d0[1] in d1[2]:\n                return -1\n            if d1[1] and d1[1] in d0[2]:\n                return +1\n            return cmp(d0[0], d1[0])\n\n        def get_paths(t):\n            \"\"\"\n            Returns SQL file paths in an execution order that respect dependencies.\n            \"\"\"\n            data = [] # [(path, view_name, content)]\n            for path in paths:\n                if fn and fn not in path:\n                    continue\n                parts = path.split('.')\n                if len(parts) == 3 and parts[1] != t:\n                    continue\n                if not path.lower().endswith('.sql'):\n                    continue\n                content = open(path, 'r').read()\n                matches = re.findall(r'[\\s\\t]+VIEW[\\s\\t]+([a-zA-Z0-9_]{3,})', content, flags=re.IGNORECASE)\n                view_name = ''\n                if matches:\n                    view_name = matches[0]\n                    print('Found view %s.' % view_name)\n                data.append((path, view_name, content))\n            for d in sorted(data, cmp=cmp_paths):\n                yield d[0]\n\n        def run_paths(paths, cmd_template, max_retries=3):\n            r = self.local_renderer\n            paths = list(paths)\n            error_counts = defaultdict(int) # {path:count}\n            terminal = set()\n            if self.verbose:\n                print('Checking %i paths.' % len(paths))\n            while paths:\n                path = paths.pop(0)\n                if self.verbose:\n                    print('path:', path)\n                app_name = re.findall(r'/([^/]+)/sql/', path)[0]\n                if apps and app_name not in apps:\n                    self.vprint('skipping because app_name %s not in apps' % app_name)\n                    continue\n                with self.settings(warn_only=True):\n                    if self.is_local:\n                        r.env.sql_path = path\n                    else:\n                        r.env.sql_path = '/tmp/%s' % os.path.split(path)[-1]\n                        r.put(local_path=path, remote_path=r.env.sql_path)\n                    ret = r.run_or_local(cmd_template)\n                    if ret and ret.return_code:\n\n                        if stop_on_error:\n                            raise Exception('Unable to execute file %s' % path)\n\n                        error_counts[path] += 1\n                        if error_counts[path] < max_retries:\n                            paths.append(path)\n                        else:\n                            terminal.add(path)\n            if terminal:\n                print('%i files could not be loaded.' % len(terminal), file=sys.stderr)\n                for path in sorted(list(terminal)):\n                    print(path, file=sys.stderr)\n                print(file=sys.stderr)\n\n        if self.verbose:\n            print('install_sql.db_engine:', r.env.db_engine)\n\n        for _site, site_data in self.iter_sites(site=site, no_secure=True):\n\n            self.set_db(name=name, site=_site)\n\n            if 'postgres' in r.env.db_engine or 'postgis' in r.env.db_engine:\n                paths = list(get_paths('postgresql'))\n                run_paths(\n                    paths=paths,\n                    cmd_template=\"psql --host={db_host} --user={db_user} --no-password -d {db_name} -f {sql_path}\")\n\n            elif 'mysql' in r.env.db_engine:\n                paths = list(get_paths('mysql'))\n                run_paths(\n                    paths=paths,\n                    cmd_template=\"mysql -v -h {db_host} -u {db_user} -p'{db_password}' {db_name} < {sql_path}\")\n\n            else:\n                raise NotImplementedError"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nruns the Django createsuperuser management command.", "response": "def createsuperuser(self, username='admin', email=None, password=None, site=None):\n        \"\"\"\n        Runs the Django createsuperuser management command.\n        \"\"\"\n        r = self.local_renderer\n        site = site or self.genv.SITE\n        self.set_site_specifics(site)\n        options = ['--username=%s' % username]\n        if email:\n            options.append('--email=%s' % email)\n        if password:\n            options.append('--password=%s' % password)\n        r.env.options_str = ' '.join(options)\n        if self.is_local:\n            r.env.project_dir = r.env.local_project_dir\n        r.genv.SITE = r.genv.SITE or site\n        r.run_or_local('export SITE={SITE}; export ROLE={ROLE}; cd {project_dir}; {manage_cmd} {createsuperuser_cmd} {options_str}')"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nrunning the Dango loaddata management command.", "response": "def loaddata(self, path, site=None):\n        \"\"\"\n        Runs the Dango loaddata management command.\n\n        By default, runs on only the current site.\n\n        Pass site=all to run on all sites.\n        \"\"\"\n        site = site or self.genv.SITE\n        r = self.local_renderer\n        r.env._loaddata_path = path\n        for _site, site_data in self.iter_sites(site=site, no_secure=True):\n            try:\n                self.set_db(site=_site)\n                r.env.SITE = _site\n                r.sudo('export SITE={SITE}; export ROLE={ROLE}; '\n                    'cd {project_dir}; '\n                    '{manage_cmd} loaddata {_loaddata_path}')\n            except KeyError:\n                pass"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef manage_all(self, *args, **kwargs):\n        for site, site_data in self.iter_unique_databases(site='all'):\n            if self.verbose:\n                print('-'*80, file=sys.stderr)\n                print('site:', site, file=sys.stderr)\n            if self.env.available_sites_by_host:\n                hostname = self.current_hostname\n                sites_on_host = self.env.available_sites_by_host.get(hostname, [])\n                if sites_on_host and site not in sites_on_host:\n                    self.vprint('skipping site:', site, sites_on_host, file=sys.stderr)\n                    continue\n            self.manage(*args, **kwargs)", "response": "Runs manage on all unique site default databases."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef load_django_settings(self):\n        r = self.local_renderer\n\n        # Save environment variables so we can restore them later.\n        _env = {}\n        save_vars = ['ALLOW_CELERY', 'DJANGO_SETTINGS_MODULE']\n        for var_name in save_vars:\n            _env[var_name] = os.environ.get(var_name)\n\n        try:\n\n            # Allow us to import local app modules.\n            if r.env.local_project_dir:\n                sys.path.insert(0, r.env.local_project_dir)\n\n            #TODO:remove this once bug in django-celery has been fixed\n            os.environ['ALLOW_CELERY'] = '0'\n\n#             print('settings_module:', r.format(r.env.settings_module))\n            os.environ['DJANGO_SETTINGS_MODULE'] = r.format(r.env.settings_module)\n#             os.environ['CELERY_LOADER'] = 'django'\n#             os.environ['SITE'] = r.genv.SITE or r.genv.default_site\n#             os.environ['ROLE'] = r.genv.ROLE or r.genv.default_role\n\n            # In Django >= 1.7, fixes the error AppRegistryNotReady: Apps aren't loaded yet\n            # Disabling, in Django >= 1.10, throws exception:\n            # RuntimeError: Model class django.contrib.contenttypes.models.ContentType\n            # doesn't declare an explicit app_label and isn't in an application in INSTALLED_APPS.\n#             try:\n#                 from django.core.wsgi import get_wsgi_application\n#                 application = get_wsgi_application()\n#             except (ImportError, RuntimeError):\n#                 raise\n#                 print('Unable to get wsgi application.')\n#                 traceback.print_exc()\n\n            # In Django >= 1.7, fixes the error AppRegistryNotReady: Apps aren't loaded yet\n            try:\n                import django\n                django.setup()\n            except AttributeError:\n                # This doesn't exist in Django < 1.7, so ignore it.\n                pass\n\n            # Load Django settings.\n            settings = self.get_settings()\n            try:\n                from django.contrib import staticfiles\n                from django.conf import settings as _settings\n\n                # get_settings() doesn't raise ImportError but returns None instead\n                if settings is not None:\n                    for k, v in settings.__dict__.items():\n                        setattr(_settings, k, v)\n                else:\n                    raise ImportError\n            except (ImportError, RuntimeError):\n                print('Unable to load settings.')\n                traceback.print_exc()\n\n        finally:\n            # Restore environment variables.\n            for var_name, var_value in _env.items():\n                if var_value is None:\n                    del os.environ[var_name]\n                else:\n                    os.environ[var_name] = var_value\n\n        return settings", "response": "Loads Django settings for the current site and sets them so Django internals can be run later."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef shell(self):\n        r = self.local_renderer\n        if '@' in self.genv.host_string:\n            r.env.shell_host_string = self.genv.host_string\n        else:\n            r.env.shell_host_string = '{user}@{host_string}'\n        r.env.shell_default_dir = self.genv.shell_default_dir_template\n        r.env.shell_interactive_djshell_str = self.genv.interactive_shell_template\n        r.run_or_local('ssh -t -i {key_filename} {shell_host_string} \"{shell_interactive_djshell_str}\"')", "response": "Opens a Django focussed Python shell."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef syncdb(self, site=None, all=0, database=None, ignore_errors=1): # pylint: disable=redefined-builtin\n        r = self.local_renderer\n\n        ignore_errors = int(ignore_errors)\n\n        post_south = self.version_tuple >= (1, 7, 0)\n\n        use_run_syncdb = self.version_tuple >= (1, 9, 0)\n\n        # DEPRECATED: removed in Django>=1.7\n        r.env.db_syncdb_all_flag = '--all' if int(all) else ''\n\n        r.env.db_syncdb_database = ''\n        if database:\n            r.env.db_syncdb_database = ' --database=%s' % database\n\n        if self.is_local:\n            r.env.project_dir = r.env.local_project_dir\n\n        site = site or self.genv.SITE\n        for _site, site_data in r.iter_unique_databases(site=site):\n            r.env.SITE = _site\n            with self.settings(warn_only=ignore_errors):\n                if post_south:\n                    if use_run_syncdb:\n                        r.run_or_local(\n                            'export SITE={SITE}; export ROLE={ROLE}; cd {project_dir}; '\n                            '{manage_cmd} migrate --run-syncdb --noinput {db_syncdb_database}')\n                    else:\n                        # Between Django>=1.7,<1.9 we can only do a regular migrate, no true syncdb.\n                        r.run_or_local(\n                            'export SITE={SITE}; export ROLE={ROLE}; cd {project_dir}; '\n                            '{manage_cmd} migrate --noinput {db_syncdb_database}')\n                else:\n                    r.run_or_local(\n                        'export SITE={SITE}; export ROLE={ROLE}; cd {project_dir}; '\n                        '{manage_cmd} syncdb --noinput {db_syncdb_all_flag} {db_syncdb_database}')", "response": "Runs the standard Django syncdb command for one or more sites."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nrunning the standard South migrate command for one or more sites.", "response": "def migrate(self, app='', migration='', site=None, fake=0, ignore_errors=None, skip_databases=None, database=None, migrate_apps='', delete_ghosts=1):\n        \"\"\"\n        Runs the standard South migrate command for one or more sites.\n        \"\"\"\n    #     Note, to pass a comma-delimted list in a fab command, escape the comma with a back slash.\n    #\n    #         e.g.\n    #\n    #             fab staging dj.migrate:migrate_apps=oneapp\\,twoapp\\,threeapp\n\n        r = self.local_renderer\n\n        ignore_errors = int(r.env.ignore_migration_errors if ignore_errors is None else ignore_errors)\n\n        delete_ghosts = int(delete_ghosts)\n\n        post_south = self.version_tuple >= (1, 7, 0)\n\n        if self.version_tuple >= (1, 9, 0):\n            delete_ghosts = 0\n\n        skip_databases = (skip_databases or '')\n        if isinstance(skip_databases, six.string_types):\n            skip_databases = [_.strip() for _ in skip_databases.split(',') if _.strip()]\n\n        migrate_apps = migrate_apps or ''\n        migrate_apps = [\n            _.strip().split('.')[-1]\n            for _ in migrate_apps.strip().split(',')\n            if _.strip()\n        ]\n        if app:\n            migrate_apps.append(app)\n\n        r.env.migrate_migration = migration or ''\n        r.env.migrate_fake_str = '--fake' if int(fake) else ''\n        r.env.migrate_database = '--database=%s' % database if database else ''\n        r.env.migrate_merge = '--merge' if not post_south else ''\n        r.env.delete_ghosts = '--delete-ghost-migrations' if delete_ghosts and not post_south else ''\n        self.vprint('project_dir0:', r.env.project_dir, r.genv.get('dj_project_dir'), r.genv.get('project_dir'))\n        self.vprint('migrate_apps:', migrate_apps)\n\n        if self.is_local:\n            r.env.project_dir = r.env.local_project_dir\n\n        # CS 2017-3-29 Don't bypass the iterator. That causes reversion to the global env that could corrupt the generated commands.\n        #databases = list(self.iter_unique_databases(site=site))#TODO:remove\n        # CS 2017-4-24 Don't specify a single site as the default when none is supplied. Otherwise all other sites will be ignored.\n        #site = site or self.genv.SITE\n        site = site or ALL\n        databases = self.iter_unique_databases(site=site)\n        for _site, site_data in databases:\n            self.vprint('-'*80, file=sys.stderr)\n            self.vprint('site:', _site, file=sys.stderr)\n\n            if self.env.available_sites_by_host:\n                hostname = self.current_hostname\n                sites_on_host = self.env.available_sites_by_host.get(hostname, [])\n                if sites_on_host and _site not in sites_on_host:\n                    self.vprint('skipping site:', _site, sites_on_host, file=sys.stderr)\n                    continue\n\n            if not migrate_apps:\n                migrate_apps.append(' ')\n\n            for _app in migrate_apps:\n                # In cases where we're migrating built-in apps or apps with dotted names\n                # e.g. django.contrib.auth, extract the name used for the migrate command.\n                r.env.migrate_app = _app.split('.')[-1]\n                self.vprint('project_dir1:', r.env.project_dir, r.genv.get('dj_project_dir'), r.genv.get('project_dir'))\n                r.env.SITE = _site\n                with self.settings(warn_only=ignore_errors):\n                    r.run_or_local(\n                        'export SITE={SITE}; export ROLE={ROLE}; {migrate_pre_command} cd {project_dir}; '\n                        '{manage_cmd} migrate --noinput {migrate_merge} --traceback '\n                        '{migrate_database} {delete_ghosts} {migrate_app} {migrate_migration} '\n                        '{migrate_fake_str}')"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nstart a Django management command in a screen.", "response": "def manage_async(self, command='', name='process', site=ALL, exclude_sites='', end_message='', recipients=''):\n        \"\"\"\n        Starts a Django management command in a screen.\n\n        Parameters:\n\n            command :- all arguments passed to `./manage` as a single string\n\n            site :- the site to run the command for (default is all)\n\n        Designed to be ran like:\n\n            fab <role> dj.manage_async:\"some_management_command --force\"\n\n        \"\"\"\n        exclude_sites = exclude_sites.split(':')\n        r = self.local_renderer\n        for _site, site_data in self.iter_sites(site=site, no_secure=True):\n            if _site in exclude_sites:\n                continue\n            r.env.SITE = _site\n            r.env.command = command\n            r.env.end_email_command = ''\n            r.env.recipients = recipients or ''\n            r.env.end_email_command = ''\n            if end_message:\n                end_message = end_message + ' for ' + _site\n                end_message = end_message.replace(' ', '_')\n                r.env.end_message = end_message\n                r.env.end_email_command = r.format('{manage_cmd} send_mail --subject={end_message} --recipients={recipients}')\n            r.env.name = name.format(**r.genv)\n            r.run(\n                'screen -dmS {name} bash -c \"export SITE={SITE}; '\\\n                'export ROLE={ROLE}; cd {project_dir}; '\\\n                '{manage_cmd} {command} --traceback; {end_email_command}\"; sleep 3;')"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef get_media_timestamp(self, last_timestamp=None):\n        r = self.local_renderer\n        _latest_timestamp = -1e9999999999999999\n        for path in self.iter_static_paths():\n            path = r.env.static_root + '/' + path\n            self.vprint('checking timestamp of path:', path)\n            if not os.path.isfile(path):\n                continue\n            #print('path:', path)\n            _latest_timestamp = max(_latest_timestamp, get_last_modified_timestamp(path) or _latest_timestamp)\n            if last_timestamp is not None and _latest_timestamp > last_timestamp:\n                break\n        self.vprint('latest_timestamp:', _latest_timestamp)\n        return _latest_timestamp", "response": "Retrieves the most recent timestamp of the media in the static root. If last_timestamp is not given retrieves the first timestamp more recent than this value."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nset the root login for the given database on the given host and sets the root login to environment variables.", "response": "def set_root_login(self, r):\n        \"\"\"\n        Looks up the root login for the given database on the given host and sets\n        it to environment variables.\n\n        Populates these standard variables:\n\n            db_root_password\n            db_root_username\n\n        \"\"\"\n\n        # Check the legacy password location.\n        try:\n            r.env.db_root_username = r.env.root_username\n        except AttributeError:\n            pass\n        try:\n            r.env.db_root_password = r.env.root_password\n        except AttributeError:\n            pass\n\n        # Check the new password location.\n        key = r.env.get('db_host')\n        if self.verbose:\n            print('db.set_root_login.key:', key)\n            print('db.set_root_logins:', r.env.root_logins)\n        if key in r.env.root_logins:\n            data = r.env.root_logins[key]\n#             print('data:', data)\n            if 'username' in data:\n                r.env.db_root_username = data['username']\n                r.genv.db_root_username = data['username']\n            if 'password' in data:\n                r.env.db_root_password = data['password']\n                r.genv.db_root_password = data['password']\n        else:\n            msg = 'Warning: No root login entry found for host %s in role %s.' % (r.env.get('db_host'), self.genv.get('ROLE'))\n            print(msg, file=sys.stderr)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef database_renderer(self, name=None, site=None, role=None):\n\n        name = name or self.env.default_db_name\n\n        site = site or self.genv.SITE\n\n        role = role or self.genv.ROLE\n\n        key = (name, site, role)\n        self.vprint('checking key:', key)\n        if key not in self._database_renderers:\n            self.vprint('No cached db renderer, generating...')\n\n            if self.verbose:\n                print('db.name:', name)\n                print('db.databases:', self.env.databases)\n                print('db.databases[%s]:' % name, self.env.databases.get(name))\n\n            d = type(self.genv)(self.lenv)\n            d.update(self.get_database_defaults())\n            d.update(self.env.databases.get(name, {}))\n            d['db_name'] = name\n            if self.verbose:\n                print('db.d:')\n                pprint(d, indent=4)\n                print('db.connection_handler:', d.connection_handler)\n\n            if d.connection_handler == CONNECTION_HANDLER_DJANGO:\n                self.vprint('Using django handler...')\n                dj = self.get_satchel('dj')\n                if self.verbose:\n                    print('Loading Django DB settings for site {} and role {}.'.format(site, role), file=sys.stderr)\n                dj.set_db(name=name, site=site, role=role)\n                _d = dj.local_renderer.collect_genv(include_local=True, include_global=False)\n\n                # Copy \"dj_db_*\" into \"db_*\".\n                for k, v in _d.items():\n                    if k.startswith('dj_db_'):\n                        _d[k[3:]] = v\n                    del _d[k]\n\n                if self.verbose:\n                    print('Loaded:')\n                    pprint(_d)\n                d.update(_d)\n\n            elif d.connection_handler and d.connection_handler.startswith(CONNECTION_HANDLER_CUSTOM+':'):\n\n                _callable_str = d.connection_handler[len(CONNECTION_HANDLER_CUSTOM+':'):]\n                self.vprint('Using custom handler %s...' % _callable_str)\n                _d = str_to_callable(_callable_str)(role=self.genv.ROLE)\n                if self.verbose:\n                    print('Loaded:')\n                    pprint(_d)\n                d.update(_d)\n\n            r = LocalRenderer(self, lenv=d)\n\n            # Optionally set any root logins needed for administrative commands.\n            self.set_root_login(r)\n\n            self._database_renderers[key] = r\n        else:\n            self.vprint('Cached db renderer found.')\n\n        return self._database_renderers[key]", "response": "Returns a new database instance with the given name site and role."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ngets the free space in bytes.", "response": "def get_free_space(self):\n        \"\"\"\n        Return free space in bytes.\n        \"\"\"\n        cmd = \"df -k | grep -vE '^Filesystem|tmpfs|cdrom|none|udev|cgroup' | awk '{ print($1 \\\" \\\" $4 }'\"\n        lines = [_ for _ in self.run(cmd).strip().split('\\n') if _.startswith('/')]\n        assert len(lines) == 1, 'Ambiguous devices: %s' % str(lines)\n        device, kb = lines[0].split(' ')\n        free_space = int(kb) * 1024\n        self.vprint('free_space (bytes):', free_space)\n        return free_space"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef load_db_set(self, name, r=None):\n        r = r or self\n        db_set = r.genv.db_sets.get(name, {})\n        r.genv.update(db_set)", "response": "Loads database parameters from a specific named set."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef loadable(self, src, dst):\n        from fabric import state\n        from fabric.task_utils import crawl\n\n        src_task = crawl(src, state.commands)\n        assert src_task, 'Unknown source role: %s' % src\n\n        dst_task = crawl(dst, state.commands)\n        assert dst_task, 'Unknown destination role: %s' % src\n\n        # Get source database size.\n        src_task()\n        env.host_string = env.hosts[0]\n        src_size_bytes = self.get_size()\n\n        # Get target database size, if any.\n        dst_task()\n        env.host_string = env.hosts[0]\n        try:\n            dst_size_bytes = self.get_size()\n        except (ValueError, TypeError):\n            dst_size_bytes = 0\n\n        # Get target host disk size.\n        free_space_bytes = self.get_free_space()\n\n        # Deduct existing database size, because we'll be deleting it.\n        balance_bytes = free_space_bytes + dst_size_bytes - src_size_bytes\n        balance_bytes_scaled, units = pretty_bytes(balance_bytes)\n\n        viable = balance_bytes >= 0\n        if self.verbose:\n            print('src_db_size:', pretty_bytes(src_size_bytes))\n            print('dst_db_size:', pretty_bytes(dst_size_bytes))\n            print('dst_free_space:', pretty_bytes(free_space_bytes))\n            print\n            if viable:\n                print('Viable! There will be %.02f %s of disk space left.' % (balance_bytes_scaled, units))\n            else:\n                print('Not viable! We would be %.02f %s short.' % (balance_bytes_scaled, units))\n\n        return viable", "response": "Determines if there s enough space to load the target database."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef dump(self, dest_dir=None, to_local=1, from_local=0, archive=0, dump_fn=None, name=None, site=None, use_sudo=0, cleanup=1):\n        r = self.local_renderer\n\n        site = site or self.genv.SITE\n\n        r = self.database_renderer(name=name, site=site)\n\n        # Load optional site-specific command, if given.\n        try:\n            r.env.dump_command = self.genv.sites[site]['postgresql_dump_command']\n        except KeyError:\n            pass\n\n        use_sudo = int(use_sudo)\n\n        from_local = int(from_local)\n\n        to_local = int(to_local)\n\n        dump_fn = dump_fn or r.env.dump_fn_template\n\n        # Render the snapshot filename.\n        r.env.dump_fn = self.get_default_db_fn(\n            fn_template=dump_fn,\n            dest_dir=dest_dir,\n            name=name,\n            site=site,\n        )\n\n        # Dump the database to a snapshot file.\n        #if not os.path.isfile(os.path.abspath(r.env.dump_fn))):\n        r.pc('Dumping database snapshot.')\n        if from_local:\n            r.local(r.env.dump_command)\n        elif use_sudo:\n            r.sudo(r.env.dump_command)\n        else:\n            r.run(r.env.dump_command)\n\n        # Download the database dump file on the remote host to localhost.\n        if not from_local and to_local:\n            r.pc('Downloading database snapshot to localhost.')\n            r.local('rsync -rvz --progress --recursive --no-p --no-g '\n                '--rsh \"ssh -o StrictHostKeyChecking=no -i {key_filename}\" {user}@{host_string}:{dump_fn} {dump_fn}')\n\n            # Delete the snapshot file on the remote system.\n            if int(cleanup):\n                r.pc('Deleting database snapshot on remote host.')\n                r.sudo('rm {dump_fn}')\n\n        # Move the database snapshot to an archive directory.\n        if to_local and int(archive):\n            r.pc('Archiving database snapshot.')\n            db_fn = r.render_fn(r.env.dump_fn)\n            r.env.archive_fn = '%s/%s' % (env.db_dump_archive_dir, os.path.split(db_fn)[-1])\n            r.local('mv %s %s' % (db_fn, env.archive_fn))\n\n        return r.env.dump_fn", "response": "Dumps the database to a single transportable file on the localhost."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef show(keyword=''):\n    keyword = keyword.strip().lower()\n    max_len = max(len(k) for k in env.iterkeys())\n    keyword_found = False\n    for k in sorted(env.keys()):\n        if keyword and keyword not in k.lower():\n            continue\n        keyword_found = True\n        #print '%s: %s' % (k, env[k])\n        print('%s: ' % (k.ljust(max_len),))\n        pprint(env[k], indent=4)\n    if keyword:\n        if not keyword_found:\n            print('Keyword \"%s\" not found.' % keyword)", "response": "Displays a list of all environment key value pairs for the current role."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef record_manifest():\n    data = {}\n    # Record settings.\n    data['settings'] = dict(\n        (k, v)\n        for k, v in env.items()\n        if not isinstance(v, types.GeneratorType) and k.strip() and not k.startswith('_') and not callable(v)\n    )\n    # Record tarball hash.\n    # Record database migrations.\n    # Record media hash.\n    return data", "response": "Record the manifest for the current deployment."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nfix the Ethernet name slot for the local renderer to eth0.", "response": "def fix_eth0_rename(self, hardware_addr):\n        \"\"\"\n        A bug as of 2016.10.10 causes eth0 to be renamed to enx*.\n        This renames it to eth0.\n\n        http://raspberrypi.stackexchange.com/q/43560/29103\n        \"\"\"\n        r = self.local_renderer\n        r.env.hardware_addr = hardware_addr\n        r.sudo('ln -s /dev/null /etc/udev/rules.d/80-net-name-slot.rules')\n        r.append(\n            text=r'SUBSYSTEM==\"net\", ACTION==\"add\", DRIVERS==\"?*\", '\\\n                r'ATTR\\{address\\}==\"{hardware_addr}\", '\\\n                r'ATTR\\{dev_id\\}==\"0x0\", '\\\n                r'ATTR\\{type\\}==\"1\", '\\\n                r'KERNEL==\"eth*\", NAME=\"eth0\"',\n            filename='/etc/udev/rules.d/70-persistent-net.rules',\n            use_sudo=True,\n        )"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef assume_localhost(self):\n        if not self.genv.host_string:\n            self.genv.host_string = 'localhost'\n            self.genv.hosts = ['localhost']\n            self.genv.user = getpass.getuser()", "response": "Assume that the local host is set to localhost."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef init_raspbian_disk(self, yes=0):\n        self.assume_localhost()\n\n        yes = int(yes)\n        device_question = 'SD card present at %s? ' % self.env.sd_device\n        if not yes and not raw_input(device_question).lower().startswith('y'):\n            return\n\n        r = self.local_renderer\n        r.local_if_missing(\n            fn='{raspbian_image_zip}',\n            cmd='wget {raspbian_download_url} -O raspbian_lite_latest.zip')\n\n        r.lenv.img_fn = \\\n            r.local(\"unzip -l {raspbian_image_zip} | sed -n 4p | awk '{{print $4}}'\", capture=True) or '$IMG_FN'\n        r.local('echo {img_fn}')\n        r.local('[ ! -f {img_fn} ] && unzip {raspbian_image_zip} {img_fn} || true')\n        r.lenv.img_fn = r.local('readlink -f {img_fn}', capture=True)\n        r.local('echo {img_fn}')\n\n        with self.settings(warn_only=True):\n            r.sudo('[ -d \"{sd_media_mount_dir}\" ] && umount {sd_media_mount_dir} || true')\n        with self.settings(warn_only=True):\n            r.sudo('[ -d \"{sd_media_mount_dir2}\" ] && umount {sd_media_mount_dir2} || true')\n\n        r.pc('Writing the image onto the card.')\n        r.sudo('time dd bs=4M if={img_fn} of={sd_device}')\n\n        # Flush all writes to disk.\n        r.run('sync')", "response": "Initialize the microSD card with the latest Raspbian image and writes it to the disk."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef init_ubuntu_disk(self, yes=0):\n        self.assume_localhost()\n\n        yes = int(yes)\n\n        if not self.dryrun:\n            device_question = 'SD card present at %s? ' % self.env.sd_device\n            inp = raw_input(device_question).strip()\n            print('inp:', inp)\n            if not yes and inp and not inp.lower().startswith('y'):\n                return\n\n        r = self.local_renderer\n\n        # Confirm SD card is present.\n        r.local('ls {sd_device}')\n\n        # Download image.\n        r.env.ubuntu_image_fn = os.path.abspath(os.path.split(self.env.ubuntu_download_url)[-1])\n        r.local('[ ! -f {ubuntu_image_fn} ] && wget {ubuntu_download_url} || true')\n\n        # Ensure SD card is unmounted.\n        with self.settings(warn_only=True):\n            r.sudo('[ -d \"{sd_media_mount_dir}\" ] && umount {sd_media_mount_dir}')\n        with self.settings(warn_only=True):\n            r.sudo('[ -d \"{sd_media_mount_dir2}\" ] && umount {sd_media_mount_dir2}')\n\n        r.pc('Writing the image onto the card.')\n        r.sudo('xzcat {ubuntu_image_fn} | dd bs=4M of={sd_device}')\n\n        # Flush all writes to disk.\n        r.run('sync')", "response": "Initialize the microSD card with latest Ubuntu image and writes it to a microSD card."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef init_raspbian_vm(self):\n\n        r = self.local_renderer\n\n        r.comment('Installing system packages.')\n        r.sudo('add-apt-repository ppa:linaro-maintainers/tools')\n        r.sudo('apt-get update')\n        r.sudo('apt-get install libsdl-dev qemu-system')\n\n        r.comment('Download image.')\n        r.local('wget https://downloads.raspberrypi.org/raspbian_lite_latest')\n        r.local('unzip raspbian_lite_latest.zip')\n        #TODO:fix name?\n        #TODO:resize image?\n\n        r.comment('Find start of the Linux ext4 partition.')\n        r.local(\n            \"parted -s 2016-03-18-raspbian-jessie-lite.img unit B print | \"\n            \"awk '/^Number/{{p=1;next}}; p{{gsub(/[^[:digit:]]/, \"\", $2); print $2}}' | sed -n 2p\", assign_to='START')\n\n        r.local('mkdir -p {raspbian_mount_point}')\n        r.sudo('mount -v -o offset=$START -t ext4 {raspbian_image} $MNT')\n\n        r.comment('Comment out everything in ld.so.preload')\n        r.local(\"sed -i 's/^/#/g' {raspbian_mount_point}/etc/ld.so.preload\")\n\n        r.comment('Comment out entries containing /dev/mmcblk in fstab.')\n        r.local(\"sed -i '/mmcblk/ s?^?#?' /etc/fstab\")\n\n        r.sudo('umount {raspbian_mount_point}')\n\n        r.comment('Download kernel.')\n        r.local('wget https://github.com/dhruvvyas90/qemu-rpi-kernel/blob/master/{raspbian_kernel}?raw=true')\n        r.local('mv {raspbian_kernel} {libvirt_images_dir}')\n\n        r.comment('Creating libvirt machine.')\n        r.local('virsh define libvirt-raspbian.xml')\n\n        r.comment('You should now be able to boot the VM by running:')\n        r.comment('')\n        r.comment('    qemu-system-arm -kernel {libvirt_boot_dir}/{raspbian_kernel} '\n            '-cpu arm1176 -m 256 -M versatilepb -serial stdio -append \"root=/dev/sda2 rootfstype=ext4 rw\" '\n            '-hda {libvirt_images_dir}/{raspbian_image}')\n        r.comment('')\n        r.comment('Or by running virt-manager.')", "response": "Initialize the Raspbian VM."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef create_raspbian_vagrant_box(self):\n\n        r = self.local_renderer\n\n        r.sudo('adduser --disabled-password --gecos \"\" vagrant')\n\n        #vagrant user should be able to run sudo commands without a password prompt\n\n        r.sudo('echo \"vagrant ALL=(ALL) NOPASSWD:ALL\" > /etc/sudoers.d/vagrant')\n        r.sudo('chmod 0440 /etc/sudoers.d/vagrant')\n\n        r.sudo('apt-get update')\n        r.sudo('apt-get install -y openssh-server')\n\n        #put ssh key from vagrant user\n\n        r.sudo('mkdir -p /home/vagrant/.ssh')\n        r.sudo('chmod 0700 /home/vagrant/.ssh')\n        r.sudo('wget --no-check-certificate https://raw.github.com/mitchellh/vagrant/master/keys/vagrant.pub -O /home/vagrant/.ssh/authorized_keys')\n        r.sudo('chmod 0600 /home/vagrant/.ssh/authorized_keys')\n        r.sudo('chown -R vagrant /home/vagrant/.ssh')\n\n        #open sudo vi /etc/ssh/sshd_config and change\n\n        #PubKeyAuthentication yes\n        #PermitEmptyPasswords no\n        r.sudo(\"sed -i '/AuthorizedKeysFile/s/^#//g' /etc/ssh/sshd_config\")\n        #PasswordAuthentication no\n        r.sudo(\"sed -i '/PasswordAuthentication/s/^#//g' /etc/ssh/sshd_config\")\n        r.sudo(\"sed -i 's/PasswordAuthentication yes/PasswordAuthentication no/g' /etc/ssh/sshd_config\")\n\n        #restart ssh service using\n\n        #sudo service ssh restart\n\n        #install additional development packages for the tools to properly compile and install\n        r.sudo('apt-get upgrade')\n        r.sudo('apt-get install -y gcc build-essential')\n        #TODO:fix? throws dpkg: error: fgets gave an empty string from `/var/lib/dpkg/triggers/File'\n        #r.sudo('apt-get install -y linux-headers-rpi')\n\n        #do any change that you want and shutdown the VM . now , come to host machine on which guest VM is running and goto\n        #the /var/lib/libvirt/images/ and choose raw image in which you did the change and copy somewhere for example /test\n\n        r.sudo('mkdir /tmp/test')\n        r.sudo('cp {libvirt_images_dir}/{raspbian_image} /tmp/test')\n        r.sudo('cp {libvirt_boot_dir}/{raspbian_kernel} /tmp/test')\n\n        #create two file metadata.json and Vagrantfile in /test do entry in metadata.json\n        r.render_to_file('rpi/metadata.json', '/tmp/test/metadata.json')\n        r.render_to_file('rpi/Vagrantfile', '/tmp/test/Vagrantfile')\n\n        #convert test.img to qcow2 format using\n        r.sudo('qemu-img convert -f raw -O qcow2  {libvirt_images_dir}/{raspbian_image}  {libvirt_images_dir}/{raspbian_image}.qcow2')\n\n        #rename ubuntu.qcow2 to box.img\n        r.sudo('mv {libvirt_images_dir}/{raspbian_image}.qcow2 {libvirt_images_dir}/box.img')\n\n        #Note: currently,libvirt-vagrant support only qcow2 format. so , don't change the format just rename to box.img.\n        #because it takes input with name box.img by default.\n        #create box\n\n        r.sudo('cd /tmp/test; tar cvzf custom_box.box ./metadata.json ./Vagrantfile ./{raspbian_kernel} ./box.img')", "response": "Create a box for easily spinning up a virtual machine with Vagrant."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef configure_hdmi(self):\n        r = self.local_renderer\n\n        # use HDMI mode even if no HDMI monitor is detected\n        r.enable_attr(\n            filename='/boot/config.txt',\n            key='hdmi_force_hotplug',\n            value=1,\n            use_sudo=True,\n        )\n\n        # to normal HDMI mode (Sound will be sent if supported and enabled). Without this line,\n        # the Raspbmc would switch to DVI (with no audio) mode by default.\n        r.enable_attr(\n            filename='/boot/config.txt',\n            key='hdmi_drive',\n            value=2,\n            use_sudo=True,\n        )", "response": "Configure HDMI mode even if no HDMI monitor is detected."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef configure_camera(self):\n        #TODO:check per OS? Works on Raspbian Jessie\n        r = self.local_renderer\n        if self.env.camera_enabled:\n            r.pc('Enabling camera.')\n            #TODO:fix, doesn't work on Ubuntu, which uses commented-out values\n\n            # Set start_x=1\n            #r.sudo('if grep \"start_x=0\" /boot/config.txt; then sed -i \"s/start_x=0/start_x=1/g\" /boot/config.txt; fi')\n            #r.sudo('if grep \"start_x\" /boot/config.txt; then true; else echo \"start_x=1\" >> /boot/config.txt; fi')\n            r.enable_attr(\n                filename='/boot/config.txt',\n                key='start_x',\n                value=1,\n                use_sudo=True,\n            )\n\n            # Set gpu_mem=128\n#             r.sudo('if grep \"gpu_mem\" /boot/config.txt; then true; else echo \"gpu_mem=128\" >> /boot/config.txt; fi')\n            r.enable_attr(\n                filename='/boot/config.txt',\n                key='gpu_mem',\n                value=r.env.gpu_mem,\n                use_sudo=True,\n            )\n\n            # Compile the Raspberry Pi binaries.\n            #https://github.com/raspberrypi/userland\n            r.run('cd ~; git clone https://github.com/raspberrypi/userland.git; cd userland; ./buildme')\n            r.run('touch ~/.bash_aliases')\n            #r.run(\"echo 'PATH=$PATH:/opt/vc/bin\\nexport PATH' >> ~/.bash_aliases\")\n            r.append(r'PATH=$PATH:/opt/vc/bin\\nexport PATH', '~/.bash_aliases')\n            #r.run(\"echo 'LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/vc/lib\\nexport LD_LIBRARY_PATH' >> ~/.bash_aliases\")\n            r.append(r'LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/vc/lib\\nexport LD_LIBRARY_PATH', '~/.bash_aliases')\n            r.run('source ~/.bashrc')\n            r.sudo('ldconfig')\n\n            # Allow our user to access the video device.\n            r.sudo(\"echo 'SUBSYSTEM==\\\"vchiq\\\",GROUP=\\\"video\\\",MODE=\\\"0660\\\"' > /etc/udev/rules.d/10-vchiq-permissions.rules\")\n            r.sudo(\"usermod -a -G video {user}\")\n\n            r.reboot(wait=300, timeout=60)\n\n            self.test_camera()\n\n        else:\n            r.disable_attr(\n                filename='/boot/config.txt',\n                key='start_x',\n                use_sudo=True,\n            )\n            r.disable_attr(\n                filename='/boot/config.txt',\n                key='gpu_mem',\n                use_sudo=True,\n            )\n            r.reboot(wait=300, timeout=60)", "response": "Configure the camera with the current configuration."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef fix_lsmod_for_pi3(self):\n        r = self.local_renderer\n        r.env.rpi2_conf = '/etc/modules-load.d/rpi2.conf'\n        r.sudo(\"sed '/bcm2808_rng/d' {rpi2_conf}\")\n        r.sudo(\"echo bcm2835_rng >> {rpi2_conf}\")", "response": "Fixes the lsmod for Pi2 and Pi3"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nrunning pre - deployments for services.", "response": "def pre_deploy(self):\n        \"\"\"\n        Runs methods services have requested be run before each deployment.\n        \"\"\"\n        for service in self.genv.services:\n            service = service.strip().upper()\n            funcs = common.service_pre_deployers.get(service)\n            if funcs:\n                print('Running pre-deployments for service %s...' % (service,))\n                for func in funcs:\n                    func()"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\ndeploy the service to the local cache.", "response": "def deploy(self):\n        \"\"\"\n        Applies routine, typically application-level changes to the service.\n        \"\"\"\n        for service in self.genv.services:\n            service = service.strip().upper()\n            funcs = common.service_deployers.get(service)\n            if funcs:\n                print('Deploying service %s...' % (service,))\n                for func in funcs:\n                    if not self.dryrun:\n                        func()"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nrun post - deployments for services that have requested be run before after deployment.", "response": "def post_deploy(self):\n        \"\"\"\n        Runs methods services have requested be run before after deployment.\n        \"\"\"\n        for service in self.genv.services:\n            service = service.strip().upper()\n            self.vprint('post_deploy:', service)\n            funcs = common.service_post_deployers.get(service)\n            if funcs:\n                self.vprint('Running post-deployments for service %s...' % (service,))\n                for func in funcs:\n                    try:\n                        func()\n                    except Exception as e:\n                        print('Post deployment error: %s' % e, file=sys.stderr)\n                        print(traceback.format_exc(), file=sys.stderr)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef pre_db_dump(self):\n        for service in self.genv.services:\n            service = service.strip().upper()\n            funcs = common.service_pre_db_dumpers.get(service)\n            if funcs:\n                print('Running pre-database dump for service %s...' % (service,))\n                for func in funcs:\n                    func()", "response": "Runs pre - database dump for all services that have requested to be run before each\n        database dump."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef post_db_dump(self):\n        for service in self.genv.services:\n            service = service.strip().upper()\n            funcs = common.service_post_db_dumpers.get(service)\n            if funcs:\n                print('Running post-database dump for service %s...' % (service,))\n                for func in funcs:\n                    func()", "response": "Runs post - database dump for all services that have requested to be run before each\n        database dump."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\napply one - time settings changes to the host.", "response": "def configure(self):\n        \"\"\"\n        Applies one-time settings changes to the host, usually to initialize the service.\n        \"\"\"\n        print('env.services:', self.genv.services)\n        for service in list(self.genv.services):\n            service = service.strip().upper()\n            funcs = common.service_configurators.get(service, [])\n            if funcs:\n                print('!'*80)\n                print('Configuring service %s...' % (service,))\n                for func in funcs:\n                    print('Function:', func)\n                    if not self.dryrun:\n                        func()"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef get_locale_dict(self, text=None):\n        text = text or self.cat_locale()\n        # Format NAME=\"value\".\n        return dict(re.findall(r'^([a-zA-Z_]+)\\s*=\\s*[\\'\\\"]*([0-8a-zA-Z_\\.\\:\\-]+)[\\'\\\"]*', text, re.MULTILINE))", "response": "Reads the locale file and returns a dictionary representing its key pairs."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef enable_mods(self):\n        r = self.local_renderer\n        for mod_name in r.env.mods_enabled:\n            with self.settings(warn_only=True):\n                self.enable_mod(mod_name)", "response": "Enable all modules in the current module list."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nbases on the number of sites per server and the number of resources on the server, calculates the optimal number of processes that should be allocated for each WSGI site.", "response": "def optimize_wsgi_processes(self):\n        \"\"\"\n        Based on the number of sites per server and the number of resources on the server,\n        calculates the optimal number of processes that should be allocated for each WSGI site.\n        \"\"\"\n        r = self.local_renderer\n        #r.env.wsgi_processes = 5\n        r.env.wsgi_server_memory_gb = 8\n\n        verbose = self.verbose\n\n        all_sites = list(self.iter_sites(site=ALL, setter=self.set_site_specifics))"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ncreates a new local renderer.", "response": "def create_local_renderer(self):\n        \"\"\"\n        Instantiates a new local renderer.\n        Override this to do any additional initialization.\n        \"\"\"\n        r = super(ApacheSatchel, self).create_local_renderer()\n\n        # Dynamically set values based on target operating system.\n        os_version = self.os_version\n        apache_specifics = r.env.specifics[os_version.type][os_version.distro]\n        r.env.update(apache_specifics)\n\n        return r"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef install_auth_basic_user_file(self, site=None):\n        r = self.local_renderer\n\n        hostname = self.current_hostname\n\n        target_sites = self.genv.available_sites_by_host.get(hostname, None)\n\n        for _site, site_data in self.iter_sites(site=site, setter=self.set_site_specifics):\n            if self.verbose:\n                print('~'*80, file=sys.stderr)\n                print('Site:', _site, file=sys.stderr)\n                print('env.apache_auth_basic:', r.env.auth_basic, file=sys.stderr)\n\n            # Only load site configurations that are allowed for this host.\n            if target_sites is not None:\n                assert isinstance(target_sites, (tuple, list))\n                if _site not in target_sites:\n                    continue\n\n            if not r.env.auth_basic:\n                continue\n\n            assert r.env.auth_basic_users, 'No apache auth users specified.'\n            for username, password in r.env.auth_basic_users:\n                r.env.auth_basic_username = username\n                r.env.auth_basic_password = password\n                r.env.apache_site = _site\n                r.env.fn = r.format(r.env.auth_basic_authuserfile)\n                if self.files.exists(r.env.fn):\n                    r.sudo('htpasswd -b {fn} {auth_basic_username} {auth_basic_password}')\n                else:\n                    r.sudo('htpasswd -b -c {fn} {auth_basic_username} {auth_basic_password}')", "response": "Installs user file for apache auth."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nupload a set of media files to an Apache accessible directory.", "response": "def sync_media(self, sync_set=None, clean=0, iter_local_paths=0):\n        \"\"\"\n        Uploads select media to an Apache accessible directory.\n        \"\"\"\n\n        # Ensure a site is selected.\n        self.genv.SITE = self.genv.SITE or self.genv.default_site\n\n        r = self.local_renderer\n\n        clean = int(clean)\n        self.vprint('Getting site data for %s...' % self.genv.SITE)\n\n        self.set_site_specifics(self.genv.SITE)\n\n        sync_sets = r.env.sync_sets\n        if sync_set:\n            sync_sets = [sync_set]\n\n        ret_paths = []\n        for _sync_set in sync_sets:\n            for paths in r.env.sync_sets[_sync_set]:\n                r.env.sync_local_path = os.path.abspath(paths['local_path'] % self.genv)\n                if paths['local_path'].endswith('/') and not r.env.sync_local_path.endswith('/'):\n                    r.env.sync_local_path += '/'\n\n                if iter_local_paths:\n                    ret_paths.append(r.env.sync_local_path)\n                    continue\n\n                r.env.sync_remote_path = paths['remote_path'] % self.genv\n\n                if clean:\n                    r.sudo('rm -Rf {apache_sync_remote_path}')\n\n                print('Syncing %s to %s...' % (r.env.sync_local_path, r.env.sync_remote_path))\n\n                r.env.tmp_chmod = paths.get('chmod', r.env.chmod)\n                r.sudo('mkdir -p {apache_sync_remote_path}')\n                r.sudo('chmod -R {apache_tmp_chmod} {apache_sync_remote_path}')\n                r.local('rsync -rvz --progress --recursive --no-p --no-g '\n                    '--rsh \"ssh -o StrictHostKeyChecking=no -i {key_filename}\" {apache_sync_local_path} {user}@{host_string}:{apache_sync_remote_path}')\n                r.sudo('chown -R {apache_web_user}:{apache_web_group} {apache_sync_remote_path}')\n\n        if iter_local_paths:\n            return ret_paths"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef get_media_timestamp(self):\n        from burlap.common import get_last_modified_timestamp\n        data = 0\n        for path in self.sync_media(iter_local_paths=1):\n            data = min(data, get_last_modified_timestamp(path) or data)\n        #TODO:hash media names and content\n        if self.verbose:\n            print('date:', data)\n        return data", "response": "Get the last modified timestamp of all media in the local filesystem."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef record_manifest(self):\n        manifest = super(ApacheSatchel, self).record_manifest()\n        manifest['available_sites'] = self.genv.available_sites\n        manifest['available_sites_by_host'] = self.genv.available_sites_by_host\n        manifest['media_timestamp'] = self.get_media_timestamp()\n        return manifest", "response": "Returns a manifest with all available data necessary to detect changes\n       "}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef configure_modevasive(self):\n        r = self.local_renderer\n        if r.env.modevasive_enabled:\n            self.install_packages()\n\n            # Write conf for each Ubuntu version since they don't conflict.\n            fn = r.render_to_file('apache/apache_modevasive.template.conf')\n\n            # Ubuntu 12.04\n            r.put(\n                local_path=fn,\n                remote_path='/etc/apache2/mods-available/mod-evasive.conf',\n                use_sudo=True)\n\n            # Ubuntu 14.04\n            r.put(\n                local_path=fn,\n                remote_path='/etc/apache2/mods-available/evasive.conf',\n                use_sudo=True)\n\n            self.enable_mod('evasive')\n        else:\n#             print('self.last_manifest:', self.last_manifest)\n#             print('a:', self.last_manifest.apache_modevasive_enabled)\n#             print('b:', self.last_manifest.modevasive_enabled)\n            if self.last_manifest.modevasive_enabled:\n                self.disable_mod('evasive')", "response": "Configure the mod - evasive Apache module for combating DDOS attacks."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nconfiguring the mod - security module.", "response": "def configure_modsecurity(self):\n        \"\"\"\n        Installs the mod-security Apache module.\n\n        https://www.modsecurity.org\n        \"\"\"\n        r = self.local_renderer\n        if r.env.modsecurity_enabled and not self.last_manifest.modsecurity_enabled:\n\n            self.install_packages()\n\n            # Write modsecurity.conf.\n            fn = self.render_to_file('apache/apache_modsecurity.template.conf')\n            r.put(local_path=fn, remote_path='/etc/modsecurity/modsecurity.conf', use_sudo=True)\n\n            # Write OWASP rules.\n            r.env.modsecurity_download_filename = '/tmp/owasp-modsecurity-crs.tar.gz'\n            r.sudo('cd /tmp; wget --output-document={apache_modsecurity_download_filename} {apache_modsecurity_download_url}')\n            r.env.modsecurity_download_top = r.sudo(\n                \"cd /tmp; \"\n                \"tar tzf %(apache_modsecurity_download_filename)s | sed -e 's@/.*@@' | uniq\" % self.genv)\n            r.sudo('cd /tmp; tar -zxvf %(apache_modsecurity_download_filename)s' % self.genv)\n            r.sudo('cd /tmp; cp -R %(apache_modsecurity_download_top)s/* /etc/modsecurity/' % self.genv)\n            r.sudo('mv /etc/modsecurity/modsecurity_crs_10_setup.conf.example  /etc/modsecurity/modsecurity_crs_10_setup.conf')\n\n            r.sudo('rm -f /etc/modsecurity/activated_rules/*')\n            r.sudo('cd /etc/modsecurity/base_rules; '\n                'for f in * ; do ln -s /etc/modsecurity/base_rules/$f /etc/modsecurity/activated_rules/$f ; done')\n            r.sudo('cd /etc/modsecurity/optional_rules; '\n                'for f in * ; do ln -s /etc/modsecurity/optional_rules/$f /etc/modsecurity/activated_rules/$f ; done')\n\n            r.env.httpd_conf_append.append('Include \"/etc/modsecurity/activated_rules/*.conf\"')\n\n            self.enable_mod('evasive')\n            self.enable_mod('headers')\n        elif not self.env.modsecurity_enabled and self.last_manifest.modsecurity_enabled:\n            self.disable_mod('modsecurity')"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\ninstalling the mod - rpaf Apache module.", "response": "def configure_modrpaf(self):\n        \"\"\"\n        Installs the mod-rpaf Apache module.\n\n        https://github.com/gnif/mod_rpaf\n        \"\"\"\n        r = self.local_renderer\n        if r.env.modrpaf_enabled:\n            self.install_packages()\n            self.enable_mod('rpaf')\n        else:\n            if self.last_manifest.modrpaf_enabled:\n                self.disable_mod('mod_rpaf')"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef configure_site(self, full=1, site=None, delete_old=0):\n        from burlap import service\n\n        r = self.local_renderer\n\n        print('Configuring Apache...', file=sys.stderr)\n\n        site = site or self.genv.SITE\n\n        if int(delete_old) and site == ALL:\n            # Delete all existing enabled and available sites.\n            r.sudo('rm -f {sites_available}/*')\n            r.sudo('rm -f {sites_enabled}/*')\n\n        if r.env.manage_site_conf:\n\n            # Run an optional customizable command to clear or delete old sites before writing the new ones.\n            if r.env.delete_site_command:\n                r.sudo(r.env.delete_site_command)\n\n            for _site, site_data in self.iter_sites(site=site, setter=self.set_site_specifics):\n                r = self.local_renderer\n\n                #r.env.site = site\n                if self.verbose:\n                    print('-'*80, file=sys.stderr)\n                    print('Site:', _site, file=sys.stderr)\n                    print('-'*80, file=sys.stderr)\n\n                r.env.ssl = _site.endswith('_secure')\n                r.env.apache_site = _site\n                r.env.server_name = r.format(r.env.domain_template)\n\n                # Write WSGI template\n                if r.env.wsgi_enabled:\n                    r.pc('Writing WSGI template for site %s...' % _site)\n                    r.env.wsgi_scriptalias = r.format(r.env.wsgi_scriptalias)\n                    fn = self.render_to_file(r.env.wsgi_template)\n                    r.env.wsgi_dir = r.env.remote_dir = os.path.split(r.env.wsgi_scriptalias)[0]\n                    r.sudo('mkdir -p {remote_dir}')\n                    r.put(local_path=fn, remote_path=r.env.wsgi_scriptalias, use_sudo=True)\n\n                # Write site configuration.\n                r.pc('Writing site configuration for site %s...' % _site)\n                r.env.auth_basic_authuserfile = r.format(self.env.auth_basic_authuserfile)\n                r.env.ssl_certificates = list(self.iter_certificates())\n                if r.env.server_aliases_template:\n                    r.env.server_aliases = r.format(r.env.server_aliases_template)\n                if r.env.domain_with_sub_template:\n                    r.env.domain_with_sub = r.format(r.env.domain_with_sub_template)\n                if r.env.domain_without_sub_template:\n                    r.env.domain_without_sub = r.format(r.env.domain_without_sub_template)\n                if r.env.domain_template:\n                    r.env.domain = r.format(r.env.domain_template)\n                genv = r.collect_genv()\n                genv['current_hostname'] = self.current_hostname\n                fn = self.render_to_file(\n                    self.env.site_template,\n                    extra=genv,\n                    formatter=partial(r.format, ignored_variables=self.env.ignored_template_variables))\n                r.env.site_conf = _site+'.conf'\n                r.env.site_conf_fqfn = os.path.join(r.env.sites_available, r.env.site_conf)\n                r.put(local_path=fn, remote_path=r.env.site_conf_fqfn, use_sudo=True)\n\n                self.enable_site(_site)\n\n                self.clear_local_renderer()\n\n        self.enable_mods()\n\n        if int(full):\n            # Write master Apache configuration file.\n            if r.env.manage_httpd_conf:\n                fn = self.render_to_file('apache/apache_httpd.template.conf')\n                r.put(local_path=fn, remote_path=r.env.conf, use_sudo=True)\n\n            # Write Apache listening ports configuration.\n            if r.env.manage_ports_conf:\n                fn = self.render_to_file('apache/apache_ports.template.conf')\n                r.put(local_path=fn, remote_path=r.env.ports_path, use_sudo=True)\n\n        r.sudo('chown -R {apache_web_user}:{apache_web_group} {apache_root}')", "response": "Configure Apache to host one or more websites."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nforwards all traffic to a page saying the server is down for maintenance.", "response": "def maint_up(self):\n        \"\"\"\n        Forwards all traffic to a page saying the server is down for maintenance.\n        \"\"\"\n        r = self.local_renderer\n        fn = self.render_to_file(r.env.maintenance_template, extra={'current_hostname': self.current_hostname})\n        r.put(local_path=fn, remote_path=r.env.maintenance_path, use_sudo=True)\n        r.sudo('chown -R {apache_web_user}:{apache_web_group} {maintenance_path}')"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef restart(self):\n        n = 60\n        sleep_n = int(self.env.max_restart_wait_minutes/10.*60)\n        for _ in xrange(n):\n            self.stop()\n            if self.dryrun or not self.is_running():\n                break\n            print('Waiting for supervisor to stop (%i of %i)...' % (_, n))\n            time.sleep(sleep_n)\n        self.start()\n        for _ in xrange(n):\n            if self.dryrun or self.is_running():\n                return\n            print('Waiting for supervisor to start (%i of %i)...' % (_, n))\n            time.sleep(sleep_n)\n        raise Exception('Failed to restart service %s!' % self.name)", "response": "Supervisor can take a very long time to start and stop,\n        so wait for it."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ncalling after a deployment to record any data necessary to detect changes in Apache site configs.", "response": "def record_manifest(self):\n        \"\"\"\n        Called after a deployment to record any data necessary to detect changes\n        for a future deployment.\n        \"\"\"\n        data = super(SupervisorSatchel, self).record_manifest()\n\n        # Celery deploys itself through supervisor, so monitor its changes too in Apache site configs.\n        for site_name, site_data in self.genv.sites.items():\n            if self.verbose:\n                print(site_name, site_data)\n            data['celery_has_worker_%s' % site_name] = site_data.get('celery_has_worker', False)\n\n        data['configured'] = True\n\n        # Generate services list.\n        self.write_configs(upload=0)\n        #data['services_rendered'] = ''\n\n        return data"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef deploy_services(self, site=None):\n\n        verbose = self.verbose\n\n        r = self.local_renderer\n        if not r.env.manage_configs:\n            return\n#\n#         target_sites = self.genv.available_sites_by_host.get(hostname, None)\n\n        self.render_paths()\n\n        supervisor_services = []\n\n        if r.env.purge_all_confs:\n            r.sudo('rm -Rf /etc/supervisor/conf.d/*')\n\n        #TODO:check available_sites_by_host and remove dead?\n        self.write_configs(site=site)\n        for _site, site_data in self.iter_sites(site=site, renderer=self.render_paths):\n            if verbose:\n                print('deploy_services.site:', _site)\n\n            # Only load site configurations that are allowed for this host.\n#             if target_sites is not None:\n#                 assert isinstance(target_sites, (tuple, list))\n#                 if site not in target_sites:\n#                     continue\n\n            for cb in self.genv._supervisor_create_service_callbacks:\n                if self.verbose:\n                    print('cb:', cb)\n                ret = cb(site=_site)\n                if self.verbose:\n                    print('ret:', ret)\n                if isinstance(ret, six.string_types):\n                    supervisor_services.append(ret)\n                elif isinstance(ret, tuple):\n                    assert len(ret) == 2\n                    conf_name, conf_content = ret\n                    if self.dryrun:\n                        print('supervisor conf filename:', conf_name)\n                        print(conf_content)\n                    self.write_to_file(conf_content)\n\n        self.env.services_rendered = '\\n'.join(supervisor_services)\n\n        fn = self.render_to_file(self.env.config_template)\n        r.put(local_path=fn, remote_path=self.env.config_path, use_sudo=True)\n\n        # We use supervisorctl to configure supervisor, but this will throw a uselessly vague\n        # error message is supervisor isn't running.\n        if not self.is_running():\n            self.start()\n\n        # Reload config and then add and remove as necessary (restarts programs)\n        r.sudo('supervisorctl update')", "response": "Deploy all services for a given site and writes the appropriate supervisord. conf file."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef clone(self, remote_url, path=None, use_sudo=False, user=None):\n\n        cmd = 'git clone --quiet %s' % remote_url\n        if path is not None:\n            cmd = cmd + ' %s' % path\n\n        if use_sudo and user is None:\n            run_as_root(cmd)\n        elif use_sudo:\n            sudo(cmd, user=user)\n        else:\n            run(cmd)", "response": "Clone a remote Git repository into a new directory."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nadding a remote to a Git working copy.", "response": "def add_remote(self, path, name, remote_url, use_sudo=False, user=None, fetch=True):\n        \"\"\"\n        Add a remote Git repository into a directory.\n\n        :param path: Path of the working copy directory.  This directory must exist\n                     and be a Git working copy with a default remote to fetch from.\n        :type path: str\n\n        :param use_sudo: If ``True`` execute ``git`` with\n                         :func:`fabric.operations.sudo`, else with\n                         :func:`fabric.operations.run`.\n        :type use_sudo: bool\n\n        :param user: If ``use_sudo is True``, run :func:`fabric.operations.sudo`\n                     with the given user.  If ``use_sudo is False`` this parameter\n                     has no effect.\n        :type user: str\n\n        :param name: name for the remote repository\n        :type name: str\n\n        :param remote_url: URL of the remote repository\n        :type remote_url: str\n\n        :param fetch: If ``True`` execute ``git remote add -f``\n        :type fetch: bool\n        \"\"\"\n        if path is None:\n            raise ValueError(\"Path to the working copy is needed to add a remote\")\n\n        if fetch:\n            cmd = 'git remote add -f %s %s' % (name, remote_url)\n        else:\n            cmd = 'git remote add %s %s' % (name, remote_url)\n\n        with cd(path):\n            if use_sudo and user is None:\n                run_as_root(cmd)\n            elif use_sudo:\n                sudo(cmd, user=user)\n            else:\n                run(cmd)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef fetch(self, path, use_sudo=False, user=None, remote=None):\n\n        if path is None:\n            raise ValueError(\"Path to the working copy is needed to fetch from a remote repository.\")\n\n        if remote is not None:\n            cmd = 'git fetch %s' % remote\n        else:\n            cmd = 'git fetch'\n\n        with cd(path):\n            if use_sudo and user is None:\n                run_as_root(cmd)\n            elif use_sudo:\n                sudo(cmd, user=user)\n            else:\n                run(cmd)", "response": "Fetch changes from the default remote repository."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef pull(self, path, use_sudo=False, user=None, force=False):\n\n        if path is None:\n            raise ValueError(\"Path to the working copy is needed to pull from a remote repository.\")\n\n        options = []\n        if force:\n            options.append('--force')\n        options = ' '.join(options)\n\n        cmd = 'git pull %s' % options\n\n        with cd(path):\n            if use_sudo and user is None:\n                run_as_root(cmd)\n            elif use_sudo:\n                sudo(cmd, user=user)\n            else:\n                run(cmd)", "response": "Pulls changes from the default remote repository and merges them into the given working copy."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ngetting all commits between the given commit numbers", "response": "def get_logs_between_commits(self, a, b):\n        \"\"\"\n        Retrieves all commit messages for all commits between the given commit numbers\n        on the current branch.\n        \"\"\"\n        print('REAL')\n        ret = self.local('git --no-pager log --pretty=oneline %s...%s' % (a, b), capture=True)\n        if self.verbose:\n            print(ret)\n        return str(ret)"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef get_current_commit(self):\n        with hide('running', 'stdout', 'stderr', 'warnings'):\n            s = str(self.local('git rev-parse HEAD', capture=True))\n            self.vprint('current commit:', s)\n            return s", "response": "Returns the current git commit number of the current head branch."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning a manifest that contains all the information necessary to detect changes for this repository.", "response": "def record_manifest(self):\n        \"\"\"\n        Called after a deployment to record any data necessary to detect changes\n        for a future deployment.\n        \"\"\"\n        manifest = super(GitTrackerSatchel, self).record_manifest()\n        manifest[CURRENT_COMMIT] = self.get_current_commit()\n        return manifest"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef ssh_config(self, name=''):\n        r = self.local_renderer\n        with self.settings(hide('running')):\n            output = r.local('vagrant ssh-config %s' % name, capture=True)\n\n        config = {}\n        for line in output.splitlines()[1:]:\n            key, value = line.strip().split(' ', 2)\n            config[key] = value\n        return config", "response": "Get the SSH parameters for connecting to a vagrant VM."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef version(self):\n        r = self.local_renderer\n        with self.settings(hide('running', 'warnings'), warn_only=True):\n            res = r.local('vagrant --version', capture=True)\n        if res.failed:\n            return None\n        line = res.splitlines()[-1]\n        version = re.match(r'Vagrant (?:v(?:ersion )?)?(.*)', line).group(1)\n        return tuple(_to_int(part) for part in version.split('.'))", "response": "Get the Vagrant version."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nrunning the following tasks on a Vagrant box.", "response": "def vagrant(self, name=''):\n        \"\"\"\n        Run the following tasks on a vagrant box.\n\n        First, you need to import this task in your ``fabfile.py``::\n\n            from fabric.api import *\n            from burlap.vagrant import vagrant\n\n            @task\n            def some_task():\n                run('echo hello')\n\n        Then you can easily run tasks on your current Vagrant box::\n\n            $ fab vagrant some_task\n\n        \"\"\"\n        r = self.local_renderer\n        config = self.ssh_config(name)\n\n        extra_args = self._settings_dict(config)\n        r.genv.update(extra_args)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef vagrant_settings(self, name='', *args, **kwargs):\n        config = self.ssh_config(name)\n\n        extra_args = self._settings_dict(config)\n        kwargs.update(extra_args)\n\n        return self.settings(*args, **kwargs)", "response": "Context manager that sets a vagrant VM\n        as the remote host."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef base_boxes(self):\n        return sorted(list(set([name for name, provider in self._box_list()])))", "response": "Get the list of base boxes in this vagrant instance"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef install_from_upstream(self):\n        from burlap.system import get_arch, distrib_family\n        r = self.local_renderer\n        content = urlopen(r.env.download_url).read()\n        print(len(content))\n        matches = DOWNLOAD_LINK_PATTERN.findall(content)\n        print(matches)\n        arch = get_arch() # e.g. 'x86_64'\n        family = distrib_family()\n        if family == DEBIAN:\n            ext = '.deb'\n            matches = [match for match in matches if match.endswith(ext) and arch in match]\n            print('matches:', matches)\n            assert matches, \"No matches found.\"\n            assert len(matches) == 1, \"Too many matches found: %s\" % (', '.join(matches))\n            r.env.final_download_url = matches[0]\n            r.env.local_filename = '/tmp/vagrant%s' % ext\n            r.run('wget -O {local_filename} {final_download_url}')\n            r.sudo('dpkg -i {local_filename}')\n        else:\n            raise NotImplementedError('Unsupported family: %s' % family)", "response": "Installs Vagrant from the most recent package available from their homepage."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef distrib_id():\n\n    with settings(hide('running', 'stdout')):\n        kernel = (run('uname -s') or '').strip().lower()\n        if kernel == LINUX:\n            # lsb_release works on Ubuntu and Debian >= 6.0\n            # but is not always included in other distros\n            if is_file('/usr/bin/lsb_release'):\n                id_ = run('lsb_release --id --short').strip().lower()\n                if id in ['arch', 'archlinux']:  # old IDs used before lsb-release 1.4-14\n                    id_ = ARCH\n                return id_\n            else:\n                if is_file('/etc/debian_version'):\n                    return DEBIAN\n                elif is_file('/etc/fedora-release'):\n                    return FEDORA\n                elif is_file('/etc/arch-release'):\n                    return ARCH\n                elif is_file('/etc/redhat-release'):\n                    release = run('cat /etc/redhat-release')\n                    if release.startswith('Red Hat Enterprise Linux'):\n                        return REDHAT\n                    elif release.startswith('CentOS'):\n                        return CENTOS\n                    elif release.startswith('Scientific Linux'):\n                        return SLES\n                elif is_file('/etc/gentoo-release'):\n                    return GENTOO\n        elif kernel == SUNOS:\n            return SUNOS", "response": "Get the OS distribution ID."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef distrib_release():\n    with settings(hide('running', 'stdout')):\n        kernel = (run('uname -s') or '').strip().lower()\n        if kernel == LINUX:\n            return run('lsb_release -r --short')\n\n        elif kernel == SUNOS:\n            return run('uname -v')", "response": "Get the release number of the distribution."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreturns the distribution family.", "response": "def distrib_family():\n    \"\"\"\n    Get the distribution family.\n\n    Returns one of ``debian``, ``redhat``, ``arch``, ``gentoo``,\n    ``sun``, ``other``.\n    \"\"\"\n    distrib = (distrib_id() or '').lower()\n    if distrib in ['debian', 'ubuntu', 'linuxmint', 'elementary os']:\n        return DEBIAN\n    elif distrib in ['redhat', 'rhel', 'centos', 'sles', 'fedora']:\n        return REDHAT\n    elif distrib in ['sunos']:\n        return SUN\n    elif distrib in ['gentoo']:\n        return GENTOO\n    elif distrib in ['arch', 'manjarolinux']:\n        return ARCH\n    return 'other'"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef supported_locales():\n    family = distrib_family()\n    if family == 'debian':\n        return _parse_locales('/usr/share/i18n/SUPPORTED')\n    elif family == 'arch':\n        return _parse_locales('/etc/locale.gen')\n    elif family == 'redhat':\n        return _supported_locales_redhat()\n    else:\n        raise UnsupportedFamily(supported=['debian', 'arch', 'redhat'])", "response": "Returns the list of supported locales."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef set_permissions(self):\n        r = self.local_renderer\n        for path in r.env.paths_owned:\n            r.env.path_owned = path\n            r.sudo('chown {celery_daemon_user}:{celery_daemon_user} {celery_path_owned}')", "response": "Sets ownership and permissions for Celery - related files."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef create_supervisor_services(self, site):\n\n        self.vprint('create_supervisor_services:', site)\n\n        self.set_site_specifics(site=site)\n\n        r = self.local_renderer\n        if self.verbose:\n            print('r.env:')\n            pprint(r.env, indent=4)\n\n        self.vprint('r.env.has_worker:', r.env.has_worker)\n        if not r.env.has_worker:\n            self.vprint('skipping: no celery worker')\n            return\n\n        if self.name.lower() not in self.genv.services:\n            self.vprint('skipping: celery not enabled')\n            return\n\n        hostname = self.current_hostname\n        target_sites = self.genv.available_sites_by_host.get(hostname, None)\n        if target_sites and site not in target_sites:\n            self.vprint('skipping: site not supported on this server')\n            return\n\n        self.render_paths()\n\n        conf_name = 'celery_%s.conf' % site\n        ret = r.render_to_string('celery/celery_supervisor.template.conf')\n        return conf_name, ret", "response": "This method creates a Celery config file for each site."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ncheck if all tests have passed for this branch.", "response": "def check_ok(self):\n        \"\"\"\n        Ensures all tests have passed for this branch.\n\n        This should be called before deployment, to prevent accidental deployment of code\n        that hasn't passed automated testing.\n        \"\"\"\n        import requests\n\n        if not self.env.check_ok:\n            return\n\n        # Find current git branch.\n        branch_name = self._local('git rev-parse --abbrev-ref HEAD', capture=True).strip()\n\n        check_ok_paths = self.env.check_ok_paths or {}\n\n        if branch_name in check_ok_paths:\n            check = check_ok_paths[branch_name]\n            if 'username' in check:\n                auth = (check['username'], check['password'])\n            else:\n                auth = None\n            ret = requests.get(check['url'], auth=auth)\n            passed = check['text'] in ret.content\n            assert passed, 'Check failed: %s' % check['url']"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef is_present(self, host=None):\n        r = self.local_renderer\n        r.env.host = host or self.genv.host_string\n        ret = r._local(\"getent hosts {host} | awk '{{ print $1 }}'\", capture=True) or ''\n        if self.verbose:\n            print('ret:', ret)\n        ret = ret.strip()\n        if self.verbose:\n            print('Host %s %s present.' % (r.env.host, 'IS' if bool(ret) else 'IS NOT'))\n        ip = ret\n        ret = bool(ret)\n        if not ret:\n            return False\n\n        r.env.ip = ip\n        with settings(warn_only=True):\n            ret = r._local('ping -c 1 {ip}', capture=True) or ''\n        packet_loss = re.findall(r'([0-9]+)% packet loss', ret)\n#         print('packet_loss:',packet_loss)\n        ip_accessible = packet_loss and int(packet_loss[0]) < 100\n        if self.verbose:\n            print('IP %s accessible: %s' % (ip, ip_accessible))\n        return bool(ip_accessible)", "response": "Returns true if the given host exists on the network."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef purge_keys(self):\n        r = self.local_renderer\n        r.env.default_ip = self.hostname_to_ip(self.env.default_hostname)\n        r.env.home_dir = '/home/%s' % getpass.getuser()\n        r.local('ssh-keygen -f \"{home_dir}/.ssh/known_hosts\" -R {host_string}')\n        if self.env.default_hostname:\n            r.local('ssh-keygen -f \"{home_dir}/.ssh/known_hosts\" -R {default_hostname}')\n        if r.env.default_ip:\n            r.local('ssh-keygen -f \"{home_dir}/.ssh/known_hosts\" -R {default_ip}')", "response": "Delete all SSH keys on the localhost associated with the current remote host."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef find_working_password(self, usernames=None, host_strings=None):\n        r = self.local_renderer\n\n        if host_strings is None:\n            host_strings = []\n\n        if not host_strings:\n            host_strings.append(self.genv.host_string)\n\n        if usernames is None:\n            usernames = []\n\n        if not usernames:\n            usernames.append(self.genv.user)\n\n        for host_string in host_strings:\n\n            for username in usernames:\n\n                passwords = []\n                passwords.append(self.genv.user_default_passwords[username])\n                passwords.append(self.genv.user_passwords[username])\n                passwords.append(self.env.default_password)\n\n                for password in passwords:\n\n                    with settings(warn_only=True):\n                        r.env.host_string = host_string\n                        r.env.password = password\n                        r.env.user = username\n                        ret = r._local(\"sshpass -p '{password}' ssh -o StrictHostKeyChecking=no {user}@{host_string} echo hello\", capture=True)\n                        #print('ret.return_code:', ret.return_code)\n            #             print('ret000:[%s]' % ret)\n                        #code 1 = good password, but prompts needed\n                        #code 5 = bad password\n                        #code 6 = good password, but host public key is unknown\n\n                    if ret.return_code in (1, 6) or 'hello' in ret:\n                        # Login succeeded, so we haven't yet changed the password, so use the default password.\n                        return host_string, username, password\n\n        raise Exception('No working login found.')", "response": "Find the first working password for the current host."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef needs_initrole(self, stop_on_error=False):\n\n        ret = False\n\n        target_host_present = self.is_present()\n\n        if not target_host_present:\n            default_host_present = self.is_present(self.env.default_hostname)\n            if default_host_present:\n                if self.verbose:\n                    print('Target host missing and default host present so host init required.')\n                ret = True\n            else:\n                if self.verbose:\n                    print('Target host missing but default host also missing, '\n                        'so no host init required.')\n#                 if stop_on_error:\n#                     raise Exception(\n#                         'Both target and default hosts missing! '\n#                         'Is the machine turned on and plugged into the network?')\n        else:\n            if self.verbose:\n                print('Target host is present so no host init required.')\n\n        return ret", "response": "Returns True if the host is present and has its initial configuration set set."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef initrole(self, check=True):\n\n        if self.env.original_user is None:\n            self.env.original_user = self.genv.user\n\n        if self.env.original_key_filename is None:\n            self.env.original_key_filename = self.genv.key_filename\n\n        host_string = None\n        user = None\n        password = None\n        if self.env.login_check:\n            host_string, user, password = self.find_working_password(\n                usernames=[self.genv.user, self.env.default_user],\n                host_strings=[self.genv.host_string, self.env.default_hostname],\n            )\n            if self.verbose:\n                print('host.initrole.host_string:', host_string)\n                print('host.initrole.user:', user)\n                print('host.initrole.password:', password)\n\n#         needs = True\n#         if check:\n#             needs = self.needs_initrole(stop_on_error=True)\n        needs = False\n\n        if host_string is not None:\n            self.genv.host_string = host_string\n        if user is not None:\n            self.genv.user = user\n        if password is not None:\n            self.genv.password = password\n\n        if not needs:\n            return\n\n        assert self.env.default_hostname, 'No default hostname set.'\n        assert self.env.default_user, 'No default user set.'\n\n        self.genv.host_string = self.env.default_hostname\n        if self.env.default_hosts:\n            self.genv.hosts = self.env.default_hosts\n        else:\n            self.genv.hosts = [self.env.default_hostname]\n\n        self.genv.user = self.env.default_user\n        self.genv.password = self.env.default_password\n        self.genv.key_filename = self.env.default_key_filename\n\n        # If the host has been reformatted, the SSH keys will mismatch, throwing an error, so clear them.\n        self.purge_keys()\n\n        # Do a test login with the default password to determine which password we should use.\n#         r.env.password = self.env.default_password\n#         with settings(warn_only=True):\n#             ret = r._local(\"sshpass -p '{password}' ssh -o StrictHostKeyChecking=no {user}@{host_string} echo hello\", capture=True)\n#             print('ret.return_code:', ret.return_code)\n# #             print('ret000:[%s]' % ret)\n#             #code 1 = good password, but prompts needed\n#             #code 5 = bad password\n#             #code 6 = good password, but host public key is unknown\n#         if ret.return_code in (1, 6) or 'hello' in ret:\n#             # Login succeeded, so we haven't yet changed the password, so use the default password.\n#             self.genv.password = self.env.default_password\n#         elif self.genv.user in self.genv.user_passwords:\n#             # Otherwise, use the password or key set in the config.\n#             self.genv.password = self.genv.user_passwords[self.genv.user]\n#         else:\n#             # Default password fails and there's no current password, so clear.\n#             self.genv.password = None\n#         self.genv.password = self.find_working_password()\n#         print('host.initrole,using password:', self.genv.password)\n\n        # Execute post-init callbacks.\n        for task_name in self.env.post_initrole_tasks:\n            if self.verbose:\n                print('Calling post initrole task %s' % task_name)\n            satchel_name, method_name = task_name.split('.')\n            satchel = self.get_satchel(name=satchel_name)\n            getattr(satchel, method_name)()\n\n        print('^'*80)\n        print('host.initrole.host_string:', self.genv.host_string)\n        print('host.initrole.user:', self.genv.user)\n        print('host.initrole.password:', self.genv.password)", "response": "Called to set default password login for systems that do not yet have passwordless\n        login setup."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef iter_hostnames(self):\n        from burlap.common import get_hosts_retriever\n        if self.env.use_retriever:\n            self.vprint('using retriever')\n            self.vprint('hosts:', self.genv.hosts)\n            retriever = get_hosts_retriever()\n            hosts = list(retriever(extended=1))\n            for _hostname, _data in hosts:\n\n                # Skip hosts that aren't selected for this run.\n                if self.genv.hosts \\\n                and _data.ip not in self.genv.hosts \\\n                and _data.public_dns_name not in self.genv.hosts \\\n                and _hostname not in self.genv.hosts:\n                    continue\n\n                assert _data.ip, 'Missing IP.'\n                yield _data.ip, _hostname#_data.public_dns_name\n        else:\n            self.vprint('using default')\n            for ip, hostname in self.env.hostnames.items():\n                self.vprint('ip lookup:', ip, hostname)\n                if ip == UNKNOWN:\n                    ip = self.hostname_to_ip(hostname)\n                    if not ip and hostname in self.env.default_hostnames:\n                        ip = self.hostname_to_ip(self.env.default_hostnames[hostname])\n                elif not ip[0].isdigit():\n                    ip = self.hostname_to_ip(ip)\n                assert ip, 'Invalid IP.'\n                yield ip, hostname", "response": "Yields a list of tuples of the form ip hostname."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef get_public_ip(self):\n        r = self.local_renderer\n        ret = r.run(r.env.get_public_ip_command) or ''\n        ret = ret.strip()\n        print('ip:', ret)\n        return ret", "response": "Gets the public IP for a host."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nconfigures the local environment for the current server.", "response": "def configure(self, reboot=1):\n        \"\"\"\n        Assigns a name to the server accessible from user space.\n\n        Note, we add the name to /etc/hosts since not all programs use\n        /etc/hostname to reliably identify the server hostname.\n        \"\"\"\n        r = self.local_renderer\n        for ip, hostname in self.iter_hostnames():\n            self.vprint('ip/hostname:', ip, hostname)\n            r.genv.host_string = ip\n            r.env.hostname = hostname\n            with settings(warn_only=True):\n                r.sudo('echo \"{hostname}\" > /etc/hostname')\n                r.sudo('echo \"127.0.0.1 {hostname}\" | cat - /etc/hosts > /tmp/out && mv /tmp/out /etc/hosts')\n                r.sudo(r.env.set_hostname_command)\n                if r.env.auto_reboot and int(reboot):\n                    r.reboot()"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nupdates the ticket numbers in JIRA based on the changes made in the current git repository.", "response": "def update_tickets_from_git(self, from_commit=None, to_commit=None):\n        \"\"\"\n        Run during a deployment.\n        Looks at all commits between now and the last deployment.\n        Finds all ticket numbers and updates their status in Jira.\n        \"\"\"\n        from jira import JIRA, JIRAError\n        #from burlap.deploy import get_last_current_diffs\n        from burlap.git import gittracker, CURRENT_COMMIT\n\n        r = self.local_renderer\n\n#         get_current_commit = gittracker.get_current_commit\n#         GITTRACKER = gittracker.name.upper()\n\n        # Ensure this is only run once per role.\n        if self.genv.host_string != self.genv.hosts[-1]:\n            self.vprint('Not first server. Aborting.')\n            return\n\n        print('self.env.update_from_git:', self.env.update_from_git)\n        print('self.genv.jirahelper_update_from_git:', self.genv.jirahelper_update_from_git)\n        if not self.env.update_from_git:\n            self.vprint('Update from git disabled. Aborting.')\n            return\n\n        if not self.env.ticket_pattern:\n            self.vprint('No ticket pattern defined. Aborting.')\n            return\n\n        if not self.env.basic_auth_username or not self.env.basic_auth_password:\n            self.vprint('Username or password not given. Aborting.')\n            return\n\n        # During a deployment, we should be given these, but for testing,\n        # lookup the diffs dynamically.\n        last = gittracker.last_manifest\n        current = gittracker.current_manifest\n\n        last_commit = from_commit or last.current_commit#[CURRENT_COMMIT]\n        print('last_commit:', last_commit)\n        current_commit = to_commit or current[CURRENT_COMMIT]\n        print('current_commit:', current_commit)\n\n        if not last_commit or not current_commit:\n            print('Missing commit ID. Aborting.')\n            return\n\n        if self.verbose:\n            print('-'*80)\n            print('last.keys:', last.keys())\n            print('-'*80)\n            print('current.keys:', current.keys())\n\n#         try:\n#             last_commit = last['GITTRACKER']['current_commit']\n#         except KeyError:\n#             return\n#         current_commit = current['GITTRACKER']['current_commit']\n\n        # Find all tickets deployed between last deployment and now.\n        tickets = self.get_tickets_between_commits(current_commit, last_commit)\n        if self.verbose:\n            print('tickets:', tickets)\n\n        # Update all tickets in Jira.\n        jira = JIRA({\n            'server': self.env.server\n        }, basic_auth=(self.env.basic_auth_username, self.env.basic_auth_password))\n        for ticket in tickets:\n\n            # Mention this Jira updated.\n            r.env.role = r.genv.ROLE.lower()\n            comment = r.format(self.env.ticket_update_message_template)\n            print('Commenting on ticket %s: %s' % (ticket, comment))\n            if not self.dryrun:\n                jira.add_comment(ticket, comment)\n\n            # Update ticket status.\n            recheck = False\n            while 1:\n                print('Looking up jira ticket %s...' % ticket)\n                issue = jira.issue(ticket)\n                self.vprint('Ticket %s retrieved.' % ticket)\n                transition_to_id = dict((t['name'], t['id']) for t in jira.transitions(issue))\n                self.vprint('%i allowable transitions found:' % len(transition_to_id))\n                pprint(transition_to_id)\n                self.vprint('issue.fields.status.id:', issue.fields.status.id)\n                self.vprint('issue.fields.status.name:', issue.fields.status.name)\n                jira_status_id = issue.fields.status.name.title()\n                self.vprint('jira_status_id:', jira_status_id)\n                next_transition_name = self.env.deploy_workflow.get(jira_status_id)\n                self.vprint('next_transition_name:', next_transition_name)\n                next_transition_id = transition_to_id.get(next_transition_name)\n                self.vprint('next_transition_id:', next_transition_id)\n                if next_transition_name:\n                    if issue.fields.assignee:\n                        if issue.fields.assignee.raw:\n                            assignee_name = issue.fields.assignee.name\n                        else:\n                            # Get assignee name directly\n                            # https://community.atlassian.com/t5/Jira-questions/Jira-in-Python-issue-fields-reporter-name-\n                            # errors-with-TypeError/qaq-p/937924\n                            assignee_name = issue.fields.assignee._session['name']\n                    else:\n                        assignee_name = None\n\n                    # Get new assignee by status\n                    new_assignee = self.env.assignee_by_status.get(\n                        #issue.fields.status.name.title(),\n                        next_transition_name,\n                        assignee_name,\n                    )\n\n                    # If assigning to reporter, get reporter name.\n                    if new_assignee == 'reporter':\n                        if issue.fields.reporter.raw:\n                            new_assignee = issue.fields.reporter.name\n                        else:\n                            # Get reporter name directly\n                            # https://community.atlassian.com/t5/Jira-questions/Jira-in-Python-issue-fields-reporter-name-\n                            # errors-with-TypeError/qaq-p/937924\n                            new_assignee = issue.fields.reporter._session['name']\n\n                    print('Updating ticket %s to status %s (%s) and assigning it to %s.' % (ticket, next_transition_name, next_transition_id, new_assignee))\n                    if not self.dryrun:\n\n                        if next_transition_id:\n                            try:\n                                jira.transition_issue(issue, next_transition_id)\n                                recheck = True\n                            except AttributeError as e:\n                                print('Unable to transition ticket %s to %s: %s' % (ticket, next_transition_name, e), file=sys.stderr)\n                                traceback.print_exc()\n\n                        # Note assignment should happen after transition, since the assignment may\n                        # effect remove transitions that we need.\n                        try:\n                            if new_assignee:\n                                print('Assigning ticket %s to %s.' % (ticket, new_assignee))\n                                jira.assign_issue(issue, new_assignee)\n                            else:\n                                print('No new assignee found.')\n                        except JIRAError as e:\n                            print('Unable to reassign ticket %s to %s: %s' % (ticket, new_assignee, e), file=sys.stderr)\n                else:\n                    recheck = False\n                    print('No transitions found for ticket %s currently in status \"%s\".' % (ticket, issue.fields.status.name))\n\n                if not recheck:\n                    break"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef partitions(device=\"\"):\n    partitions_list = {}\n    with settings(hide('running', 'stdout')):\n        res = run_as_root('sfdisk -d %(device)s' % locals())\n\n        spart = re.compile(r'(?P<pname>^/.*) : .* Id=(?P<ptypeid>[0-9a-z]+)')\n        for line in res.splitlines():\n            m = spart.search(line)\n            if m:\n                partitions_list[m.group('pname')] = int(m.group('ptypeid'), 16)\n\n    return partitions_list", "response": "Get a partition list for all disk or for selected device only"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nget a HDD device by uuid", "response": "def getdevice_by_uuid(uuid):\n    \"\"\"\n    Get a HDD device by uuid\n\n    Example::\n\n        from burlap.disk import getdevice_by_uuid\n\n        device = getdevice_by_uuid(\"356fafdc-21d5-408e-a3e9-2b3f32cb2a8c\")\n        if device:\n            mount(device,'/mountpoint')\n    \"\"\"\n    with settings(hide('running', 'warnings', 'stdout'), warn_only=True):\n        res = run_as_root('blkid -U %s' % uuid)\n\n        if not res.succeeded:\n            return None\n\n        return res"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef ismounted(device):\n    # Check filesystem\n    with settings(hide('running', 'stdout')):\n        res = run_as_root('mount')\n    for line in res.splitlines():\n        fields = line.split()\n        if fields[0] == device:\n            return True\n\n    # Check swap\n    with settings(hide('running', 'stdout')):\n        res = run_as_root('swapon -s')\n    for line in res.splitlines():\n        fields = line.split()\n        if fields[0] == device:\n            return True\n\n    return False", "response": "Check if partition is mounted"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef query(query, use_sudo=True, **kwargs):\n    func = use_sudo and run_as_root or run\n\n    user = kwargs.get('mysql_user') or env.get('mysql_user')\n    password = kwargs.get('mysql_password') or env.get('mysql_password')\n\n    options = [\n        '--batch',\n        '--raw',\n        '--skip-column-names',\n    ]\n    if user:\n        options.append('--user=%s' % quote(user))\n    if password:\n        options.append('--password=%s' % quote(password))\n    options = ' '.join(options)\n\n    return func('mysql %(options)s --execute=%(query)s' % {\n        'options': options,\n        'query': quote(query),\n    })", "response": "Run a MySQL query."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef user_exists(name, host='localhost', **kwargs):\n    with settings(hide('running', 'stdout', 'stderr', 'warnings'), warn_only=True):\n        res = query(\"\"\"\n            use mysql;\n            SELECT COUNT(*) FROM user\n                WHERE User = '%(name)s' AND Host = '%(host)s';\n            \"\"\" % {\n                'name': name,\n                'host': host,\n            }, **kwargs)\n    return res.succeeded and (int(res) == 1)", "response": "Check if a MySQL user exists."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef create_user(name, password, host='localhost', **kwargs):\n    with settings(hide('running')):\n        query(\"CREATE USER '%(name)s'@'%(host)s' IDENTIFIED BY '%(password)s';\" % {\n            'name': name,\n            'password': password,\n            'host': host\n        }, **kwargs)\n    puts(\"Created MySQL user '%s'.\" % name)", "response": "Create a MySQL user."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nchecks if a MySQL database exists.", "response": "def database_exists(name, **kwargs):\n    \"\"\"\n    Check if a MySQL database exists.\n    \"\"\"\n    with settings(hide('running', 'stdout', 'stderr', 'warnings'), warn_only=True):\n        res = query(\"SHOW DATABASES LIKE '%(name)s';\" % {\n            'name': name\n        }, **kwargs)\n\n    return res.succeeded and (res == name)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ncreate a MySQL database.", "response": "def create_database(name, owner=None, owner_host='localhost', charset='utf8',\n                    collate='utf8_general_ci', **kwargs):\n    \"\"\"\n    Create a MySQL database.\n\n    Example::\n\n        import burlap\n\n        # Create DB if it does not exist\n        if not burlap.mysql.database_exists('myapp'):\n            burlap.mysql.create_database('myapp', owner='dbuser')\n\n    \"\"\"\n    with settings(hide('running')):\n\n        query(\"CREATE DATABASE %(name)s CHARACTER SET %(charset)s COLLATE %(collate)s;\" % {\n            'name': name,\n            'charset': charset,\n            'collate': collate\n        }, **kwargs)\n\n        if owner:\n            query(\"GRANT ALL PRIVILEGES ON %(name)s.* TO '%(owner)s'@'%(owner_host)s' WITH GRANT OPTION;\" % {\n                'name': name,\n                'owner': owner,\n                'owner_host': owner_host\n            }, **kwargs)\n\n    puts(\"Created MySQL database '%s'.\" % name)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef conf_path(self):\n        from burlap.system import distrib_id, distrib_release\n        hostname = self.current_hostname\n        if hostname not in self._conf_cache:\n            self.env.conf_specifics[hostname] = self.env.conf_default\n            d_id = distrib_id()\n            d_release = distrib_release()\n            for key in ((d_id, d_release), (d_id,)):\n                if key in self.env.conf_specifics:\n                    self._conf_cache[hostname] = self.env.conf_specifics[key]\n        return self._conf_cache[hostname]", "response": "Retrieves the path to the MySQL configuration file."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef prep_root_password(self, password=None, **kwargs):\n        r = self.database_renderer(**kwargs)\n        r.env.root_password = password or r.genv.get('db_root_password')\n        r.sudo(\"DEBIAN_FRONTEND=noninteractive dpkg --configure -a\")\n        r.sudo(\"debconf-set-selections <<< 'mysql-server mysql-server/root_password password {root_password}'\")\n        r.sudo(\"debconf-set-selections <<< 'mysql-server mysql-server/root_password_again password {root_password}'\")", "response": "Set the root password prompt entries into the debconf cache."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef exists(self, **kwargs):\n        name = kwargs.pop('name', 'default')\n        site = kwargs.pop('site', None)\n        r = self.database_renderer(name=name, site=site)\n        ret = r.run('mysql -h {db_host} -u {db_root_username} '\\\n            '-p\"{db_root_password}\" -N -B -e \"SELECT IF(\\'{db_name}\\''\\\n            ' IN(SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA), '\\\n            '\\'exists\\', \\'notexists\\') AS found;\"')\n        if ret is not None:\n            ret = 'notexists' not in (ret or 'notexists')\n        if ret is not None:\n            msg = '%s database on site %s %s exist.' \\\n                % (name.title(), env.SITE, 'DOES' if ret else 'DOES NOT')\n            if ret:\n                print(green(msg))\n            else:\n                print(red(msg))\n            return ret", "response": "Returns true if a database with the given name exists. False otherwise."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef load(self, dump_fn='', prep_only=0, force_upload=0, from_local=0, name=None, site=None, dest_dir=None, force_host=None):\n\n        r = self.database_renderer(name=name, site=site)\n        r.pc('Loading database snapshot.')\n\n        # Render the snapshot filename.\n        r.env.dump_fn = self.get_default_db_fn(fn_template=dump_fn, dest_dir=dest_dir).strip()\n\n        from_local = int(from_local)\n\n        prep_only = int(prep_only)\n\n        missing_local_dump_error = r.format(\"Database dump file {dump_fn} does not exist.\")\n\n        # Copy snapshot file to target.\n        if self.is_local:\n            r.env.remote_dump_fn = dump_fn\n        else:\n            r.env.remote_dump_fn = '/tmp/' + os.path.split(r.env.dump_fn)[-1]\n\n        if not prep_only and not self.is_local:\n            #if int(force_upload) or (not self.is_local and not r.file_exists(r.env.remote_dump_fn)):\n            if not self.dryrun:\n                assert os.path.isfile(r.env.dump_fn), missing_local_dump_error\n            #if self.verbose:\n                #print('Uploading MySQL database snapshot...')\n            #r.put(\n                #local_path=r.env.dump_fn,\n                #remote_path=r.env.remote_dump_fn)\n            self.upload_snapshot(name=name, site=site)\n\n        if self.is_local and not prep_only and not self.dryrun:\n            assert os.path.isfile(r.env.dump_fn), missing_local_dump_error\n\n        if force_host:\n            r.env.db_host = force_host\n\n        # Drop the database if it's there.\n        r.run(\"mysql -v -h {db_host} -u {db_root_username} -p'{db_root_password}' --execute='DROP DATABASE IF EXISTS {db_name}'\")\n\n        # Now, create the database.\n        r.run(\"mysqladmin -h {db_host} -u {db_root_username} -p'{db_root_password}' create {db_name}\")\n\n        # Create user\n        with settings(warn_only=True):\n            r.run(\"mysql -v -h {db_host} -u {db_root_username} -p'{db_root_password}' --execute=\\\"DROP USER '{db_user}'@'%%';\"\n                \"FLUSH PRIVILEGES;\\\"\")\n        with settings(warn_only=True):\n            r.run(\"mysql -v -h {db_host} -u {db_root_username} -p'{db_root_password}' --execute=\\\"CREATE USER '{db_user}'@'%%' IDENTIFIED BY '{db_password}'; \"\n                \"GRANT ALL PRIVILEGES ON *.* TO '{db_user}'@'%%' WITH GRANT OPTION; \"\n                \"FLUSH PRIVILEGES;\\\"\")\n        self.set_collation(name=name, site=site)\n\n        self.set_max_packet_size(name=name, site=site)\n\n        # Run any server-specific commands (e.g. to setup permissions) before\n        # we load the data.\n        for command in r.env.preload_commands:\n            r.run(command)\n\n        # Restore the database content from the dump file.\n        if not prep_only:\n            r.run(r.env.load_command)\n\n        self.set_collation(name=name, site=site)", "response": "Load a database snapshot into the target database server."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef shell(self, name='default', site=None, use_root=0, **kwargs):\n        r = self.database_renderer(name=name, site=site)\n\n        if int(use_root):\n            kwargs = dict(\n                db_user=r.env.db_root_username,\n                db_password=r.env.db_root_password,\n                db_host=r.env.db_host,\n                db_name=r.env.db_name,\n            )\n            r.env.update(kwargs)\n\n        if not name:\n            r.env.db_name = ''\n\n        r.run('/bin/bash -i -c \"mysql -u {db_user} -p\\'{db_password}\\' -h {db_host} {db_name}\"')", "response": "Opens a SQL shell to the given database."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef tic_single_object_crossmatch(ra, dec, radius):\n    '''This does a cross-match against the TIC catalog on MAST.\n\n    Speed tests: about 10 crossmatches per second. (-> 3 hours for 10^5 objects\n    to crossmatch).\n\n    Parameters\n    ----------\n\n    ra,dec : np.array\n        The coordinates to cross match against, all in decimal degrees.\n\n    radius : float\n        The cross-match radius to use, in decimal degrees.\n\n    Returns\n    -------\n\n    dict\n        Returns the match results JSON from MAST loaded into a dict.\n\n    '''\n    for val in ra,dec,radius:\n        if not isinstance(val, float):\n            raise AssertionError('plz input ra,dec,radius in decimal degrees')\n\n    # This is a json object\n    crossmatchInput = {\"fields\":[{\"name\":\"ra\",\"type\":\"float\"},\n                                 {\"name\":\"dec\",\"type\":\"float\"}],\n                       \"data\":[{\"ra\":ra,\"dec\":dec}]}\n\n    request = {\"service\":\"Mast.Tic.Crossmatch\",\n               \"data\":crossmatchInput,\n               \"params\":{\n                   \"raColumn\":\"ra\",\n                   \"decColumn\":\"dec\",\n                   \"radius\":radius\n               },\n               \"format\":\"json\",\n               'removecache':True}\n\n    headers,out_string = _mast_query(request)\n\n    out_data = json.loads(out_string)\n\n    return out_data", "response": "This function performs a cross - match against the TIC catalog on MAST."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef normalized_flux_to_mag(lcdict,\n                           columns=('sap.sap_flux',\n                                    'sap.sap_flux_err',\n                                    'sap.sap_bkg',\n                                    'sap.sap_bkg_err',\n                                    'pdc.pdcsap_flux',\n                                    'pdc.pdcsap_flux_err')):\n    '''This converts the normalized fluxes in the TESS lcdicts to TESS mags.\n\n    Uses the object's TESS mag stored in lcdict['objectinfo']['tessmag']::\n\n        mag - object_tess_mag = -2.5 log (flux/median_flux)\n\n    Parameters\n    ----------\n\n    lcdict : lcdict\n        An `lcdict` produced by `read_tess_fitslc` or\n        `consolidate_tess_fitslc`. This must have normalized fluxes in its\n        measurement columns (use the `normalize` kwarg for these functions).\n\n    columns : sequence of str\n        The column keys of the normalized flux and background measurements in\n        the `lcdict` to operate on and convert to magnitudes in TESS band (T).\n\n    Returns\n    -------\n\n    lcdict\n        The returned `lcdict` will contain extra columns corresponding to\n        magnitudes for each input normalized flux/background column.\n\n    '''\n\n    tess_mag = lcdict['objectinfo']['tessmag']\n\n    for key in columns:\n\n        k1, k2 = key.split('.')\n\n        if 'err' not in k2:\n\n            lcdict[k1][k2.replace('flux','mag')] = (\n                tess_mag - 2.5*np.log10(lcdict[k1][k2])\n            )\n\n        else:\n\n            lcdict[k1][k2.replace('flux','mag')] = (\n                - 2.5*np.log10(1.0 - lcdict[k1][k2])\n            )\n\n    return lcdict", "response": "This function converts the normalized fluxes in the TESS lcdict to TESS mags."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef get_time_flux_errs_from_Ames_lightcurve(infile,\n                                            lctype,\n                                            cadence_min=2):\n    '''Reads TESS Ames-format FITS light curve files.\n\n    MIT TOI alerts include Ames lightcurve files. This function gets the finite,\n    nonzero times, fluxes, and errors with QUALITY == 0.\n\n    NOTE: the PDCSAP lightcurve typically still need \"pre-whitening\" after this\n    step.\n\n    .. deprecated:: 0.3.20\n        This function will be removed in astrobase v0.4.2. Use the\n        `read_tess_fitslc` and `consolidate_tess_fitslc` functions instead.\n\n    Parameters\n    ----------\n\n    infile : str\n        The path to `*.fits.gz` TOI alert file, from Ames pipeline.\n\n    lctype : {'PDCSAP','SAP'}\n        The type of light curve to extract from the FITS LC file.\n\n    cadence_min : int\n        The expected frame cadence in units of minutes. Raises ValueError if you\n        use the wrong cadence.\n\n    Returns\n    -------\n\n    tuple\n        The tuple returned is of the form:\n\n        (times, normalized (to median) fluxes, flux errors)\n\n    '''\n\n    warnings.warn(\n        \"Use the astrotess.read_tess_fitslc and \"\n        \"astrotess.consolidate_tess_fitslc functions instead of this function. \"\n        \"This function will be removed in astrobase v0.4.2.\",\n        FutureWarning\n    )\n\n    if lctype not in ('PDCSAP','SAP'):\n        raise ValueError('unknown light curve type requested: %s' % lctype)\n\n    hdulist = pyfits.open(infile)\n\n    main_hdr = hdulist[0].header\n    lc_hdr = hdulist[1].header\n    lc = hdulist[1].data\n\n    if (('Ames' not in main_hdr['ORIGIN']) or\n        ('LIGHTCURVE' not in lc_hdr['EXTNAME'])):\n        raise ValueError(\n            'could not understand input LC format. '\n            'Is it a TESS TOI LC file?'\n        )\n\n    time = lc['TIME']\n    flux = lc['{:s}_FLUX'.format(lctype)]\n    err_flux = lc['{:s}_FLUX_ERR'.format(lctype)]\n\n    # REMOVE POINTS FLAGGED WITH:\n    # attitude tweaks, safe mode, coarse/earth pointing, argabrithening events,\n    # reaction wheel desaturation events, cosmic rays in optimal aperture\n    # pixels, manual excludes, discontinuities, stray light from Earth or Moon\n    # in camera FoV.\n    # (Note: it's not clear to me what a lot of these mean. Also most of these\n    # columns are probably not correctly propagated right now.)\n    sel = (lc['QUALITY'] == 0)\n    sel &= np.isfinite(time)\n    sel &= np.isfinite(flux)\n    sel &= np.isfinite(err_flux)\n    sel &= ~np.isnan(time)\n    sel &= ~np.isnan(flux)\n    sel &= ~np.isnan(err_flux)\n    sel &= (time != 0)\n    sel &= (flux != 0)\n    sel &= (err_flux != 0)\n\n    time = time[sel]\n    flux = flux[sel]\n    err_flux = err_flux[sel]\n\n    # ensure desired cadence\n    lc_cadence_diff = np.abs(np.nanmedian(np.diff(time))*24*60 - cadence_min)\n\n    if lc_cadence_diff > 1.0e-2:\n        raise ValueError(\n            'the light curve is not at the required cadence specified: %.2f' %\n            cadence_min\n        )\n\n    fluxmedian = np.nanmedian(flux)\n    flux /= fluxmedian\n    err_flux /= fluxmedian\n\n    return time, flux, err_flux", "response": "Reads TESS Ames - format FITS light curve files and returns time flux errors."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef filter_tess_lcdict(lcdict,\n                       filterqualityflags=True,\n                       nanfilter='sap,pdc,time',\n                       timestoignore=None,\n                       quiet=False):\n    '''This filters the provided TESS `lcdict`, removing nans and bad\n    observations.\n\n    By default, this function removes points in the TESS LC that have ANY\n    quality flags set.\n\n    Parameters\n    ----------\n\n    lcdict : lcdict\n        An `lcdict` produced by `consolidate_tess_fitslc` or\n        `read_tess_fitslc`.\n\n    filterflags : bool\n        If True, will remove any measurements that have non-zero quality flags\n        present. This usually indicates an issue with the instrument or\n        spacecraft.\n\n    nanfilter : {'sap','pdc','sap,pdc'}\n        Indicates the flux measurement type(s) to apply the filtering to.\n\n    timestoignore : list of tuples or None\n        This is of the form::\n\n            [(time1_start, time1_end), (time2_start, time2_end), ...]\n\n        and indicates the start and end times to mask out of the final\n        lcdict. Use this to remove anything that wasn't caught by the quality\n        flags.\n\n    Returns\n    -------\n\n    lcdict\n        Returns an `lcdict` (this is useable by most astrobase functions for LC\n        processing). The `lcdict` is filtered IN PLACE!\n\n    '''\n\n    cols = lcdict['columns']\n\n    # filter all bad LC points as noted by quality flags\n    if filterqualityflags:\n\n        nbefore = lcdict['time'].size\n        filterind = lcdict['quality'] == 0\n\n        for col in cols:\n            if '.' in col:\n                key, subkey = col.split('.')\n                lcdict[key][subkey] = lcdict[key][subkey][filterind]\n            else:\n                lcdict[col] = lcdict[col][filterind]\n\n        nafter = lcdict['time'].size\n        if not quiet:\n            LOGINFO('applied quality flag filter, '\n                    'ndet before = %s, ndet after = %s'\n                    % (nbefore, nafter))\n\n\n    if nanfilter and nanfilter == 'sap,pdc,time':\n        notnanind = (\n            np.isfinite(lcdict['sap']['sap_flux']) &\n            np.isfinite(lcdict['sap']['sap_flux_err']) &\n            np.isfinite(lcdict['pdc']['pdcsap_flux']) &\n            np.isfinite(lcdict['pdc']['pdcsap_flux_err']) &\n            np.isfinite(lcdict['time'])\n        )\n    elif nanfilter and nanfilter == 'sap,time':\n        notnanind = (\n            np.isfinite(lcdict['sap']['sap_flux']) &\n            np.isfinite(lcdict['sap']['sap_flux_err']) &\n            np.isfinite(lcdict['time'])\n        )\n    elif nanfilter and nanfilter == 'pdc,time':\n        notnanind = (\n            np.isfinite(lcdict['pdc']['pdcsap_flux']) &\n            np.isfinite(lcdict['pdc']['pdcsap_flux_err']) &\n            np.isfinite(lcdict['time'])\n        )\n    elif nanfilter is None:\n        pass\n    else:\n        raise NotImplementedError\n\n    # remove nans from all columns\n    if nanfilter:\n\n        nbefore = lcdict['time'].size\n        for col in cols:\n            if '.' in col:\n                key, subkey = col.split('.')\n                lcdict[key][subkey] = lcdict[key][subkey][notnanind]\n            else:\n                lcdict[col] = lcdict[col][notnanind]\n\n        nafter = lcdict['time'].size\n\n        if not quiet:\n            LOGINFO('removed nans, ndet before = %s, ndet after = %s'\n                    % (nbefore, nafter))\n\n\n    # exclude all times in timestoignore\n    if (timestoignore and\n        isinstance(timestoignore, list) and\n        len(timestoignore) > 0):\n\n        exclind = np.full_like(lcdict['time'],True).astype(bool)\n        nbefore = exclind.size\n\n        # get all the masks\n        for ignoretime in timestoignore:\n            time0, time1 = ignoretime[0], ignoretime[1]\n            thismask = ~((lcdict['time'] >= time0) & (lcdict['time'] <= time1))\n            exclind = exclind & thismask\n\n        # apply the masks\n        for col in cols:\n            if '.' in col:\n                key, subkey = col.split('.')\n                lcdict[key][subkey] = lcdict[key][subkey][exclind]\n            else:\n                lcdict[col] = lcdict[col][exclind]\n\n        nafter = lcdict['time'].size\n        if not quiet:\n            LOGINFO('removed timestoignore, ndet before = %s, ndet after = %s'\n                    % (nbefore, nafter))\n\n    return lcdict", "response": "This function filters the TESS LCdict by removing bad LC points that are not caught by the quality flags set."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _pkl_finder_objectinfo(\n        objectinfo,\n        varinfo,\n        findercmap,\n        finderconvolve,\n        sigclip,\n        normto,\n        normmingap,\n        deredden_object=True,\n        custom_bandpasses=None,\n        lclistpkl=None,\n        nbrradiusarcsec=30.0,\n        maxnumneighbors=5,\n        plotdpi=100,\n        findercachedir='~/.astrobase/stamp-cache',\n        verbose=True,\n        gaia_submit_timeout=10.0,\n        gaia_submit_tries=3,\n        gaia_max_timeout=180.0,\n        gaia_mirror=None,\n        fast_mode=False,\n        complete_query_later=True\n):\n    '''This returns the finder chart and object information as a dict.\n\n    Parameters\n    ----------\n\n    objectinfo : dict or None\n        If provided, this is a dict containing information on the object whose\n        light curve is being processed. This function will then be able to look\n        up and download a finder chart for this object and write that to the\n        output checkplotdict. External services such as GAIA, SIMBAD, TIC@MAST,\n        etc. will also be used to look up this object by its coordinates, and\n        will add in information available from those services.\n\n        The `objectinfo` dict must be of the form and contain at least the keys\n        described below::\n\n            {'objectid': the name of the object,\n             'ra': the right ascension of the object in decimal degrees,\n             'decl': the declination of the object in decimal degrees,\n             'ndet': the number of observations of this object}\n\n        You can also provide magnitudes and proper motions of the object using\n        the following keys and the appropriate values in the `objectinfo`\n        dict. These will be used to calculate colors, total and reduced proper\n        motion, etc. and display these in the output checkplot PNG::\n\n            'pmra' -> the proper motion in mas/yr in right ascension,\n            'pmdecl' -> the proper motion in mas/yr in declination,\n            'umag'  -> U mag\t\t -> colors: U-B, U-V, U-g\n            'bmag'  -> B mag\t\t -> colors: U-B, B-V\n            'vmag'  -> V mag\t\t -> colors: U-V, B-V, V-R, V-I, V-K\n            'rmag'  -> R mag\t\t -> colors: V-R, R-I\n            'imag'  -> I mag\t\t -> colors: g-I, V-I, R-I, B-I\n            'jmag'  -> 2MASS J mag\t -> colors: J-H, J-K, g-J, i-J\n            'hmag'  -> 2MASS H mag\t -> colors: J-H, H-K\n            'kmag'  -> 2MASS Ks mag\t -> colors: g-Ks, H-Ks, J-Ks, V-Ks\n            'sdssu' -> SDSS u mag\t -> colors: u-g, u-V\n            'sdssg' -> SDSS g mag\t -> colors: g-r, g-i, g-K, u-g, U-g, g-J\n            'sdssr' -> SDSS r mag\t -> colors: r-i, g-r\n            'sdssi' -> SDSS i mag\t -> colors: r-i, i-z, g-i, i-J, i-W1\n            'sdssz' -> SDSS z mag\t -> colors: i-z, z-W2, g-z\n            'ujmag' -> UKIRT J mag\t -> colors: J-H, H-K, J-K, g-J, i-J\n            'uhmag' -> UKIRT H mag\t -> colors: J-H, H-K\n            'ukmag' -> UKIRT K mag\t -> colors: g-K, H-K, J-K, V-K\n            'irac1' -> Spitzer IRAC1 mag -> colors: i-I1, I1-I2\n            'irac2' -> Spitzer IRAC2 mag -> colors: I1-I2, I2-I3\n            'irac3' -> Spitzer IRAC3 mag -> colors: I2-I3\n            'irac4' -> Spitzer IRAC4 mag -> colors: I3-I4\n            'wise1' -> WISE W1 mag\t -> colors: i-W1, W1-W2\n            'wise2' -> WISE W2 mag\t -> colors: W1-W2, W2-W3\n            'wise3' -> WISE W3 mag\t -> colors: W2-W3\n            'wise4' -> WISE W4 mag\t -> colors: W3-W4\n\n        If you have magnitude measurements in other bands, use the\n        `custom_bandpasses` kwarg to pass these in.\n\n        If this is None, no object information will be incorporated into the\n        checkplot (kind of making it effectively useless for anything other than\n        glancing at the phased light curves at various 'best' periods from the\n        period-finder results).\n\n    varinfo : dict or None\n        If this is None, a blank dict of the form below will be added to the\n        checkplotdict::\n\n            {'objectisvar': None -> variability flag (None indicates unset),\n             'vartags': CSV str containing variability type tags from review,\n             'varisperiodic': None -> periodic variability flag (None -> unset),\n             'varperiod': the period associated with the periodic variability,\n             'varepoch': the epoch associated with the periodic variability}\n\n        If you provide a dict matching this format in this kwarg, this will be\n        passed unchanged to the output checkplotdict produced.\n\n    findercmap : str or matplotlib.cm.ColorMap object\n        The Colormap object to use for the finder chart image.\n\n    finderconvolve : astropy.convolution.Kernel object or None\n        If not None, the Kernel object to use for convolving the finder image.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    normto : {'globalmedian', 'zero'} or a float\n        This is specified as below::\n\n            'globalmedian' -> norms each mag to global median of the LC column\n            'zero'         -> norms each mag to zero\n            a float        -> norms each mag to this specified float value.\n\n    normmingap : float\n        This defines how much the difference between consecutive measurements is\n        allowed to be to consider them as parts of different timegroups. By\n        default it is set to 4.0 days.\n\n    deredden_object : bool\n        If this is True, will use the 2MASS DUST service to get extinction\n        coefficients in various bands, and then try to deredden the magnitudes\n        and colors of the object already present in the checkplot's objectinfo\n        dict.\n\n    custom_bandpasses : dict\n        This is a dict used to provide custom bandpass definitions for any\n        magnitude measurements in the objectinfo dict that are not automatically\n        recognized by :py:func:`astrobase.varclass.starfeatures.color_features`.\n\n    lclistpkl : dict or str\n        If this is provided, must be a dict resulting from reading a catalog\n        produced by the `lcproc.catalogs.make_lclist` function or a str path\n        pointing to the pickle file produced by that function. This catalog is\n        used to find neighbors of the current object in the current light curve\n        collection. Looking at neighbors of the object within the radius\n        specified by `nbrradiusarcsec` is useful for light curves produced by\n        instruments that have a large pixel scale, so are susceptible to\n        blending of variability and potential confusion of neighbor variability\n        with that of the actual object being looked at. If this is None, no\n        neighbor lookups will be performed.\n\n    nbrradiusarcsec : float\n        The radius in arcseconds to use for a search conducted around the\n        coordinates of this object to look for any potential confusion and\n        blending of variability amplitude caused by their proximity.\n\n    maxnumneighbors : int\n        The maximum number of neighbors that will have their light curves and\n        magnitudes noted in this checkplot as potential blends with the target\n        object.\n\n    plotdpi : int\n        The resolution in DPI of the plots to generate in this function\n        (e.g. the finder chart, etc.)\n\n    findercachedir : str\n        The path to the astrobase cache directory for finder chart downloads\n        from the NASA SkyView service.\n\n    verbose : bool\n        If True, will indicate progress and warn about potential problems.\n\n    gaia_submit_timeout : float\n        Sets the timeout in seconds to use when submitting a request to look up\n        the object's information to the GAIA service. Note that if `fast_mode`\n        is set, this is ignored.\n\n    gaia_submit_tries : int\n        Sets the maximum number of times the GAIA services will be contacted to\n        obtain this object's information. If `fast_mode` is set, this is\n        ignored, and the services will be contacted only once (meaning that a\n        failure to respond will be silently ignored and no GAIA data will be\n        added to the checkplot's objectinfo dict).\n\n    gaia_max_timeout : float\n        Sets the timeout in seconds to use when waiting for the GAIA service to\n        respond to our request for the object's information. Note that if\n        `fast_mode` is set, this is ignored.\n\n    gaia_mirror : str\n        This sets the GAIA mirror to use. This is a key in the\n        `services.gaia.GAIA_URLS` dict which defines the URLs to hit for each\n        mirror.\n\n    fast_mode : bool or float\n        This runs the external catalog operations in a \"fast\" mode, with short\n        timeouts and not trying to hit external catalogs that take a long time\n        to respond.\n\n        If this is set to True, the default settings for the external requests\n        will then become::\n\n            skyview_lookup = False\n            skyview_timeout = 10.0\n            skyview_retry_failed = False\n            dust_timeout = 10.0\n            gaia_submit_timeout = 7.0\n            gaia_max_timeout = 10.0\n            gaia_submit_tries = 2\n            complete_query_later = False\n            search_simbad = False\n\n        If this is a float, will run in \"fast\" mode with the provided timeout\n        value in seconds and the following settings::\n\n            skyview_lookup = True\n            skyview_timeout = fast_mode\n            skyview_retry_failed = False\n            dust_timeout = fast_mode\n            gaia_submit_timeout = 0.66*fast_mode\n            gaia_max_timeout = fast_mode\n            gaia_submit_tries = 2\n            complete_query_later = False\n            search_simbad = False\n\n    complete_query_later : bool\n        If this is True, saves the state of GAIA queries that are not yet\n        complete when `gaia_max_timeout` is reached while waiting for the GAIA\n        service to respond to our request. A later call for GAIA info on the\n        same object will attempt to pick up the results from the existing query\n        if it's completed. If `fast_mode` is True, this is ignored.\n\n    Returns\n    -------\n\n    dict\n        A checkplotdict is returned containing the objectinfo and varinfo dicts,\n        ready to use with the functions below to add in light curve plots,\n        phased LC plots, xmatch info, etc.\n\n    '''\n\n    # optional mode to hit external services and fail fast if they timeout\n    if fast_mode is True:\n        skyview_lookup = False\n        skyview_timeout = 10.0\n        skyview_retry_failed = False\n        dust_timeout = 10.0\n        gaia_submit_timeout = 7.0\n        gaia_max_timeout = 10.0\n        gaia_submit_tries = 2\n        complete_query_later = False\n        search_simbad = False\n\n    elif isinstance(fast_mode, (int, float)) and fast_mode > 0.0:\n        skyview_lookup = True\n        skyview_timeout = fast_mode\n        skyview_retry_failed = False\n        dust_timeout = fast_mode\n        gaia_submit_timeout = 0.66*fast_mode\n        gaia_max_timeout = fast_mode\n        gaia_submit_tries = 2\n        complete_query_later = False\n        search_simbad = False\n\n    else:\n        skyview_lookup = True\n        skyview_timeout = 10.0\n        skyview_retry_failed = True\n        dust_timeout = 10.0\n        search_simbad = True\n\n\n    if (isinstance(objectinfo, dict) and\n        ('objectid' in objectinfo or 'hatid' in objectinfo) and\n        'ra' in objectinfo and 'decl' in objectinfo and\n        objectinfo['ra'] and objectinfo['decl']):\n\n        if 'objectid' not in objectinfo:\n            objectid = objectinfo['hatid']\n        else:\n            objectid = objectinfo['objectid']\n\n        if verbose and skyview_lookup:\n            LOGINFO('adding in object information and '\n                    'finder chart for %s at RA: %.3f, DEC: %.3f' %\n                    (objectid, objectinfo['ra'], objectinfo['decl']))\n        elif verbose and not skyview_lookup:\n            LOGINFO('adding in object information '\n                    'for %s at RA: %.3f, DEC: %.3f. '\n                    'skipping finder chart because skyview_lookup = False' %\n                    (objectid, objectinfo['ra'], objectinfo['decl']))\n\n        # get the finder chart\n        try:\n\n            if skyview_lookup:\n\n                try:\n\n                    # generate the finder chart\n                    finder, finderheader = skyview_stamp(\n                        objectinfo['ra'],\n                        objectinfo['decl'],\n                        convolvewith=finderconvolve,\n                        verbose=verbose,\n                        flip=False,\n                        cachedir=findercachedir,\n                        timeout=skyview_timeout,\n                        retry_failed=skyview_retry_failed,\n                    )\n\n                except OSError as e:\n\n                    if not fast_mode:\n\n                        LOGERROR(\n                            'finder image appears to be corrupt, retrying...'\n                        )\n\n                        # generate the finder chart\n                        finder, finderheader = skyview_stamp(\n                            objectinfo['ra'],\n                            objectinfo['decl'],\n                            convolvewith=finderconvolve,\n                            verbose=verbose,\n                            flip=False,\n                            cachedir=findercachedir,\n                            forcefetch=True,\n                            timeout=skyview_timeout,\n                            retry_failed=False  # do not start an infinite loop\n                        )\n\n\n                finderfig = plt.figure(figsize=(3,3),dpi=plotdpi)\n\n                # initialize the finder WCS\n                finderwcs = WCS(finderheader)\n\n                # use the WCS transform for the plot\n                ax = finderfig.add_subplot(111, frameon=False)\n                ax.imshow(finder, cmap=findercmap, origin='lower')\n\n            else:\n                finder, finderheader, finderfig, finderwcs = (\n                    None, None, None, None\n                )\n\n            # skip down to after nbr stuff for the rest of the finderchart...\n\n            # search around the target's location and get its neighbors if\n            # lclistpkl is provided and it exists\n            if (lclistpkl is not None and\n                nbrradiusarcsec is not None and\n                nbrradiusarcsec > 0.0):\n\n                # if lclistpkl is a string, open it as a pickle\n                if isinstance(lclistpkl, str) and os.path.exists(lclistpkl):\n\n                    if lclistpkl.endswith('.gz'):\n                        infd = gzip.open(lclistpkl,'rb')\n                    else:\n                        infd = open(lclistpkl,'rb')\n\n                    lclist = pickle.load(infd)\n                    infd.close()\n\n                # otherwise, if it's a dict, we get it directly\n                elif isinstance(lclistpkl, dict):\n\n                    lclist = lclistpkl\n\n                # finally, if it's nothing we recognize, ignore it\n                else:\n\n                    LOGERROR('could not understand lclistpkl kwarg, '\n                             'not getting neighbor info')\n\n                    lclist = dict()\n\n                # check if we have a KDTree to use\n                # if we don't, skip neighbor stuff\n                if 'kdtree' not in lclist:\n\n                    LOGERROR('neighbors within %.1f arcsec for %s could '\n                             'not be found, no kdtree in lclistpkl: %s'\n                             % (objectid, lclistpkl))\n                    neighbors = None\n                    kdt = None\n\n                # otherwise, do neighbor processing\n                else:\n\n                    kdt = lclist['kdtree']\n\n                    obj_cosdecl = np.cos(np.radians(objectinfo['decl']))\n                    obj_sindecl = np.sin(np.radians(objectinfo['decl']))\n                    obj_cosra = np.cos(np.radians(objectinfo['ra']))\n                    obj_sinra = np.sin(np.radians(objectinfo['ra']))\n\n                    obj_xyz = np.column_stack((obj_cosra*obj_cosdecl,\n                                               obj_sinra*obj_cosdecl,\n                                               obj_sindecl))\n                    match_xyzdist = (\n                        2.0 * np.sin(np.radians(nbrradiusarcsec/3600.0)/2.0)\n                    )\n                    matchdists, matchinds = kdt.query(\n                        obj_xyz,\n                        k=maxnumneighbors+1,  # get maxnumneighbors + tgt\n                        distance_upper_bound=match_xyzdist\n                    )\n\n                    # sort by matchdist\n                    mdsorted = np.argsort(matchdists[0])\n                    matchdists = matchdists[0][mdsorted]\n                    matchinds = matchinds[0][mdsorted]\n\n                    # luckily, the indices to the kdtree are the same as that\n                    # for the objects (I think)\n                    neighbors = []\n\n                    nbrind = 0\n\n                    for md, mi in zip(matchdists, matchinds):\n\n                        if np.isfinite(md) and md > 0.0:\n\n                            if skyview_lookup:\n\n                                # generate the xy for the finder we'll use a\n                                # HTML5 canvas and these pixcoords to highlight\n                                # each neighbor when we mouse over its row in\n                                # the neighbors tab\n\n                                # we use coord origin = 0 here and not the usual\n                                # 1 because we're annotating a numpy array\n                                pixcoords = finderwcs.all_world2pix(\n                                    np.array([[lclist['objects']['ra'][mi],\n                                               lclist['objects']['decl'][mi]]]),\n                                    0\n                                )\n\n                                # each elem is {'objectid',\n                                #               'ra','decl',\n                                #               'xpix','ypix',\n                                #               'dist','lcfpath'}\n                                thisnbr = {\n                                    'objectid':(\n                                        lclist['objects']['objectid'][mi]\n                                    ),\n                                    'ra':lclist['objects']['ra'][mi],\n                                    'decl':lclist['objects']['decl'][mi],\n                                    'xpix':pixcoords[0,0],\n                                    'ypix':300.0 - pixcoords[0,1],\n                                    'dist':_xyzdist_to_distarcsec(md),\n                                    'lcfpath': lclist['objects']['lcfname'][mi]\n                                }\n                                neighbors.append(thisnbr)\n                                nbrind = nbrind+1\n\n                                # put in a nice marker for this neighbor into\n                                # the overall finder chart\n                                annotatex = pixcoords[0,0]\n                                annotatey = pixcoords[0,1]\n\n                                if ((300.0 - annotatex) > 50.0):\n                                    offx = annotatex + 30.0\n                                    xha = 'center'\n                                else:\n                                    offx = annotatex - 30.0\n                                    xha = 'center'\n                                if ((300.0 - annotatey) > 50.0):\n                                    offy = annotatey - 30.0\n                                    yha = 'center'\n                                else:\n                                    offy = annotatey + 30.0\n                                    yha = 'center'\n\n                                ax.annotate('N%s' % nbrind,\n                                            (annotatex, annotatey),\n                                            xytext=(offx, offy),\n                                            arrowprops={'facecolor':'blue',\n                                                        'edgecolor':'blue',\n                                                        'width':1.0,\n                                                        'headwidth':1.0,\n                                                        'headlength':0.1,\n                                                        'shrink':0.0},\n                                            color='blue',\n                                            horizontalalignment=xha,\n                                            verticalalignment=yha)\n\n                            else:\n\n                                thisnbr = {\n                                    'objectid':(\n                                        lclist['objects']['objectid'][mi]\n                                    ),\n                                    'ra':lclist['objects']['ra'][mi],\n                                    'decl':lclist['objects']['decl'][mi],\n                                    'xpix':0.0,\n                                    'ypix':0.0,\n                                    'dist':_xyzdist_to_distarcsec(md),\n                                    'lcfpath': lclist['objects']['lcfname'][mi]\n                                }\n                                neighbors.append(thisnbr)\n                                nbrind = nbrind+1\n\n            # if there are no neighbors, set the 'neighbors' key to None\n            else:\n\n                neighbors = None\n                kdt = None\n\n            if skyview_lookup:\n\n                #\n                # finish up the finder chart after neighbors are processed\n                #\n                ax.set_xticks([])\n                ax.set_yticks([])\n\n                # add a reticle pointing to the object's coordinates\n                # we use coord origin = 0 here and not the usual\n                # 1 because we're annotating a numpy array\n                object_pixcoords = finderwcs.all_world2pix(\n                    [[objectinfo['ra'],\n                      objectinfo['decl']]],\n                    0\n                )\n\n                ax.axvline(\n                    # x=150.0,\n                    x=object_pixcoords[0,0],\n                    ymin=0.375,\n                    ymax=0.45,\n                    linewidth=1,\n                    color='b'\n                )\n                ax.axhline(\n                    # y=150.0,\n                    y=object_pixcoords[0,1],\n                    xmin=0.375,\n                    xmax=0.45,\n                    linewidth=1,\n                    color='b'\n                )\n                ax.set_frame_on(False)\n\n                # this is the output instance\n                finderpng = StrIO()\n                finderfig.savefig(finderpng,\n                                  bbox_inches='tight',\n                                  pad_inches=0.0, format='png')\n                plt.close()\n\n                # encode the finderpng instance to base64\n                finderpng.seek(0)\n                finderb64 = base64.b64encode(finderpng.read())\n\n                # close the stringio buffer\n                finderpng.close()\n\n            else:\n\n                finderb64 = None\n\n        except Exception as e:\n\n            LOGEXCEPTION('could not fetch a DSS stamp for this '\n                         'object %s using coords (%.3f,%.3f)' %\n                         (objectid, objectinfo['ra'], objectinfo['decl']))\n            finderb64 = None\n            neighbors = None\n            kdt = None\n\n    # if we don't have ra, dec info, then everything is none up to this point\n    else:\n\n        finderb64 = None\n        neighbors = None\n        kdt = None\n\n    #\n    # end of finder chart operations\n    #\n\n    # now that we have the finder chart, get the rest of the object\n    # information\n\n    # get the rest of the features, these don't necessarily rely on ra, dec and\n    # should degrade gracefully if these aren't provided\n    if isinstance(objectinfo, dict):\n\n        if 'objectid' not in objectinfo and 'hatid' in objectinfo:\n            objectid = objectinfo['hatid']\n            objectinfo['objectid'] = objectid\n        elif 'objectid' in objectinfo:\n            objectid = objectinfo['objectid']\n        else:\n            objectid = os.urandom(12).hex()[:7]\n            objectinfo['objectid'] = objectid\n            LOGWARNING('no objectid found in objectinfo dict, '\n                       'making up a random one: %s')\n\n\n        # get the neighbor features and GAIA info\n        nbrfeat = neighbor_gaia_features(\n            objectinfo,\n            kdt,\n            nbrradiusarcsec,\n            verbose=False,\n            gaia_submit_timeout=gaia_submit_timeout,\n            gaia_submit_tries=gaia_submit_tries,\n            gaia_max_timeout=gaia_max_timeout,\n            gaia_mirror=gaia_mirror,\n            complete_query_later=complete_query_later,\n            search_simbad=search_simbad\n        )\n        objectinfo.update(nbrfeat)\n\n        # see if the objectinfo dict has pmra/pmdecl entries.  if it doesn't,\n        # then we'll see if the nbrfeat dict has pmra/pmdecl from GAIA. we'll\n        # set the appropriate provenance keys as well so we know where the PM\n        # came from\n        if ( ('pmra' not in objectinfo) or\n             ( ('pmra' in objectinfo) and\n               ( (objectinfo['pmra'] is None) or\n                 (not np.isfinite(objectinfo['pmra'])) ) ) ):\n\n            if 'ok' in nbrfeat['gaia_status']:\n\n                objectinfo['pmra'] = nbrfeat['gaia_pmras'][0]\n                objectinfo['pmra_err'] = nbrfeat['gaia_pmra_errs'][0]\n                objectinfo['pmra_source'] = 'gaia'\n\n                if verbose:\n                    LOGWARNING('pmRA not found in provided objectinfo dict, '\n                               'using value from GAIA')\n\n        else:\n            objectinfo['pmra_source'] = 'light curve'\n\n        if ( ('pmdecl' not in objectinfo) or\n             ( ('pmdecl' in objectinfo) and\n               ( (objectinfo['pmdecl'] is None) or\n                 (not np.isfinite(objectinfo['pmdecl'])) ) ) ):\n\n            if 'ok' in nbrfeat['gaia_status']:\n\n                objectinfo['pmdecl'] = nbrfeat['gaia_pmdecls'][0]\n                objectinfo['pmdecl_err'] = nbrfeat['gaia_pmdecl_errs'][0]\n                objectinfo['pmdecl_source'] = 'gaia'\n\n                if verbose:\n                    LOGWARNING('pmDEC not found in provided objectinfo dict, '\n                               'using value from GAIA')\n\n        else:\n            objectinfo['pmdecl_source'] = 'light curve'\n\n        #\n        # update GAIA info so it's available at the first level\n        #\n        if 'ok' in objectinfo['gaia_status']:\n            objectinfo['gaiaid'] = objectinfo['gaia_ids'][0]\n            objectinfo['gaiamag'] = objectinfo['gaia_mags'][0]\n            objectinfo['gaia_absmag'] = objectinfo['gaia_absolute_mags'][0]\n            objectinfo['gaia_parallax'] = objectinfo['gaia_parallaxes'][0]\n            objectinfo['gaia_parallax_err'] = (\n                objectinfo['gaia_parallax_errs'][0]\n            )\n            objectinfo['gaia_pmra'] = objectinfo['gaia_pmras'][0]\n            objectinfo['gaia_pmra_err'] = objectinfo['gaia_pmra_errs'][0]\n            objectinfo['gaia_pmdecl'] = objectinfo['gaia_pmdecls'][0]\n            objectinfo['gaia_pmdecl_err'] = objectinfo['gaia_pmdecl_errs'][0]\n\n        else:\n            objectinfo['gaiaid'] = None\n            objectinfo['gaiamag'] = np.nan\n            objectinfo['gaia_absmag'] = np.nan\n            objectinfo['gaia_parallax'] = np.nan\n            objectinfo['gaia_parallax_err'] = np.nan\n            objectinfo['gaia_pmra'] = np.nan\n            objectinfo['gaia_pmra_err'] = np.nan\n            objectinfo['gaia_pmdecl'] = np.nan\n            objectinfo['gaia_pmdecl_err'] = np.nan\n\n        #\n        # get the object's TIC information\n        #\n        if ('ra' in objectinfo and\n            objectinfo['ra'] is not None and\n            np.isfinite(objectinfo['ra']) and\n            'decl' in objectinfo and\n            objectinfo['decl'] is not None and\n            np.isfinite(objectinfo['decl'])):\n\n            try:\n                ticres = tic_conesearch(objectinfo['ra'],\n                                        objectinfo['decl'],\n                                        radius_arcmin=5.0/60.0,\n                                        verbose=verbose,\n                                        timeout=gaia_max_timeout,\n                                        maxtries=gaia_submit_tries)\n\n                if ticres is not None:\n\n                    with open(ticres['cachefname'],'r') as infd:\n                        ticinfo = json.load(infd)\n\n                    if ('data' in ticinfo and\n                        len(ticinfo['data']) > 0 and\n                        isinstance(ticinfo['data'][0], dict)):\n\n                        objectinfo['ticid'] = str(ticinfo['data'][0]['ID'])\n                        objectinfo['tessmag'] = ticinfo['data'][0]['Tmag']\n                        objectinfo['tic_version'] = (\n                            ticinfo['data'][0]['version']\n                        )\n                        objectinfo['tic_distarcsec'] = (\n                            ticinfo['data'][0]['dstArcSec']\n                        )\n                        objectinfo['tessmag_origin'] = (\n                            ticinfo['data'][0]['TESSflag']\n                        )\n\n                        objectinfo['tic_starprop_origin'] = (\n                            ticinfo['data'][0]['SPFlag']\n                        )\n                        objectinfo['tic_lumclass'] = (\n                            ticinfo['data'][0]['lumclass']\n                        )\n                        objectinfo['tic_teff'] = (\n                            ticinfo['data'][0]['Teff']\n                        )\n                        objectinfo['tic_teff_err'] = (\n                            ticinfo['data'][0]['e_Teff']\n                        )\n                        objectinfo['tic_logg'] = (\n                            ticinfo['data'][0]['logg']\n                        )\n                        objectinfo['tic_logg_err'] = (\n                            ticinfo['data'][0]['e_logg']\n                        )\n                        objectinfo['tic_mh'] = (\n                            ticinfo['data'][0]['MH']\n                        )\n                        objectinfo['tic_mh_err'] = (\n                            ticinfo['data'][0]['e_MH']\n                        )\n                        objectinfo['tic_radius'] = (\n                            ticinfo['data'][0]['rad']\n                        )\n                        objectinfo['tic_radius_err'] = (\n                            ticinfo['data'][0]['e_rad']\n                        )\n                        objectinfo['tic_mass'] = (\n                            ticinfo['data'][0]['mass']\n                        )\n                        objectinfo['tic_mass_err'] = (\n                            ticinfo['data'][0]['e_mass']\n                        )\n                        objectinfo['tic_density'] = (\n                            ticinfo['data'][0]['rho']\n                        )\n                        objectinfo['tic_density_err'] = (\n                            ticinfo['data'][0]['e_rho']\n                        )\n                        objectinfo['tic_luminosity'] = (\n                            ticinfo['data'][0]['lum']\n                        )\n                        objectinfo['tic_luminosity_err'] = (\n                            ticinfo['data'][0]['e_lum']\n                        )\n                        objectinfo['tic_distancepc'] = (\n                            ticinfo['data'][0]['d']\n                        )\n                        objectinfo['tic_distancepc_err'] = (\n                            ticinfo['data'][0]['e_d']\n                        )\n\n                        #\n                        # fill in any missing info using the TIC entry\n                        #\n                        if ('gaiaid' not in objectinfo or\n                            ('gaiaid' in objectinfo and\n                             (objectinfo['gaiaid'] is None))):\n                            objectinfo['gaiaid'] = ticinfo['data'][0]['GAIA']\n\n                        if ('gaiamag' not in objectinfo or\n                            ('gaiamag' in objectinfo and\n                             (objectinfo['gaiamag'] is None or\n                              not np.isfinite(objectinfo['gaiamag'])))):\n                            objectinfo['gaiamag'] = (\n                                ticinfo['data'][0]['GAIAmag']\n                            )\n                            objectinfo['gaiamag_err'] = (\n                                ticinfo['data'][0]['e_GAIAmag']\n                            )\n\n                        if ('gaia_parallax' not in objectinfo or\n                            ('gaia_parallax' in objectinfo and\n                             (objectinfo['gaia_parallax'] is None or\n                              not np.isfinite(objectinfo['gaia_parallax'])))):\n\n                            objectinfo['gaia_parallax'] = (\n                                ticinfo['data'][0]['plx']\n                            )\n                            objectinfo['gaia_parallax_err'] = (\n                                ticinfo['data'][0]['e_plx']\n                            )\n\n                            if (objectinfo['gaiamag'] is not None and\n                                np.isfinite(objectinfo['gaiamag']) and\n                                objectinfo['gaia_parallax'] is not None and\n                                np.isfinite(objectinfo['gaia_parallax'])):\n\n                                objectinfo['gaia_absmag'] = (\n                                    magnitudes.absolute_gaia_magnitude(\n                                        objectinfo['gaiamag'],\n                                        objectinfo['gaia_parallax']\n                                    )\n                                )\n\n                        if ('pmra' not in objectinfo or\n                            ('pmra' in objectinfo and\n                             (objectinfo['pmra'] is None or\n                              not np.isfinite(objectinfo['pmra'])))):\n                            objectinfo['pmra'] = ticinfo['data'][0]['pmRA']\n                            objectinfo['pmra_err'] = (\n                                ticinfo['data'][0]['e_pmRA']\n                            )\n                            objectinfo['pmra_source'] = 'TIC'\n\n                        if ('pmdecl' not in objectinfo or\n                            ('pmdecl' in objectinfo and\n                             (objectinfo['pmdecl'] is None or\n                              not np.isfinite(objectinfo['pmdecl'])))):\n                            objectinfo['pmdecl'] = ticinfo['data'][0]['pmDEC']\n                            objectinfo['pmdecl_err'] = (\n                                ticinfo['data'][0]['e_pmDEC']\n                            )\n                            objectinfo['pmdecl_source'] = 'TIC'\n\n                        if ('bmag' not in objectinfo or\n                            ('bmag' in objectinfo and\n                             (objectinfo['bmag'] is None or\n                              not np.isfinite(objectinfo['bmag'])))):\n                            objectinfo['bmag'] = ticinfo['data'][0]['Bmag']\n                            objectinfo['bmag_err'] = (\n                                ticinfo['data'][0]['e_Bmag']\n                            )\n\n                        if ('vmag' not in objectinfo or\n                            ('vmag' in objectinfo and\n                             (objectinfo['vmag'] is None or\n                              not np.isfinite(objectinfo['vmag'])))):\n                            objectinfo['vmag'] = ticinfo['data'][0]['Vmag']\n                            objectinfo['vmag_err'] = (\n                                ticinfo['data'][0]['e_Vmag']\n                            )\n\n                        if ('sdssu' not in objectinfo or\n                            ('sdssu' in objectinfo and\n                             (objectinfo['sdssu'] is None or\n                              not np.isfinite(objectinfo['sdssu'])))):\n                            objectinfo['sdssu'] = ticinfo['data'][0]['umag']\n                            objectinfo['sdssu_err'] = (\n                                ticinfo['data'][0]['e_umag']\n                            )\n\n                        if ('sdssg' not in objectinfo or\n                            ('sdssg' in objectinfo and\n                             (objectinfo['sdssg'] is None or\n                              not np.isfinite(objectinfo['sdssg'])))):\n                            objectinfo['sdssg'] = ticinfo['data'][0]['gmag']\n                            objectinfo['sdssg_err'] = (\n                                ticinfo['data'][0]['e_gmag']\n                            )\n\n                        if ('sdssr' not in objectinfo or\n                            ('sdssr' in objectinfo and\n                             (objectinfo['sdssr'] is None or\n                              not np.isfinite(objectinfo['sdssr'])))):\n                            objectinfo['sdssr'] = ticinfo['data'][0]['rmag']\n                            objectinfo['sdssr_err'] = (\n                                ticinfo['data'][0]['e_rmag']\n                            )\n\n                        if ('sdssi' not in objectinfo or\n                            ('sdssi' in objectinfo and\n                             (objectinfo['sdssi'] is None or\n                              not np.isfinite(objectinfo['sdssi'])))):\n                            objectinfo['sdssi'] = ticinfo['data'][0]['imag']\n                            objectinfo['sdssi_err'] = (\n                                ticinfo['data'][0]['e_imag']\n                            )\n\n                        if ('sdssz' not in objectinfo or\n                            ('sdssz' in objectinfo and\n                             (objectinfo['sdssz'] is None or\n                              not np.isfinite(objectinfo['sdssz'])))):\n                            objectinfo['sdssz'] = ticinfo['data'][0]['zmag']\n                            objectinfo['sdssz_err'] = (\n                                ticinfo['data'][0]['e_zmag']\n                            )\n\n                        if ('jmag' not in objectinfo or\n                            ('jmag' in objectinfo and\n                             (objectinfo['jmag'] is None or\n                              not np.isfinite(objectinfo['jmag'])))):\n                            objectinfo['jmag'] = ticinfo['data'][0]['Jmag']\n                            objectinfo['jmag_err'] = (\n                                ticinfo['data'][0]['e_Jmag']\n                            )\n\n                        if ('hmag' not in objectinfo or\n                            ('hmag' in objectinfo and\n                             (objectinfo['hmag'] is None or\n                              not np.isfinite(objectinfo['hmag'])))):\n                            objectinfo['hmag'] = ticinfo['data'][0]['Hmag']\n                            objectinfo['hmag_err'] = (\n                                ticinfo['data'][0]['e_Hmag']\n                            )\n\n                        if ('kmag' not in objectinfo or\n                            ('kmag' in objectinfo and\n                             (objectinfo['kmag'] is None or\n                              not np.isfinite(objectinfo['kmag'])))):\n                            objectinfo['kmag'] = ticinfo['data'][0]['Kmag']\n                            objectinfo['kmag_err'] = (\n                                ticinfo['data'][0]['e_Kmag']\n                            )\n\n                        if ('wise1' not in objectinfo or\n                            ('wise1' in objectinfo and\n                             (objectinfo['wise1'] is None or\n                              not np.isfinite(objectinfo['wise1'])))):\n                            objectinfo['wise1'] = ticinfo['data'][0]['w1mag']\n                            objectinfo['wise1_err'] = (\n                                ticinfo['data'][0]['e_w1mag']\n                            )\n\n                        if ('wise2' not in objectinfo or\n                            ('wise2' in objectinfo and\n                             (objectinfo['wise2'] is None or\n                              not np.isfinite(objectinfo['wise2'])))):\n                            objectinfo['wise2'] = ticinfo['data'][0]['w2mag']\n                            objectinfo['wise2_err'] = (\n                                ticinfo['data'][0]['e_w2mag']\n                            )\n\n                        if ('wise3' not in objectinfo or\n                            ('wise3' in objectinfo and\n                             (objectinfo['wise3'] is None or\n                              not np.isfinite(objectinfo['wise3'])))):\n                            objectinfo['wise3'] = ticinfo['data'][0]['w3mag']\n                            objectinfo['wise3_err'] = (\n                                ticinfo['data'][0]['e_w3mag']\n                            )\n\n                        if ('wise4' not in objectinfo or\n                            ('wise4' in objectinfo and\n                             (objectinfo['wise4'] is None or\n                              not np.isfinite(objectinfo['wise4'])))):\n                            objectinfo['wise4'] = ticinfo['data'][0]['w4mag']\n                            objectinfo['wise4_err'] = (\n                                ticinfo['data'][0]['e_w4mag']\n                            )\n\n                else:\n                    LOGERROR('could not look up TIC '\n                             'information for object: %s '\n                             'at (%.3f, %.3f)' %\n                             (objectinfo['objectid'],\n                              objectinfo['ra'],\n                              objectinfo['decl']))\n\n            except Exception as e:\n\n                LOGEXCEPTION('could not look up TIC '\n                             'information for object: %s '\n                             'at (%.3f, %.3f)' %\n                             (objectinfo['objectid'],\n                              objectinfo['ra'],\n                              objectinfo['decl']))\n\n\n        # try to get the object's coord features\n        coordfeat = coord_features(objectinfo)\n\n        # get the color features\n        colorfeat = color_features(objectinfo,\n                                   deredden=deredden_object,\n                                   custom_bandpasses=custom_bandpasses,\n                                   dust_timeout=dust_timeout)\n\n        # get the object's color classification\n        colorclass = color_classification(colorfeat, coordfeat)\n\n        # update the objectinfo dict with everything\n        objectinfo.update(colorfeat)\n        objectinfo.update(coordfeat)\n        objectinfo.update(colorclass)\n\n        # put together the initial checkplot pickle dictionary\n        # this will be updated by the functions below as appropriate\n        # and will written out as a gzipped pickle at the end of processing\n        checkplotdict = {'objectid':objectid,\n                         'neighbors':neighbors,\n                         'objectinfo':objectinfo,\n                         'finderchart':finderb64,\n                         'sigclip':sigclip,\n                         'normto':normto,\n                         'normmingap':normmingap}\n\n        # add the objecttags key to objectinfo\n        checkplotdict['objectinfo']['objecttags'] = None\n\n    # if there's no objectinfo, we can't do anything.\n    else:\n\n        # empty objectinfo dict\n        checkplotdict = {'objectid':None,\n                         'neighbors':None,\n                         'objectinfo':{\n                             'available_bands':[],\n                             'available_band_labels':[],\n                             'available_dereddened_bands':[],\n                             'available_dereddened_band_labels':[],\n                             'available_colors':[],\n                             'available_color_labels':[],\n                             'bmag':None,\n                             'bmag-vmag':None,\n                             'decl':None,\n                             'hatid':None,\n                             'hmag':None,\n                             'imag-jmag':None,\n                             'jmag-kmag':None,\n                             'jmag':None,\n                             'kmag':None,\n                             'ndet':None,\n                             'network':None,\n                             'objecttags':None,\n                             'pmdecl':None,\n                             'pmdecl_err':None,\n                             'pmra':None,\n                             'pmra_err':None,\n                             'propermotion':None,\n                             'ra':None,\n                             'rpmj':None,\n                             'sdssg':None,\n                             'sdssi':None,\n                             'sdssr':None,\n                             'stations':None,\n                             'twomassid':None,\n                             'ucac4id':None,\n                             'vmag':None\n                         },\n                         'finderchart':None,\n                         'sigclip':sigclip,\n                         'normto':normto,\n                         'normmingap':normmingap}\n\n    # end of objectinfo processing\n\n    # add the varinfo dict\n    if isinstance(varinfo, dict):\n        checkplotdict['varinfo'] = varinfo\n    else:\n        checkplotdict['varinfo'] = {\n            'objectisvar':None,\n            'vartags':None,\n            'varisperiodic':None,\n            'varperiod':None,\n            'varepoch':None,\n        }\n\n    return checkplotdict", "response": "This function returns the finder chart and object information as a dict."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _pkl_magseries_plot(stimes, smags, serrs,\n                        plotdpi=100,\n                        magsarefluxes=False):\n    '''This returns the magseries plot PNG as base64, plus arrays as dict.\n\n    Parameters\n    ----------\n\n    stimes,smags,serrs : np.array\n        The mag/flux time-series arrays along with associated errors. These\n        should all have been run through nan-stripping and sigma-clipping\n        beforehand.\n\n    plotdpi : int\n        The resolution of the plot to make in DPI.\n\n    magsarefluxes : bool\n        If True, indicates the input time-series is fluxes and not mags so the\n        plot y-axis direction and range can be set appropriately.\n\n    Returns\n    -------\n\n    dict\n        A dict of the following form is returned::\n\n            {'magseries': {'plot': base64 encoded str representation of the\n                                   magnitude/flux time-series plot,\n                           'times': the `stimes` array,\n                           'mags': the `smags` array,\n                           'errs': the 'serrs' array}}\n\n        The dict is returned in this format so it can be directly incorporated\n        in a checkplotdict, using Python's dict `update()` method.\n\n    '''\n\n    scaledplottime = stimes - npmin(stimes)\n\n    # open the figure instance\n    magseriesfig = plt.figure(figsize=(7.5,4.8),dpi=plotdpi)\n\n    plt.plot(scaledplottime,\n             smags,\n             marker='o',\n             ms=2.0, ls='None',mew=0,\n             color='green',\n             rasterized=True)\n\n    # flip y axis for mags\n    if not magsarefluxes:\n        plot_ylim = plt.ylim()\n        plt.ylim((plot_ylim[1], plot_ylim[0]))\n\n    # set the x axis limit\n    plt.xlim((npmin(scaledplottime)-2.0,\n              npmax(scaledplottime)+2.0))\n\n    # make a grid\n    plt.grid(color='#a9a9a9',\n             alpha=0.9,\n             zorder=0,\n             linewidth=1.0,\n             linestyle=':')\n\n    # make the x and y axis labels\n    plot_xlabel = 'JD - %.3f' % npmin(stimes)\n    if magsarefluxes:\n        plot_ylabel = 'flux'\n    else:\n        plot_ylabel = 'magnitude'\n\n    plt.xlabel(plot_xlabel)\n    plt.ylabel(plot_ylabel)\n\n    # fix the yaxis ticks (turns off offset and uses the full\n    # value of the yaxis tick)\n    plt.gca().get_yaxis().get_major_formatter().set_useOffset(False)\n    plt.gca().get_xaxis().get_major_formatter().set_useOffset(False)\n\n    # this is the output instance\n    magseriespng = StrIO()\n    magseriesfig.savefig(magseriespng,\n                         # bbox_inches='tight',\n                         pad_inches=0.05, format='png')\n    plt.close()\n\n    # encode the finderpng instance to base64\n    magseriespng.seek(0)\n    magseriesb64 = base64.b64encode(magseriespng.read())\n\n    # close the stringio buffer\n    magseriespng.close()\n\n    checkplotdict = {\n        'magseries':{\n            'plot':magseriesb64,\n            'times':stimes,\n            'mags':smags,\n            'errs':serrs\n        }\n    }\n\n    return checkplotdict", "response": "This function returns the magseries plot PNG as base64 plus arrays as dict."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _pkl_phased_magseries_plot(\n        checkplotdict,\n        lspmethod,\n        periodind,\n        stimes, smags, serrs,\n        varperiod, varepoch,\n        lspmethodind=0,\n        phasewrap=True,\n        phasesort=True,\n        phasebin=0.002,\n        minbinelems=7,\n        plotxlim=(-0.8,0.8),\n        plotdpi=100,\n        bestperiodhighlight=None,\n        xgridlines=None,\n        xliminsetmode=False,\n        magsarefluxes=False,\n        directreturn=False,\n        overplotfit=None,\n        verbose=True,\n        override_pfmethod=None\n):\n    '''This returns the phased magseries plot PNG as base64 plus info as a dict.\n\n    Parameters\n    ----------\n\n    checkplotdict : dict\n        This is an existing checkplotdict to update. If it's None or\n        `directreturn` = True, then the generated dict result for this magseries\n        plot will be returned directly.\n\n    lspmethod : str\n        lspmethod is a string indicating the type of period-finding algorithm\n        that produced the period. If this is not in\n        `astrobase.plotbase.METHODSHORTLABELS`, it will be used verbatim. In\n        most cases, this will come directly from the lspinfo dict produced by a\n        period-finder function.\n\n    periodind : int\n        This is the index of the current periodogram period being operated\n        on::\n\n            If == 0 -> best period and `bestperiodhighlight` is applied if not\n                       None\n            If > 0   -> some other peak of the periodogram\n            If == -1 -> special mode w/ no periodogram labels and enabled\n                        highlight\n\n    stimes,smags,serrs : np.array\n        The mag/flux time-series arrays along with associated errors. These\n        should all have been run through nan-stripping and sigma-clipping\n        beforehand.\n\n    varperiod : float or None\n        The period to use for this phased light curve plot tile.\n\n    varepoch : 'min' or float or list of lists or None\n        The epoch to use for this phased light curve plot tile. If this is a\n        float, will use the provided value directly. If this is 'min', will\n        automatically figure out the time-of-minimum of the phased light\n        curve. If this is None, will use the mimimum value of `stimes` as the\n        epoch of the phased light curve plot. If this is a list of lists, will\n        use the provided value of `lspmethodind` to look up the current\n        period-finder method and the provided value of `periodind` to look up\n        the epoch associated with that method and the current period. This is\n        mostly only useful when `twolspmode` is True.\n\n    phasewrap : bool\n        If this is True, the phased time-series will be wrapped around\n        phase 0.0.\n\n    phasesort : bool\n        If True, will sort the phased light curve in order of increasing phase.\n\n    phasebin: float\n        The bin size to use to group together measurements closer than this\n        amount in phase. This is in units of phase. If this is a float, a\n        phase-binned version of the phased light curve will be overplotted on\n        top of the regular phased light curve.\n\n    minbinelems : int\n        The minimum number of elements required per phase bin to include it in\n        the phased LC plot.\n\n    plotxlim : sequence of two floats or None\n        The x-range (min, max) of the phased light curve plot. If None, will be\n        determined automatically.\n\n    plotdpi : int\n        The resolution of the output plot PNGs in dots per inch.\n\n    bestperiodhighlight : str or None\n        If not None, this is a str with a matplotlib color specification to use\n        as the background color to highlight the phased light curve plot of the\n        'best' period and epoch combination. If None, no highlight will be\n        applied.\n\n    xgridlines : list of floats or None\n        If this is provided, must be a list of floats corresponding to the phase\n        values where to draw vertical dashed lines as a means of highlighting\n        these.\n\n    xliminsetmode : bool\n        If this is True, the generated phased light curve plot will use the\n        values of `plotxlim` as the main plot x-axis limits (i.e. zoomed-in if\n        `plotxlim` is a range smaller than the full phase range), and will show\n        the full phased light curve plot as an smaller inset. Useful for\n        planetary transit light curves.\n\n    magsarefluxes : bool\n        If True, indicates the input time-series is fluxes and not mags so the\n        plot y-axis direction and range can be set appropriately.\n\n    directreturn : bool\n        If this set to True, will return only the dict corresponding to the\n        phased LC plot for the input `periodind` and `lspmethod` and not return\n        this result embedded in a checkplotdict.\n\n    overplotfit : dict\n        If this is provided, it must be a dict of the form returned by one of\n        the astrobase.lcfit.fit_XXXXX_magseries functions. This can be used to\n        overplot a light curve model fit on top of the phased light curve plot\n        returned by this function. The `overplotfit` dict has the following\n        form, including at least the keys listed here::\n\n            {'fittype':str: name of fit method,\n            'fitchisq':float: the chi-squared value of the fit,\n            'fitredchisq':float: the reduced chi-squared value of the fit,\n            'fitinfo':{'fitmags':array: model mags or fluxes from fit function},\n            'magseries':{'times':array: times where the fitmags are evaluated}}\n\n        `fitmags` and `times` should all be of the same size. The input\n        `overplotfit` dict is copied over to the checkplotdict for each specific\n        phased LC plot to save all of this information for use later.\n\n    verbose : bool\n        If True, will indicate progress and warn about problems.\n\n    override_pfmethod : str or None\n        This is used to set a custom label for the periodogram method. Normally,\n        this is taken from the 'method' key in the input `lspinfo` dict, but if\n        you want to override the output method name, provide this as a string\n        here. This can be useful if you have multiple results you want to\n        incorporate into a checkplotdict from a single period-finder (e.g. if\n        you ran BLS over several period ranges separately).\n\n    Returns\n    -------\n\n    dict\n        Returns a dict of the following form::\n\n            {lspmethod: {'plot': the phased LC plot as base64 str,\n                         'period': the period used for this phased LC,\n                         'epoch': the epoch used for this phased LC,\n                         'phase': phase value array,\n                         'phasedmags': mags/fluxes sorted in phase order,\n                         'binphase': array of binned phase values,\n                         'binphasedmags': mags/fluxes sorted in binphase order,\n                         'phasewrap': value of the input `phasewrap` kwarg,\n                         'phasesort': value of the input `phasesort` kwarg,\n                         'phasebin': value of the input `phasebin` kwarg,\n                         'minbinelems': value of the input `minbinelems` kwarg,\n                         'plotxlim': value of the input `plotxlim` kwarg,\n                         'lcfit': the provided `overplotfit` dict}}\n\n        The dict is in this form because we can use Python dicts' `update()`\n        method to update an existing checkplotdict. If `returndirect` is True,\n        only the inner dict is returned.\n\n    '''\n\n    # open the figure instance\n    phasedseriesfig = plt.figure(figsize=(7.5,4.8),dpi=plotdpi)\n\n    plotvarepoch = None\n\n    # figure out the epoch, if it's None, use the min of the time\n    if varepoch is None:\n        plotvarepoch = npmin(stimes)\n\n    # if the varepoch is 'min', then fit a spline to the light curve\n    # phased using the min of the time, find the fit mag minimum and use\n    # the time for that as the varepoch\n    elif isinstance(varepoch,str) and varepoch == 'min':\n\n        try:\n            spfit = spline_fit_magseries(stimes,\n                                         smags,\n                                         serrs,\n                                         varperiod,\n                                         magsarefluxes=magsarefluxes,\n                                         sigclip=None,\n                                         verbose=verbose)\n            plotvarepoch = spfit['fitinfo']['fitepoch']\n            if len(plotvarepoch) != 1:\n                plotvarepoch = plotvarepoch[0]\n\n\n        except Exception as e:\n\n            LOGERROR('spline fit failed, trying SavGol fit')\n\n            sgfit = savgol_fit_magseries(stimes,\n                                         smags,\n                                         serrs,\n                                         varperiod,\n                                         sigclip=None,\n                                         magsarefluxes=magsarefluxes,\n                                         verbose=verbose)\n            plotvarepoch = sgfit['fitinfo']['fitepoch']\n            if len(plotvarepoch) != 1:\n                plotvarepoch = plotvarepoch[0]\n\n        finally:\n\n            if plotvarepoch is None:\n\n                LOGERROR('could not find a min epoch time, '\n                         'using min(times) as the epoch for '\n                         'the phase-folded LC')\n\n                plotvarepoch = npmin(stimes)\n\n    # special case with varepoch lists per each period-finder method\n    elif isinstance(varepoch, list):\n\n        try:\n            thisvarepochlist = varepoch[lspmethodind]\n            plotvarepoch = thisvarepochlist[periodind]\n        except Exception as e:\n            LOGEXCEPTION(\n                \"varepoch provided in list form either doesn't match \"\n                \"the length of nbestperiods from the period-finder \"\n                \"result, or something else went wrong. using min(times) \"\n                \"as the epoch instead\"\n            )\n            plotvarepoch = npmin(stimes)\n\n    # the final case is to use the provided varepoch directly\n    else:\n        plotvarepoch = varepoch\n\n\n    if verbose:\n        LOGINFO('plotting %s phased LC with period %s: %.6f, epoch: %.5f' %\n                (lspmethod, periodind, varperiod, plotvarepoch))\n\n    # make the plot title based on the lspmethod\n    if periodind == 0:\n        plottitle = '%s best period: %.6f d - epoch: %.5f' % (\n            (METHODSHORTLABELS[lspmethod] if lspmethod in METHODSHORTLABELS\n             else lspmethod),\n            varperiod,\n            plotvarepoch\n        )\n    elif periodind > 0:\n        plottitle = '%s peak %s: %.6f d - epoch: %.5f' % (\n            (METHODSHORTLABELS[lspmethod] if lspmethod in METHODSHORTLABELS\n             else lspmethod),\n            periodind+1,\n            varperiod,\n            plotvarepoch\n        )\n    elif periodind == -1:\n        plottitle = '%s period: %.6f d - epoch: %.5f' % (\n            lspmethod,\n            varperiod,\n            plotvarepoch\n        )\n\n\n    # phase the magseries\n    phasedlc = phase_magseries(stimes,\n                               smags,\n                               varperiod,\n                               plotvarepoch,\n                               wrap=phasewrap,\n                               sort=phasesort)\n    plotphase = phasedlc['phase']\n    plotmags = phasedlc['mags']\n\n    # if we're supposed to bin the phases, do so\n    if phasebin:\n\n        binphasedlc = phase_bin_magseries(plotphase,\n                                          plotmags,\n                                          binsize=phasebin,\n                                          minbinelems=minbinelems)\n        binplotphase = binphasedlc['binnedphases']\n        binplotmags = binphasedlc['binnedmags']\n\n    else:\n        binplotphase = None\n        binplotmags = None\n\n\n    # finally, make the phased LC plot\n    plt.plot(plotphase,\n             plotmags,\n             marker='o',\n             ms=2.0, ls='None',mew=0,\n             color='gray',\n             rasterized=True)\n\n    # overlay the binned phased LC plot if we're making one\n    if phasebin:\n        plt.plot(binplotphase,\n                 binplotmags,\n                 marker='o',\n                 ms=4.0, ls='None',mew=0,\n                 color='#1c1e57',\n                 rasterized=True)\n\n\n    # if we're making a overplotfit, then plot the fit over the other stuff\n    if overplotfit and isinstance(overplotfit, dict):\n\n        fitmethod = overplotfit['fittype']\n        fitredchisq = overplotfit['fitredchisq']\n\n        plotfitmags = overplotfit['fitinfo']['fitmags']\n        plotfittimes = overplotfit['magseries']['times']\n\n        # phase the fit magseries\n        fitphasedlc = phase_magseries(plotfittimes,\n                                      plotfitmags,\n                                      varperiod,\n                                      plotvarepoch,\n                                      wrap=phasewrap,\n                                      sort=phasesort)\n        plotfitphase = fitphasedlc['phase']\n        plotfitmags = fitphasedlc['mags']\n\n        plotfitlabel = (r'%s fit ${\\chi}^2/{\\mathrm{dof}} = %.3f$' %\n                        (fitmethod, fitredchisq))\n\n        # plot the fit phase and mags\n        plt.plot(plotfitphase, plotfitmags,'k-',\n                 linewidth=3, rasterized=True,label=plotfitlabel)\n\n        plt.legend(loc='upper left', frameon=False)\n\n    # flip y axis for mags\n    if not magsarefluxes:\n        plot_ylim = plt.ylim()\n        plt.ylim((plot_ylim[1], plot_ylim[0]))\n\n    # set the x axis limit\n    if not plotxlim:\n        plt.xlim((npmin(plotphase)-0.1,\n                  npmax(plotphase)+0.1))\n    else:\n        plt.xlim((plotxlim[0],plotxlim[1]))\n\n    # make a grid\n    ax = plt.gca()\n    if isinstance(xgridlines, (list, tuple)):\n        ax.set_xticks(xgridlines, minor=False)\n\n    plt.grid(color='#a9a9a9',\n             alpha=0.9,\n             zorder=0,\n             linewidth=1.0,\n             linestyle=':')\n\n\n    # make the x and y axis labels\n    plot_xlabel = 'phase'\n    if magsarefluxes:\n        plot_ylabel = 'flux'\n    else:\n        plot_ylabel = 'magnitude'\n\n    plt.xlabel(plot_xlabel)\n    plt.ylabel(plot_ylabel)\n\n    # fix the yaxis ticks (turns off offset and uses the full\n    # value of the yaxis tick)\n    plt.gca().get_yaxis().get_major_formatter().set_useOffset(False)\n    plt.gca().get_xaxis().get_major_formatter().set_useOffset(False)\n\n    # set the plot title\n    plt.title(plottitle)\n\n    # make sure the best period phased LC plot stands out\n    if (periodind == 0 or periodind == -1) and bestperiodhighlight:\n        if MPLVERSION >= (2,0,0):\n            plt.gca().set_facecolor(bestperiodhighlight)\n        else:\n            plt.gca().set_axis_bgcolor(bestperiodhighlight)\n\n    # if we're making an inset plot showing the full range\n    if (plotxlim and isinstance(plotxlim, (list, tuple)) and\n        len(plotxlim) == 2 and xliminsetmode is True):\n\n        # bump the ylim of the plot so that the inset can fit in this axes plot\n        axesylim = plt.gca().get_ylim()\n\n        if magsarefluxes:\n            plt.gca().set_ylim(\n                axesylim[0],\n                axesylim[1] + 0.5*npabs(axesylim[1]-axesylim[0])\n            )\n        else:\n            plt.gca().set_ylim(\n                axesylim[0],\n                axesylim[1] - 0.5*npabs(axesylim[1]-axesylim[0])\n            )\n\n        # put the inset axes in\n        inset = inset_axes(plt.gca(), width=\"40%\", height=\"40%\", loc=1)\n\n        # make the scatter plot for the phased LC plot\n        inset.plot(plotphase,\n                   plotmags,\n                   marker='o',\n                   ms=2.0, ls='None',mew=0,\n                   color='gray',\n                   rasterized=True)\n\n        if phasebin:\n            # make the scatter plot for the phased LC plot\n            inset.plot(binplotphase,\n                       binplotmags,\n                       marker='o',\n                       ms=4.0, ls='None',mew=0,\n                       color='#1c1e57',\n                       rasterized=True)\n\n        # show the full phase coverage\n        if phasewrap:\n            inset.set_xlim(-0.2,0.8)\n        else:\n            inset.set_xlim(-0.1,1.1)\n\n        # flip y axis for mags\n        if not magsarefluxes:\n            inset_ylim = inset.get_ylim()\n            inset.set_ylim((inset_ylim[1], inset_ylim[0]))\n\n        # set the plot title\n        inset.text(0.5,0.9,'full phased light curve',\n                   ha='center',va='center',transform=inset.transAxes)\n        # don't show axes labels or ticks\n        inset.set_xticks([])\n        inset.set_yticks([])\n\n    # this is the output instance\n    phasedseriespng = StrIO()\n    phasedseriesfig.savefig(phasedseriespng,\n                            # bbox_inches='tight',\n                            pad_inches=0.0, format='png')\n    plt.close()\n\n    # encode the finderpng instance to base64\n    phasedseriespng.seek(0)\n    phasedseriesb64 = base64.b64encode(phasedseriespng.read())\n\n    # close the stringio buffer\n    phasedseriespng.close()\n\n    # this includes a fitinfo dict if one is provided in overplotfit\n    retdict = {\n        'plot':phasedseriesb64,\n        'period':varperiod,\n        'epoch':plotvarepoch,\n        'phase':plotphase,\n        'phasedmags':plotmags,\n        'binphase':binplotphase,\n        'binphasedmags':binplotmags,\n        'phasewrap':phasewrap,\n        'phasesort':phasesort,\n        'phasebin':phasebin,\n        'minbinelems':minbinelems,\n        'plotxlim':plotxlim,\n        'lcfit':overplotfit,\n    }\n\n    # if we're returning stuff directly, i.e. not being used embedded within\n    # the checkplot_dict function\n    if directreturn or checkplotdict is None:\n\n        return retdict\n\n    # this requires the checkplotdict to be present already, we'll just update\n    # it at the appropriate lspmethod and periodind\n    else:\n\n        if override_pfmethod:\n            checkplotdict[override_pfmethod][periodind] = retdict\n        else:\n            checkplotdict[lspmethod][periodind] = retdict\n\n        return checkplotdict", "response": "This function returns the phased magseries plot PNG as base64 plus info as a dict."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef prewhiten_magseries(times, mags, errs,\n                        whitenperiod,\n                        whitenparams,\n                        sigclip=3.0,\n                        magsarefluxes=False,\n                        plotfit=None,\n                        plotfitphasedlconly=True,\n                        rescaletomedian=True):\n    '''Removes a periodic sinusoidal signal generated using whitenparams from\n    the input magnitude time series.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The input mag/flux time-series to prewhiten.\n\n    whitenperiod : float\n        The period of the sinusoidal signal to remove.\n\n    whitenparams : list of floats\n        This contains the Fourier amplitude and phase coefficients of the\n        sinusoidal signal to remove::\n\n            [ampl_1, ampl_2, ampl_3, ..., ampl_X,\n             pha_1, pha_2, pha_3, ..., pha_X]\n\n        where `X` is the Fourier order. These are usually the output of a\n        previous Fourier fit to the light curve (from\n        :py:func:`astrobase.lcfit.sinusoidal.fourier_fit_magseries` for\n        example).\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    magsarefluxes : bool\n        If True, will treat the input values of `mags` as fluxes for purposes of\n        plotting the fit and sig-clipping.\n\n    plotfit : str or False\n        If this is a string, this function will make a plot showing the effect\n        of the pre-whitening on the mag/flux time-series and write the plot to\n        the path specified here.\n\n    plotfitphasedlconly : bool\n        If True, will plot only the phased LC for showing the effect of\n        pre-whitening, and skip plotting the unphased LC.\n\n    rescaletomedian : bool\n        If this is True, then we add back the constant median term of the\n        magnitudes to the final pre-whitened mag series.\n\n    Returns\n    -------\n\n    dict\n        Returns a dict of the form::\n\n            {'wtimes':times array after pre-whitening,\n             'wphase':phase array after pre-whitening,\n             'wmags':mags array after pre-whitening,\n             'werrs':errs array after pre-whitening,\n             'whitenparams':the input pre-whitening params used,\n             'whitenperiod':the input pre-whitening period used,\n             'fitplotfile':the output plot file if plotfit was set}\n\n    '''\n\n    stimes, smags, serrs = sigclip_magseries(times, mags, errs,\n                                             sigclip=sigclip,\n                                             magsarefluxes=magsarefluxes)\n\n    median_mag = np.median(smags)\n\n\n    # phase the mag series using the given period and epoch = min(stimes)\n    mintime = np.min(stimes)\n\n    # calculate the unsorted phase, then sort it\n    iphase = (\n        (stimes - mintime)/whitenperiod -\n        np.floor((stimes - mintime)/whitenperiod)\n    )\n    phasesortind = np.argsort(iphase)\n\n    # these are the final quantities to use for the Fourier fits\n    phase = iphase[phasesortind]\n    pmags = smags[phasesortind]\n    perrs = serrs[phasesortind]\n\n    # get the times sorted in phase order (useful to get the fit mag minimum\n    # with respect to phase -- the light curve minimum)\n    ptimes = stimes[phasesortind]\n\n    # now subtract the harmonic series from the phased LC\n    # these are still in phase order\n    wmags = pmags - _fourier_func(whitenparams, phase, pmags)\n\n    # resort everything by time order\n    wtimeorder = np.argsort(ptimes)\n    wtimes = ptimes[wtimeorder]\n    wphase = phase[wtimeorder]\n    wmags = wmags[wtimeorder]\n    werrs = perrs[wtimeorder]\n\n    if rescaletomedian:\n        wmags = wmags + median_mag\n\n    # prepare the returndict\n    returndict = {'wtimes':wtimes,  # these are in the new time order\n                  'wphase':wphase,\n                  'wmags':wmags,\n                  'werrs':werrs,\n                  'whitenparams':whitenparams,\n                  'whitenperiod':whitenperiod}\n\n\n    # make the fit plot if required\n    if plotfit and (isinstance(plotfit, str) or isinstance(plotfit, Strio)):\n\n        if plotfitphasedlconly:\n            plt.figure(figsize=(10,4.8))\n        else:\n            plt.figure(figsize=(16,9.6))\n\n        if plotfitphasedlconly:\n\n            # phased series before whitening\n            plt.subplot(121)\n            plt.plot(phase,pmags,\n                     marker='.',\n                     color='k',\n                     linestyle='None',\n                     markersize=2.0,\n                     markeredgewidth=0)\n\n            if not magsarefluxes:\n                plt.gca().invert_yaxis()\n                plt.ylabel('magnitude')\n            else:\n                plt.ylabel('fluxes')\n\n            plt.xlabel('phase')\n            plt.title('phased LC before pre-whitening')\n\n            # phased series after whitening\n            plt.subplot(122)\n            plt.plot(wphase,wmags,\n                     marker='.',\n                     color='g',\n                     linestyle='None',\n                     markersize=2.0,\n                     markeredgewidth=0)\n\n            if not magsarefluxes:\n                plt.gca().invert_yaxis()\n                plt.ylabel('magnitude')\n            else:\n                plt.ylabel('fluxes')\n\n            plt.xlabel('phase')\n            plt.title('phased LC after pre-whitening')\n\n        else:\n\n            # time series before whitening\n            plt.subplot(221)\n            plt.plot(stimes,smags,\n                     marker='.',\n                     color='k',\n                     linestyle='None',\n                     markersize=2.0,\n                     markeredgewidth=0)\n\n            if not magsarefluxes:\n                plt.gca().invert_yaxis()\n                plt.ylabel('magnitude')\n            else:\n                plt.ylabel('fluxes')\n\n            plt.xlabel('JD')\n            plt.title('LC before pre-whitening')\n\n            # time series after whitening\n            plt.subplot(222)\n            plt.plot(wtimes,wmags,\n                     marker='.',\n                     color='g',\n                     linestyle='None',\n                     markersize=2.0,\n                     markeredgewidth=0)\n\n            if not magsarefluxes:\n                plt.gca().invert_yaxis()\n                plt.ylabel('magnitude')\n            else:\n                plt.ylabel('fluxes')\n\n            plt.xlabel('JD')\n            plt.title('LC after pre-whitening with period: %.6f' % whitenperiod)\n\n            # phased series before whitening\n            plt.subplot(223)\n            plt.plot(phase,pmags,\n                     marker='.',\n                     color='k',\n                     linestyle='None',\n                     markersize=2.0,\n                     markeredgewidth=0)\n\n            if not magsarefluxes:\n                plt.gca().invert_yaxis()\n                plt.ylabel('magnitude')\n            else:\n                plt.ylabel('fluxes')\n\n            plt.xlabel('phase')\n            plt.title('phased LC before pre-whitening')\n\n            # phased series after whitening\n            plt.subplot(224)\n            plt.plot(wphase,wmags,\n                     marker='.',\n                     color='g',\n                     linestyle='None',\n                     markersize=2.0,\n                     markeredgewidth=0)\n\n            if not magsarefluxes:\n                plt.gca().invert_yaxis()\n                plt.ylabel('magnitude')\n            else:\n                plt.ylabel('fluxes')\n\n            plt.xlabel('phase')\n            plt.title('phased LC after pre-whitening')\n\n        plt.tight_layout()\n        plt.savefig(plotfit, format='png', pad_inches=0.0)\n        plt.close()\n\n        if isinstance(plotfit, str) or isinstance(plotfit, Strio):\n            returndict['fitplotfile'] = plotfit\n\n    return returndict", "response": "This function removes a periodic sinusoidal signal generated using whitenparams from the input magnitude time - series."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef mask_signal(times, mags, errs,\n                signalperiod,\n                signalepoch,\n                magsarefluxes=False,\n                maskphases=(0,0,0.5,1.0),\n                maskphaselength=0.1,\n                plotfit=None,\n                plotfitphasedlconly=True,\n                sigclip=30.0):\n    '''This removes repeating signals in the magnitude time series.\n\n    Useful for masking planetary transit signals in light curves to search for\n    other variability.\n\n    A small worked example of using this and `prewhiten_magseries` above:\n\n    https://github.com/waqasbhatti/astrobase/issues/77#issuecomment-463803558\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The input mag/flux time-series to run the masking on.\n\n    signalperiod : float\n        The period of the signal to mask.\n\n    signalepoch : float\n        The epoch of the signal to mask.\n\n    magsarefluxes : bool\n        Set to True if `mags` is actually an array of fluxes.\n\n    maskphases : sequence of floats\n        This defines which phase values will be masked. For each item in this\n        sequence, this function will mask a length of phase given by\n        `maskphaselength` centered on each `maskphases` value, and remove all LC\n        points in these regions from the light curve.\n\n    maskphaselength : float\n        The length in phase to mask for each phase value provided in\n        `maskphases`.\n\n    plotfit : str or None\n        If provided as a str, indicates the output plot file.\n\n    plotfitphasedlconly : bool\n        If True, will only plot the effect of masking the signal as requested on\n        the phased LC. If False, will also plot the unphased LC.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    '''\n\n    stimes, smags, serrs = sigclip_magseries(times, mags, errs,\n                                             sigclip=sigclip,\n                                             magsarefluxes=magsarefluxes)\n\n\n    # now phase the light curve using the period and epoch provided\n    phases = (\n        (stimes - signalepoch)/signalperiod -\n        np.floor((stimes - signalepoch)/signalperiod)\n    )\n\n    # mask the requested phases using the mask length (in phase units)\n    # this gets all the masks into one array\n    masks = np.array([(np.abs(phases - x) > maskphaselength)\n                      for x in maskphases])\n    # this flattens the masks to a single array for all combinations\n    masks = np.all(masks,axis=0)\n\n    # apply the mask to the times, mags, and errs\n    mphases = phases[masks]\n    mtimes = stimes[masks]\n    mmags = smags[masks]\n    merrs = serrs[masks]\n\n    returndict = {'mphases':mphases,\n                  'mtimes':mtimes,\n                  'mmags':mmags,\n                  'merrs':merrs}\n\n    # make the fit plot if required\n    if plotfit and isinstance(plotfit, str) or isinstance(plotfit, Strio):\n\n        if plotfitphasedlconly:\n            plt.figure(figsize=(10,4.8))\n        else:\n            plt.figure(figsize=(16,9.6))\n\n        if plotfitphasedlconly:\n\n            # phased series before whitening\n            plt.subplot(121)\n            plt.plot(phases,smags,\n                     marker='.',\n                     color='k',\n                     linestyle='None',\n                     markersize=2.0,\n                     markeredgewidth=0)\n\n            if not magsarefluxes:\n                plt.gca().invert_yaxis()\n                plt.ylabel('magnitude')\n            else:\n                plt.ylabel('fluxes')\n\n            plt.xlabel('phase')\n            plt.title('phased LC before signal masking')\n\n            # phased series after whitening\n            plt.subplot(122)\n            plt.plot(mphases,mmags,\n                     marker='.',\n                     color='g',\n                     linestyle='None',\n                     markersize=2.0,\n                     markeredgewidth=0)\n\n            if not magsarefluxes:\n                plt.gca().invert_yaxis()\n                plt.ylabel('magnitude')\n            else:\n                plt.ylabel('fluxes')\n\n            plt.xlabel('phase')\n            plt.title('phased LC after signal masking')\n\n        else:\n\n            # time series before whitening\n            plt.subplot(221)\n            plt.plot(stimes,smags,\n                     marker='.',\n                     color='k',\n                     linestyle='None',\n                     markersize=2.0,\n                     markeredgewidth=0)\n\n            if not magsarefluxes:\n                plt.gca().invert_yaxis()\n                plt.ylabel('magnitude')\n            else:\n                plt.ylabel('fluxes')\n\n            plt.xlabel('JD')\n            plt.title('LC before signal masking')\n\n            # time series after whitening\n            plt.subplot(222)\n            plt.plot(mtimes,mmags,\n                     marker='.',\n                     color='g',\n                     linestyle='None',\n                     markersize=2.0,\n                     markeredgewidth=0)\n\n            if not magsarefluxes:\n                plt.gca().invert_yaxis()\n                plt.ylabel('magnitude')\n            else:\n                plt.ylabel('fluxes')\n\n            plt.xlabel('JD')\n            plt.title('LC after signal masking')\n\n            # phased series before whitening\n            plt.subplot(223)\n            plt.plot(phases,smags,\n                     marker='.',\n                     color='k',\n                     linestyle='None',\n                     markersize=2.0,\n                     markeredgewidth=0)\n\n            if not magsarefluxes:\n                plt.gca().invert_yaxis()\n                plt.ylabel('magnitude')\n            else:\n                plt.ylabel('fluxes')\n\n            plt.xlabel('phase')\n            plt.title('phased LC before signal masking')\n\n            # phased series after whitening\n            plt.subplot(224)\n            plt.plot(mphases,mmags,\n                     marker='.',\n                     color='g',\n                     linestyle='None',\n                     markersize=2.0,\n                     markeredgewidth=0)\n\n            if not magsarefluxes:\n                plt.gca().invert_yaxis()\n                plt.ylabel('magnitude')\n            else:\n                plt.ylabel('fluxes')\n\n            plt.xlabel('phase')\n            plt.title('phased LC after signal masking')\n\n        plt.tight_layout()\n        plt.savefig(plotfit, format='png', pad_inches=0.0)\n        plt.close()\n\n        if isinstance(plotfit, str) or isinstance(plotfit, Strio):\n            returndict['fitplotfile'] = plotfit\n\n\n    return returndict", "response": "This function masks the signal in the time - series times and mags."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef find_lc_timegroups(lctimes, mingap=4.0):\n    '''Finds gaps in the provided time-series and indexes them into groups.\n\n    This finds the gaps in the provided `lctimes` array, so we can figure out\n    which times are for consecutive observations and which represent gaps\n    between seasons or observing eras.\n\n    Parameters\n    ----------\n\n    lctimes : array-like\n        This contains the times to analyze for gaps; assumed to be some form of\n        Julian date.\n\n    mingap : float\n        This defines how much the difference between consecutive measurements is\n        allowed to be to consider them as parts of different timegroups. By\n        default it is set to 4.0 days.\n\n    Returns\n    -------\n\n    tuple\n        A tuple of the form: `(ngroups, [slice(start_ind_1, end_ind_1), ...])`\n        is returned.  This contains the number of groups as the first element,\n        and a list of Python `slice` objects for each time-group found. These\n        can be used directly to index into the array of times to quickly get\n        measurements associated with each group.\n\n    '''\n\n    lc_time_diffs = np.diff(lctimes)\n    group_start_indices = np.where(lc_time_diffs > mingap)[0]\n\n    if len(group_start_indices) > 0:\n\n        group_indices = []\n\n        for i, gindex in enumerate(group_start_indices):\n\n            if i == 0:\n                group_indices.append(slice(0,gindex+1))\n            else:\n                group_indices.append(slice(group_start_indices[i-1]+1,gindex+1))\n\n\n        # at the end, add the slice for the last group to the end of the times\n        # array\n        group_indices.append(slice(group_start_indices[-1]+1,len(lctimes)))\n\n    # if there's no large gap in the LC, then there's only one group to worry\n    # about\n    else:\n        group_indices = [slice(0,len(lctimes))]\n\n\n    return len(group_indices), group_indices", "response": "Finds gaps in the provided time - series and indexes them into groups."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef normalize_magseries(times,\n                        mags,\n                        mingap=4.0,\n                        normto='globalmedian',\n                        magsarefluxes=False,\n                        debugmode=False):\n    '''This normalizes the magnitude time-series to a specified value.\n\n    This is used to normalize time series measurements that may have large time\n    gaps and vertical offsets in mag/flux measurement between these\n    'timegroups', either due to instrument changes or different filters.\n\n    NOTE: this works in-place! The mags array will be replaced with normalized\n    mags when this function finishes.\n\n    Parameters\n    ----------\n\n    times,mags : array-like\n        The times (assumed to be some form of JD) and mags (or flux)\n        measurements to be normalized.\n\n    mingap : float\n        This defines how much the difference between consecutive measurements is\n        allowed to be to consider them as parts of different timegroups. By\n        default it is set to 4.0 days.\n\n    normto : {'globalmedian', 'zero'} or a float\n        Specifies the normalization type::\n\n          'globalmedian' -> norms each mag to the global median of the LC column\n          'zero'         -> norms each mag to zero\n          a float        -> norms each mag to this specified float value.\n\n    magsarefluxes : bool\n        Indicates if the input `mags` array is actually an array of flux\n        measurements instead of magnitude measurements. If this is set to True,\n        then:\n\n        - if `normto` is 'zero', then the median flux is divided from each\n          observation's flux value to yield normalized fluxes with 1.0 as the\n          global median.\n\n        - if `normto` is 'globalmedian', then the global median flux value\n          across the entire time series is multiplied with each measurement.\n\n        - if `norm` is set to a `float`, then this number is multiplied with the\n          flux value for each measurement.\n\n    debugmode : bool\n        If this is True, will print out verbose info on each timegroup found.\n\n    Returns\n    -------\n\n    times,normalized_mags : np.arrays\n        Normalized magnitude values after normalization. If normalization fails\n        for some reason, `times` and `normalized_mags` will both be None.\n\n    '''\n\n    ngroups, timegroups = find_lc_timegroups(times,\n                                             mingap=mingap)\n\n    # find all the non-nan indices\n    finite_ind = np.isfinite(mags)\n\n    if any(finite_ind):\n\n        # find the global median\n        global_mag_median = np.median(mags[finite_ind])\n\n        # go through the groups and normalize them to the median for\n        # each group\n        for tgind, tg in enumerate(timegroups):\n\n            finite_ind = np.isfinite(mags[tg])\n\n            # find this timegroup's median mag and normalize the mags in\n            # it to this median\n            group_median = np.median((mags[tg])[finite_ind])\n\n            if magsarefluxes:\n                mags[tg] = mags[tg]/group_median\n            else:\n                mags[tg] = mags[tg] - group_median\n\n            if debugmode:\n                LOGDEBUG('group %s: elems %s, '\n                         'finite elems %s, median mag %s' %\n                         (tgind,\n                          len(mags[tg]),\n                          len(finite_ind),\n                          group_median))\n\n        # now that everything is normalized to 0.0, add the global median\n        # offset back to all the mags and write the result back to the dict\n        if isinstance(normto, str) and normto == 'globalmedian':\n\n            if magsarefluxes:\n                mags = mags * global_mag_median\n            else:\n                mags = mags + global_mag_median\n\n        # if the normto is a float, add everything to that float and return\n        elif isinstance(normto, float):\n\n            if magsarefluxes:\n                mags = mags * normto\n            else:\n                mags = mags + normto\n\n        # anything else just returns the normalized mags as usual\n        return times, mags\n\n    else:\n        LOGERROR('measurements are all nan!')\n        return None, None", "response": "This function normalizes the magnitude time - series measurements to a specified value."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef sigclip_magseries(times, mags, errs,\n                      sigclip=None,\n                      iterative=False,\n                      niterations=None,\n                      meanormedian='median',\n                      magsarefluxes=False):\n    '''Sigma-clips a magnitude or flux time-series.\n\n    Selects the finite times, magnitudes (or fluxes), and errors from the passed\n    values, and apply symmetric or asymmetric sigma clipping to them.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The magnitude or flux time-series arrays to sigma-clip. This doesn't\n        assume all values are finite or if they're positive/negative. All of\n        these arrays will have their non-finite elements removed, and then will\n        be sigma-clipped based on the arguments to this function.\n\n        `errs` is optional. Set it to None if you don't have values for these. A\n        'faked' `errs` array will be generated if necessary, which can be\n        ignored in the output as well.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    iterative : bool\n        If this is set to True, will perform iterative sigma-clipping. If\n        `niterations` is not set and this is True, sigma-clipping is iterated\n        until no more points are removed.\n\n    niterations : int\n        The maximum number of iterations to perform for sigma-clipping. If None,\n        the `iterative` arg takes precedence, and `iterative=True` will\n        sigma-clip until no more points are removed.  If `niterations` is not\n        None and `iterative` is False, `niterations` takes precedence and\n        iteration will occur for the specified number of iterations.\n\n    meanormedian : {'mean', 'median'}\n        Use 'mean' for sigma-clipping based on the mean value, or 'median' for\n        sigma-clipping based on the median value.  Default is 'median'.\n\n    magsareflux : bool\n        True if your \"mags\" are in fact fluxes, i.e. if \"fainter\" corresponds to\n        `mags` getting smaller.\n\n    Returns\n    -------\n\n    (stimes, smags, serrs) : tuple\n        The sigma-clipped and nan-stripped time-series.\n\n    '''\n\n    returnerrs = True\n\n    # fake the errors if they don't exist\n    # this is inconsequential to sigma-clipping\n    # we don't return these dummy values if the input errs are None\n    if errs is None:\n        # assume 0.1% errors if not given\n        # this should work for mags and fluxes\n        errs = 0.001*mags\n        returnerrs = False\n\n    # filter the input times, mags, errs; do sigclipping and normalization\n    find = npisfinite(times) & npisfinite(mags) & npisfinite(errs)\n    ftimes, fmags, ferrs = times[find], mags[find], errs[find]\n\n    # get the center value and stdev\n    if meanormedian == 'median':  # stddev = 1.483 x MAD\n\n        center_mag = npmedian(fmags)\n        stddev_mag = (npmedian(npabs(fmags - center_mag))) * 1.483\n\n    elif meanormedian == 'mean':\n\n        center_mag = npmean(fmags)\n        stddev_mag = npstddev(fmags)\n\n    else:\n        LOGWARNING(\"unrecognized meanormedian value given to \"\n                   \"sigclip_magseries: %s, defaulting to 'median'\" %\n                   meanormedian)\n        meanormedian = 'median'\n        center_mag = npmedian(fmags)\n        stddev_mag = (npmedian(npabs(fmags - center_mag))) * 1.483\n\n    # sigclip next for a single sigclip value\n    if sigclip and isinstance(sigclip, (float, int)):\n\n        if not iterative and niterations is None:\n\n            sigind = (npabs(fmags - center_mag)) < (sigclip * stddev_mag)\n\n            stimes = ftimes[sigind]\n            smags = fmags[sigind]\n            serrs = ferrs[sigind]\n\n        else:\n\n            #\n            # iterative version adapted from scipy.stats.sigmaclip\n            #\n\n            # First, if niterations is not set, iterate until covergence\n            if niterations is None:\n\n                delta = 1\n\n                this_times = ftimes\n                this_mags = fmags\n                this_errs = ferrs\n\n                while delta:\n\n                    if meanormedian == 'mean':\n                        this_center = npmean(this_mags)\n                        this_stdev = npstddev(this_mags)\n                    elif meanormedian == 'median':\n                        this_center = npmedian(this_mags)\n                        this_stdev = (\n                            npmedian(npabs(this_mags - this_center))\n                        ) * 1.483\n                    this_size = this_mags.size\n\n                    # apply the sigclip\n                    tsi = (\n                        (npabs(this_mags - this_center)) <\n                        (sigclip * this_stdev)\n                    )\n\n                    # update the arrays\n                    this_times = this_times[tsi]\n                    this_mags = this_mags[tsi]\n                    this_errs = this_errs[tsi]\n\n                    # update delta and go to the top of the loop\n                    delta = this_size - this_mags.size\n\n            else:  # If iterating only a certain number of times\n\n                this_times = ftimes\n                this_mags = fmags\n                this_errs = ferrs\n\n                iter_num = 0\n                delta = 1\n                while iter_num < niterations and delta:\n\n                    if meanormedian == 'mean':\n\n                        this_center = npmean(this_mags)\n                        this_stdev = npstddev(this_mags)\n\n                    elif meanormedian == 'median':\n\n                        this_center = npmedian(this_mags)\n                        this_stdev = (npmedian(npabs(this_mags -\n                                                     this_center))) * 1.483\n                    this_size = this_mags.size\n\n                    # apply the sigclip\n                    tsi = (\n                        (npabs(this_mags - this_center)) <\n                        (sigclip * this_stdev)\n                    )\n\n                    # update the arrays\n                    this_times = this_times[tsi]\n                    this_mags = this_mags[tsi]\n                    this_errs = this_errs[tsi]\n\n                    # update the number of iterations and delta and\n                    # go to the top of the loop\n                    delta = this_size - this_mags.size\n                    iter_num += 1\n\n            # final sigclipped versions\n            stimes, smags, serrs = this_times, this_mags, this_errs\n\n\n    # this handles sigclipping for asymmetric +ve and -ve clip values\n    elif sigclip and isinstance(sigclip, (list,tuple)) and len(sigclip) == 2:\n\n        # sigclip is passed as [dimmingclip, brighteningclip]\n        dimmingclip = sigclip[0]\n        brighteningclip = sigclip[1]\n\n        if not iterative and niterations is None:\n\n            if magsarefluxes:\n                nottoodimind = (\n                    (fmags - center_mag) > (-dimmingclip*stddev_mag)\n                )\n                nottoobrightind = (\n                    (fmags - center_mag) < (brighteningclip*stddev_mag)\n                )\n            else:\n                nottoodimind = (\n                    (fmags - center_mag) < (dimmingclip*stddev_mag)\n                )\n                nottoobrightind = (\n                    (fmags - center_mag) > (-brighteningclip*stddev_mag)\n                )\n\n            sigind = nottoodimind & nottoobrightind\n\n            stimes = ftimes[sigind]\n            smags = fmags[sigind]\n            serrs = ferrs[sigind]\n\n        else:\n\n            #\n            # iterative version adapted from scipy.stats.sigmaclip\n            #\n            if niterations is None:\n\n                delta = 1\n\n                this_times = ftimes\n                this_mags = fmags\n                this_errs = ferrs\n\n                while delta:\n\n                    if meanormedian == 'mean':\n\n                        this_center = npmean(this_mags)\n                        this_stdev = npstddev(this_mags)\n\n                    elif meanormedian == 'median':\n                        this_center = npmedian(this_mags)\n                        this_stdev = (npmedian(npabs(this_mags -\n                                                     this_center))) * 1.483\n                    this_size = this_mags.size\n\n                    if magsarefluxes:\n                        nottoodimind = (\n                            (this_mags - this_center) >\n                            (-dimmingclip*this_stdev)\n                        )\n                        nottoobrightind = (\n                            (this_mags - this_center) <\n                            (brighteningclip*this_stdev)\n                        )\n                    else:\n                        nottoodimind = (\n                            (this_mags - this_center) <\n                            (dimmingclip*this_stdev)\n                        )\n                        nottoobrightind = (\n                            (this_mags - this_center) >\n                            (-brighteningclip*this_stdev)\n                        )\n\n                    # apply the sigclip\n                    tsi = nottoodimind & nottoobrightind\n\n                    # update the arrays\n                    this_times = this_times[tsi]\n                    this_mags = this_mags[tsi]\n                    this_errs = this_errs[tsi]\n\n                    # update delta and go to top of the loop\n                    delta = this_size - this_mags.size\n\n            else:  # If iterating only a certain number of times\n                this_times = ftimes\n                this_mags = fmags\n                this_errs = ferrs\n\n                iter_num = 0\n                delta = 1\n\n                while iter_num < niterations and delta:\n\n                    if meanormedian == 'mean':\n                        this_center = npmean(this_mags)\n                        this_stdev = npstddev(this_mags)\n                    elif meanormedian == 'median':\n                        this_center = npmedian(this_mags)\n                        this_stdev = (npmedian(npabs(this_mags -\n                                                     this_center))) * 1.483\n                    this_size = this_mags.size\n\n                    if magsarefluxes:\n                        nottoodimind = (\n                            (this_mags - this_center) >\n                            (-dimmingclip*this_stdev)\n                        )\n                        nottoobrightind = (\n                            (this_mags - this_center) <\n                            (brighteningclip*this_stdev)\n                        )\n                    else:\n                        nottoodimind = (\n                            (this_mags - this_center) < (dimmingclip*this_stdev)\n                        )\n                        nottoobrightind = (\n                            (this_mags - this_center) >\n                            (-brighteningclip*this_stdev)\n                        )\n\n                    # apply the sigclip\n                    tsi = nottoodimind & nottoobrightind\n\n                    # update the arrays\n                    this_times = this_times[tsi]\n                    this_mags = this_mags[tsi]\n                    this_errs = this_errs[tsi]\n\n                    # update the number of iterations and delta\n                    # and go to top of the loop\n                    delta = this_size - this_mags.size\n                    iter_num += 1\n\n            # final sigclipped versions\n            stimes, smags, serrs = this_times, this_mags, this_errs\n\n    else:\n\n        stimes = ftimes\n        smags = fmags\n        serrs = ferrs\n\n    if returnerrs:\n        return stimes, smags, serrs\n    else:\n        return stimes, smags, None", "response": "This function takes a finite times magnitudes or fluxes and applies a symmetric or asymmetric sigma - clipping to the passed time - series and returns the resulting array."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef sigclip_magseries_with_extparams(times, mags, errs, extparams,\n                                     sigclip=None,\n                                     iterative=False,\n                                     magsarefluxes=False):\n    '''Sigma-clips a magnitude or flux time-series and associated measurement\n    arrays.\n\n    Selects the finite times, magnitudes (or fluxes), and errors from the passed\n    values, and apply symmetric or asymmetric sigma clipping to them.  Uses the\n    same array indices as these values to filter out the values of all arrays in\n    the `extparams` list. This can be useful for simultaneously sigma-clipping a\n    magnitude/flux time-series along with their associated values of external\n    parameters, such as telescope hour angle, zenith distance, temperature, moon\n    phase, etc.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The magnitude or flux time-series arrays to sigma-clip. This doesn't\n        assume all values are finite or if they're positive/negative. All of\n        these arrays will have their non-finite elements removed, and then will\n        be sigma-clipped based on the arguments to this function.\n\n        `errs` is optional. Set it to None if you don't have values for these. A\n        'faked' `errs` array will be generated if necessary, which can be\n        ignored in the output as well.\n\n    extparams : list of np.array\n        This is a list of all external parameter arrays to simultaneously filter\n        along with the magnitude/flux time-series. All of these arrays should\n        have the same length as the `times`, `mags`, and `errs` arrays.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    iterative : bool\n        If this is set to True, will perform iterative sigma-clipping. If\n        `niterations` is not set and this is True, sigma-clipping is iterated\n        until no more points are removed.\n\n    magsareflux : bool\n        True if your \"mags\" are in fact fluxes, i.e. if \"fainter\" corresponds to\n        `mags` getting smaller.\n\n    Returns\n    -------\n\n    (stimes, smags, serrs) : tuple\n        The sigma-clipped and nan-stripped time-series in `stimes`, `smags`,\n        `serrs` and the associated values of the `extparams` in `sextparams`.\n\n    '''\n\n    returnerrs = True\n\n    # fake the errors if they don't exist\n    # this is inconsequential to sigma-clipping\n    # we don't return these dummy values if the input errs are None\n    if errs is None:\n        # assume 0.1% errors if not given\n        # this should work for mags and fluxes\n        errs = 0.001*mags\n        returnerrs = False\n\n    # filter the input times, mags, errs; do sigclipping and normalization\n    find = npisfinite(times) & npisfinite(mags) & npisfinite(errs)\n    ftimes, fmags, ferrs = times[find], mags[find], errs[find]\n\n    # apply the same indices to the external parameters\n    for epi, eparr in enumerate(extparams):\n        extparams[epi] = eparr[find]\n\n    # get the median and stdev = 1.483 x MAD\n    median_mag = npmedian(fmags)\n    stddev_mag = (npmedian(npabs(fmags - median_mag))) * 1.483\n\n    # sigclip next for a single sigclip value\n    if sigclip and isinstance(sigclip, (float, int)):\n\n        if not iterative:\n\n            sigind = (npabs(fmags - median_mag)) < (sigclip * stddev_mag)\n\n            stimes = ftimes[sigind]\n            smags = fmags[sigind]\n            serrs = ferrs[sigind]\n\n            # apply the same indices to the external parameters\n            for epi, eparr in enumerate(extparams):\n                extparams[epi] = eparr[sigind]\n\n        else:\n\n            #\n            # iterative version adapted from scipy.stats.sigmaclip\n            #\n            delta = 1\n\n            this_times = ftimes\n            this_mags = fmags\n            this_errs = ferrs\n\n            while delta:\n\n                this_median = npmedian(this_mags)\n                this_stdev = (npmedian(npabs(this_mags - this_median))) * 1.483\n                this_size = this_mags.size\n\n                # apply the sigclip\n                tsi = (npabs(this_mags - this_median)) < (sigclip * this_stdev)\n\n                # update the arrays\n                this_times = this_times[tsi]\n                this_mags = this_mags[tsi]\n                this_errs = this_errs[tsi]\n\n                # apply the same indices to the external parameters\n                for epi, eparr in enumerate(extparams):\n                    extparams[epi] = eparr[tsi]\n\n                # update delta and go to the top of the loop\n                delta = this_size - this_mags.size\n\n            # final sigclipped versions\n            stimes, smags, serrs = this_times, this_mags, this_errs\n\n\n    # this handles sigclipping for asymmetric +ve and -ve clip values\n    elif sigclip and isinstance(sigclip, (list, tuple)) and len(sigclip) == 2:\n\n        # sigclip is passed as [dimmingclip, brighteningclip]\n        dimmingclip = sigclip[0]\n        brighteningclip = sigclip[1]\n\n        if not iterative:\n\n            if magsarefluxes:\n                nottoodimind = (\n                    (fmags - median_mag) > (-dimmingclip*stddev_mag)\n                )\n                nottoobrightind = (\n                    (fmags - median_mag) < (brighteningclip*stddev_mag)\n                )\n            else:\n                nottoodimind = (\n                    (fmags - median_mag) < (dimmingclip*stddev_mag)\n                )\n                nottoobrightind = (\n                    (fmags - median_mag) > (-brighteningclip*stddev_mag)\n                )\n\n            sigind = nottoodimind & nottoobrightind\n\n            stimes = ftimes[sigind]\n            smags = fmags[sigind]\n            serrs = ferrs[sigind]\n\n            # apply the same indices to the external parameters\n            for epi, eparr in enumerate(extparams):\n                extparams[epi] = eparr[sigind]\n\n        else:\n\n            #\n            # iterative version adapted from scipy.stats.sigmaclip\n            #\n            delta = 1\n\n            this_times = ftimes\n            this_mags = fmags\n            this_errs = ferrs\n\n            while delta:\n\n                this_median = npmedian(this_mags)\n                this_stdev = (npmedian(npabs(this_mags - this_median))) * 1.483\n                this_size = this_mags.size\n\n                if magsarefluxes:\n                    nottoodimind = (\n                        (this_mags - this_median) > (-dimmingclip*this_stdev)\n                    )\n                    nottoobrightind = (\n                        (this_mags - this_median) < (brighteningclip*this_stdev)\n                    )\n                else:\n                    nottoodimind = (\n                        (this_mags - this_median) < (dimmingclip*this_stdev)\n                    )\n                    nottoobrightind = (\n                        (this_mags - this_median) >\n                        (-brighteningclip*this_stdev)\n                    )\n\n                # apply the sigclip\n                tsi = nottoodimind & nottoobrightind\n\n                # update the arrays\n                this_times = this_times[tsi]\n                this_mags = this_mags[tsi]\n                this_errs = this_errs[tsi]\n\n                # apply the same indices to the external parameters\n                for epi, eparr in enumerate(extparams):\n                    extparams[epi] = eparr[tsi]\n\n                # update delta and go to top of the loop\n                delta = this_size - this_mags.size\n\n            # final sigclipped versions\n            stimes, smags, serrs = this_times, this_mags, this_errs\n\n    else:\n\n        stimes = ftimes\n        smags = fmags\n        serrs = ferrs\n\n    if returnerrs:\n        return stimes, smags, serrs, extparams\n    else:\n        return stimes, smags, None, extparams", "response": "This function will take a finite time - series and a magnitude or flux time - series and an external parameter list and apply the sigma - clipping to those values."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef phase_magseries(times, mags, period, epoch, wrap=True, sort=True):\n    '''Phases a magnitude/flux time-series using a given period and epoch.\n\n    The equation used is::\n\n        phase = (times - epoch)/period - floor((times - epoch)/period)\n\n    This phases the given magnitude timeseries using the given period and\n    epoch. If wrap is True, wraps the result around 0.0 (and returns an array\n    that has twice the number of the original elements). If sort is True,\n    returns the magnitude timeseries in phase sorted order.\n\n    Parameters\n    ----------\n\n    times,mags : np.array\n        The magnitude/flux time-series values to phase using the provided\n        `period` and `epoch`. Non-fiinite values will be removed.\n\n    period : float\n        The period to use to phase the time-series.\n\n    epoch : float\n        The epoch to phase the time-series. This is usually the time-of-minimum\n        or time-of-maximum of some periodic light curve\n        phenomenon. Alternatively, one can use the minimum time value in\n        `times`.\n\n    wrap : bool\n        If this is True, the returned phased time-series will be wrapped around\n        phase 0.0, which is useful for plotting purposes. The arrays returned\n        will have twice the number of input elements because of this wrapping.\n\n    sort : bool\n        If this is True, the returned phased time-series will be sorted in\n        increasing phase order.\n\n    Returns\n    -------\n\n    dict\n        A dict of the following form is returned::\n\n            {'phase': the phase values,\n             'mags': the mags/flux values at each phase,\n             'period': the input `period` used to phase the time-series,\n             'epoch': the input `epoch` used to phase the time-series}\n\n    '''\n\n    # find all the finite values of the magnitudes and times\n    finiteind = np.isfinite(mags) & np.isfinite(times)\n\n    finite_times = times[finiteind]\n    finite_mags = mags[finiteind]\n\n    magseries_phase = (\n        (finite_times - epoch)/period -\n        np.floor(((finite_times - epoch)/period))\n    )\n\n    outdict = {'phase':magseries_phase,\n               'mags':finite_mags,\n               'period':period,\n               'epoch':epoch}\n\n    if sort:\n        sortorder = np.argsort(outdict['phase'])\n        outdict['phase'] = outdict['phase'][sortorder]\n        outdict['mags'] = outdict['mags'][sortorder]\n\n    if wrap:\n        outdict['phase'] = np.concatenate((outdict['phase']-1.0,\n                                           outdict['phase']))\n        outdict['mags'] = np.concatenate((outdict['mags'],\n                                          outdict['mags']))\n\n    return outdict", "response": "Phases a magnitude time - series using a given period and epoch."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef time_bin_magseries(times, mags,\n                       binsize=540.0,\n                       minbinelems=7):\n    '''Bins the given mag/flux time-series in time using the bin size given.\n\n    Parameters\n    ----------\n\n    times,mags : np.array\n        The magnitude/flux time-series to bin in time. Non-finite elements will\n        be removed from these arrays. At least 10 elements in each array are\n        required for this function to operate.\n\n    binsize : float\n        The bin size to use to group together measurements closer than this\n        amount in time. This is in seconds.\n\n    minbinelems : int\n        The minimum number of elements required per bin to include it in the\n        output.\n\n    Returns\n    -------\n\n    dict\n        A dict of the following form is returned::\n\n            {'jdbin_indices': a list of the index arrays into the nan-filtered\n                              input arrays per each bin,\n             'jdbins': list of bin boundaries for each bin,\n             'nbins': the number of bins generated,\n             'binnedtimes': the time values associated with each time bin;\n                            this is the median of the times in each bin,\n             'binnedmags': the mag/flux values associated with each time bin;\n                           this is the median of the mags/fluxes in each bin}\n\n    '''\n\n    # check if the input arrays are ok\n    if not(times.shape and mags.shape and len(times) > 9 and len(mags) > 9):\n\n        LOGERROR(\"input time/mag arrays don't have enough elements\")\n        return\n\n    # find all the finite values of the magnitudes and times\n    finiteind = np.isfinite(mags) & np.isfinite(times)\n    finite_times = times[finiteind]\n    finite_mags = mags[finiteind]\n\n    # convert binsize in seconds to JD units\n    binsizejd = binsize/(86400.0)\n    nbins = int(np.ceil((np.nanmax(finite_times) -\n                         np.nanmin(finite_times))/binsizejd) + 1)\n\n    minjd = np.nanmin(finite_times)\n    jdbins = [(minjd + x*binsizejd) for x in range(nbins)]\n\n    # make a KD-tree on the JDs so we can do fast distance calculations.  we\n    # need to add a bogus y coord to make this a problem that KD-trees can\n    # solve.\n    time_coords = np.array([[x,1.0] for x in finite_times])\n    jdtree = cKDTree(time_coords)\n    binned_finite_timeseries_indices = []\n\n    collected_binned_mags = {}\n\n    for jd in jdbins:\n        # find all bin indices close to within binsizejd of this point\n        # using the cKDTree query. we use the p-norm = 1 (I think this\n        # means straight-up pairwise distance? FIXME: check this)\n        bin_indices = jdtree.query_ball_point(np.array([jd,1.0]),\n                                              binsizejd/2.0, p=1.0)\n\n        # if the bin_indices have already been collected, then we're\n        # done with this bin, move to the next one. if they haven't,\n        # then this is the start of a new bin.\n        if (bin_indices not in binned_finite_timeseries_indices and\n            len(bin_indices) >= minbinelems):\n\n            binned_finite_timeseries_indices.append(bin_indices)\n\n    # convert to ndarrays\n    binned_finite_timeseries_indices = [np.array(x) for x in\n                                        binned_finite_timeseries_indices]\n\n    collected_binned_mags['jdbins_indices'] = binned_finite_timeseries_indices\n    collected_binned_mags['jdbins'] = jdbins\n    collected_binned_mags['nbins'] = len(binned_finite_timeseries_indices)\n\n    # collect the finite_times\n    binned_jd = np.array([np.median(finite_times[x])\n                          for x in binned_finite_timeseries_indices])\n    collected_binned_mags['binnedtimes'] = binned_jd\n    collected_binned_mags['binsize'] = binsize\n\n    # median bin the magnitudes according to the calculated indices\n    collected_binned_mags['binnedmags'] = (\n        np.array([np.median(finite_mags[x])\n                  for x in binned_finite_timeseries_indices])\n    )\n\n    return collected_binned_mags", "response": "Returns a dictionary of the time - series in time using the given bin size given."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef phase_bin_magseries(phases, mags,\n                        binsize=0.005,\n                        minbinelems=7):\n    '''Bins a phased magnitude/flux time-series using the bin size provided.\n\n    Parameters\n    ----------\n\n    phases,mags : np.array\n        The phased magnitude/flux time-series to bin in phase. Non-finite\n        elements will be removed from these arrays. At least 10 elements in each\n        array are required for this function to operate.\n\n    binsize : float\n        The bin size to use to group together measurements closer than this\n        amount in phase. This is in units of phase.\n\n    minbinelems : int\n        The minimum number of elements required per bin to include it in the\n        output.\n\n    Returns\n    -------\n\n    dict\n        A dict of the following form is returned::\n\n            {'phasebin_indices': a list of the index arrays into the\n                                 nan-filtered input arrays per each bin,\n             'phasebins': list of bin boundaries for each bin,\n             'nbins': the number of bins generated,\n             'binnedphases': the phase values associated with each phase bin;\n                            this is the median of the phase value in each bin,\n             'binnedmags': the mag/flux values associated with each phase bin;\n                           this is the median of the mags/fluxes in each bin}\n\n    '''\n\n    # check if the input arrays are ok\n    if not(phases.shape and mags.shape and len(phases) > 10 and len(mags) > 10):\n\n        LOGERROR(\"input time/mag arrays don't have enough elements\")\n        return\n\n    # find all the finite values of the magnitudes and phases\n    finiteind = np.isfinite(mags) & np.isfinite(phases)\n    finite_phases = phases[finiteind]\n    finite_mags = mags[finiteind]\n\n    nbins = int(np.ceil((np.nanmax(finite_phases) -\n                         np.nanmin(finite_phases))/binsize) + 1)\n\n    minphase = np.nanmin(finite_phases)\n    phasebins = [(minphase + x*binsize) for x in range(nbins)]\n\n    # make a KD-tree on the PHASEs so we can do fast distance calculations.  we\n    # need to add a bogus y coord to make this a problem that KD-trees can\n    # solve.\n    time_coords = np.array([[x,1.0] for x in finite_phases])\n    phasetree = cKDTree(time_coords)\n    binned_finite_phaseseries_indices = []\n\n    collected_binned_mags = {}\n\n    for phase in phasebins:\n\n        # find all bin indices close to within binsize of this point using the\n        # cKDTree query. we use the p-norm = 1 for pairwise Euclidean distance.\n        bin_indices = phasetree.query_ball_point(np.array([phase,1.0]),\n                                                 binsize/2.0, p=1.0)\n\n        # if the bin_indices have already been collected, then we're\n        # done with this bin, move to the next one. if they haven't,\n        # then this is the start of a new bin.\n        if (bin_indices not in binned_finite_phaseseries_indices and\n            len(bin_indices) >= minbinelems):\n\n            binned_finite_phaseseries_indices.append(bin_indices)\n\n    # convert to ndarrays\n    binned_finite_phaseseries_indices = [np.array(x) for x in\n                                         binned_finite_phaseseries_indices]\n\n    collected_binned_mags['phasebins_indices'] = (\n        binned_finite_phaseseries_indices\n    )\n    collected_binned_mags['phasebins'] = phasebins\n    collected_binned_mags['nbins'] = len(binned_finite_phaseseries_indices)\n\n    # collect the finite_phases\n    binned_phase = np.array([np.median(finite_phases[x])\n                             for x in binned_finite_phaseseries_indices])\n    collected_binned_mags['binnedphases'] = binned_phase\n    collected_binned_mags['binsize'] = binsize\n\n    # median bin the magnitudes according to the calculated indices\n    collected_binned_mags['binnedmags'] = (\n        np.array([np.median(finite_mags[x])\n                  for x in binned_finite_phaseseries_indices])\n    )\n\n    return collected_binned_mags", "response": "This function returns a dict of the phase - bin data for a set of phases and mags."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ncalculate the total SNR of a transit assuming gaussian uncertainties.", "response": "def get_snr_of_dip(times,\n                   mags,\n                   modeltimes,\n                   modelmags,\n                   atol_normalization=1e-8,\n                   indsforrms=None,\n                   magsarefluxes=False,\n                   verbose=True,\n                   transitdepth=None,\n                   npoints_in_transit=None):\n    '''Calculate the total SNR of a transit assuming gaussian uncertainties.\n\n    `modelmags` gets interpolated onto the cadence of `mags`. The noise is\n    calculated as the 1-sigma std deviation of the residual (see below).\n\n    Following Carter et al. 2009::\n\n        Q = sqrt( \u0393 T ) * \u03b4 / \u03c3\n\n    for Q the total SNR of the transit in the r->0 limit, where::\n\n        r = Rp/Rstar,\n        T = transit duration,\n        \u03b4 = transit depth,\n        \u03c3 = RMS of the lightcurve in transit.\n        \u0393 = sampling rate\n\n    Thus \u0393 * T is roughly the number of points obtained during transit.\n    (This doesn't correctly account for the SNR during ingress/egress, but this\n    is a second-order correction).\n\n    Note this is the same total SNR as described by e.g., Kovacs et al. 2002,\n    their Equation 11.\n\n    NOTE: this only works with fluxes at the moment.\n\n    Parameters\n    ----------\n\n    times,mags : np.array\n        The input flux time-series to process.\n\n    modeltimes,modelmags : np.array\n        A transiting planet model, either from BLS, a trapezoid model, or a\n        Mandel-Agol model.\n\n    atol_normalization : float\n        The absolute tolerance to which the median of the passed model fluxes\n        must be equal to 1.\n\n    indsforrms : np.array\n        A array of bools of `len(mags)` used to select points for the RMS\n        measurement. If not passed, the RMS of the entire passed timeseries is\n        used as an approximation. Genearlly, it's best to use out of transit\n        points, so the RMS measurement is not model-dependent.\n\n    magsarefluxes : bool\n        Currently forced to be True because this function only works with\n        fluxes.\n\n    verbose : bool\n        If True, indicates progress and warns about problems.\n\n    transitdepth : float or None\n        If the transit depth is known, pass it in here. Otherwise, it is\n        calculated assuming OOT flux is 1.\n\n    npoints_in_transits : int or None\n        If the number of points in transit is known, pass it in here. Otherwise,\n        the function will guess at this value.\n\n    Returns\n    -------\n\n    (snr, transit_depth, noise) : tuple\n        The returned tuple contains the calculated SNR, transit depth, and noise\n        of the residual lightcurve calculated using the relation described\n        above.\n\n    '''\n\n    if magsarefluxes:\n        if not np.isclose(np.nanmedian(modelmags), 1, atol=atol_normalization):\n            raise AssertionError('snr calculation assumes modelmags are '\n                                 'median-normalized')\n    else:\n        raise NotImplementedError(\n            'need to implement a method for identifying in-transit points when'\n            'mags are mags, and not fluxes'\n        )\n\n    if not transitdepth:\n        # calculate transit depth from whatever model magnitudes are passed.\n        transitdepth = np.abs(np.max(modelmags) - np.min(modelmags))\n\n    # generally, mags (data) and modelmags are at different cadence.\n    # interpolate modelmags onto the cadence of mags.\n    if not len(mags) == len(modelmags):\n        from scipy.interpolate import interp1d\n\n        fn = interp1d(modeltimes, modelmags, kind='cubic', bounds_error=True,\n                      fill_value=np.nan)\n\n        modelmags = fn(times)\n\n        if verbose:\n            LOGINFO('interpolated model timeseries onto the data timeseries')\n\n    subtractedmags = mags - modelmags\n\n    if isinstance(indsforrms, np.ndarray):\n        subtractedrms = np.std(subtractedmags[indsforrms])\n        if verbose:\n            LOGINFO('using selected points to measure RMS')\n    else:\n        subtractedrms = np.std(subtractedmags)\n        if verbose:\n            LOGINFO('using all points to measure RMS')\n\n    def _get_npoints_in_transit(modelmags):\n        # assumes median-normalized fluxes are input\n        if np.nanmedian(modelmags) == 1:\n            return len(modelmags[(modelmags != 1)])\n        else:\n            raise NotImplementedError\n\n    if not npoints_in_transit:\n        npoints_in_transit = _get_npoints_in_transit(modelmags)\n\n    snr = np.sqrt(npoints_in_transit) * transitdepth/subtractedrms\n\n    if verbose:\n\n        LOGINFO('\\npoints in transit: {:d}'.format(npoints_in_transit) +\n                '\\ndepth: {:.2e}'.format(transitdepth) +\n                '\\nrms in residual: {:.2e}'.format(subtractedrms) +\n                '\\n\\t SNR: {:.2e}'.format(snr))\n\n    return snr, transitdepth, subtractedrms"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef estimate_achievable_tmid_precision(snr, t_ingress_min=10,\n                                       t_duration_hr=2.14):\n    '''Using Carter et al. 2009's estimate, calculate the theoretical optimal\n    precision on mid-transit time measurement possible given a transit of a\n    particular SNR.\n\n    The relation used is::\n\n        sigma_tc = Q^{-1} * T * sqrt(\u03b8/2)\n\n        Q = SNR of the transit.\n        T = transit duration, which is 2.14 hours from discovery paper.\n        \u03b8 = \u03c4/T = ratio of ingress to total duration\n                ~= (few minutes [guess]) / 2.14 hours\n\n    Parameters\n    ----------\n\n    snr : float\n        The measured signal-to-noise of the transit, e,g. from\n        :py:func:`astrobase.periodbase.kbls.bls_stats_singleperiod` or from\n        running the `.compute_stats()` method on an Astropy BoxLeastSquares\n        object.\n\n    t_ingress_min : float\n        The ingress duration in minutes. This is t_I to t_II in Winn (2010)\n        nomenclature.\n\n    t_duration_hr : float\n        The transit duration in hours. This is t_I to t_IV in Winn (2010)\n        nomenclature.\n\n    Returns\n    -------\n\n    float\n        Returns the precision achievable for transit-center time as calculated\n        from the relation above. This is in days.\n\n    '''\n\n    t_ingress = t_ingress_min*u.minute\n    t_duration = t_duration_hr*u.hour\n\n    theta = t_ingress/t_duration\n\n    sigma_tc = (1/snr * t_duration * np.sqrt(theta/2))\n\n    LOGINFO('assuming t_ingress = {:.1f}'.format(t_ingress))\n    LOGINFO('assuming t_duration = {:.1f}'.format(t_duration))\n    LOGINFO('measured SNR={:.2f}\\n\\t'.format(snr) +\n            '-->theoretical sigma_tc = {:.2e} = {:.2e} = {:.2e}'.format(\n                sigma_tc.to(u.minute), sigma_tc.to(u.hour), sigma_tc.to(u.day)))\n\n    return sigma_tc.to(u.day).value", "response": "Estimate the theoretical optimal base time measurement possible given a transit of a\n        and a particular SNR."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\ngives a BLS period, epoch, and transit ingress/egress points (usually from :py:func:`astrobase.periodbase.kbls.bls_stats_singleperiod`), return the times within transit durations + `extra_maskfrac` of each transit. Optionally, can use the (more accurate) trapezoidal fit period and epoch, if it's passed. Useful for inspecting individual transits, and masking them out if desired. Parameters ---------- blsd : dict This is the dict returned by :py:func:`astrobase.periodbase.kbls.bls_stats_singleperiod`. time : np.array The times from the time-series of transit observations used to calculate the initial period. extra_maskfrac : float This is the separation from in-transit points you desire, in units of the transit duration. `extra_maskfrac = 0` if you just want points inside transit (see below). trapd : dict This is a dict returned by :py:func:`astrobase.lcfit.transits.traptransit_fit_magseries` containing the trapezoid transit model. nperiodint : int This indicates how many periods backwards/forwards to try and identify transits from the epochs reported in `blsd` or `trapd`. Returns ------- (tmids_obsd, t_starts, t_ends) : tuple of np.array The returned items are:: tmids_obsd (np.ndarray): best guess of transit midtimes in lightcurve. Has length number of transits in lightcurve. t_starts (np.ndarray): t_Is - extra_maskfrac*tdur, for t_Is transit first contact point. t_ends (np.ndarray): t_Is + extra_maskfrac*tdur, for t_Is transit first contact point.", "response": "def get_transit_times(\n        blsd,\n        time,\n        extra_maskfrac,\n        trapd=None,\n        nperiodint=1000\n):\n    '''Given a BLS period, epoch, and transit ingress/egress points (usually\n    from :py:func:`astrobase.periodbase.kbls.bls_stats_singleperiod`), return\n    the times within transit durations + `extra_maskfrac` of each transit.\n\n    Optionally, can use the (more accurate) trapezoidal fit period and epoch, if\n    it's passed.  Useful for inspecting individual transits, and masking them\n    out if desired.\n\n    Parameters\n    ----------\n\n    blsd : dict\n        This is the dict returned by\n        :py:func:`astrobase.periodbase.kbls.bls_stats_singleperiod`.\n\n    time : np.array\n        The times from the time-series of transit observations used to calculate\n        the initial period.\n\n    extra_maskfrac : float\n        This is the separation from in-transit points you desire, in units of\n        the transit duration. `extra_maskfrac = 0` if you just want points\n        inside transit (see below).\n\n    trapd : dict\n        This is a dict returned by\n        :py:func:`astrobase.lcfit.transits.traptransit_fit_magseries` containing\n        the trapezoid transit model.\n\n    nperiodint : int\n        This indicates how many periods backwards/forwards to try and identify\n        transits from the epochs reported in `blsd` or `trapd`.\n\n    Returns\n    -------\n\n    (tmids_obsd, t_starts, t_ends) : tuple of np.array\n        The returned items are::\n\n            tmids_obsd (np.ndarray): best guess of transit midtimes in\n            lightcurve. Has length number of transits in lightcurve.\n\n            t_starts (np.ndarray): t_Is - extra_maskfrac*tdur, for t_Is transit\n            first contact point.\n\n            t_ends (np.ndarray): t_Is + extra_maskfrac*tdur, for t_Is transit\n            first contact point.\n\n    '''\n\n    if trapd:\n        period = trapd['fitinfo']['finalparams'][0]\n        t0 = trapd['fitinfo']['fitepoch']\n        transitduration_phase = trapd['fitinfo']['finalparams'][3]\n        tdur = period * transitduration_phase\n    else:\n        period = blsd['period']\n        t0 = blsd['epoch']\n        tdur = (\n            period *\n            (blsd['transegressbin']-blsd['transingressbin'])/blsd['nphasebins']\n        )\n        if not blsd['transegressbin'] > blsd['transingressbin']:\n\n            raise NotImplementedError(\n                'careful of the width. '\n                'this edge case must be dealt with separately.'\n            )\n\n    tmids = [t0 + ix*period for ix in range(-nperiodint,nperiodint)]\n\n    sel = (tmids > np.nanmin(time)) & (tmids < np.nanmax(time))\n    tmids_obsd = np.array(tmids)[sel]\n\n    t_Is = tmids_obsd - tdur/2\n    t_IVs = tmids_obsd + tdur/2\n\n    # focus on the times around transit\n    t_starts = t_Is - extra_maskfrac * tdur\n    t_ends = t_IVs + extra_maskfrac * tdur\n\n    return tmids_obsd, t_starts, t_ends"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ngets the transit start, middle, and end times for transits in a given time-series of observations. Parameters ---------- time,flux,err_flux : np.array The input flux time-series measurements and their associated measurement errors blsfit_savpath : str or None If provided as a str, indicates the path of the fit plot to make for a simple BLS model fit to the transit using the obtained period and epoch. trapfit_savpath : str or None If provided as a str, indicates the path of the fit plot to make for a trapezoidal transit model fit to the transit using the obtained period and epoch. sigclip : float or int or sequence of two floats/ints or None If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series. If a list of two ints/floats is provided, the function will perform an 'asymmetric' sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of \"dimming\" and \"brightening\" is set by *physics* (not the magnitude system), which is why the `magsarefluxes` kwarg must be correctly set. If `sigclip` is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output. magsarefluxes : bool This is by default True for this function, since it works on fluxes only at the moment. nworkers : int The number of parallel BLS period-finder workers to use. extra_maskfrac : float This is the separation (N) from in-transit points you desire, in units of the transit duration. `extra_maskfrac = 0` if you just want points inside transit, otherwise:: t_starts = t_Is - N*tdur, t_ends = t_IVs + N*tdur Thus setting N=0.03 masks slightly more than the guessed transit duration. Returns ------- (tmids_obsd, t_starts, t_ends) : tuple The returned items are:: tmids_obsd (np.ndarray): best guess of transit midtimes in lightcurve. Has length number of transits in lightcurve. t_starts (np.ndarray): t_Is - extra_maskfrac*tdur, for t_Is transit first contact point. t_ends (np.ndarray): t_Is + extra_maskfrac*tdur, for t_Is transit first contact point.", "response": "def given_lc_get_transit_tmids_tstarts_tends(\n        time,\n        flux,\n        err_flux,\n        blsfit_savpath=None,\n        trapfit_savpath=None,\n        magsarefluxes=True,\n        nworkers=1,\n        sigclip=None,\n        extra_maskfrac=0.03\n):\n    '''Gets the transit start, middle, and end times for transits in a given\n    time-series of observations.\n\n    Parameters\n    ----------\n\n    time,flux,err_flux : np.array\n        The input flux time-series measurements and their associated measurement\n        errors\n\n    blsfit_savpath : str or None\n        If provided as a str, indicates the path of the fit plot to make for a\n        simple BLS model fit to the transit using the obtained period and epoch.\n\n    trapfit_savpath : str or None\n        If provided as a str, indicates the path of the fit plot to make for a\n        trapezoidal transit model fit to the transit using the obtained period\n        and epoch.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    magsarefluxes : bool\n        This is by default True for this function, since it works on fluxes only\n        at the moment.\n\n    nworkers : int\n        The number of parallel BLS period-finder workers to use.\n\n    extra_maskfrac : float\n        This is the separation (N) from in-transit points you desire, in units\n        of the transit duration. `extra_maskfrac = 0` if you just want points\n        inside transit, otherwise::\n\n            t_starts = t_Is - N*tdur, t_ends = t_IVs + N*tdur\n\n        Thus setting N=0.03 masks slightly more than the guessed transit\n        duration.\n\n    Returns\n    -------\n\n    (tmids_obsd, t_starts, t_ends) : tuple\n        The returned items are::\n\n            tmids_obsd (np.ndarray): best guess of transit midtimes in\n            lightcurve. Has length number of transits in lightcurve.\n\n            t_starts (np.ndarray): t_Is - extra_maskfrac*tdur, for t_Is transit\n            first contact point.\n\n            t_ends (np.ndarray): t_Is + extra_maskfrac*tdur, for t_Is transit\n            first contact point.\n\n    '''\n\n    # first, run BLS to get an initial epoch and period.\n    endp = 1.05*(np.nanmax(time) - np.nanmin(time))/2\n\n    blsdict = kbls.bls_parallel_pfind(time, flux, err_flux,\n                                      magsarefluxes=magsarefluxes, startp=0.1,\n                                      endp=endp, maxtransitduration=0.3,\n                                      nworkers=nworkers, sigclip=sigclip)\n\n    blsd = kbls.bls_stats_singleperiod(time, flux, err_flux,\n                                       blsdict['bestperiod'],\n                                       magsarefluxes=True, sigclip=sigclip,\n                                       perioddeltapercent=5)\n    #  plot the BLS model.\n    if blsfit_savpath:\n        make_fit_plot(blsd['phases'], blsd['phasedmags'], None,\n                      blsd['blsmodel'], blsd['period'], blsd['epoch'],\n                      blsd['epoch'], blsfit_savpath,\n                      magsarefluxes=magsarefluxes)\n\n    ingduration_guess = blsd['transitduration'] * 0.2  # a guesstimate.\n    transitparams = [\n        blsd['period'], blsd['epoch'], blsd['transitdepth'],\n        blsd['transitduration'], ingduration_guess\n    ]\n\n    # fit a trapezoidal transit model; plot the resulting phased LC.\n    if trapfit_savpath:\n        trapd = traptransit_fit_magseries(time, flux, err_flux,\n                                          transitparams,\n                                          magsarefluxes=magsarefluxes,\n                                          sigclip=sigclip,\n                                          plotfit=trapfit_savpath)\n\n    # use the trapezoidal model's epoch as the guess to identify (roughly) in\n    # and out of transit points\n    tmids, t_starts, t_ends = get_transit_times(blsd,\n                                                time,\n                                                extra_maskfrac,\n                                                trapd=trapd)\n\n    return tmids, t_starts, t_ends"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _pycompress_sqlitecurve(sqlitecurve, force=False):\n    '''This just compresses the sqlitecurve. Should be independent of OS.\n\n    '''\n\n    outfile = '%s.gz' % sqlitecurve\n\n    try:\n\n        if os.path.exists(outfile) and not force:\n            os.remove(sqlitecurve)\n            return outfile\n\n        else:\n\n            with open(sqlitecurve,'rb') as infd:\n                with gzip.open(outfile,'wb') as outfd:\n                    shutil.copyfileobj(infd, outfd)\n\n            if os.path.exists(outfile):\n                os.remove(sqlitecurve)\n                return outfile\n\n    except Exception as e:\n        return None", "response": "This function compresses the sqlitecurve file and returns the filename of the compressed file."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _pyuncompress_sqlitecurve(sqlitecurve, force=False):\n    '''This just uncompresses the sqlitecurve. Should be independent of OS.\n\n    '''\n\n    outfile = sqlitecurve.replace('.gz','')\n\n    try:\n\n        if os.path.exists(outfile) and not force:\n            return outfile\n\n        else:\n\n            with gzip.open(sqlitecurve,'rb') as infd:\n                with open(outfile,'wb') as outfd:\n                    shutil.copyfileobj(infd, outfd)\n\n            # do not remove the intput file yet\n            if os.path.exists(outfile):\n                return outfile\n\n    except Exception as e:\n        return None", "response": "This just uncompresses the sqlitecurve. Should be independent of OS."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _parse_csv_header_lcc_csv_v1(headerlines):\n    '''\n    This parses the header of the LCC CSV V1 LC format.\n\n    '''\n\n    # the first three lines indicate the format name, comment char, separator\n    commentchar = headerlines[1]\n    separator = headerlines[2]\n\n    headerlines = [x.lstrip('%s ' % commentchar) for x in headerlines[3:]]\n\n    # next, find the indices of the various LC sections\n    metadatastart = headerlines.index('OBJECT METADATA')\n    columnstart = headerlines.index('COLUMN DEFINITIONS')\n    lcstart = headerlines.index('LIGHTCURVE')\n\n    metadata = ' ' .join(headerlines[metadatastart+1:columnstart-1])\n    columns = ' ' .join(headerlines[columnstart+1:lcstart-1])\n    metadata = json.loads(metadata)\n    columns = json.loads(columns)\n\n    return metadata, columns, separator", "response": "This parses the header of the LCC CSV V1 LC format."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef read_csvlc(lcfile):\n    '''This reads a HAT data server or LCC-Server produced CSV light curve\n    into an lcdict.\n\n    This will automatically figure out the format of the file\n    provided. Currently, it can read:\n\n    - legacy HAT data server CSV LCs (e.g. from\n      https://hatsouth.org/planets/lightcurves.html) with an extension of the\n      form: `.hatlc.csv.gz`.\n    - all LCC-Server produced LCC-CSV-V1 LCs (e.g. from\n      https://data.hatsurveys.org) with an extension of the form: `-csvlc.gz`.\n\n\n    Parameters\n    ----------\n\n    lcfile : str\n        The light curve file to read.\n\n    Returns\n    -------\n\n    dict\n        Returns an lcdict that can be read and used by many astrobase processing\n        functions.\n\n    '''\n\n    # read in the file and split by lines\n    if '.gz' in os.path.basename(lcfile):\n        LOGINFO('reading gzipped HATLC: %s' % lcfile)\n        infd = gzip.open(lcfile,'rb')\n    else:\n        LOGINFO('reading HATLC: %s' % lcfile)\n        infd = open(lcfile,'rb')\n\n\n    # this transparently reads LCC CSVLCs\n    lcformat_check = infd.read(12).decode()\n    if 'LCC-CSVLC' in lcformat_check:\n        infd.close()\n        return read_lcc_csvlc(lcfile)\n    else:\n        infd.seek(0)\n\n    # below is reading the HATLC v2 CSV LCs\n\n    lctext = infd.read().decode()  # argh Python 3\n    infd.close()\n\n    # figure out the header and get the LC columns\n    lcstart = lctext.index('# LIGHTCURVE\\n')\n    lcheader = lctext[:lcstart+12]\n    lccolumns = lctext[lcstart+13:].split('\\n')\n    lccolumns = [x for x in lccolumns if len(x) > 0]\n\n    # initialize the lcdict and parse the CSV header\n    lcdict = _parse_csv_header(lcheader)\n\n    # tranpose the LC rows into columns\n    lccolumns = [x.split(',') for x in lccolumns]\n    lccolumns = list(zip(*lccolumns))  # argh more Python 3\n\n    # write the columns to the dict\n    for colind, col in enumerate(lcdict['columns']):\n\n        if (col.split('_')[0] in LC_MAG_COLUMNS or\n            col.split('_')[0] in LC_ERR_COLUMNS or\n            col.split('_')[0] in LC_FLAG_COLUMNS):\n            lcdict[col] = np.array([_smartcast(x,\n                                               COLUMNDEFS[col.split('_')[0]][2])\n                                    for x in lccolumns[colind]])\n\n        elif col in COLUMNDEFS:\n            lcdict[col] = np.array([_smartcast(x,COLUMNDEFS[col][2])\n                                    for x in lccolumns[colind]])\n\n        else:\n            LOGWARNING('lcdict col %s has no formatter available' % col)\n            continue\n\n    return lcdict", "response": "This function reads a HAT - L1 or LCC - CSV - V1 light curve file into a lcdict."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef find_lc_timegroups(lctimes, mingap=4.0):\n    '''This finds the time gaps in the light curve, so we can figure out which\n    times are for consecutive observations and which represent gaps\n    between seasons.\n\n    Parameters\n    ----------\n\n    lctimes : np.array\n        This is the input array of times, assumed to be in some form of JD.\n\n    mingap : float\n        This defines how much the difference between consecutive measurements is\n        allowed to be to consider them as parts of different timegroups. By\n        default it is set to 4.0 days.\n\n    Returns\n    -------\n\n    tuple\n        A tuple of the form below is returned, containing the number of time\n        groups found and Python slice objects for each group::\n\n            (ngroups, [slice(start_ind_1, end_ind_1), ...])\n\n    '''\n\n    lc_time_diffs = [(lctimes[x] - lctimes[x-1]) for x in range(1,len(lctimes))]\n    lc_time_diffs = np.array(lc_time_diffs)\n\n    group_start_indices = np.where(lc_time_diffs > mingap)[0]\n\n    if len(group_start_indices) > 0:\n\n        group_indices = []\n\n        for i, gindex in enumerate(group_start_indices):\n\n            if i == 0:\n                group_indices.append(slice(0,gindex+1))\n            else:\n                group_indices.append(slice(group_start_indices[i-1]+1,gindex+1))\n\n\n        # at the end, add the slice for the last group to the end of the times\n        # array\n        group_indices.append(slice(group_start_indices[-1]+1,len(lctimes)))\n\n    # if there's no large gap in the LC, then there's only one group to worry\n    # about\n    else:\n        group_indices = [slice(0,len(lctimes))]\n\n\n    return len(group_indices), group_indices", "response": "This function finds the time gaps in the light curve so we can figure out which time gaps are for consecutive observations and which represent gaps between consecutive measurements."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef normalize_lcdict_byinst(\n        lcdict,\n        magcols='all',\n        normto='sdssr',\n        normkeylist=('stf','ccd','flt','fld','prj','exp'),\n        debugmode=False,\n        quiet=False\n):\n    '''This is a function to normalize light curves across all instrument\n    combinations present.\n\n    Use this to normalize a light curve containing a variety of:\n\n    - HAT station IDs ('stf')\n    - camera IDs ('ccd')\n    - filters ('flt')\n    - observed field names ('fld')\n    - HAT project IDs ('prj')\n    - exposure times ('exp')\n\n    Parameters\n    ----------\n\n    lcdict : dict\n        The input lcdict to process.\n\n    magcols : 'all' or list of str\n        If this is 'all', all of the columns in the lcdict that are indicated to\n        be magnitude measurement columns are normalized. If this is a list of\n        str, must contain the keys of the lcdict specifying which magnitude\n        columns will be normalized.\n\n    normto : {'zero', 'jmag', 'hmag', 'kmag', 'bmag', 'vmag', 'sdssg', 'sdssr', 'sdssi'}\n        This indicates which column will be the normalization target. If this is\n        'zero', will normalize to 0.0 for each LC column. Otherwise, will\n        normalize to the value of one of the other keys in the\n        lcdict['objectinfo'][magkey], meaning the normalization will be to some\n        form of catalog magnitude.\n\n    normkeylist : list of str\n        These are the column keys to use to form the normalization\n        index. Measurements in the specified `magcols` with identical\n        normalization index values will be considered as part of a single\n        measurement 'era', and will be normalized to zero. Once all eras have\n        been normalized this way, the final light curve will be re-normalized as\n        specified in `normto`.\n\n    debugmode : bool\n        If True, will indicate progress as time-groups are found and processed.\n\n    quiet : bool\n        If True, will not emit any messages when processing.\n\n    Returns\n    -------\n\n    dict\n        Returns the lcdict with the magnitude measurements normalized as\n        specified. The normalization happens IN PLACE.\n\n    '''\n\n    # check if this lc has been normalized already. return as-is if so\n    if 'lcinstnormcols' in lcdict and len(lcdict['lcinstnormcols']) > 0:\n        if not quiet:\n            LOGWARNING('this lightcurve is already '\n                       'normalized by instrument keys, '\n                       'returning...')\n        return lcdict\n\n    # generate the normalization key\n    normkeycols = []\n    availablenormkeys = []\n    for key in normkeylist:\n        if key in lcdict and lcdict[key] is not None:\n            normkeycols.append(lcdict[key])\n            availablenormkeys.append(key)\n\n    # transpose to turn these into rows\n    normkeycols = list(zip(*normkeycols))\n\n    # convert to a string rep for each key and post-process for simplicity\n    allkeys = [repr(x) for x in normkeycols]\n    allkeys = [a.replace('(','').replace(')','').replace(\"'\",'').replace(' ','')\n               for a in allkeys]\n\n    # turn these into a numpy array and get the unique values\n    allkeys = np.array(allkeys)\n    normkeys = np.unique(allkeys)\n\n    # figure out the apertures\n    # HATLC V2 format\n    if 'lcapertures' in lcdict:\n        apertures = sorted(lcdict['lcapertures'].keys())\n    # LCC-CSV-V1 format HATLC\n    elif 'objectinfo' in lcdict and 'lcapertures' in lcdict['objectinfo']:\n        apertures = sorted(lcdict['objectinfo']['lcapertures'].keys())\n\n    # put together the column names\n    aimcols = [('aim_%s' % x) for x in apertures if ('aim_%s' % x) in lcdict]\n    armcols = [('arm_%s' % x) for x in apertures if ('arm_%s' % x) in lcdict]\n    aepcols = [('aep_%s' % x)for x in apertures if ('aep_%s' % x) in lcdict]\n    atfcols = [('atf_%s' % x) for x in apertures if ('atf_%s' % x) in lcdict]\n    psimcols = [x for x in ['psim','psrm','psep','pstf'] if x in lcdict]\n    irmcols = [('irm_%s' % x) for x in apertures if ('irm_%s' % x) in lcdict]\n    iepcols = [('iep_%s' % x) for x in apertures if ('iep_%s' % x) in lcdict]\n    itfcols = [('itf_%s' % x) for x in apertures if ('itf_%s' % x) in lcdict]\n\n    # next, find all the mag columns to normalize\n    if magcols == 'all':\n        cols_to_normalize = (aimcols + armcols + aepcols + atfcols +\n                             psimcols + irmcols + iepcols + itfcols)\n    elif magcols == 'redmags':\n        cols_to_normalize = (irmcols + (['psrm'] if 'psrm' in lcdict else []) +\n                             irmcols)\n    elif magcols == 'epdmags':\n        cols_to_normalize = (aepcols + (['psep'] if 'psep' in lcdict else []) +\n                             iepcols)\n    elif magcols == 'tfamags':\n        cols_to_normalize = (atfcols + (['pstf'] if 'pstf' in lcdict else []) +\n                             itfcols)\n    elif magcols == 'epdtfa':\n        cols_to_normalize = (aepcols + (['psep'] if 'psep' in lcdict else []) +\n                             iepcols + atfcols +\n                             (['pstf'] if 'pstf' in lcdict else []) +\n                             itfcols)\n    else:\n        cols_to_normalize = magcols.split(',')\n        cols_to_normalize = [x.strip() for x in cols_to_normalize]\n\n    colsnormalized = []\n\n    # go through each column and normalize them\n    for col in cols_to_normalize:\n\n        if col in lcdict:\n\n            # note: this requires the columns in ndarray format\n            # unlike normalize_lcdict\n            thismags = lcdict[col]\n\n            # go through each key in normusing\n            for nkey in normkeys:\n\n                thisind = allkeys == nkey\n\n                # make sure we have at least 3 elements in the matched set of\n                # magnitudes corresponding to this key. also make sure that the\n                # magnitudes corresponding to this key aren't all nan.\n                thismagsize = thismags[thisind].size\n                thismagfinite = np.where(np.isfinite(thismags[thisind]))[0].size\n\n                if thismagsize > 2 and thismagfinite > 2:\n\n                    # do the normalization and update the thismags in the lcdict\n                    medmag = np.nanmedian(thismags[thisind])\n                    lcdict[col][thisind] = lcdict[col][thisind] - medmag\n\n                    if debugmode:\n                        LOGDEBUG('magcol: %s, currkey: \"%s\", nelem: %s, '\n                                 'medmag: %s' %\n                                 (col, nkey, len(thismags[thisind]), medmag))\n\n                # we remove mags that correspond to keys with less than 3\n                # (finite) elements because we can't get the median mag\n                # correctly and renormalizing them to zero would just set them\n                # to zero\n                else:\n\n                    lcdict[col][thisind] = np.nan\n\n            # everything should now be normalized to zero\n            # add back the requested normto\n            if normto in ('jmag', 'hmag', 'kmag',\n                          'bmag', 'vmag',\n                          'sdssg', 'sdssr', 'sdssi'):\n\n                if (normto in lcdict['objectinfo'] and\n                    lcdict['objectinfo'][normto] is not None):\n                    lcdict[col] = lcdict[col] + lcdict['objectinfo'][normto]\n\n                else:\n                    if not quiet:\n                        LOGWARNING('no %s available in lcdict, '\n                                   'normalizing to 0.0' % normto)\n                    normto = 'zero'\n\n            # update the colsnormalized list\n            colsnormalized.append(col)\n\n        else:\n            if not quiet:\n                LOGWARNING('column %s is not present, skipping...' % col)\n            continue\n\n    # add the lcnormcols key to the lcdict\n    lcinstnormcols = ('cols normalized: %s - '\n                      'normalized to: %s - '\n                      'norm keys used: %s') % (repr(colsnormalized),\n                                               normto,\n                                               repr(availablenormkeys))\n    lcdict['lcinstnormcols'] = lcinstnormcols\n\n    return lcdict", "response": "This function is used to normalize a single - level dictionary of light curves across all instrument - level light curves."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef coord_features(objectinfo):\n\n    '''Calculates object coordinates features, including:\n\n    - galactic coordinates\n    - total proper motion from pmra, pmdecl\n    - reduced J proper motion from propermotion and Jmag\n\n    Parameters\n    ----------\n\n    objectinfo : dict\n        This is an objectinfo dict from a light curve file read into an\n        `lcdict`. The format and the minimum keys required are::\n\n            {'ra': the right ascension of the object in decimal degrees,\n             'decl': the declination of the object in decimal degrees,\n             'pmra': the proper motion in right ascension in mas/yr,\n             'pmdecl': the proper motion in declination in mas/yr,\n             'jmag': the 2MASS J mag of this object}\n\n    Returns\n    -------\n\n    dict\n        A dict containing the total proper motion\n\n    '''\n\n    retdict = {'propermotion': np.nan,\n               'gl':np.nan,\n               'gb':np.nan,\n               'rpmj':np.nan}\n\n    if ('ra' in objectinfo and\n        objectinfo['ra'] is not None and\n        np.isfinite(objectinfo['ra']) and\n        'decl' in objectinfo and\n        objectinfo['decl'] is not None and\n        np.isfinite(objectinfo['decl'])):\n\n        retdict['gl'], retdict['gb'] = coordutils.equatorial_to_galactic(\n            objectinfo['ra'],\n            objectinfo['decl']\n        )\n\n    if ('pmra' in objectinfo and\n        objectinfo['pmra'] is not None and\n        np.isfinite(objectinfo['pmra']) and\n        'pmdecl' in objectinfo and\n        objectinfo['pmdecl'] is not None and\n        np.isfinite(objectinfo['pmdecl']) and\n        'decl' in objectinfo and\n        objectinfo['decl'] is not None and\n        np.isfinite(objectinfo['decl'])):\n\n        retdict['propermotion'] = coordutils.total_proper_motion(\n            objectinfo['pmra'],\n            objectinfo['pmdecl'],\n            objectinfo['decl']\n        )\n\n    if ('jmag' in objectinfo and\n        objectinfo['jmag'] is not None and\n        np.isfinite(objectinfo['jmag']) and\n        np.isfinite(retdict['propermotion'])):\n\n        retdict['rpmj'] = coordutils.reduced_proper_motion(\n            objectinfo['jmag'],\n            retdict['propermotion']\n        )\n\n    return retdict", "response": "Calculates the object coordinates features for a single object in an objectinfo dict."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef color_features(in_objectinfo,\n                   deredden=True,\n                   custom_bandpasses=None,\n                   dust_timeout=10.0):\n    '''Stellar colors and dereddened stellar colors using 2MASS DUST API:\n\n    http://irsa.ipac.caltech.edu/applications/DUST/docs/dustProgramInterface.html\n\n    Parameters\n    ----------\n\n    in_objectinfo : dict\n        This is a dict that contains the object's magnitudes and positions. This\n        requires at least 'ra', and 'decl' as keys which correspond to the right\n        ascension and declination of the object, and one or more of the\n        following keys for object magnitudes::\n\n            'umag'  -> U mag             -> colors: U-B, U-V, U-g\n            'bmag'  -> B mag             -> colors: U-B, B-V\n            'vmag'  -> V mag             -> colors: U-V, B-V, V-R, V-I, V-K\n            'rmag'  -> R mag             -> colors: V-R, R-I\n            'imag'  -> I mag             -> colors: g-I, V-I, R-I, B-I\n            'jmag'  -> 2MASS J mag       -> colors: J-H, J-K, g-J, i-J\n            'hmag'  -> 2MASS H mag       -> colors: J-H, H-K\n            'kmag'  -> 2MASS Ks mag      -> colors: g-Ks, H-Ks, J-Ks, V-Ks\n            'sdssu' -> SDSS u mag        -> colors: u-g, u-V\n            'sdssg' -> SDSS g mag        -> colors: g-r, g-i, g-K, u-g, U-g, g-J\n            'sdssr' -> SDSS r mag        -> colors: r-i, g-r\n            'sdssi' -> SDSS i mag        -> colors: r-i, i-z, g-i, i-J, i-W1\n            'sdssz' -> SDSS z mag        -> colors: i-z, z-W2, g-z\n            'ujmag' -> UKIRT J mag       -> colors: J-H, H-K, J-K, g-J, i-J\n            'uhmag' -> UKIRT H mag       -> colors: J-H, H-K\n            'ukmag' -> UKIRT K mag       -> colors: g-K, H-K, J-K, V-K\n            'irac1' -> Spitzer IRAC1 mag -> colors: i-I1, I1-I2\n            'irac2' -> Spitzer IRAC2 mag -> colors: I1-I2, I2-I3\n            'irac3' -> Spitzer IRAC3 mag -> colors: I2-I3\n            'irac4' -> Spitzer IRAC4 mag -> colors: I3-I4\n            'wise1' -> WISE W1 mag       -> colors: i-W1, W1-W2\n            'wise2' -> WISE W2 mag       -> colors: W1-W2, W2-W3\n            'wise3' -> WISE W3 mag       -> colors: W2-W3\n            'wise4' -> WISE W4 mag       -> colors: W3-W4\n\n        These are basically taken from the available reddening bandpasses from\n        the 2MASS DUST service. If B, V, u, g, r, i, z aren't provided but 2MASS\n        J, H, Ks are all provided, the former will be calculated using the 2MASS\n        JHKs -> BVugriz conversion functions in :py:mod:`astrobase.magnitudes`.\n\n    deredden : bool\n        If True, will make sure all colors use dereddened mags where possible.\n\n    custom_bandpasses : dict\n        This is a dict used to define any custom bandpasses in the\n        `in_objectinfo` dict you want to make this function aware of and\n        generate colors for. Use the format below for this dict::\n\n            {\n            '<bandpass_key_1>':{'dustkey':'<twomass_dust_key_1>',\n                                'label':'<band_label_1>'\n                                'colors':[['<bandkey1>-<bandkey2>',\n                                           '<BAND1> - <BAND2>'],\n                                          ['<bandkey3>-<bandkey4>',\n                                           '<BAND3> - <BAND4>']]},\n            .\n            ...\n            .\n            '<bandpass_key_N>':{'dustkey':'<twomass_dust_key_N>',\n                                'label':'<band_label_N>'\n                                'colors':[['<bandkey1>-<bandkey2>',\n                                           '<BAND1> - <BAND2>'],\n                                          ['<bandkey3>-<bandkey4>',\n                                           '<BAND3> - <BAND4>']]},\n            }\n\n        Where:\n\n        `bandpass_key` is a key to use to refer to this bandpass in the\n        `objectinfo` dict, e.g. 'sdssg' for SDSS g band\n\n        `twomass_dust_key` is the key to use in the 2MASS DUST result table for\n        reddening per band-pass. For example, given the following DUST result\n        table (using http://irsa.ipac.caltech.edu/applications/DUST/)::\n\n            |Filter_name|LamEff |A_over_E_B_V_SandF|A_SandF|A_over_E_B_V_SFD|A_SFD|\n            |char       |float  |float             |float  |float           |float|\n            |           |microns|                  |mags   |                |mags |\n             CTIO U       0.3734              4.107   0.209            4.968 0.253\n             CTIO B       0.4309              3.641   0.186            4.325 0.221\n             CTIO V       0.5517              2.682   0.137            3.240 0.165\n            .\n            .\n            ...\n\n        The `twomass_dust_key` for 'vmag' would be 'CTIO V'. If you want to\n        skip DUST lookup and want to pass in a specific reddening magnitude\n        for your bandpass, use a float for the value of\n        `twomass_dust_key`. If you want to skip DUST lookup entirely for\n        this bandpass, use None for the value of `twomass_dust_key`.\n\n        `band_label` is the label to use for this bandpass, e.g. 'W1' for\n        WISE-1 band, 'u' for SDSS u, etc.\n\n        The 'colors' list contains color definitions for all colors you want\n        to generate using this bandpass. this list contains elements of the\n        form::\n\n            ['<bandkey1>-<bandkey2>','<BAND1> - <BAND2>']\n\n        where the the first item is the bandpass keys making up this color,\n        and the second item is the label for this color to be used by the\n        frontends. An example::\n\n            ['sdssu-sdssg','u - g']\n\n    dust_timeout : float\n        The timeout to use when contacting the 2MASS DUST web service.\n\n    Returns\n    -------\n\n    dict\n        An `objectinfo` dict with all of the generated colors, dereddened\n        magnitude,s dereddened colors, as specified in the input args is\n        returned.\n\n    '''\n\n    objectinfo = in_objectinfo.copy()\n\n    # this is the initial output dict\n    outdict = {\n        'available_bands':[],\n        'available_band_labels':[],\n        'available_dereddened_bands':[],\n        'available_dereddened_band_labels':[],\n        'available_colors':[],\n        'available_color_labels':[],\n        'dereddened':False\n    }\n\n    #\n    # get the BVugriz mags from the JHK mags if necessary\n    #\n    # FIXME: should these be direct dered mag_0 = f(J_0, H_0, K_0) instead?\n    # Bilir+ 2008 uses dereddened colors for their transforms, should check if\n    # we need to do so here\n\n    if ('jmag' in objectinfo and\n        objectinfo['jmag'] is not None and\n        np.isfinite(objectinfo['jmag']) and\n        'hmag' in objectinfo and\n        objectinfo['hmag'] is not None and\n        np.isfinite(objectinfo['hmag']) and\n        'kmag' in objectinfo and\n        objectinfo['kmag'] is not None and\n        np.isfinite(objectinfo['kmag'])):\n\n        if ('bmag' not in objectinfo or\n            ('bmag' in objectinfo and objectinfo['bmag'] is None) or\n            ('bmag' in objectinfo and not np.isfinite(objectinfo['bmag']))):\n            objectinfo['bmag'] = magnitudes.jhk_to_bmag(objectinfo['jmag'],\n                                                        objectinfo['hmag'],\n                                                        objectinfo['kmag'])\n            outdict['bmagfromjhk'] = True\n        else:\n            outdict['bmagfromjhk'] = False\n\n        if ('vmag' not in objectinfo or\n            ('vmag' in objectinfo and objectinfo['vmag'] is None) or\n            ('vmag' in objectinfo and not np.isfinite(objectinfo['vmag']))):\n            objectinfo['vmag'] = magnitudes.jhk_to_vmag(objectinfo['jmag'],\n                                                        objectinfo['hmag'],\n                                                        objectinfo['kmag'])\n            outdict['vmagfromjhk'] = True\n        else:\n            outdict['vmagfromjhk'] = False\n\n\n        if ('sdssu' not in objectinfo or\n            ('sdssu' in objectinfo and objectinfo['sdssu'] is None) or\n            ('sdssu' in objectinfo and not np.isfinite(objectinfo['sdssu']))):\n            objectinfo['sdssu'] = magnitudes.jhk_to_sdssu(objectinfo['jmag'],\n                                                          objectinfo['hmag'],\n                                                          objectinfo['kmag'])\n            outdict['sdssufromjhk'] = True\n        else:\n            outdict['sdssufromjhk'] = False\n\n        if ('sdssg' not in objectinfo or\n            ('sdssg' in objectinfo and objectinfo['sdssg'] is None) or\n            ('sdssg' in objectinfo and not np.isfinite(objectinfo['sdssg']))):\n            objectinfo['sdssg'] = magnitudes.jhk_to_sdssg(objectinfo['jmag'],\n                                                          objectinfo['hmag'],\n                                                          objectinfo['kmag'])\n            outdict['sdssgfromjhk'] = True\n        else:\n            outdict['sdssgfromjhk'] = False\n\n        if ('sdssr' not in objectinfo or\n            ('sdssr' in objectinfo and objectinfo['sdssr'] is None) or\n            ('sdssr' in objectinfo and not np.isfinite(objectinfo['sdssr']))):\n            objectinfo['sdssr'] = magnitudes.jhk_to_sdssr(objectinfo['jmag'],\n                                                          objectinfo['hmag'],\n                                                          objectinfo['kmag'])\n            outdict['sdssrfromjhk'] = True\n        else:\n            outdict['sdssrfromjhk'] = False\n\n        if ('sdssi' not in objectinfo or\n            ('sdssi' in objectinfo and objectinfo['sdssi'] is None) or\n            ('sdssi' in objectinfo and not np.isfinite(objectinfo['sdssi']))):\n            objectinfo['sdssi'] = magnitudes.jhk_to_sdssi(objectinfo['jmag'],\n                                                          objectinfo['hmag'],\n                                                          objectinfo['kmag'])\n            outdict['sdssifromjhk'] = True\n        else:\n            outdict['sdssifromjhk'] = False\n\n        if ('sdssz' not in objectinfo or\n            ('sdssz' in objectinfo and objectinfo['sdssz'] is None) or\n            ('sdssz' in objectinfo and not np.isfinite(objectinfo['sdssz']))):\n            objectinfo['sdssz'] = magnitudes.jhk_to_sdssz(objectinfo['jmag'],\n                                                          objectinfo['hmag'],\n                                                          objectinfo['kmag'])\n            outdict['sdsszfromjhk'] = True\n        else:\n            outdict['sdsszfromjhk'] = False\n\n\n    # now handle dereddening if possible\n    if deredden:\n\n        try:\n\n            # first, get the extinction table for this object\n            extinction = dust.extinction_query(objectinfo['ra'],\n                                               objectinfo['decl'],\n                                               verbose=False,\n                                               timeout=dust_timeout)\n\n        except Exception as e:\n\n            LOGERROR(\"deredden = True but 'ra', 'decl' keys not present \"\n                     \"or invalid in objectinfo dict, ignoring reddening...\")\n            extinction = None\n            outdict['dereddened'] = False\n\n    else:\n\n        extinction = None\n        outdict['dereddened'] = False\n\n    # handle timeout from DUST service\n    if not extinction:\n        outdict['dereddened'] = False\n\n    # go through the objectdict and pick out the mags we have available from the\n    # BANDPASSES_COLORS dict\n\n    # update our bandpasses_colors dict with any custom ones the user defined\n    our_bandpasses_colors = BANDPASSES_COLORS.copy()\n    our_bandpass_list = BANDPASS_LIST[::]\n\n    if custom_bandpasses is not None and isinstance(custom_bandpasses, dict):\n\n        our_bandpasses_colors.update(custom_bandpasses)\n\n        # also update the list\n        for key in custom_bandpasses:\n            if key not in our_bandpass_list:\n                our_bandpass_list.append(key)\n\n    for mk in our_bandpass_list:\n\n        if (mk in objectinfo and\n            objectinfo[mk] is not None and\n            np.isfinite(objectinfo[mk])):\n\n            thisbandlabel = our_bandpasses_colors[mk]['label']\n            thisdustkey = our_bandpasses_colors[mk]['dustkey']\n\n            # add this to the outdict\n            outdict[mk] = objectinfo[mk]\n\n            outdict['available_bands'].append(mk)\n            outdict['available_band_labels'].append(thisbandlabel)\n\n            #\n            # deredden if possible\n            #\n            # calculating dereddened mags:\n            # A_x = m - m0_x where m is measured mag, m0 is intrinsic mag\n            # m0_x = m - A_x\n            #\n            # so for two bands x, y:\n            # intrinsic color (m_x - m_y)_0 = (m_x - m_y) - (A_x - A_y)\n            if (deredden and extinction):\n\n                outdict['dereddened'] = True\n\n                # check if the dustkey is None, float, or str to figure out how\n                # to retrieve the reddening\n                if (thisdustkey is not None and\n                    isinstance(thisdustkey, str) and\n                    thisdustkey in extinction['Amag'] and\n                    np.isfinite(extinction['Amag'][thisdustkey]['sf11'])):\n\n                    outdict['extinction_%s' % mk] = (\n                        extinction['Amag'][thisdustkey]['sf11']\n                    )\n\n                elif (thisdustkey is not None and\n                      isinstance(thisdustkey, float)):\n\n                    outdict['extinction_%s' % mk] = thisdustkey\n\n                else:\n\n                    outdict['extinction_%s' % mk] = 0.0\n\n                # apply the extinction\n                outdict['dered_%s' % mk] = (\n                    outdict[mk] - outdict['extinction_%s' % mk]\n                )\n                outdict['available_dereddened_bands'].append('dered_%s' % mk)\n                outdict['available_dereddened_band_labels'].append(\n                    thisbandlabel\n                )\n\n                # get all the colors to generate for this bandpass\n                for colorspec in BANDPASSES_COLORS[mk]['colors']:\n\n                    # only add this if the color's not there already\n                    if colorspec[0] not in outdict:\n\n                        colorkey, colorlabel = colorspec\n\n                        # look for the bands to make this color\n\n                        # if it's not found now, this should work when we come\n                        # around for the next bandpass for this color\n\n                        band1, band2 = colorkey.split('-')\n\n                        if ('dered_%s' % band1 in outdict and\n                            'dered_%s' % band2 in outdict and\n                            np.isfinite(outdict['dered_%s' % band1]) and\n                            np.isfinite(outdict['dered_%s' % band2])):\n\n                            outdict[colorkey] = (\n                                outdict['dered_%s' % band1] -\n                                outdict['dered_%s' % band2]\n                            )\n                            outdict['available_colors'].append(colorkey)\n                            outdict['available_color_labels'].append(colorlabel)\n\n\n            # handle no dereddening\n            else:\n\n                outdict['dereddened'] = False\n                outdict['extinction_%s' % mk] = 0.0\n                outdict['dered_%s' % mk] = np.nan\n\n                # get all the colors to generate for this bandpass\n                for colorspec in our_bandpasses_colors[mk]['colors']:\n\n                    # only add this if the color's not there already\n                    if colorspec[0] not in outdict:\n\n                        colorkey, colorlabel = colorspec\n\n                        # look for the bands to make this color\n\n                        # if it's not found now, this should work when we come\n                        # around for the next bandpass for this color\n\n                        band1, band2 = colorkey.split('-')\n\n                        if (band1 in outdict and\n                            band2 in outdict and\n                            outdict[band1] is not None and\n                            outdict[band2] is not None and\n                            np.isfinite(outdict[band1]) and\n                            np.isfinite(outdict[band2])):\n\n                            outdict[colorkey] = (\n                                outdict[band1] -\n                                outdict[band2]\n                            )\n                            outdict['available_colors'].append(colorkey)\n                            outdict['available_color_labels'].append(colorlabel)\n\n\n        # if this bandpass was not found in the objectinfo dict, ignore it\n        else:\n\n            outdict[mk] = np.nan\n\n\n    return outdict", "response": "Color features of the object in the objectinfo dict."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef mdwarf_subtype_from_sdsscolor(ri_color, iz_color):\n    '''This calculates the M-dwarf subtype given SDSS `r-i` and `i-z` colors.\n\n    Parameters\n    ----------\n\n    ri_color : float\n        The SDSS `r-i` color of the object.\n\n    iz_color : float\n        The SDSS `i-z` color of the object.\n\n    Returns\n    -------\n\n    (subtype, index1, index2) : tuple\n        `subtype`: if the star appears to be an M dwarf, will return an int\n        between 0 and 9 indicating its subtype, e.g. will return 4 for an M4\n        dwarf. If the object isn't an M dwarf, will return None\n\n        `index1`, `index2`: the M-dwarf color locus value and spread of this\n        object calculated from the `r-i` and `i-z` colors.\n\n    '''\n\n    # calculate the spectral type index and the spectral type spread of the\n    # object. sti is calculated by fitting a line to the locus in r-i and i-z\n    # space for M dwarfs in West+ 2007\n    if np.isfinite(ri_color) and np.isfinite(iz_color):\n        obj_sti = 0.875274*ri_color + 0.483628*(iz_color + 0.00438)\n        obj_sts = -0.483628*ri_color + 0.875274*(iz_color + 0.00438)\n    else:\n        obj_sti = np.nan\n        obj_sts = np.nan\n\n    # possible M star if sti is >= 0.666 but <= 3.4559\n    if (np.isfinite(obj_sti) and np.isfinite(obj_sts) and\n        (obj_sti > 0.666) and (obj_sti < 3.4559)):\n\n        # decide which M subclass object this is\n        if ((obj_sti > 0.6660) and (obj_sti < 0.8592)):\n            m_class = 'M0'\n\n        if ((obj_sti > 0.8592) and (obj_sti < 1.0822)):\n            m_class = 'M1'\n\n        if ((obj_sti > 1.0822) and (obj_sti < 1.2998)):\n            m_class = 'M2'\n\n        if ((obj_sti > 1.2998) and (obj_sti < 1.6378)):\n            m_class = 'M3'\n\n        if ((obj_sti > 1.6378) and (obj_sti < 2.0363)):\n            m_class = 'M4'\n\n        if ((obj_sti > 2.0363) and (obj_sti < 2.2411)):\n            m_class = 'M5'\n\n        if ((obj_sti > 2.2411) and (obj_sti < 2.4126)):\n            m_class = 'M6'\n\n        if ((obj_sti > 2.4126) and (obj_sti < 2.9213)):\n            m_class = 'M7'\n\n        if ((obj_sti > 2.9213) and (obj_sti < 3.2418)):\n            m_class = 'M8'\n\n        if ((obj_sti > 3.2418) and (obj_sti < 3.4559)):\n            m_class = 'M9'\n\n    else:\n        m_class = None\n\n    return m_class, obj_sti, obj_sts", "response": "This function calculates the M - dwarf subtype given SDSS r - i and i - z colors."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ngets several neighbor, GAIA, and SIMBAD features: From the KD-Tree in the given light curve catalog the object is in: `lclist_kdtree`: - distance to closest neighbor in arcsec - total number of all neighbors within 2 x `neighbor_radius_arcsec` From the GAIA DR2 catalog: - distance to closest neighbor in arcsec - total number of all neighbors within 2 x `neighbor_radius_arcsec` - gets the parallax for the object and neighbors - calculates the absolute GAIA mag and `G-K` color for use in CMDs - gets the proper motion in RA/Dec if available From the SIMBAD catalog: - the name of the object - the type of the object Parameters ---------- objectinfo : dict This is the objectinfo dict from an object's light curve. This must contain at least the following keys:: {'ra': the right ascension of the object, 'decl': the declination of the object} lclist_kdtree : scipy.spatial.cKDTree object This is a KD-Tree built on the Cartesian xyz coordinates from (ra, dec) of all objects in the same field as this object. It is similar to that produced by :py:func:`astrobase.lcproc.catalogs.make_lclist`, and is used to carry out the spatial search required to find neighbors for this object. neighbor_radius_arcsec : float The maximum radius in arcseconds around this object to search for neighbors in both the light curve catalog and in the GAIA DR2 catalog. gaia_matchdist_arcsec : float The maximum distance in arcseconds to use for a GAIA cross-match to this object. verbose : bool If True, indicates progress and warns of problems. gaia_submit_timeout : float Sets the timeout in seconds to use when submitting a request to look up the object's information to the GAIA service. Note that if `fast_mode` is set, this is ignored. gaia_submit_tries : int Sets the maximum number of times the GAIA services will be contacted to obtain this object's information. If `fast_mode` is set, this is ignored, and the services will be contacted only once (meaning that a failure to respond will be silently ignored and no GAIA data will be added to the checkplot's objectinfo dict). gaia_max_timeout : float Sets the timeout in seconds to use when waiting for the GAIA service to respond to our request for the object's information. Note that if `fast_mode` is set, this is ignored. gaia_mirror : str This sets the GAIA mirror to use. This is a key in the `services.gaia.GAIA_URLS` dict which defines the URLs to hit for each mirror. search_simbad : bool If this is True, searches for objects in SIMBAD at this object's location and gets the object's SIMBAD main ID, type, and stellar classification if available. Returns ------- dict Returns a dict with neighbor, GAIA, and SIMBAD features.", "response": "def neighbor_gaia_features(objectinfo,\n                           lclist_kdtree,\n                           neighbor_radius_arcsec,\n                           gaia_matchdist_arcsec=3.0,\n                           verbose=True,\n                           gaia_submit_timeout=10.0,\n                           gaia_submit_tries=3,\n                           gaia_max_timeout=180.0,\n                           gaia_mirror=None,\n                           complete_query_later=True,\n                           search_simbad=False):\n    '''Gets several neighbor, GAIA, and SIMBAD features:\n\n    From the KD-Tree in the given light curve catalog the object is in:\n    `lclist_kdtree`:\n\n    - distance to closest neighbor in arcsec\n    - total number of all neighbors within 2 x `neighbor_radius_arcsec`\n\n    From the GAIA DR2 catalog:\n\n    - distance to closest neighbor in arcsec\n    - total number of all neighbors within 2 x `neighbor_radius_arcsec`\n    - gets the parallax for the object and neighbors\n    - calculates the absolute GAIA mag and `G-K` color for use in CMDs\n    - gets the proper motion in RA/Dec if available\n\n    From the SIMBAD catalog:\n\n    - the name of the object\n    - the type of the object\n\n    Parameters\n    ----------\n\n    objectinfo : dict\n        This is the objectinfo dict from an object's light curve. This must\n        contain at least the following keys::\n\n            {'ra': the right ascension of the object,\n             'decl': the declination of the object}\n\n    lclist_kdtree : scipy.spatial.cKDTree object\n        This is a KD-Tree built on the Cartesian xyz coordinates from (ra, dec)\n        of all objects in the same field as this object. It is similar to that\n        produced by :py:func:`astrobase.lcproc.catalogs.make_lclist`, and is\n        used to carry out the spatial search required to find neighbors for this\n        object.\n\n    neighbor_radius_arcsec : float\n        The maximum radius in arcseconds around this object to search for\n        neighbors in both the light curve catalog and in the GAIA DR2 catalog.\n\n    gaia_matchdist_arcsec : float\n        The maximum distance in arcseconds to use for a GAIA cross-match to this\n        object.\n\n    verbose : bool\n        If True, indicates progress and warns of problems.\n\n    gaia_submit_timeout : float\n        Sets the timeout in seconds to use when submitting a request to look up\n        the object's information to the GAIA service. Note that if `fast_mode`\n        is set, this is ignored.\n\n    gaia_submit_tries : int\n        Sets the maximum number of times the GAIA services will be contacted to\n        obtain this object's information. If `fast_mode` is set, this is\n        ignored, and the services will be contacted only once (meaning that a\n        failure to respond will be silently ignored and no GAIA data will be\n        added to the checkplot's objectinfo dict).\n\n    gaia_max_timeout : float\n        Sets the timeout in seconds to use when waiting for the GAIA service to\n        respond to our request for the object's information. Note that if\n        `fast_mode` is set, this is ignored.\n\n    gaia_mirror : str\n        This sets the GAIA mirror to use. This is a key in the\n        `services.gaia.GAIA_URLS` dict which defines the URLs to hit for each\n        mirror.\n\n    search_simbad : bool\n        If this is True, searches for objects in SIMBAD at this object's\n        location and gets the object's SIMBAD main ID, type, and stellar\n        classification if available.\n\n    Returns\n    -------\n\n    dict\n        Returns a dict with neighbor, GAIA, and SIMBAD features.\n\n    '''\n\n    # kdtree search for neighbors in light curve catalog\n    if ('ra' in objectinfo and 'decl' in objectinfo and\n        objectinfo['ra'] is not None and objectinfo['decl'] is not None and\n        (isinstance(lclist_kdtree, cKDTree) or\n         isinstance(lclist_kdtree, KDTree))):\n\n        ra, decl = objectinfo['ra'], objectinfo['decl']\n\n        cosdecl = np.cos(np.radians(decl))\n        sindecl = np.sin(np.radians(decl))\n        cosra = np.cos(np.radians(ra))\n        sinra = np.sin(np.radians(ra))\n\n        # this is the search distance in xyz unit vectors\n        xyzdist = 2.0 * np.sin(np.radians(neighbor_radius_arcsec/3600.0)/2.0)\n\n        # look up the coordinates for the closest 100 objects in the kdtree\n        # within 2 x neighbor_radius_arcsec\n        kdt_dist, kdt_ind = lclist_kdtree.query(\n            [cosra*cosdecl,\n             sinra*cosdecl,\n             sindecl],\n            k=100,\n            distance_upper_bound=xyzdist\n        )\n\n        # the first match is the object itself\n        finite_distind = (np.isfinite(kdt_dist)) & (kdt_dist > 0)\n        finite_dists = kdt_dist[finite_distind]\n        nbrindices = kdt_ind[finite_distind]\n        n_neighbors = finite_dists.size\n\n        if n_neighbors > 0:\n\n            closest_dist = finite_dists.min()\n            closest_dist_arcsec = (\n                np.degrees(2.0*np.arcsin(closest_dist/2.0))*3600.0\n            )\n            closest_dist_nbrind = nbrindices[finite_dists == finite_dists.min()]\n\n            resultdict = {\n                'neighbors':n_neighbors,\n                'nbrindices':nbrindices,\n                'distarcsec':np.degrees(2.0*np.arcsin(finite_dists/2.0))*3600.0,\n                'closestdistarcsec':closest_dist_arcsec,\n                'closestdistnbrind':closest_dist_nbrind,\n                'searchradarcsec':neighbor_radius_arcsec,\n            }\n\n        else:\n\n            resultdict = {\n                'neighbors':0,\n                'nbrindices':np.array([]),\n                'distarcsec':np.array([]),\n                'closestdistarcsec':np.nan,\n                'closestdistnbrind':np.array([]),\n                'searchradarcsec':neighbor_radius_arcsec,\n            }\n\n\n    else:\n        if verbose:\n            LOGWARNING(\"one of ra, decl, kdtree is missing in \"\n                       \"objectinfo dict or lclistpkl, \"\n                       \"can't get observed neighbors\")\n\n        resultdict = {\n            'neighbors':np.nan,\n            'nbrindices':np.array([]),\n            'distarcsec':np.array([]),\n            'closestdistarcsec':np.nan,\n            'closestdistnbrind':np.array([]),\n            'searchradarcsec':neighbor_radius_arcsec,\n        }\n\n\n    # next, search for this object in GAIA\n    if ('ra' in objectinfo and 'decl' in objectinfo and\n        objectinfo['ra'] is not None and objectinfo['decl'] is not None):\n\n        gaia_result = gaia.objectlist_conesearch(\n            objectinfo['ra'],\n            objectinfo['decl'],\n            neighbor_radius_arcsec,\n            verbose=verbose,\n            timeout=gaia_submit_timeout,\n            maxtimeout=gaia_max_timeout,\n            maxtries=gaia_submit_tries,\n            gaia_mirror=gaia_mirror,\n            complete_query_later=complete_query_later\n        )\n\n        if gaia_result:\n\n            gaia_objlistf = gaia_result['result']\n\n            with gzip.open(gaia_objlistf,'rb') as infd:\n\n                try:\n                    gaia_objlist = np.genfromtxt(\n                        infd,\n                        names=True,\n                        delimiter=',',\n                        dtype='U20,f8,f8,f8,f8,f8,f8,f8,f8,f8,f8,f8,f8',\n                        usecols=(0,1,2,3,4,5,6,7,8,9,10,11,12)\n                    )\n                except Exception as e:\n                    gaia_objlist = []\n\n            gaia_objlist = np.atleast_1d(gaia_objlist)\n\n            if gaia_objlist.size > 0:\n\n                # if we have GAIA results, we can get xypositions of all of\n                # these objects on the object skyview stamp\n                stampres = skyview.get_stamp(objectinfo['ra'],\n                                             objectinfo['decl'])\n\n                if (stampres and\n                    'fitsfile' in stampres and\n                    stampres['fitsfile'] is not None and\n                    os.path.exists(stampres['fitsfile'])):\n\n                    stampwcs = WCS(stampres['fitsfile'])\n\n                    gaia_xypos = stampwcs.all_world2pix(\n                        np.column_stack((gaia_objlist['ra'],\n                                         gaia_objlist['dec'])),\n                        1\n                    )\n\n                else:\n\n                    gaia_xypos = None\n\n\n                # the first object is likely the match to the object itself\n                if gaia_objlist['dist_arcsec'][0] < gaia_matchdist_arcsec:\n\n                    if gaia_objlist.size > 1:\n\n                        gaia_nneighbors = gaia_objlist[1:].size\n\n                        gaia_status = (\n                            'ok: object found with %s neighbors' %\n                            gaia_nneighbors\n                        )\n\n                        # the first in each array is the object\n                        gaia_ids = gaia_objlist['source_id']\n                        gaia_mags = gaia_objlist['phot_g_mean_mag']\n                        gaia_parallaxes = gaia_objlist['parallax']\n                        gaia_parallax_errs = gaia_objlist['parallax_error']\n                        gaia_pmra = gaia_objlist['pmra']\n                        gaia_pmra_err = gaia_objlist['pmra_error']\n                        gaia_pmdecl = gaia_objlist['pmdec']\n                        gaia_pmdecl_err = gaia_objlist['pmdec_error']\n\n                        gaia_absolute_mags = magnitudes.absolute_gaia_magnitude(\n                            gaia_mags, gaia_parallaxes\n                        )\n                        if ('kmag' in objectinfo and\n                            objectinfo['kmag'] is not None and\n                            np.isfinite(objectinfo['kmag'])):\n                            gaiak_colors = gaia_mags - objectinfo['kmag']\n                        else:\n                            gaiak_colors = None\n\n                        gaia_dists = gaia_objlist['dist_arcsec']\n                        gaia_closest_distarcsec = gaia_objlist['dist_arcsec'][1]\n                        gaia_closest_gmagdiff = (\n                            gaia_objlist['phot_g_mean_mag'][0] -\n                            gaia_objlist['phot_g_mean_mag'][1]\n                        )\n\n                    else:\n\n                        LOGWARNING('object found in GAIA at (%.3f,%.3f), '\n                                   'but no neighbors' % (objectinfo['ra'],\n                                                         objectinfo['decl']))\n\n                        gaia_nneighbors = 0\n\n                        gaia_status = (\n                            'ok: object found but no neighbors'\n                        )\n\n                        # the first in each array is the object\n                        gaia_ids = gaia_objlist['source_id']\n                        gaia_mags = gaia_objlist['phot_g_mean_mag']\n                        gaia_parallaxes = gaia_objlist['parallax']\n                        gaia_parallax_errs = gaia_objlist['parallax_error']\n                        gaia_pmra = gaia_objlist['pmra']\n                        gaia_pmra_err = gaia_objlist['pmra_error']\n                        gaia_pmdecl = gaia_objlist['pmdec']\n                        gaia_pmdecl_err = gaia_objlist['pmdec_error']\n\n                        gaia_absolute_mags = magnitudes.absolute_gaia_magnitude(\n                            gaia_mags, gaia_parallaxes\n                        )\n                        if ('kmag' in objectinfo and\n                            objectinfo['kmag'] is not None and\n                            np.isfinite(objectinfo['kmag'])):\n                            gaiak_colors = gaia_mags - objectinfo['kmag']\n                        else:\n                            gaiak_colors = None\n\n                        gaia_dists = gaia_objlist['dist_arcsec']\n                        gaia_closest_distarcsec = np.nan\n                        gaia_closest_gmagdiff = np.nan\n\n\n                # otherwise, the object wasn't found in GAIA for some reason\n                else:\n\n                    LOGWARNING('no GAIA objects found within '\n                               '%.3f arcsec of object position (%.3f, %.3f), '\n                               'closest object is at %.3f arcsec away' %\n                               (gaia_matchdist_arcsec,\n                                objectinfo['ra'], objectinfo['decl'],\n                                gaia_objlist['dist_arcsec'][0]))\n\n                    gaia_status = ('failed: no object within %.3f '\n                                   'arcsec, closest = %.3f arcsec' %\n                                   (gaia_matchdist_arcsec,\n                                    gaia_objlist['dist_arcsec'][0]))\n\n                    gaia_nneighbors = np.nan\n\n                    gaia_ids = gaia_objlist['source_id']\n                    gaia_mags = gaia_objlist['phot_g_mean_mag']\n                    gaia_parallaxes = gaia_objlist['parallax']\n                    gaia_parallax_errs = gaia_objlist['parallax_error']\n                    gaia_pmra = gaia_objlist['pmra']\n                    gaia_pmra_err = gaia_objlist['pmra_error']\n                    gaia_pmdecl = gaia_objlist['pmdec']\n                    gaia_pmdecl_err = gaia_objlist['pmdec_error']\n\n                    gaia_absolute_mags = magnitudes.absolute_gaia_magnitude(\n                        gaia_mags, gaia_parallaxes\n                    )\n                    if ('kmag' in objectinfo and\n                        objectinfo['kmag'] is not None and\n                        np.isfinite(objectinfo['kmag'])):\n                        gaiak_colors = gaia_mags - objectinfo['kmag']\n                    else:\n                        gaiak_colors = None\n\n                    gaia_dists = gaia_objlist['dist_arcsec']\n                    gaia_closest_distarcsec = np.nan\n                    gaia_closest_gmagdiff = np.nan\n\n            # if there are no neighbors within neighbor_radius_arcsec\n            # or this object is not covered by GAIA. return nothing\n            else:\n\n                LOGERROR('no GAIA objects at this '\n                         'position or GAIA query failed')\n\n                gaia_status = (\n                    'failed: no GAIA objects at this '\n                    'position or GAIA query failed.'\n                )\n                gaia_nneighbors = np.nan\n                gaia_ids = None\n                gaia_mags = None\n\n                gaia_xypos = None\n                gaia_parallaxes = None\n                gaia_parallax_errs = None\n                gaia_pmra = None\n                gaia_pmra_err = None\n                gaia_pmdecl = None\n                gaia_pmdecl_err = None\n                gaia_absolute_mags = None\n                gaiak_colors = None\n\n                gaia_dists = None\n                gaia_closest_distarcsec = np.nan\n                gaia_closest_gmagdiff = np.nan\n\n            # update the resultdict with gaia stuff\n            resultdict.update(\n                {'gaia_status':gaia_status,\n                 'gaia_neighbors':gaia_nneighbors,\n                 'gaia_ids':gaia_ids,\n                 'gaia_xypos':gaia_xypos,\n                 'gaia_mags':gaia_mags,\n                 'gaia_parallaxes':gaia_parallaxes,\n                 'gaia_parallax_errs':gaia_parallax_errs,\n                 'gaia_pmras':gaia_pmra,\n                 'gaia_pmra_errs':gaia_pmra_err,\n                 'gaia_pmdecls':gaia_pmdecl,\n                 'gaia_pmdecl_errs':gaia_pmdecl_err,\n                 'gaia_absolute_mags':gaia_absolute_mags,\n                 'gaiak_colors':gaiak_colors,\n                 'gaia_dists':gaia_dists,\n                 'gaia_closest_distarcsec':gaia_closest_distarcsec,\n                 'gaia_closest_gmagdiff':gaia_closest_gmagdiff}\n            )\n\n        else:\n\n            LOGERROR('GAIA query did not return a '\n                     'result for object at (%.3f, %.3f)' % (objectinfo['ra'],\n                                                            objectinfo['decl']))\n\n            resultdict.update(\n                {'gaia_status':'failed: GAIA TAP query failed',\n                 'gaia_neighbors':np.nan,\n                 'gaia_ids':None,\n                 'gaia_xypos':None,\n                 'gaia_mags':None,\n                 'gaia_parallaxes':None,\n                 'gaia_parallax_errs':None,\n                 'gaia_pmras':None,\n                 'gaia_pmra_errs':None,\n                 'gaia_pmdecls':None,\n                 'gaia_pmdecl_errs':None,\n                 'gaia_absolute_mags':None,\n                 'gaiak_colors':None,\n                 'gaia_dists':None,\n                 'gaia_closest_distarcsec':np.nan,\n                 'gaia_closest_gmagdiff':np.nan}\n            )\n\n\n    else:\n\n        LOGERROR(\"one or more of the 'ra', 'decl' keys \"\n                 \"are missing from the objectinfo dict, \"\n                 \"can't get GAIA or LC collection neighbor features\")\n\n        resultdict.update(\n            {'gaia_status':'failed: no ra/decl for object',\n             'gaia_neighbors':np.nan,\n             'gaia_ids':None,\n             'gaia_xypos':None,\n             'gaia_mags':None,\n             'gaia_parallaxes':None,\n             'gaia_parallax_errs':None,\n             'gaia_pmras':None,\n             'gaia_pmra_errs':None,\n             'gaia_pmdecls':None,\n             'gaia_pmdecl_errs':None,\n             'gaia_absolute_mags':None,\n             'gaiak_colors':None,\n             'gaia_dists':None,\n             'gaia_closest_distarcsec':np.nan,\n             'gaia_closest_gmagdiff':np.nan}\n        )\n\n\n    # finally, search for this object in SIMBAD\n    if ('ra' in objectinfo and 'decl' in objectinfo and\n        objectinfo['ra'] is not None and objectinfo['decl'] is not None and\n        search_simbad):\n\n        simbad_result = simbad.objectnames_conesearch(\n            objectinfo['ra'],\n            objectinfo['decl'],\n            neighbor_radius_arcsec,\n            verbose=verbose,\n            timeout=gaia_submit_timeout,\n            maxtimeout=gaia_max_timeout,\n            maxtries=gaia_submit_tries,\n            complete_query_later=complete_query_later\n        )\n\n    else:\n\n        simbad_result = None\n\n    if (simbad_result and\n        simbad_result['result'] and\n        os.path.exists(simbad_result['result'])):\n\n        with gzip.open(simbad_result['result'],'rb') as infd:\n\n            try:\n                simbad_objectnames = np.genfromtxt(\n                    infd,\n                    names=True,\n                    delimiter=',',\n                    dtype='U20,f8,f8,U20,U20,U20,i8,U600,f8',\n                    usecols=(0,1,2,3,4,5,6,7,8),\n                    comments='?',\n                )\n            except Exception as e:\n                simbad_objectnames = []\n\n            simbad_objectnames = np.atleast_1d(simbad_objectnames)\n\n            if simbad_objectnames.size > 0:\n\n                simbad_mainid = simbad_objectnames['main_id'].tolist()\n                simbad_allids = simbad_objectnames['all_ids'].tolist()\n                simbad_objtype = simbad_objectnames['otype_txt'].tolist()\n                simbad_distarcsec = simbad_objectnames['dist_arcsec'].tolist()\n                simbad_nmatches = len(simbad_mainid)\n\n                simbad_mainid = [x.replace('\"','') for x in simbad_mainid]\n                simbad_allids = [x.replace('\"','') for x in simbad_allids]\n                simbad_objtype = [x.replace('\"','') for x in simbad_objtype]\n\n\n                resultdict.update({\n                    'simbad_nmatches':simbad_nmatches,\n                    'simbad_mainid':simbad_mainid,\n                    'simbad_objtype':simbad_objtype,\n                    'simbad_allids':simbad_allids,\n                    'simbad_distarcsec':simbad_distarcsec\n                })\n\n\n                if simbad_nmatches > 1:\n\n                    resultdict['simbad_status'] = (\n                        'ok: multiple SIMBAD matches found'\n                    )\n\n                else:\n\n                    resultdict['simbad_status'] = 'ok: single SIMBAD match'\n\n\n\n                # get the closest match\n                if simbad_distarcsec[0] < gaia_matchdist_arcsec:\n\n                    resultdict.update({\n                        'simbad_best_mainid':simbad_mainid[0],\n                        'simbad_best_objtype':simbad_objtype[0],\n                        'simbad_best_allids':simbad_allids[0],\n                        'simbad_best_distarcsec':simbad_distarcsec[0],\n                        'simbad_status':'ok: object found within match radius'\n                    })\n\n                else:\n\n                    LOGWARNING('no SIMBAD objects found within '\n                               '%.3f arcsec of object position (%.3f, %.3f), '\n                               'closest object: %s at %.3f arcsec away' %\n                               (gaia_matchdist_arcsec,\n                                objectinfo['ra'],\n                                objectinfo['decl'],\n                                simbad_mainid[0],\n                                simbad_distarcsec[0]))\n\n                    simbad_status = ('failed: no object within %.3f '\n                                     'arcsec, closest = %.3f arcsec' %\n                                     (gaia_matchdist_arcsec,\n                                      simbad_distarcsec[0]))\n\n\n                    resultdict.update({\n                        'simbad_best_mainid':None,\n                        'simbad_best_objtype':None,\n                        'simbad_best_allids':None,\n                        'simbad_best_distarcsec':None,\n                        'simbad_status':simbad_status\n                    })\n\n\n            else:\n\n                resultdict.update({\n                    'simbad_status':'failed: no SIMBAD matches found',\n                    'simbad_nmatches':None,\n                    'simbad_mainid':None,\n                    'simbad_objtype':None,\n                    'simbad_allids':None,\n                    'simbad_distarcsec':None,\n                    'simbad_best_mainid':None,\n                    'simbad_best_objtype':None,\n                    'simbad_best_allids':None,\n                    'simbad_best_distarcsec':None,\n                })\n\n    else:\n\n        if search_simbad:\n            simbad_status = 'failed: SIMBAD query failed'\n        else:\n            simbad_status = 'failed: SIMBAD query not tried'\n\n        resultdict.update({\n            'simbad_status':simbad_status,\n            'simbad_nmatches':None,\n            'simbad_mainid':None,\n            'simbad_objtype':None,\n            'simbad_allids':None,\n            'simbad_distarcsec':None,\n            'simbad_best_mainid':None,\n            'simbad_best_objtype':None,\n            'simbad_best_allids':None,\n            'simbad_best_distarcsec':None,\n        })\n\n    return resultdict"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef parallel_epd_lclist(lclist,\n                        externalparams,\n                        timecols=None,\n                        magcols=None,\n                        errcols=None,\n                        lcformat='hat-sql',\n                        lcformatdir=None,\n                        epdsmooth_sigclip=3.0,\n                        epdsmooth_windowsize=21,\n                        epdsmooth_func=smooth_magseries_savgol,\n                        epdsmooth_extraparams=None,\n                        nworkers=NCPUS,\n                        maxworkertasks=1000):\n    '''This applies EPD in parallel to all LCs in the input list.\n\n    Parameters\n    ----------\n\n    lclist : list of str\n        This is the list of light curve files to run EPD on.\n\n    externalparams : dict or None\n        This is a dict that indicates which keys in the lcdict obtained from the\n        lcfile correspond to the required external parameters. As with timecol,\n        magcol, and errcol, these can be simple keys (e.g. 'rjd') or compound\n        keys ('magaperture1.mags'). The dict should look something like::\n\n          {'fsv':'<lcdict key>' array: S values for each observation,\n           'fdv':'<lcdict key>' array: D values for each observation,\n           'fkv':'<lcdict key>' array: K values for each observation,\n           'xcc':'<lcdict key>' array: x coords for each observation,\n           'ycc':'<lcdict key>' array: y coords for each observation,\n           'bgv':'<lcdict key>' array: sky background for each observation,\n           'bge':'<lcdict key>' array: sky background err for each observation,\n           'iha':'<lcdict key>' array: hour angle for each observation,\n           'izd':'<lcdict key>' array: zenith distance for each observation}\n\n        Alternatively, if these exact keys are already present in the lcdict,\n        indicate this by setting externalparams to None.\n\n    timecols,magcols,errcols : lists of str\n        The keys in the lcdict produced by your light curve reader function that\n        correspond to the times, mags/fluxes, and associated measurement errors\n        that will be used as inputs to the EPD process. If these are None, the\n        default values for `timecols`, `magcols`, and `errcols` for your light\n        curve format will be used here.\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curve files.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    epdsmooth_sigclip : float or int or sequence of two floats/ints or None\n        This specifies how to sigma-clip the input LC before fitting the EPD\n        function to it.\n\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    epdsmooth_windowsize : int\n        This is the number of LC points to smooth over to generate a smoothed\n        light curve that will be used to fit the EPD function.\n\n    epdsmooth_func : Python function\n        This sets the smoothing filter function to use. A Savitsky-Golay filter\n        is used to smooth the light curve by default. The functions that can be\n        used with this kwarg are listed in `varbase.trends`. If you want to use\n        your own function, it MUST have the following signature::\n\n                def smoothfunc(mags_array, window_size, **extraparams)\n\n        and return a numpy array of the same size as `mags_array` with the\n        smoothed time-series. Any extra params can be provided using the\n        `extraparams` dict.\n\n    epdsmooth_extraparams : dict\n        This is a dict of any extra filter params to supply to the smoothing\n        function.\n\n    nworkers : int\n        The number of parallel workers to launch when processing the LCs.\n\n    maxworkertasks : int\n        The maximum number of tasks a parallel worker will complete before it is\n        replaced with a new one (sometimes helps with memory-leaks).\n\n    Returns\n    -------\n\n    dict\n        Returns a dict organized by all the keys in the input `magcols` list,\n        containing lists of EPD pickle light curves for that `magcol`.\n\n    Notes\n    -----\n\n    - S -> measure of PSF sharpness (~1/sigma^2 sosmaller S = wider PSF)\n    - D -> measure of PSF ellipticity in xy direction\n    - K -> measure of PSF ellipticity in cross direction\n\n    S, D, K are related to the PSF's variance and covariance, see eqn 30-33 in\n    A. Pal's thesis: https://arxiv.org/abs/0906.3486\n\n    '''\n\n    try:\n        formatinfo = get_lcformat(lcformat,\n                                  use_lcformat_dir=lcformatdir)\n        if formatinfo:\n            (fileglob, readerfunc,\n             dtimecols, dmagcols, derrcols,\n             magsarefluxes, normfunc) = formatinfo\n        else:\n            LOGERROR(\"can't figure out the light curve format\")\n            return None\n    except Exception as e:\n        LOGEXCEPTION(\"can't figure out the light curve format\")\n        return None\n\n    # override the default timecols, magcols, and errcols\n    # using the ones provided to the function\n    if timecols is None:\n        timecols = dtimecols\n    if magcols is None:\n        magcols = dmagcols\n    if errcols is None:\n        errcols = derrcols\n\n    outdict = {}\n\n    # run by magcol\n    for t, m, e in zip(timecols, magcols, errcols):\n\n        tasks = [(x, t, m, e, externalparams, lcformat, lcformatdir,\n                  epdsmooth_sigclip, epdsmooth_windowsize,\n                  epdsmooth_func, epdsmooth_extraparams) for\n                 x in lclist]\n\n        pool = mp.Pool(nworkers, maxtasksperchild=maxworkertasks)\n        results = pool.map(parallel_epd_worker, tasks)\n        pool.close()\n        pool.join()\n\n        outdict[m] = results\n\n    return outdict", "response": "This function applies EPD in parallel to all light curve files in the input list."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef parallel_epd_lcdir(\n        lcdir,\n        externalparams,\n        lcfileglob=None,\n        timecols=None,\n        magcols=None,\n        errcols=None,\n        lcformat='hat-sql',\n        lcformatdir=None,\n        epdsmooth_sigclip=3.0,\n        epdsmooth_windowsize=21,\n        epdsmooth_func=smooth_magseries_savgol,\n        epdsmooth_extraparams=None,\n        nworkers=NCPUS,\n        maxworkertasks=1000\n):\n    '''This applies EPD in parallel to all LCs in a directory.\n\n    Parameters\n    ----------\n\n    lcdir : str\n        The light curve directory to process.\n\n    externalparams : dict or None\n        This is a dict that indicates which keys in the lcdict obtained from the\n        lcfile correspond to the required external parameters. As with timecol,\n        magcol, and errcol, these can be simple keys (e.g. 'rjd') or compound\n        keys ('magaperture1.mags'). The dict should look something like::\n\n          {'fsv':'<lcdict key>' array: S values for each observation,\n           'fdv':'<lcdict key>' array: D values for each observation,\n           'fkv':'<lcdict key>' array: K values for each observation,\n           'xcc':'<lcdict key>' array: x coords for each observation,\n           'ycc':'<lcdict key>' array: y coords for each observation,\n           'bgv':'<lcdict key>' array: sky background for each observation,\n           'bge':'<lcdict key>' array: sky background err for each observation,\n           'iha':'<lcdict key>' array: hour angle for each observation,\n           'izd':'<lcdict key>' array: zenith distance for each observation}\n\n    lcfileglob : str or None\n        A UNIX fileglob to use to select light curve files in `lcdir`. If this\n        is not None, the value provided will override the default fileglob for\n        your light curve format.\n\n    timecols,magcols,errcols : lists of str\n        The keys in the lcdict produced by your light curve reader function that\n        correspond to the times, mags/fluxes, and associated measurement errors\n        that will be used as inputs to the EPD process. If these are None, the\n        default values for `timecols`, `magcols`, and `errcols` for your light\n        curve format will be used here.\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curves specified in `basedir` or `use_list_of_filenames`.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    epdsmooth_sigclip : float or int or sequence of two floats/ints or None\n        This specifies how to sigma-clip the input LC before fitting the EPD\n        function to it.\n\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    epdsmooth_windowsize : int\n        This is the number of LC points to smooth over to generate a smoothed\n        light curve that will be used to fit the EPD function.\n\n    epdsmooth_func : Python function\n        This sets the smoothing filter function to use. A Savitsky-Golay filter\n        is used to smooth the light curve by default. The functions that can be\n        used with this kwarg are listed in `varbase.trends`. If you want to use\n        your own function, it MUST have the following signature::\n\n                def smoothfunc(mags_array, window_size, **extraparams)\n\n        and return a numpy array of the same size as `mags_array` with the\n        smoothed time-series. Any extra params can be provided using the\n        `extraparams` dict.\n\n    epdsmooth_extraparams : dict\n        This is a dict of any extra filter params to supply to the smoothing\n        function.\n\n    nworkers : int\n        The number of parallel workers to launch when processing the LCs.\n\n    maxworkertasks : int\n        The maximum number of tasks a parallel worker will complete before it is\n        replaced with a new one (sometimes helps with memory-leaks).\n\n    Returns\n    -------\n\n    dict\n        Returns a dict organized by all the keys in the input `magcols` list,\n        containing lists of EPD pickle light curves for that `magcol`.\n\n    Notes\n    -----\n\n    - S -> measure of PSF sharpness (~1/sigma^2 sosmaller S = wider PSF)\n    - D -> measure of PSF ellipticity in xy direction\n    - K -> measure of PSF ellipticity in cross direction\n\n    S, D, K are related to the PSF's variance and covariance, see eqn 30-33 in\n    A. Pal's thesis: https://arxiv.org/abs/0906.3486\n\n    '''\n\n    try:\n        formatinfo = get_lcformat(lcformat,\n                                  use_lcformat_dir=lcformatdir)\n        if formatinfo:\n            (fileglob, readerfunc,\n             dtimecols, dmagcols, derrcols,\n             magsarefluxes, normfunc) = formatinfo\n        else:\n            LOGERROR(\"can't figure out the light curve format\")\n            return None\n    except Exception as e:\n        LOGEXCEPTION(\"can't figure out the light curve format\")\n        return None\n\n    # find all the files matching the lcglob in lcdir\n    if lcfileglob is None:\n        lcfileglob = fileglob\n\n    lclist = sorted(glob.glob(os.path.join(lcdir, lcfileglob)))\n\n    return parallel_epd_lclist(\n        lclist,\n        externalparams,\n        timecols=timecols,\n        magcols=magcols,\n        errcols=errcols,\n        lcformat=lcformat,\n        epdsmooth_sigclip=epdsmooth_sigclip,\n        epdsmooth_windowsize=epdsmooth_windowsize,\n        epdsmooth_func=epdsmooth_func,\n        epdsmooth_extraparams=epdsmooth_extraparams,\n        nworkers=nworkers,\n        maxworkertasks=maxworkertasks\n    )", "response": "This function applies an EPD in parallel to all LCs in a directory."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef bls_serial_pfind(times, mags, errs,\n                     magsarefluxes=False,\n                     startp=0.1,  # search from 0.1 d to...\n                     endp=100.0,  # ... 100.0 d -- don't search full timebase\n                     stepsize=5.0e-4,\n                     mintransitduration=0.01,  # minimum transit length in phase\n                     maxtransitduration=0.4,   # maximum transit length in phase\n                     ndurations=100,\n                     autofreq=True,  # figure out f0, nf, and df automatically\n                     blsobjective='likelihood',\n                     blsmethod='fast',\n                     blsoversample=10,\n                     blsmintransits=3,\n                     blsfreqfactor=10.0,\n                     periodepsilon=0.1,\n                     nbestpeaks=5,\n                     sigclip=10.0,\n                     verbose=True,\n                     raiseonfail=False):\n    '''Runs the Box Least Squares Fitting Search for transit-shaped signals.\n\n    Based on the version of BLS in Astropy 3.1:\n    `astropy.stats.BoxLeastSquares`. If you don't have Astropy 3.1, this module\n    will fail to import. Note that by default, this implementation of\n    `bls_serial_pfind` doesn't use the `.autoperiod()` function from\n    `BoxLeastSquares` but uses the same auto frequency-grid generation as the\n    functions in `periodbase.kbls`. If you want to use Astropy's implementation,\n    set the value of `autofreq` kwarg to 'astropy'.\n\n    The dict returned from this function contains a `blsmodel` key, which is the\n    generated model from Astropy's BLS. Use the `.compute_stats()` method to\n    calculate the required stats like SNR, depth, duration, etc.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The magnitude/flux time-series to search for transits.\n\n    magsarefluxes : bool\n        If the input measurement values in `mags` and `errs` are in fluxes, set\n        this to True.\n\n    startp,endp : float\n        The minimum and maximum periods to consider for the transit search.\n\n    stepsize : float\n        The step-size in frequency to use when constructing a frequency grid for\n        the period search.\n\n    mintransitduration,maxtransitduration : float\n        The minimum and maximum transitdurations (in units of phase) to consider\n        for the transit search.\n\n    ndurations : int\n        The number of transit durations to use in the period-search.\n\n    autofreq : bool or str\n        If this is True, the values of `stepsize` and `nphasebins` will be\n        ignored, and these, along with a frequency-grid, will be determined\n        based on the following relations::\n\n            nphasebins = int(ceil(2.0/mintransitduration))\n            if nphasebins > 3000:\n                nphasebins = 3000\n\n            stepsize = 0.25*mintransitduration/(times.max()-times.min())\n\n            minfreq = 1.0/endp\n            maxfreq = 1.0/startp\n            nfreq = int(ceil((maxfreq - minfreq)/stepsize))\n\n        If this is False, you must set `startp`, `endp`, and `stepsize` as\n        appropriate.\n\n        If this is str == 'astropy', will use the\n        `astropy.stats.BoxLeastSquares.autoperiod()` function to calculate the\n        frequency grid instead of the kbls method.\n\n    blsobjective : {'likelihood','snr'}\n        Sets the type of objective to optimize in the `BoxLeastSquares.power()`\n        function.\n\n    blsmethod : {'fast','slow'}\n        Sets the type of method to use in the `BoxLeastSquares.power()`\n        function.\n\n    blsoversample : {'likelihood','snr'}\n        Sets the `oversample` kwarg for the `BoxLeastSquares.power()` function.\n\n    blsmintransits : int\n        Sets the `min_n_transits` kwarg for the `BoxLeastSquares.autoperiod()`\n        function.\n\n    blsfreqfactor : float\n        Sets the `frequency_factor` kwarg for the `BoxLeastSquares.autperiod()`\n        function.\n\n    periodepsilon : float\n        The fractional difference between successive values of 'best' periods\n        when sorting by periodogram power to consider them as separate periods\n        (as opposed to part of the same periodogram peak). This is used to avoid\n        broad peaks in the periodogram and make sure the 'best' periods returned\n        are all actually independent.\n\n    nbestpeaks : int\n        The number of 'best' peaks to return from the periodogram results,\n        starting from the global maximum of the periodogram peak values.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    verbose : bool\n        If this is True, will indicate progress and details about the frequency\n        grid used for the period search.\n\n    raiseonfail : bool\n        If True, raises an exception if something goes wrong. Otherwise, returns\n        None.\n\n    Returns\n    -------\n\n    dict\n        This function returns a dict, referred to as an `lspinfo` dict in other\n        astrobase functions that operate on periodogram results. This is a\n        standardized format across all astrobase period-finders, and is of the\n        form below::\n\n            {'bestperiod': the best period value in the periodogram,\n             'bestlspval': the periodogram peak associated with the best period,\n             'nbestpeaks': the input value of nbestpeaks,\n             'nbestlspvals': nbestpeaks-size list of best period peak values,\n             'nbestperiods': nbestpeaks-size list of best periods,\n             'lspvals': the full array of periodogram powers,\n             'frequencies': the full array of frequencies considered,\n             'periods': the full array of periods considered,\n             'durations': the array of durations used to run BLS,\n             'blsresult': Astropy BLS result object (BoxLeastSquaresResult),\n             'blsmodel': Astropy BLS BoxLeastSquares object used for work,\n             'stepsize': the actual stepsize used,\n             'nfreq': the actual nfreq used,\n             'durations': the durations array used,\n             'mintransitduration': the input mintransitduration,\n             'maxtransitduration': the input maxtransitdurations,\n             'method':'bls' -> the name of the period-finder method,\n             'kwargs':{ dict of all of the input kwargs for record-keeping}}\n\n    '''\n\n    # get rid of nans first and sigclip\n    stimes, smags, serrs = sigclip_magseries(times,\n                                             mags,\n                                             errs,\n                                             magsarefluxes=magsarefluxes,\n                                             sigclip=sigclip)\n\n    # make sure there are enough points to calculate a spectrum\n    if len(stimes) > 9 and len(smags) > 9 and len(serrs) > 9:\n\n        # if we're setting up everything automatically\n        if isinstance(autofreq, bool) and autofreq:\n\n            # use heuristic to figure out best timestep\n            stepsize = 0.25*mintransitduration/(stimes.max()-stimes.min())\n\n            # now figure out the frequencies to use\n            minfreq = 1.0/endp\n            maxfreq = 1.0/startp\n            nfreq = int(npceil((maxfreq - minfreq)/stepsize))\n\n            # say what we're using\n            if verbose:\n                LOGINFO('min P: %s, max P: %s, nfreq: %s, '\n                        'minfreq: %s, maxfreq: %s' % (startp, endp, nfreq,\n                                                      minfreq, maxfreq))\n                LOGINFO('autofreq = True: using AUTOMATIC values for '\n                        'freq stepsize: %s, ndurations: %s, '\n                        'min transit duration: %s, max transit duration: %s' %\n                        (stepsize, ndurations,\n                         mintransitduration, maxtransitduration))\n\n            use_autoperiod = False\n\n        elif isinstance(autofreq, bool) and not autofreq:\n\n            minfreq = 1.0/endp\n            maxfreq = 1.0/startp\n            nfreq = int(npceil((maxfreq - minfreq)/stepsize))\n\n            # say what we're using\n            if verbose:\n                LOGINFO('min P: %s, max P: %s, nfreq: %s, '\n                        'minfreq: %s, maxfreq: %s' % (startp, endp, nfreq,\n                                                      minfreq, maxfreq))\n                LOGINFO('autofreq = False: using PROVIDED values for '\n                        'freq stepsize: %s, ndurations: %s, '\n                        'min transit duration: %s, max transit duration: %s' %\n                        (stepsize, ndurations,\n                         mintransitduration, maxtransitduration))\n\n            use_autoperiod = False\n\n        elif isinstance(autofreq, str) and autofreq == 'astropy':\n\n            use_autoperiod = True\n            minfreq = 1.0/endp\n            maxfreq = 1.0/startp\n\n        else:\n\n            LOGERROR(\"unknown autofreq kwarg encountered. can't continue...\")\n            return None\n\n        # check the time-base vs. endp value\n        if minfreq < (1.0/(stimes.max() - stimes.min())):\n\n            if verbose:\n                LOGWARNING('the requested max P = %.3f is larger than '\n                           'the time base of the observations = %.3f, '\n                           ' will make minfreq = 2 x 1/timebase'\n                           % (endp, stimes.max() - stimes.min()))\n            minfreq = 2.0/(stimes.max() - stimes.min())\n            if verbose:\n                LOGINFO('new minfreq: %s, maxfreq: %s' %\n                        (minfreq, maxfreq))\n\n\n        # run BLS\n        try:\n\n            # astropy's BLS requires durations in units of time\n            durations = nplinspace(mintransitduration*startp,\n                                   maxtransitduration*startp,\n                                   ndurations)\n\n            # set up the correct units for the BLS model\n            if magsarefluxes:\n\n                blsmodel = BoxLeastSquares(\n                    stimes*u.day,\n                    smags*u.dimensionless_unscaled,\n                    dy=serrs*u.dimensionless_unscaled\n                )\n\n            else:\n\n                blsmodel = BoxLeastSquares(\n                    stimes*u.day,\n                    smags*u.mag,\n                    dy=serrs*u.mag\n                )\n\n            # use autoperiod if requested\n            if use_autoperiod:\n                periods = nparray(\n                    blsmodel.autoperiod(\n                        durations,\n                        minimum_period=startp,\n                        maximum_period=endp,\n                        minimum_n_transit=blsmintransits,\n                        frequency_factor=blsfreqfactor\n                    )\n                )\n                nfreq = periods.size\n\n                if verbose:\n                    LOGINFO(\n                        \"autofreq = 'astropy', used .autoperiod() with \"\n                        \"minimum_n_transit = %s, freq_factor = %s \"\n                        \"to generate the frequency grid\" %\n                        (blsmintransits, blsfreqfactor)\n                    )\n                    LOGINFO('stepsize = %.5f, nfreq = %s, minfreq = %.5f, '\n                            'maxfreq = %.5f, ndurations = %s' %\n                            (abs(1.0/periods[1] - 1.0/periods[0]),\n                             nfreq,\n                             1.0/periods.max(),\n                             1.0/periods.min(),\n                             durations.size))\n\n            # otherwise, use kbls method\n            else:\n                frequencies = minfreq + nparange(nfreq)*stepsize\n                periods = 1.0/frequencies\n\n            if nfreq > 5.0e5:\n                if verbose:\n                    LOGWARNING('more than 5.0e5 frequencies to go through; '\n                               'this will take a while. '\n                               'you might want to use the '\n                               'abls.bls_parallel_pfind function instead')\n\n            # run the periodogram\n            blsresult = blsmodel.power(\n                periods*u.day,\n                durations*u.day,\n                objective=blsobjective,\n                method=blsmethod,\n                oversample=blsoversample\n            )\n\n            # get the peak values\n            lsp = nparray(blsresult.power)\n\n            # find the nbestpeaks for the periodogram: 1. sort the lsp array\n            # by highest value first 2. go down the values until we find\n            # five values that are separated by at least periodepsilon in\n            # period\n            # make sure to get only the finite peaks in the periodogram\n            # this is needed because BLS may produce infs for some peaks\n            finitepeakind = npisfinite(lsp)\n            finlsp = lsp[finitepeakind]\n            finperiods = periods[finitepeakind]\n\n            # make sure that finlsp has finite values before we work on it\n            try:\n\n                bestperiodind = npargmax(finlsp)\n\n            except ValueError:\n\n                LOGERROR('no finite periodogram values '\n                         'for this mag series, skipping...')\n\n                return {'bestperiod':npnan,\n                        'bestlspval':npnan,\n                        'nbestpeaks':nbestpeaks,\n                        'nbestinds':None,\n                        'nbestlspvals':None,\n                        'nbestperiods':None,\n                        'lspvals':None,\n                        'periods':None,\n                        'durations':None,\n                        'method':'bls',\n                        'blsresult':None,\n                        'blsmodel':None,\n                        'kwargs':{'startp':startp,\n                                  'endp':endp,\n                                  'stepsize':stepsize,\n                                  'mintransitduration':mintransitduration,\n                                  'maxtransitduration':maxtransitduration,\n                                  'ndurations':ndurations,\n                                  'blsobjective':blsobjective,\n                                  'blsmethod':blsmethod,\n                                  'blsoversample':blsoversample,\n                                  'blsntransits':blsmintransits,\n                                  'blsfreqfactor':blsfreqfactor,\n                                  'autofreq':autofreq,\n                                  'periodepsilon':periodepsilon,\n                                  'nbestpeaks':nbestpeaks,\n                                  'sigclip':sigclip,\n                                  'magsarefluxes':magsarefluxes}}\n\n            sortedlspind = npargsort(finlsp)[::-1]\n            sortedlspperiods = finperiods[sortedlspind]\n            sortedlspvals = finlsp[sortedlspind]\n\n            # now get the nbestpeaks\n            nbestperiods, nbestlspvals, nbestinds, peakcount = (\n                [finperiods[bestperiodind]],\n                [finlsp[bestperiodind]],\n                [bestperiodind],\n                1\n            )\n            prevperiod = sortedlspperiods[0]\n\n            # find the best nbestpeaks in the lsp and their periods\n            for period, lspval, ind in zip(sortedlspperiods,\n                                           sortedlspvals,\n                                           sortedlspind):\n\n                if peakcount == nbestpeaks:\n                    break\n                perioddiff = abs(period - prevperiod)\n                bestperiodsdiff = [abs(period - x) for x in nbestperiods]\n\n                # print('prevperiod = %s, thisperiod = %s, '\n                #       'perioddiff = %s, peakcount = %s' %\n                #       (prevperiod, period, perioddiff, peakcount))\n\n                # this ensures that this period is different from the last\n                # period and from all the other existing best periods by\n                # periodepsilon to make sure we jump to an entire different\n                # peak in the periodogram\n                if (perioddiff > (periodepsilon*prevperiod) and\n                    all(x > (periodepsilon*period)\n                        for x in bestperiodsdiff)):\n                    nbestperiods.append(period)\n                    nbestlspvals.append(lspval)\n                    nbestinds.append(ind)\n                    peakcount = peakcount + 1\n\n                prevperiod = period\n\n\n            # generate the return dict\n            resultdict = {\n                'bestperiod':finperiods[bestperiodind],\n                'bestlspval':finlsp[bestperiodind],\n                'nbestpeaks':nbestpeaks,\n                'nbestinds':nbestinds,\n                'nbestlspvals':nbestlspvals,\n                'nbestperiods':nbestperiods,\n                'lspvals':lsp,\n                'frequencies':frequencies,\n                'periods':periods,\n                'durations':durations,\n                'blsresult':blsresult,\n                'blsmodel':blsmodel,\n                'stepsize':stepsize,\n                'nfreq':nfreq,\n                'mintransitduration':mintransitduration,\n                'maxtransitduration':maxtransitduration,\n                'method':'bls',\n                'kwargs':{'startp':startp,\n                          'endp':endp,\n                          'stepsize':stepsize,\n                          'mintransitduration':mintransitduration,\n                          'maxtransitduration':maxtransitduration,\n                          'ndurations':ndurations,\n                          'blsobjective':blsobjective,\n                          'blsmethod':blsmethod,\n                          'blsoversample':blsoversample,\n                          'blsntransits':blsmintransits,\n                          'blsfreqfactor':blsfreqfactor,\n                          'autofreq':autofreq,\n                          'periodepsilon':periodepsilon,\n                          'nbestpeaks':nbestpeaks,\n                          'sigclip':sigclip,\n                          'magsarefluxes':magsarefluxes}\n            }\n\n            return resultdict\n\n        except Exception as e:\n\n            LOGEXCEPTION('BLS failed!')\n\n            if raiseonfail:\n                raise\n\n            return {'bestperiod':npnan,\n                    'bestlspval':npnan,\n                    'nbestinds':None,\n                    'nbestpeaks':nbestpeaks,\n                    'nbestlspvals':None,\n                    'nbestperiods':None,\n                    'lspvals':None,\n                    'periods':None,\n                    'durations':None,\n                    'blsresult':None,\n                    'blsmodel':None,\n                    'stepsize':stepsize,\n                    'nfreq':nfreq,\n                    'mintransitduration':mintransitduration,\n                    'maxtransitduration':maxtransitduration,\n                    'method':'bls',\n                    'kwargs':{'startp':startp,\n                              'endp':endp,\n                              'stepsize':stepsize,\n                              'mintransitduration':mintransitduration,\n                              'maxtransitduration':maxtransitduration,\n                              'ndurations':ndurations,\n                              'blsobjective':blsobjective,\n                              'blsmethod':blsmethod,\n                              'blsoversample':blsoversample,\n                              'blsntransits':blsmintransits,\n                              'blsfreqfactor':blsfreqfactor,\n                              'autofreq':autofreq,\n                              'periodepsilon':periodepsilon,\n                              'nbestpeaks':nbestpeaks,\n                              'sigclip':sigclip,\n                              'magsarefluxes':magsarefluxes}}\n\n\n    else:\n\n        LOGERROR('no good detections for these times and mags, skipping...')\n        return {'bestperiod':npnan,\n                'bestlspval':npnan,\n                'nbestinds':None,\n                'nbestpeaks':nbestpeaks,\n                'nbestlspvals':None,\n                'nbestperiods':None,\n                'lspvals':None,\n                'periods':None,\n                'durations':None,\n                'blsresult':None,\n                'blsmodel':None,\n                'stepsize':stepsize,\n                'nfreq':None,\n                'nphasebins':None,\n                'mintransitduration':mintransitduration,\n                'maxtransitduration':maxtransitduration,\n                'method':'bls',\n                'kwargs':{'startp':startp,\n                          'endp':endp,\n                          'stepsize':stepsize,\n                          'mintransitduration':mintransitduration,\n                          'maxtransitduration':maxtransitduration,\n                          'ndurations':ndurations,\n                          'blsobjective':blsobjective,\n                          'blsmethod':blsmethod,\n                          'blsoversample':blsoversample,\n                          'blsntransits':blsmintransits,\n                          'blsfreqfactor':blsfreqfactor,\n                          'autofreq':autofreq,\n                          'periodepsilon':periodepsilon,\n                          'nbestpeaks':nbestpeaks,\n                          'sigclip':sigclip,\n                          'magsarefluxes':magsarefluxes}}", "response": "This function runs the BoxLeastSquares Fitting Search for transit - shaped signals."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nrunning the Box Least Squares Fitting Search for transit-shaped signals. Breaks up the full frequency space into chunks and passes them to parallel BLS workers. Based on the version of BLS in Astropy 3.1: `astropy.stats.BoxLeastSquares`. If you don't have Astropy 3.1, this module will fail to import. Note that by default, this implementation of `bls_parallel_pfind` doesn't use the `.autoperiod()` function from `BoxLeastSquares` but uses the same auto frequency-grid generation as the functions in `periodbase.kbls`. If you want to use Astropy's implementation, set the value of `autofreq` kwarg to 'astropy'. The generated period array will then be broken up into chunks and sent to the individual workers. NOTE: the combined BLS spectrum produced by this function is not identical to that produced by running BLS in one shot for the entire frequency space. There are differences on the order of 1.0e-3 or so in the respective peak values, but peaks appear at the same frequencies for both methods. This is likely due to different aliasing caused by smaller chunks of the frequency space used by the parallel workers in this function. When in doubt, confirm results for this parallel implementation by comparing to those from the serial implementation above. In particular, when you want to get reliable estimates of the SNR, transit depth, duration, etc. that Astropy's BLS gives you, rerun `bls_serial_pfind` with `startp`, and `endp` close to the best period you want to characterize the transit at. The dict returned from that function contains a `blsmodel` key, which is the generated model from Astropy's BLS. Use the `.compute_stats()` method to calculate the required stats. Parameters ---------- times,mags,errs : np.array The magnitude/flux time-series to search for transits. magsarefluxes : bool If the input measurement values in `mags` and `errs` are in fluxes, set this to True. startp,endp : float The minimum and maximum periods to consider for the transit search. stepsize : float The step-size in frequency to use when constructing a frequency grid for the period search. mintransitduration,maxtransitduration : float The minimum and maximum transitdurations (in units of phase) to consider for the transit search. ndurations : int The number of transit durations to use in the period-search. autofreq : bool or str If this is True, the values of `stepsize` and `nphasebins` will be ignored, and these, along with a frequency-grid, will be determined based on the following relations:: nphasebins = int(ceil(2.0/mintransitduration)) if nphasebins > 3000: nphasebins = 3000 stepsize = 0.25*mintransitduration/(times.max()-times.min()) minfreq = 1.0/endp maxfreq = 1.0/startp nfreq = int(ceil((maxfreq - minfreq)/stepsize)) If this is False, you must set `startp`, `endp`, and `stepsize` as appropriate. If this is str == 'astropy', will use the `astropy.stats.BoxLeastSquares.autoperiod()` function to calculate the frequency grid instead of the kbls method. blsobjective : {'likelihood','snr'} Sets the type of objective to optimize in the `BoxLeastSquares.power()` function. blsmethod : {'fast','slow'} Sets the type of method to use in the `BoxLeastSquares.power()` function. blsoversample : {'likelihood','snr'} Sets the `oversample` kwarg for the `BoxLeastSquares.power()` function. blsmintransits : int Sets the `min_n_transits` kwarg for the `BoxLeastSquares.autoperiod()` function. blsfreqfactor : float Sets the `frequency_factor` kwarg for the `BoxLeastSquares.autoperiod()` function. periodepsilon : float The fractional difference between successive values of 'best' periods when sorting by periodogram power to consider them as separate periods (as opposed to part of the same periodogram peak). This is used to avoid broad peaks in the periodogram and make sure the 'best' periods returned are all actually independent. nbestpeaks : int The number of 'best' peaks to return from the periodogram results, starting from the global maximum of the periodogram peak values. sigclip : float or int or sequence of two floats/ints or None If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series. If a list of two ints/floats is provided, the function will perform an 'asymmetric' sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of \"dimming\" and \"brightening\" is set by *physics* (not the magnitude system), which is why the `magsarefluxes` kwarg must be correctly set. If `sigclip` is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output. verbose : bool If this is True, will indicate progress and details about the frequency grid used for the period search. nworkers : int or None The number of parallel workers to launch for period-search. If None, nworkers = NCPUS. Returns ------- dict This function returns a dict, referred to as an `lspinfo` dict in other astrobase functions that operate on periodogram results. This is a standardized format across all astrobase period-finders, and is of the form below:: {'bestperiod': the best period value in the periodogram, 'bestlspval': the periodogram peak associated with the best period, 'nbestpeaks': the input value of nbestpeaks, 'nbestlspvals': nbestpeaks-size list of best period peak values, 'nbestperiods': nbestpeaks-size list of best periods, 'lspvals': the full array of periodogram powers, 'frequencies': the full array of frequencies considered, 'periods': the full array of periods considered, 'durations': the array of durations used to run BLS, 'blsresult': Astropy BLS result object (BoxLeastSquaresResult), 'blsmodel': Astropy BLS BoxLeastSquares object used for work, 'stepsize': the actual stepsize used, 'nfreq': the actual nfreq used, 'durations': the durations array used, 'mintransitduration': the input mintransitduration, 'maxtransitduration': the input maxtransitdurations, 'method':'bls' -> the name of the period-finder method, 'kwargs':{ dict of all of the input kwargs for record-keeping}}", "response": "def bls_parallel_pfind(\n        times, mags, errs,\n        magsarefluxes=False,\n        startp=0.1,  # by default, search from 0.1 d to...\n        endp=100.0,  # ... 100.0 d -- don't search full timebase\n        stepsize=1.0e-4,\n        mintransitduration=0.01,  # minimum transit length in phase\n        maxtransitduration=0.4,   # maximum transit length in phase\n        ndurations=100,\n        autofreq=True,  # figure out f0, nf, and df automatically\n        blsobjective='likelihood',\n        blsmethod='fast',\n        blsoversample=5,\n        blsmintransits=3,\n        blsfreqfactor=10.0,\n        nbestpeaks=5,\n        periodepsilon=0.1,  # 0.1\n        sigclip=10.0,\n        verbose=True,\n        nworkers=None,\n):\n    '''Runs the Box Least Squares Fitting Search for transit-shaped signals.\n\n    Breaks up the full frequency space into chunks and passes them to parallel\n    BLS workers.\n\n    Based on the version of BLS in Astropy 3.1:\n    `astropy.stats.BoxLeastSquares`. If you don't have Astropy 3.1, this module\n    will fail to import. Note that by default, this implementation of\n    `bls_parallel_pfind` doesn't use the `.autoperiod()` function from\n    `BoxLeastSquares` but uses the same auto frequency-grid generation as the\n    functions in `periodbase.kbls`. If you want to use Astropy's implementation,\n    set the value of `autofreq` kwarg to 'astropy'. The generated period array\n    will then be broken up into chunks and sent to the individual workers.\n\n    NOTE: the combined BLS spectrum produced by this function is not identical\n    to that produced by running BLS in one shot for the entire frequency\n    space. There are differences on the order of 1.0e-3 or so in the respective\n    peak values, but peaks appear at the same frequencies for both methods. This\n    is likely due to different aliasing caused by smaller chunks of the\n    frequency space used by the parallel workers in this function. When in\n    doubt, confirm results for this parallel implementation by comparing to\n    those from the serial implementation above.\n\n    In particular, when you want to get reliable estimates of the SNR, transit\n    depth, duration, etc. that Astropy's BLS gives you, rerun `bls_serial_pfind`\n    with `startp`, and `endp` close to the best period you want to characterize\n    the transit at. The dict returned from that function contains a `blsmodel`\n    key, which is the generated model from Astropy's BLS. Use the\n    `.compute_stats()` method to calculate the required stats.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The magnitude/flux time-series to search for transits.\n\n    magsarefluxes : bool\n        If the input measurement values in `mags` and `errs` are in fluxes, set\n        this to True.\n\n    startp,endp : float\n        The minimum and maximum periods to consider for the transit search.\n\n    stepsize : float\n        The step-size in frequency to use when constructing a frequency grid for\n        the period search.\n\n    mintransitduration,maxtransitduration : float\n        The minimum and maximum transitdurations (in units of phase) to consider\n        for the transit search.\n\n    ndurations : int\n        The number of transit durations to use in the period-search.\n\n    autofreq : bool or str\n        If this is True, the values of `stepsize` and `nphasebins` will be\n        ignored, and these, along with a frequency-grid, will be determined\n        based on the following relations::\n\n            nphasebins = int(ceil(2.0/mintransitduration))\n            if nphasebins > 3000:\n                nphasebins = 3000\n\n            stepsize = 0.25*mintransitduration/(times.max()-times.min())\n\n            minfreq = 1.0/endp\n            maxfreq = 1.0/startp\n            nfreq = int(ceil((maxfreq - minfreq)/stepsize))\n\n        If this is False, you must set `startp`, `endp`, and `stepsize` as\n        appropriate.\n\n        If this is str == 'astropy', will use the\n        `astropy.stats.BoxLeastSquares.autoperiod()` function to calculate the\n        frequency grid instead of the kbls method.\n\n    blsobjective : {'likelihood','snr'}\n        Sets the type of objective to optimize in the `BoxLeastSquares.power()`\n        function.\n\n    blsmethod : {'fast','slow'}\n        Sets the type of method to use in the `BoxLeastSquares.power()`\n        function.\n\n    blsoversample : {'likelihood','snr'}\n        Sets the `oversample` kwarg for the `BoxLeastSquares.power()` function.\n\n    blsmintransits : int\n        Sets the `min_n_transits` kwarg for the `BoxLeastSquares.autoperiod()`\n        function.\n\n    blsfreqfactor : float\n        Sets the `frequency_factor` kwarg for the `BoxLeastSquares.autoperiod()`\n        function.\n\n    periodepsilon : float\n        The fractional difference between successive values of 'best' periods\n        when sorting by periodogram power to consider them as separate periods\n        (as opposed to part of the same periodogram peak). This is used to avoid\n        broad peaks in the periodogram and make sure the 'best' periods returned\n        are all actually independent.\n\n    nbestpeaks : int\n        The number of 'best' peaks to return from the periodogram results,\n        starting from the global maximum of the periodogram peak values.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    verbose : bool\n        If this is True, will indicate progress and details about the frequency\n        grid used for the period search.\n\n    nworkers : int or None\n        The number of parallel workers to launch for period-search. If None,\n        nworkers = NCPUS.\n\n    Returns\n    -------\n\n    dict\n        This function returns a dict, referred to as an `lspinfo` dict in other\n        astrobase functions that operate on periodogram results. This is a\n        standardized format across all astrobase period-finders, and is of the\n        form below::\n\n            {'bestperiod': the best period value in the periodogram,\n             'bestlspval': the periodogram peak associated with the best period,\n             'nbestpeaks': the input value of nbestpeaks,\n             'nbestlspvals': nbestpeaks-size list of best period peak values,\n             'nbestperiods': nbestpeaks-size list of best periods,\n             'lspvals': the full array of periodogram powers,\n             'frequencies': the full array of frequencies considered,\n             'periods': the full array of periods considered,\n             'durations': the array of durations used to run BLS,\n             'blsresult': Astropy BLS result object (BoxLeastSquaresResult),\n             'blsmodel': Astropy BLS BoxLeastSquares object used for work,\n             'stepsize': the actual stepsize used,\n             'nfreq': the actual nfreq used,\n             'durations': the durations array used,\n             'mintransitduration': the input mintransitduration,\n             'maxtransitduration': the input maxtransitdurations,\n             'method':'bls' -> the name of the period-finder method,\n             'kwargs':{ dict of all of the input kwargs for record-keeping}}\n\n    '''\n\n    # get rid of nans first and sigclip\n    stimes, smags, serrs = sigclip_magseries(times,\n                                             mags,\n                                             errs,\n                                             magsarefluxes=magsarefluxes,\n                                             sigclip=sigclip)\n\n    # make sure there are enough points to calculate a spectrum\n    if len(stimes) > 9 and len(smags) > 9 and len(serrs) > 9:\n\n        # if we're setting up everything automatically\n        if isinstance(autofreq, bool) and autofreq:\n\n            # use heuristic to figure out best timestep\n            stepsize = 0.25*mintransitduration/(stimes.max()-stimes.min())\n\n            # now figure out the frequencies to use\n            minfreq = 1.0/endp\n            maxfreq = 1.0/startp\n            nfreq = int(npceil((maxfreq - minfreq)/stepsize))\n\n            # say what we're using\n            if verbose:\n                LOGINFO('min P: %s, max P: %s, nfreq: %s, '\n                        'minfreq: %s, maxfreq: %s' % (startp, endp, nfreq,\n                                                      minfreq, maxfreq))\n                LOGINFO('autofreq = True: using AUTOMATIC values for '\n                        'freq stepsize: %s, ndurations: %s, '\n                        'min transit duration: %s, max transit duration: %s' %\n                        (stepsize, ndurations,\n                         mintransitduration, maxtransitduration))\n\n            use_autoperiod = False\n\n        elif isinstance(autofreq, bool) and not autofreq:\n\n            minfreq = 1.0/endp\n            maxfreq = 1.0/startp\n            nfreq = int(npceil((maxfreq - minfreq)/stepsize))\n\n            # say what we're using\n            if verbose:\n                LOGINFO('min P: %s, max P: %s, nfreq: %s, '\n                        'minfreq: %s, maxfreq: %s' % (startp, endp, nfreq,\n                                                      minfreq, maxfreq))\n                LOGINFO('autofreq = False: using PROVIDED values for '\n                        'freq stepsize: %s, ndurations: %s, '\n                        'min transit duration: %s, max transit duration: %s' %\n                        (stepsize, ndurations,\n                         mintransitduration, maxtransitduration))\n\n            use_autoperiod = False\n\n        elif isinstance(autofreq, str) and autofreq == 'astropy':\n\n            use_autoperiod = True\n            minfreq = 1.0/endp\n            maxfreq = 1.0/startp\n\n        else:\n\n            LOGERROR(\"unknown autofreq kwarg encountered. can't continue...\")\n            return None\n\n        # check the minimum frequency\n        if minfreq < (1.0/(stimes.max() - stimes.min())):\n\n            minfreq = 2.0/(stimes.max() - stimes.min())\n            if verbose:\n                LOGWARNING('the requested max P = %.3f is larger than '\n                           'the time base of the observations = %.3f, '\n                           ' will make minfreq = 2 x 1/timebase'\n                           % (endp, stimes.max() - stimes.min()))\n                LOGINFO('new minfreq: %s, maxfreq: %s' %\n                        (minfreq, maxfreq))\n\n\n        #############################\n        ## NOW RUN BLS IN PARALLEL ##\n        #############################\n\n        # fix number of CPUs if needed\n        if not nworkers or nworkers > NCPUS:\n            nworkers = NCPUS\n            if verbose:\n                LOGINFO('using %s workers...' % nworkers)\n\n        # check if autoperiod is True and get the correct period-grid\n        if use_autoperiod:\n\n            # astropy's BLS requires durations in units of time\n            durations = nplinspace(mintransitduration*startp,\n                                   maxtransitduration*startp,\n                                   ndurations)\n\n            # set up the correct units for the BLS model\n            if magsarefluxes:\n\n                blsmodel = BoxLeastSquares(\n                    stimes*u.day,\n                    smags*u.dimensionless_unscaled,\n                    dy=serrs*u.dimensionless_unscaled\n                )\n\n            else:\n\n                blsmodel = BoxLeastSquares(\n                    stimes*u.day,\n                    smags*u.mag,\n                    dy=serrs*u.mag\n                )\n\n            periods = nparray(\n                blsmodel.autoperiod(\n                    durations*u.day,\n                    minimum_period=startp,\n                    maximum_period=endp,\n                    minimum_n_transit=blsmintransits,\n                    frequency_factor=blsfreqfactor\n                )\n            )\n\n            frequencies = 1.0/periods\n            nfreq = frequencies.size\n\n            if verbose:\n                LOGINFO(\n                    \"autofreq = 'astropy', used .autoperiod() with \"\n                    \"minimum_n_transit = %s, freq_factor = %s \"\n                    \"to generate the frequency grid\" %\n                    (blsmintransits, blsfreqfactor)\n                )\n                LOGINFO('stepsize = %s, nfreq = %s, minfreq = %.5f, '\n                        'maxfreq = %.5f, ndurations = %s' %\n                        (abs(frequencies[1] - frequencies[0]),\n                         nfreq,\n                         1.0/periods.max(),\n                         1.0/periods.min(),\n                         durations.size))\n\n            del blsmodel\n            del durations\n\n        # otherwise, use kbls method\n        else:\n\n            frequencies = minfreq + nparange(nfreq)*stepsize\n\n\n        # break up the tasks into chunks\n        csrem = int(fmod(nfreq, nworkers))\n        csint = int(float(nfreq/nworkers))\n        chunk_minfreqs, chunk_nfreqs = [], []\n\n        for x in range(nworkers):\n\n            this_minfreqs = frequencies[x*csint]\n\n            # handle usual nfreqs\n            if x < (nworkers - 1):\n                this_nfreqs = frequencies[x*csint:x*csint+csint].size\n            else:\n                this_nfreqs = frequencies[x*csint:x*csint+csint+csrem].size\n\n            chunk_minfreqs.append(this_minfreqs)\n            chunk_nfreqs.append(this_nfreqs)\n\n\n        # populate the tasks list\n        #\n        # task[0] = times\n        # task[1] = mags\n        # task[2] = errs\n        # task[3] = magsarefluxes\n\n        # task[4] = minfreq\n        # task[5] = nfreq\n        # task[6] = stepsize\n\n        # task[7] = nphasebins\n        # task[8] = mintransitduration\n        # task[9] = maxtransitduration\n\n        # task[10] = blsobjective\n        # task[11] = blsmethod\n        # task[12] = blsoversample\n\n        # populate the tasks list\n        tasks = [(stimes, smags, serrs, magsarefluxes,\n                  chunk_minf, chunk_nf, stepsize,\n                  ndurations, mintransitduration, maxtransitduration,\n                  blsobjective, blsmethod, blsoversample)\n                 for (chunk_minf, chunk_nf)\n                 in zip(chunk_minfreqs, chunk_nfreqs)]\n\n        if verbose:\n            for ind, task in enumerate(tasks):\n                LOGINFO('worker %s: minfreq = %.6f, nfreqs = %s' %\n                        (ind+1, task[4], task[5]))\n            LOGINFO('running...')\n\n        # return tasks\n\n        # start the pool\n        pool = Pool(nworkers)\n        results = pool.map(_parallel_bls_worker, tasks)\n\n        pool.close()\n        pool.join()\n        del pool\n\n        # now concatenate the output lsp arrays\n        lsp = npconcatenate([x['power'] for x in results])\n        periods = 1.0/frequencies\n\n        # find the nbestpeaks for the periodogram: 1. sort the lsp array\n        # by highest value first 2. go down the values until we find\n        # five values that are separated by at least periodepsilon in\n        # period\n        # make sure to get only the finite peaks in the periodogram\n        # this is needed because BLS may produce infs for some peaks\n        finitepeakind = npisfinite(lsp)\n        finlsp = lsp[finitepeakind]\n        finperiods = periods[finitepeakind]\n\n        # make sure that finlsp has finite values before we work on it\n        try:\n\n            bestperiodind = npargmax(finlsp)\n\n        except ValueError:\n\n            LOGERROR('no finite periodogram values '\n                     'for this mag series, skipping...')\n\n            return {'bestperiod':npnan,\n                    'bestlspval':npnan,\n                    'nbestpeaks':nbestpeaks,\n                    'nbestinds':None,\n                    'nbestlspvals':None,\n                    'nbestperiods':None,\n                    'lspvals':None,\n                    'periods':None,\n                    'durations':None,\n                    'method':'bls',\n                    'blsresult':None,\n                    'blsmodel':None,\n                    'kwargs':{'startp':startp,\n                              'endp':endp,\n                              'stepsize':stepsize,\n                              'mintransitduration':mintransitduration,\n                              'maxtransitduration':maxtransitduration,\n                              'ndurations':ndurations,\n                              'blsobjective':blsobjective,\n                              'blsmethod':blsmethod,\n                              'blsoversample':blsoversample,\n                              'autofreq':autofreq,\n                              'periodepsilon':periodepsilon,\n                              'nbestpeaks':nbestpeaks,\n                              'sigclip':sigclip,\n                              'magsarefluxes':magsarefluxes}}\n\n        sortedlspind = npargsort(finlsp)[::-1]\n        sortedlspperiods = finperiods[sortedlspind]\n        sortedlspvals = finlsp[sortedlspind]\n\n        # now get the nbestpeaks\n        nbestperiods, nbestlspvals, nbestinds, peakcount = (\n            [finperiods[bestperiodind]],\n            [finlsp[bestperiodind]],\n            [bestperiodind],\n            1\n        )\n        prevperiod = sortedlspperiods[0]\n\n        # find the best nbestpeaks in the lsp and their periods\n        for period, lspval, ind in zip(sortedlspperiods,\n                                       sortedlspvals,\n                                       sortedlspind):\n\n            if peakcount == nbestpeaks:\n                break\n            perioddiff = abs(period - prevperiod)\n            bestperiodsdiff = [abs(period - x) for x in nbestperiods]\n\n            # this ensures that this period is different from the last\n            # period and from all the other existing best periods by\n            # periodepsilon to make sure we jump to an entire different\n            # peak in the periodogram\n            if (perioddiff > (periodepsilon*prevperiod) and\n                all(x > (periodepsilon*period)\n                    for x in bestperiodsdiff)):\n                nbestperiods.append(period)\n                nbestlspvals.append(lspval)\n                nbestinds.append(ind)\n                peakcount = peakcount + 1\n\n            prevperiod = period\n\n\n        # generate the return dict\n        resultdict = {\n            'bestperiod':finperiods[bestperiodind],\n            'bestlspval':finlsp[bestperiodind],\n            'nbestpeaks':nbestpeaks,\n            'nbestinds':nbestinds,\n            'nbestlspvals':nbestlspvals,\n            'nbestperiods':nbestperiods,\n            'lspvals':lsp,\n            'frequencies':frequencies,\n            'periods':periods,\n            'durations':[x['durations'] for x in results],\n            'blsresult':[x['blsresult'] for x in results],\n            'blsmodel':[x['blsmodel'] for x in results],\n            'stepsize':stepsize,\n            'nfreq':nfreq,\n            'mintransitduration':mintransitduration,\n            'maxtransitduration':maxtransitduration,\n            'method':'bls',\n            'kwargs':{'startp':startp,\n                      'endp':endp,\n                      'stepsize':stepsize,\n                      'mintransitduration':mintransitduration,\n                      'maxtransitduration':maxtransitduration,\n                      'ndurations':ndurations,\n                      'blsobjective':blsobjective,\n                      'blsmethod':blsmethod,\n                      'blsoversample':blsoversample,\n                      'autofreq':autofreq,\n                      'periodepsilon':periodepsilon,\n                      'nbestpeaks':nbestpeaks,\n                      'sigclip':sigclip,\n                      'magsarefluxes':magsarefluxes}\n        }\n\n        return resultdict\n\n\n    else:\n\n        LOGERROR('no good detections for these times and mags, skipping...')\n        return {'bestperiod':npnan,\n                'bestlspval':npnan,\n                'nbestinds':None,\n                'nbestpeaks':nbestpeaks,\n                'nbestlspvals':None,\n                'nbestperiods':None,\n                'lspvals':None,\n                'periods':None,\n                'durations':None,\n                'blsresult':None,\n                'blsmodel':None,\n                'stepsize':stepsize,\n                'nfreq':None,\n                'nphasebins':None,\n                'mintransitduration':mintransitduration,\n                'maxtransitduration':maxtransitduration,\n                'method':'bls',\n                'kwargs':{'startp':startp,\n                          'endp':endp,\n                          'stepsize':stepsize,\n                          'mintransitduration':mintransitduration,\n                          'maxtransitduration':maxtransitduration,\n                          'ndurations':ndurations,\n                          'blsobjective':blsobjective,\n                          'blsmethod':blsmethod,\n                          'blsoversample':blsoversample,\n                          'autofreq':autofreq,\n                          'periodepsilon':periodepsilon,\n                          'nbestpeaks':nbestpeaks,\n                          'sigclip':sigclip,\n                          'magsarefluxes':magsarefluxes}}"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _parse_csv_header(header):\n    '''This parses a CSV header from a K2 CSV LC.\n\n    Returns a dict that can be used to update an existing lcdict with the\n    relevant metadata info needed to form a full LC.\n\n    '''\n\n    # first, break into lines\n    headerlines = header.split('\\n')\n    headerlines = [x.lstrip('# ') for x in headerlines]\n\n    # next, find the indices of the '# COLUMNS' line and '# LIGHTCURVE' line\n    metadatastart = headerlines.index('METADATA')\n    columnstart = headerlines.index('COLUMNS')\n    lcstart = headerlines.index('LIGHTCURVE')\n\n    # get the lines for the metadata and columndefs\n    metadata = headerlines[metadatastart+1:columnstart-1]\n    columndefs = headerlines[columnstart+1:lcstart-1]\n\n    # parse the metadata\n    metainfo = [x.split(',') for x in metadata][:-1]\n    aperpixradius = metadata[-1]\n\n    objectid, kepid, ucac4id, kepmag = metainfo[0]\n    objectid, kepid, ucac4id, kepmag = (objectid.split(' = ')[-1],\n                                        kepid.split(' = ')[-1],\n                                        ucac4id.split(' = ')[-1],\n                                        kepmag.split(' = ')[-1])\n    kepmag = float(kepmag) if kepmag else None\n\n    ra, decl, ndet, k2campaign = metainfo[1]\n    ra, decl, ndet, k2campaign = (ra.split(' = ')[-1],\n                                  decl.split(' = ')[-1],\n                                  int(ndet.split(' = ')[-1]),\n                                  int(k2campaign.split(' = ')[-1]))\n\n    fovccd, fovchannel, fovmodule = metainfo[2]\n    fovccd, fovchannel, fovmodule = (int(fovccd.split(' = ')[-1]),\n                                     int(fovchannel.split(' = ')[-1]),\n                                     int(fovmodule.split(' = ')[-1]))\n\n    try:\n        qualflag, bjdoffset, napertures = metainfo[3]\n        qualflag, bjdoffset, napertures = (int(qualflag.split(' = ')[-1]),\n                                           float(bjdoffset.split(' = ')[-1]),\n                                           int(napertures.split(' = ')[-1]))\n        kernelspec = None\n    except Exception as e:\n        qualflag, bjdoffset, napertures, kernelspec = metainfo[3]\n        qualflag, bjdoffset, napertures, kernelspec = (\n            int(qualflag.split(' = ')[-1]),\n            float(bjdoffset.split(' = ')[-1]),\n            int(napertures.split(' = ')[-1]),\n            str(kernelspec.split(' = ')[-1])\n        )\n\n    aperpixradius = aperpixradius.split(' = ')[-1].split(',')\n    aperpixradius = [float(x) for x in aperpixradius]\n\n    # parse the columndefs\n    columns = [x.split(' - ')[1] for x in columndefs]\n\n    metadict = {'objectid':objectid,\n                'objectinfo':{\n                    'objectid':objectid,\n                    'kepid':kepid,\n                    'ucac4id':ucac4id,\n                    'kepmag':kepmag,\n                    'ra':ra,\n                    'decl':decl,\n                    'ndet':ndet,\n                    'k2campaign':k2campaign,\n                    'fovccd':fovccd,\n                    'fovchannel':fovchannel,\n                    'fovmodule':fovmodule,\n                    'qualflag':qualflag,\n                    'bjdoffset':bjdoffset,\n                    'napertures':napertures,\n                    'kernelspec':kernelspec,\n                    'aperpixradius':aperpixradius,\n                },\n                'columns':columns}\n\n    return metadict", "response": "This function parses a CSV header from a K2 CSV LC."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nreads in a K2 light curve in CSV format. Transparently reads gzipped K2 light curve file and returns a dict containing the keys of each column as keys and values of each column as values.", "response": "def read_csv_lightcurve(lcfile):\n    '''\n    This reads in a K2 lightcurve in CSV format. Transparently reads gzipped\n    files.\n\n    Parameters\n    ----------\n\n    lcfile : str\n        The light curve file to read.\n\n    Returns\n    -------\n\n    dict\n        Returns an lcdict.\n\n    '''\n\n    # read in the file first\n    if '.gz' in os.path.basename(lcfile):\n        LOGINFO('reading gzipped K2 LC: %s' % lcfile)\n        infd = gzip.open(lcfile,'rb')\n    else:\n        LOGINFO('reading K2 LC: %s' % lcfile)\n        infd = open(lcfile,'rb')\n\n    lctext = infd.read().decode()\n    infd.close()\n\n    # figure out the header and get the LC columns\n    lcstart = lctext.index('# LIGHTCURVE\\n')\n    lcheader = lctext[:lcstart+12]\n    lccolumns = lctext[lcstart+13:].split('\\n')\n    lccolumns = [x.split(',') for x in lccolumns if len(x) > 0]\n\n    # initialize the lcdict and parse the CSV header\n    lcdict = _parse_csv_header(lcheader)\n\n    # tranpose the LC rows into columns\n    lccolumns = list(zip(*lccolumns))\n\n    # write the columns to the dict\n    for colind, col in enumerate(lcdict['columns']):\n\n        # this picks out the caster to use when reading each column using the\n        # definitions in the lcutils.COLUMNDEFS dictionary\n        lcdict[col.lower()] = np.array([COLUMNDEFS[col][2](x)\n                                        for x in lccolumns[colind]])\n\n    lcdict['columns'] = [x.lower() for x in lcdict['columns']]\n\n    return lcdict"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef get_starfeatures(lcfile,\n                     outdir,\n                     kdtree,\n                     objlist,\n                     lcflist,\n                     neighbor_radius_arcsec,\n                     deredden=True,\n                     custom_bandpasses=None,\n                     lcformat='hat-sql',\n                     lcformatdir=None):\n    '''This runs the functions from :py:func:`astrobase.varclass.starfeatures`\n    on a single light curve file.\n\n    Parameters\n    ----------\n\n    lcfile : str\n        This is the LC file to extract star features for.\n\n    outdir : str\n        This is the directory to write the output pickle to.\n\n    kdtree: scipy.spatial.cKDTree\n        This is a `scipy.spatial.KDTree` or `cKDTree` used to calculate neighbor\n        proximity features. This is for the light curve catalog this object is\n        in.\n\n    objlist : np.array\n        This is a Numpy array of object IDs in the same order as the\n        `kdtree.data` np.array. This is for the light curve catalog this object\n        is in.\n\n    lcflist : np.array\n        This is a Numpy array of light curve filenames in the same order as\n        `kdtree.data`. This is for the light curve catalog this object is in.\n\n    neighbor_radius_arcsec : float\n        This indicates the radius in arcsec to search for neighbors for this\n        object using the light curve catalog's `kdtree`, `objlist`, `lcflist`,\n        and in GAIA.\n\n    deredden : bool\n        This controls if the colors and any color classifications will be\n        dereddened using 2MASS DUST.\n\n    custom_bandpasses : dict or None\n        This is a dict used to define any custom bandpasses in the\n        `in_objectinfo` dict you want to make this function aware of and\n        generate colors for. Use the format below for this dict::\n\n            {\n            '<bandpass_key_1>':{'dustkey':'<twomass_dust_key_1>',\n                                'label':'<band_label_1>'\n                                'colors':[['<bandkey1>-<bandkey2>',\n                                           '<BAND1> - <BAND2>'],\n                                          ['<bandkey3>-<bandkey4>',\n                                           '<BAND3> - <BAND4>']]},\n            .\n            ...\n            .\n            '<bandpass_key_N>':{'dustkey':'<twomass_dust_key_N>',\n                                'label':'<band_label_N>'\n                                'colors':[['<bandkey1>-<bandkey2>',\n                                           '<BAND1> - <BAND2>'],\n                                          ['<bandkey3>-<bandkey4>',\n                                           '<BAND3> - <BAND4>']]},\n            }\n\n        Where:\n\n        `bandpass_key` is a key to use to refer to this bandpass in the\n        `objectinfo` dict, e.g. 'sdssg' for SDSS g band\n\n        `twomass_dust_key` is the key to use in the 2MASS DUST result table for\n        reddening per band-pass. For example, given the following DUST result\n        table (using http://irsa.ipac.caltech.edu/applications/DUST/)::\n\n            |Filter_name|LamEff |A_over_E_B_V_SandF|A_SandF|A_over_E_B_V_SFD|A_SFD|\n            |char       |float  |float             |float  |float           |float|\n            |           |microns|                  |mags   |                |mags |\n             CTIO U       0.3734              4.107   0.209            4.968 0.253\n             CTIO B       0.4309              3.641   0.186            4.325 0.221\n             CTIO V       0.5517              2.682   0.137            3.240 0.165\n            .\n            .\n            ...\n\n        The `twomass_dust_key` for 'vmag' would be 'CTIO V'. If you want to\n        skip DUST lookup and want to pass in a specific reddening magnitude\n        for your bandpass, use a float for the value of\n        `twomass_dust_key`. If you want to skip DUST lookup entirely for\n        this bandpass, use None for the value of `twomass_dust_key`.\n\n        `band_label` is the label to use for this bandpass, e.g. 'W1' for\n        WISE-1 band, 'u' for SDSS u, etc.\n\n        The 'colors' list contains color definitions for all colors you want\n        to generate using this bandpass. this list contains elements of the\n        form::\n\n            ['<bandkey1>-<bandkey2>','<BAND1> - <BAND2>']\n\n        where the the first item is the bandpass keys making up this color,\n        and the second item is the label for this color to be used by the\n        frontends. An example::\n\n            ['sdssu-sdssg','u - g']\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curves specified in `basedir` or `use_list_of_filenames`.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    Returns\n    -------\n\n    str\n        Path to the output pickle containing all of the star features for this\n        object.\n\n    '''\n\n    try:\n        formatinfo = get_lcformat(lcformat,\n                                  use_lcformat_dir=lcformatdir)\n        if formatinfo:\n            (dfileglob, readerfunc,\n             dtimecols, dmagcols, derrcols,\n             magsarefluxes, normfunc) = formatinfo\n        else:\n            LOGERROR(\"can't figure out the light curve format\")\n            return None\n    except Exception as e:\n        LOGEXCEPTION(\"can't figure out the light curve format\")\n        return None\n\n    try:\n\n        # get the LC into a dict\n        lcdict = readerfunc(lcfile)\n\n        # this should handle lists/tuples being returned by readerfunc\n        # we assume that the first element is the actual lcdict\n        # FIXME: figure out how to not need this assumption\n        if ( (isinstance(lcdict, (list, tuple))) and\n             (isinstance(lcdict[0], dict)) ):\n            lcdict = lcdict[0]\n\n        resultdict = {'objectid':lcdict['objectid'],\n                      'info':lcdict['objectinfo'],\n                      'lcfbasename':os.path.basename(lcfile)}\n\n        # run the coord features first\n        coordfeat = starfeatures.coord_features(lcdict['objectinfo'])\n\n        # next, run the color features\n        colorfeat = starfeatures.color_features(\n            lcdict['objectinfo'],\n            deredden=deredden,\n            custom_bandpasses=custom_bandpasses\n        )\n\n        # run a rough color classification\n        colorclass = starfeatures.color_classification(colorfeat,\n                                                       coordfeat)\n\n        # finally, run the neighbor features\n        nbrfeat = starfeatures.neighbor_gaia_features(lcdict['objectinfo'],\n                                                      kdtree,\n                                                      neighbor_radius_arcsec)\n\n        # get the objectids of the neighbors found if any\n        if nbrfeat['nbrindices'].size > 0:\n            nbrfeat['nbrobjectids'] = objlist[nbrfeat['nbrindices']]\n            nbrfeat['closestnbrobjectid'] = objlist[\n                nbrfeat['closestdistnbrind']\n            ]\n            nbrfeat['closestnbrlcfname'] = lcflist[\n                nbrfeat['closestdistnbrind']\n            ]\n\n        else:\n            nbrfeat['nbrobjectids'] = np.array([])\n            nbrfeat['closestnbrobjectid'] = np.array([])\n            nbrfeat['closestnbrlcfname'] = np.array([])\n\n        # update the result dict\n        resultdict.update(coordfeat)\n        resultdict.update(colorfeat)\n        resultdict.update(colorclass)\n        resultdict.update(nbrfeat)\n\n        outfile = os.path.join(outdir,\n                               'starfeatures-%s.pkl' %\n                               squeeze(resultdict['objectid']).replace(' ','-'))\n\n        with open(outfile, 'wb') as outfd:\n            pickle.dump(resultdict, outfd, protocol=4)\n\n        return outfile\n\n    except Exception as e:\n\n        LOGEXCEPTION('failed to get star features for %s because: %s' %\n                     (os.path.basename(lcfile), e))\n        return None", "response": "This function runs the functions from astrobase. varclass. starfeatures on a single light curve file and returns the star features for that object."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef serial_starfeatures(lclist,\n                        outdir,\n                        lc_catalog_pickle,\n                        neighbor_radius_arcsec,\n                        maxobjects=None,\n                        deredden=True,\n                        custom_bandpasses=None,\n                        lcformat='hat-sql',\n                        lcformatdir=None):\n    '''This drives the `get_starfeatures` function for a collection of LCs.\n\n    Parameters\n    ----------\n\n    lclist : list of str\n        The list of light curve file names to process.\n\n    outdir : str\n        The output directory where the results will be placed.\n\n    lc_catalog_pickle : str\n        The path to a catalog containing at a dict with least:\n\n        - an object ID array accessible with `dict['objects']['objectid']`\n\n        - an LC filename array accessible with `dict['objects']['lcfname']`\n\n        - a `scipy.spatial.KDTree` or `cKDTree` object to use for finding\n          neighbors for each object accessible with `dict['kdtree']`\n\n        A catalog pickle of the form needed can be produced using\n        :py:func:`astrobase.lcproc.catalogs.make_lclist` or\n        :py:func:`astrobase.lcproc.catalogs.filter_lclist`.\n\n    neighbor_radius_arcsec : float\n        This indicates the radius in arcsec to search for neighbors for this\n        object using the light curve catalog's `kdtree`, `objlist`, `lcflist`,\n        and in GAIA.\n\n    maxobjects : int\n        The number of objects to process from `lclist`.\n\n    deredden : bool\n        This controls if the colors and any color classifications will be\n        dereddened using 2MASS DUST.\n\n    custom_bandpasses : dict or None\n        This is a dict used to define any custom bandpasses in the\n        `in_objectinfo` dict you want to make this function aware of and\n        generate colors for. Use the format below for this dict::\n\n            {\n            '<bandpass_key_1>':{'dustkey':'<twomass_dust_key_1>',\n                                'label':'<band_label_1>'\n                                'colors':[['<bandkey1>-<bandkey2>',\n                                           '<BAND1> - <BAND2>'],\n                                          ['<bandkey3>-<bandkey4>',\n                                           '<BAND3> - <BAND4>']]},\n            .\n            ...\n            .\n            '<bandpass_key_N>':{'dustkey':'<twomass_dust_key_N>',\n                                'label':'<band_label_N>'\n                                'colors':[['<bandkey1>-<bandkey2>',\n                                           '<BAND1> - <BAND2>'],\n                                          ['<bandkey3>-<bandkey4>',\n                                           '<BAND3> - <BAND4>']]},\n            }\n\n        Where:\n\n        `bandpass_key` is a key to use to refer to this bandpass in the\n        `objectinfo` dict, e.g. 'sdssg' for SDSS g band\n\n        `twomass_dust_key` is the key to use in the 2MASS DUST result table for\n        reddening per band-pass. For example, given the following DUST result\n        table (using http://irsa.ipac.caltech.edu/applications/DUST/)::\n\n            |Filter_name|LamEff |A_over_E_B_V_SandF|A_SandF|A_over_E_B_V_SFD|A_SFD|\n            |char       |float  |float             |float  |float           |float|\n            |           |microns|                  |mags   |                |mags |\n             CTIO U       0.3734              4.107   0.209            4.968 0.253\n             CTIO B       0.4309              3.641   0.186            4.325 0.221\n             CTIO V       0.5517              2.682   0.137            3.240 0.165\n            .\n            .\n            ...\n\n        The `twomass_dust_key` for 'vmag' would be 'CTIO V'. If you want to\n        skip DUST lookup and want to pass in a specific reddening magnitude\n        for your bandpass, use a float for the value of\n        `twomass_dust_key`. If you want to skip DUST lookup entirely for\n        this bandpass, use None for the value of `twomass_dust_key`.\n\n        `band_label` is the label to use for this bandpass, e.g. 'W1' for\n        WISE-1 band, 'u' for SDSS u, etc.\n\n        The 'colors' list contains color definitions for all colors you want\n        to generate using this bandpass. this list contains elements of the\n        form::\n\n            ['<bandkey1>-<bandkey2>','<BAND1> - <BAND2>']\n\n        where the the first item is the bandpass keys making up this color,\n        and the second item is the label for this color to be used by the\n        frontends. An example::\n\n            ['sdssu-sdssg','u - g']\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curves specified in `basedir` or `use_list_of_filenames`.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    Returns\n    -------\n\n    list of str\n        A list of all star features pickles produced.\n\n    '''\n    # make sure to make the output directory if it doesn't exist\n    if not os.path.exists(outdir):\n        os.makedirs(outdir)\n\n    if maxobjects:\n        lclist = lclist[:maxobjects]\n\n    # read in the kdtree pickle\n    with open(lc_catalog_pickle, 'rb') as infd:\n        kdt_dict = pickle.load(infd)\n\n    kdt = kdt_dict['kdtree']\n    objlist = kdt_dict['objects']['objectid']\n    objlcfl = kdt_dict['objects']['lcfname']\n\n    tasks = [(x, outdir, kdt, objlist, objlcfl,\n              neighbor_radius_arcsec,\n              deredden, custom_bandpasses,\n              lcformat, lcformatdir) for x in lclist]\n\n    for task in tqdm(tasks):\n        result = _starfeatures_worker(task)\n\n    return result", "response": "This drives the get_starfeatures function for a collection of LCs."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef parallel_starfeatures_lcdir(lcdir,\n                                outdir,\n                                lc_catalog_pickle,\n                                neighbor_radius_arcsec,\n                                fileglob=None,\n                                maxobjects=None,\n                                deredden=True,\n                                custom_bandpasses=None,\n                                lcformat='hat-sql',\n                                lcformatdir=None,\n                                nworkers=NCPUS,\n                                recursive=True):\n    '''This runs parallel star feature extraction for a directory of LCs.\n\n    Parameters\n    ----------\n\n    lcdir : list of str\n        The directory to search for light curves.\n\n    outdir : str\n        The output directory where the results will be placed.\n\n    lc_catalog_pickle : str\n        The path to a catalog containing at a dict with least:\n\n        - an object ID array accessible with `dict['objects']['objectid']`\n\n        - an LC filename array accessible with `dict['objects']['lcfname']`\n\n        - a `scipy.spatial.KDTree` or `cKDTree` object to use for finding\n          neighbors for each object accessible with `dict['kdtree']`\n\n        A catalog pickle of the form needed can be produced using\n        :py:func:`astrobase.lcproc.catalogs.make_lclist` or\n        :py:func:`astrobase.lcproc.catalogs.filter_lclist`.\n\n    neighbor_radius_arcsec : float\n        This indicates the radius in arcsec to search for neighbors for this\n        object using the light curve catalog's `kdtree`, `objlist`, `lcflist`,\n        and in GAIA.\n\n    fileglob : str\n        The UNIX file glob to use to search for the light curves in `lcdir`. If\n        None, the default value for the light curve format specified will be\n        used.\n\n    maxobjects : int\n        The number of objects to process from `lclist`.\n\n    deredden : bool\n        This controls if the colors and any color classifications will be\n        dereddened using 2MASS DUST.\n\n    custom_bandpasses : dict or None\n        This is a dict used to define any custom bandpasses in the\n        `in_objectinfo` dict you want to make this function aware of and\n        generate colors for. Use the format below for this dict::\n\n            {\n            '<bandpass_key_1>':{'dustkey':'<twomass_dust_key_1>',\n                                'label':'<band_label_1>'\n                                'colors':[['<bandkey1>-<bandkey2>',\n                                           '<BAND1> - <BAND2>'],\n                                          ['<bandkey3>-<bandkey4>',\n                                           '<BAND3> - <BAND4>']]},\n            .\n            ...\n            .\n            '<bandpass_key_N>':{'dustkey':'<twomass_dust_key_N>',\n                                'label':'<band_label_N>'\n                                'colors':[['<bandkey1>-<bandkey2>',\n                                           '<BAND1> - <BAND2>'],\n                                          ['<bandkey3>-<bandkey4>',\n                                           '<BAND3> - <BAND4>']]},\n            }\n\n        Where:\n\n        `bandpass_key` is a key to use to refer to this bandpass in the\n        `objectinfo` dict, e.g. 'sdssg' for SDSS g band\n\n        `twomass_dust_key` is the key to use in the 2MASS DUST result table for\n        reddening per band-pass. For example, given the following DUST result\n        table (using http://irsa.ipac.caltech.edu/applications/DUST/)::\n\n            |Filter_name|LamEff |A_over_E_B_V_SandF|A_SandF|A_over_E_B_V_SFD|A_SFD|\n            |char       |float  |float             |float  |float           |float|\n            |           |microns|                  |mags   |                |mags |\n             CTIO U       0.3734              4.107   0.209            4.968 0.253\n             CTIO B       0.4309              3.641   0.186            4.325 0.221\n             CTIO V       0.5517              2.682   0.137            3.240 0.165\n            .\n            .\n            ...\n\n        The `twomass_dust_key` for 'vmag' would be 'CTIO V'. If you want to\n        skip DUST lookup and want to pass in a specific reddening magnitude\n        for your bandpass, use a float for the value of\n        `twomass_dust_key`. If you want to skip DUST lookup entirely for\n        this bandpass, use None for the value of `twomass_dust_key`.\n\n        `band_label` is the label to use for this bandpass, e.g. 'W1' for\n        WISE-1 band, 'u' for SDSS u, etc.\n\n        The 'colors' list contains color definitions for all colors you want\n        to generate using this bandpass. this list contains elements of the\n        form::\n\n            ['<bandkey1>-<bandkey2>','<BAND1> - <BAND2>']\n\n        where the the first item is the bandpass keys making up this color,\n        and the second item is the label for this color to be used by the\n        frontends. An example::\n\n            ['sdssu-sdssg','u - g']\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curves specified in `basedir` or `use_list_of_filenames`.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    nworkers : int\n        The number of parallel workers to launch.\n\n    Returns\n    -------\n\n    dict\n        A dict with key:val pairs of the input light curve filename and the\n        output star features pickle for each LC processed.\n\n    '''\n\n    try:\n        formatinfo = get_lcformat(lcformat,\n                                  use_lcformat_dir=lcformatdir)\n        if formatinfo:\n            (dfileglob, readerfunc,\n             dtimecols, dmagcols, derrcols,\n             magsarefluxes, normfunc) = formatinfo\n        else:\n            LOGERROR(\"can't figure out the light curve format\")\n            return None\n    except Exception as e:\n        LOGEXCEPTION(\"can't figure out the light curve format\")\n        return None\n\n    if not fileglob:\n        fileglob = dfileglob\n\n    # now find the files\n    LOGINFO('searching for %s light curves in %s ...' % (lcformat, lcdir))\n\n    if recursive is False:\n        matching = glob.glob(os.path.join(lcdir, fileglob))\n\n    else:\n        # use recursive glob for Python 3.5+\n        if sys.version_info[:2] > (3,4):\n\n            matching = glob.glob(os.path.join(lcdir,\n                                              '**',\n                                              fileglob),recursive=True)\n\n        # otherwise, use os.walk and glob\n        else:\n\n            # use os.walk to go through the directories\n            walker = os.walk(lcdir)\n            matching = []\n\n            for root, dirs, _files in walker:\n                for sdir in dirs:\n                    searchpath = os.path.join(root,\n                                              sdir,\n                                              fileglob)\n                    foundfiles = glob.glob(searchpath)\n\n                    if foundfiles:\n                        matching.extend(foundfiles)\n\n\n    # now that we have all the files, process them\n    if matching and len(matching) > 0:\n\n        LOGINFO('found %s light curves, getting starfeatures...' %\n                len(matching))\n\n        return parallel_starfeatures(matching,\n                                     outdir,\n                                     lc_catalog_pickle,\n                                     neighbor_radius_arcsec,\n                                     deredden=deredden,\n                                     custom_bandpasses=custom_bandpasses,\n                                     maxobjects=maxobjects,\n                                     lcformat=lcformat,\n                                     lcformatdir=lcformatdir,\n                                     nworkers=nworkers)\n\n    else:\n\n        LOGERROR('no light curve files in %s format found in %s' % (lcformat,\n                                                                    lcdir))\n        return None", "response": "This function runs the parallel star features extraction for a directory of LCs."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef checkplot_pickle(\n        lspinfolist,\n        times,\n        mags,\n        errs,\n        fast_mode=False,\n        magsarefluxes=False,\n        nperiodstouse=3,\n        objectinfo=None,\n        deredden_object=True,\n        custom_bandpasses=None,\n        gaia_submit_timeout=10.0,\n        gaia_submit_tries=3,\n        gaia_max_timeout=180.0,\n        gaia_mirror=None,\n        complete_query_later=True,\n        varinfo=None,\n        getvarfeatures=True,\n        lclistpkl=None,\n        nbrradiusarcsec=60.0,\n        maxnumneighbors=5,\n        xmatchinfo=None,\n        xmatchradiusarcsec=3.0,\n        lcfitfunc=None,\n        lcfitparams=None,\n        externalplots=None,\n        findercmap='gray_r',\n        finderconvolve=None,\n        findercachedir='~/.astrobase/stamp-cache',\n        normto='globalmedian',\n        normmingap=4.0,\n        sigclip=4.0,\n        varepoch='min',\n        phasewrap=True,\n        phasesort=True,\n        phasebin=0.002,\n        minbinelems=7,\n        plotxlim=(-0.8,0.8),\n        xliminsetmode=False,\n        plotdpi=100,\n        bestperiodhighlight=None,\n        xgridlines=None,\n        mindet=99,\n        verbose=True,\n        outfile=None,\n        outgzip=False,\n        pickleprotocol=None,\n        returndict=False\n):\n\n    '''This writes a multiple lspinfo checkplot to a (gzipped) pickle file.\n\n    This function can take input from multiple lspinfo dicts (e.g. a list of\n    output dicts or gzipped pickles of dicts from independent runs of BLS, PDM,\n    AoV, or GLS period-finders in periodbase).\n\n    NOTE: if `lspinfolist` contains more than one lspinfo object with the same\n    lspmethod ('pdm','gls','sls','aov','bls'), the latest one in the list will\n    overwrite the earlier ones.\n\n    The output dict contains all the plots (magseries and phased magseries),\n    periodograms, object information, variability information, light curves, and\n    phased light curves. This can be written to:\n\n    - a pickle with `checkplot_pickle` below\n    - a PNG with `checkplot.pkl_png.checkplot_pickle_to_png`\n\n    Parameters\n    ----------\n\n    lspinfolist : list of dicts\n        This is a list of dicts containing period-finder results ('lspinfo'\n        dicts). These can be from any of the period-finder methods in\n        astrobase.periodbase. To incorporate external period-finder results into\n        checkplots, these dicts must be of the form below, including at least\n        the keys indicated here::\n\n            {'periods': np.array of all periods searched by the period-finder,\n             'lspvals': np.array of periodogram power value for each period,\n             'bestperiod': a float value that is the period with the highest\n                           peak in the periodogram, i.e. the most-likely actual\n                           period,\n             'method': a three-letter code naming the period-finder used; must\n                       be one of the keys in the\n                       `astrobase.periodbase.METHODLABELS` dict,\n             'nbestperiods': a list of the periods corresponding to periodogram\n                             peaks (`nbestlspvals` below) to annotate on the\n                             periodogram plot so they can be called out\n                             visually,\n             'nbestlspvals': a list of the power values associated with\n                             periodogram peaks to annotate on the periodogram\n                             plot so they can be called out visually; should be\n                             the same length as `nbestperiods` above}\n\n        `nbestperiods` and `nbestlspvals` in each lspinfo dict must have at\n        least as many elements as the `nperiodstouse` kwarg to this function.\n\n    times,mags,errs : np.arrays\n        The magnitude/flux time-series to process for this checkplot along with\n        their associated measurement errors.\n\n    fast_mode : bool or float\n        This runs the external catalog operations in a \"fast\" mode, with short\n        timeouts and not trying to hit external catalogs that take a long time\n        to respond.\n\n        If this is set to True, the default settings for the external requests\n        will then become::\n\n            skyview_lookup = False\n            skyview_timeout = 10.0\n            skyview_retry_failed = False\n            dust_timeout = 10.0\n            gaia_submit_timeout = 7.0\n            gaia_max_timeout = 10.0\n            gaia_submit_tries = 2\n            complete_query_later = False\n            search_simbad = False\n\n        If this is a float, will run in \"fast\" mode with the provided timeout\n        value in seconds and the following settings::\n\n            skyview_lookup = True\n            skyview_timeout = fast_mode\n            skyview_retry_failed = False\n            dust_timeout = fast_mode\n            gaia_submit_timeout = 0.66*fast_mode\n            gaia_max_timeout = fast_mode\n            gaia_submit_tries = 2\n            complete_query_later = False\n            search_simbad = False\n\n    magsarefluxes : bool\n        If True, indicates the input time-series is fluxes and not mags so the\n        plot y-axis direction and range can be set appropriately.\n\n    nperiodstouse : int\n        This controls how many 'best' periods to make phased LC plots for. By\n        default, this is the 3 best. If this is set to None, all 'best' periods\n        present in each lspinfo dict's 'nbestperiods' key will be processed for\n        this checkplot.\n\n    objectinfo : dict or None\n        If provided, this is a dict containing information on the object whose\n        light curve is being processed. This function will then be able to look\n        up and download a finder chart for this object and write that to the\n        output checkplotdict. External services such as GAIA, SIMBAD, TIC@MAST,\n        etc. will also be used to look up this object by its coordinates, and\n        will add in information available from those services.\n\n        The `objectinfo` dict must be of the form and contain at least the keys\n        described below::\n\n            {'objectid': the name of the object,\n             'ra': the right ascension of the object in decimal degrees,\n             'decl': the declination of the object in decimal degrees,\n             'ndet': the number of observations of this object}\n\n        You can also provide magnitudes and proper motions of the object using\n        the following keys and the appropriate values in the `objectinfo`\n        dict. These will be used to calculate colors, total and reduced proper\n        motion, etc. and display these in the output checkplot PNG::\n\n            'pmra' -> the proper motion in mas/yr in right ascension,\n            'pmdecl' -> the proper motion in mas/yr in the declination,\n            'umag'  -> U mag\t\t -> colors: U-B, U-V, U-g\n            'bmag'  -> B mag\t\t -> colors: U-B, B-V\n            'vmag'  -> V mag\t\t -> colors: U-V, B-V, V-R, V-I, V-K\n            'rmag'  -> R mag\t\t -> colors: V-R, R-I\n            'imag'  -> I mag\t\t -> colors: g-I, V-I, R-I, B-I\n            'jmag'  -> 2MASS J mag\t -> colors: J-H, J-K, g-J, i-J\n            'hmag'  -> 2MASS H mag\t -> colors: J-H, H-K\n            'kmag'  -> 2MASS Ks mag\t -> colors: g-Ks, H-Ks, J-Ks, V-Ks\n            'sdssu' -> SDSS u mag\t -> colors: u-g, u-V\n            'sdssg' -> SDSS g mag\t -> colors: g-r, g-i, g-K, u-g, U-g, g-J\n            'sdssr' -> SDSS r mag\t -> colors: r-i, g-r\n            'sdssi' -> SDSS i mag\t -> colors: r-i, i-z, g-i, i-J, i-W1\n            'sdssz' -> SDSS z mag\t -> colors: i-z, z-W2, g-z\n            'ujmag' -> UKIRT J mag\t -> colors: J-H, H-K, J-K, g-J, i-J\n            'uhmag' -> UKIRT H mag\t -> colors: J-H, H-K\n            'ukmag' -> UKIRT K mag\t -> colors: g-K, H-K, J-K, V-K\n            'irac1' -> Spitzer IRAC1 mag -> colors: i-I1, I1-I2\n            'irac2' -> Spitzer IRAC2 mag -> colors: I1-I2, I2-I3\n            'irac3' -> Spitzer IRAC3 mag -> colors: I2-I3\n            'irac4' -> Spitzer IRAC4 mag -> colors: I3-I4\n            'wise1' -> WISE W1 mag\t -> colors: i-W1, W1-W2\n            'wise2' -> WISE W2 mag\t -> colors: W1-W2, W2-W3\n            'wise3' -> WISE W3 mag\t -> colors: W2-W3\n            'wise4' -> WISE W4 mag\t -> colors: W3-W4\n\n        If you have magnitude measurements in other bands, use the\n        `custom_bandpasses` kwarg to pass these in.\n\n        If this is None, no object information will be incorporated into the\n        checkplot (kind of making it effectively useless for anything other than\n        glancing at the phased light curves at various 'best' periods from the\n        period-finder results).\n\n    deredden_object : bool\n        If this is True, will use the 2MASS DUST service to get extinction\n        coefficients in various bands, and then try to deredden the magnitudes\n        and colors of the object already present in the checkplot's objectinfo\n        dict.\n\n    custom_bandpasses : dict\n        This is a dict used to provide custom bandpass definitions for any\n        magnitude measurements in the objectinfo dict that are not automatically\n        recognized by :py:func:`astrobase.varclass.starfeatures.color_features`.\n\n    gaia_submit_timeout : float\n        Sets the timeout in seconds to use when submitting a request to look up\n        the object's information to the GAIA service. Note that if `fast_mode`\n        is set, this is ignored.\n\n    gaia_submit_tries : int\n        Sets the maximum number of times the GAIA services will be contacted to\n        obtain this object's information. If `fast_mode` is set, this is\n        ignored, and the services will be contacted only once (meaning that a\n        failure to respond will be silently ignored and no GAIA data will be\n        added to the checkplot's objectinfo dict).\n\n    gaia_max_timeout : float\n        Sets the timeout in seconds to use when waiting for the GAIA service to\n        respond to our request for the object's information. Note that if\n        `fast_mode` is set, this is ignored.\n\n    gaia_mirror : str or None\n        This sets the GAIA mirror to use. This is a key in the\n        `services.gaia.GAIA_URLS` dict which defines the URLs to hit for each\n        mirror.\n\n    complete_query_later : bool\n        If this is True, saves the state of GAIA queries that are not yet\n        complete when `gaia_max_timeout` is reached while waiting for the GAIA\n        service to respond to our request. A later call for GAIA info on the\n        same object will attempt to pick up the results from the existing query\n        if it's completed. If `fast_mode` is True, this is ignored.\n\n    varinfo : dict\n        If this is None, a blank dict of the form below will be added to the\n        checkplotdict::\n\n            {'objectisvar': None -> variability flag (None indicates unset),\n             'vartags': CSV str containing variability type tags from review,\n             'varisperiodic': None -> periodic variability flag (None -> unset),\n             'varperiod': the period associated with the periodic variability,\n             'varepoch': the epoch associated with the periodic variability}\n\n        If you provide a dict matching this format in this kwarg, this will be\n        passed unchanged to the output checkplotdict produced.\n\n    getvarfeatures : bool\n        If this is True, several light curve variability features for this\n        object will be calculated and added to the output checkpotdict as\n        checkplotdict['varinfo']['features']. This uses the function\n        `varclass.varfeatures.all_nonperiodic_features` so see its docstring for\n        the measures that are calculated (e.g. Stetson J indices, dispersion\n        measures, etc.)\n\n    lclistpkl : dict or str\n        If this is provided, must be a dict resulting from reading a catalog\n        produced by the `lcproc.catalogs.make_lclist` function or a str path\n        pointing to the pickle file produced by that function. This catalog is\n        used to find neighbors of the current object in the current light curve\n        collection. Looking at neighbors of the object within the radius\n        specified by `nbrradiusarcsec` is useful for light curves produced by\n        instruments that have a large pixel scale, so are susceptible to\n        blending of variability and potential confusion of neighbor variability\n        with that of the actual object being looked at. If this is None, no\n        neighbor lookups will be performed.\n\n    nbrradiusarcsec : flaot\n        The radius in arcseconds to use for a search conducted around the\n        coordinates of this object to look for any potential confusion and\n        blending of variability amplitude caused by their proximity.\n\n    maxnumneighbors : int\n        The maximum number of neighbors that will have their light curves and\n        magnitudes noted in this checkplot as potential blends with the target\n        object.\n\n    xmatchinfo : str or dict\n        This is either the xmatch dict produced by the function\n        `load_xmatch_external_catalogs` above, or the path to the xmatch info\n        pickle file produced by that function.\n\n    xmatchradiusarcsec : float\n        This is the cross-matching radius to use in arcseconds.\n\n    lcfitfunc : Python function or None\n        If provided, this should be a Python function that is used to fit a\n        model to the light curve. This fit is then overplotted for each phased\n        light curve in the checkplot. This function should have the following\n        signature:\n\n        `def lcfitfunc(times, mags, errs, period, **lcfitparams)`\n\n        where `lcfitparams` encapsulates all external parameters (i.e. number of\n        knots for a spline function, the degree of a Legendre polynomial fit,\n        etc., planet transit parameters) This function should return a Python\n        dict with the following structure (similar to the functions in\n        `astrobase.lcfit`) and at least the keys below::\n\n            {'fittype':<str: name of fit method>,\n             'fitchisq':<float: the chi-squared value of the fit>,\n             'fitredchisq':<float: the reduced chi-squared value of the fit>,\n             'fitinfo':{'fitmags':<ndarray: model mags/fluxes from fit func>},\n             'magseries':{'times':<ndarray: times where fitmags are evaluated>}}\n\n        Additional keys in the dict returned from this function can include\n        `fitdict['fitinfo']['finalparams']` for the final model fit parameters\n        (this will be used by the checkplotserver if present),\n        `fitdict['fitinfo']['fitepoch']` for the minimum light epoch returned by\n        the model fit, among others.\n\n        In any case, the output dict of `lcfitfunc` will be copied to the output\n        checkplotdict as\n        `checkplotdict[lspmethod][periodind]['lcfit'][<fittype>]` for each\n        phased light curve.\n\n    lcfitparams : dict\n        A dict containing the LC fit parameters to use when calling the function\n        provided in `lcfitfunc`. This contains key-val pairs corresponding to\n        parameter names and their respective initial values to be used by the\n        fit function.\n\n    externalplots : list of tuples of str\n        If provided, this is a list of 4-element tuples containing:\n\n        1. path to PNG of periodogram from an external period-finding method\n        2. path to PNG of best period phased LC from the external period-finder\n        3. path to PNG of 2nd-best phased LC from the external period-finder\n        4. path to PNG of 3rd-best phased LC from the external period-finder\n\n        This can be used to incorporate external period-finding method results\n        into the output checkplot pickle or exported PNG to allow for comparison\n        with astrobase results. Example of `externalplots`::\n\n            [('/path/to/external/bls-periodogram.png',\n              '/path/to/external/bls-phasedlc-plot-bestpeak.png',\n              '/path/to/external/bls-phasedlc-plot-peak2.png',\n              '/path/to/external/bls-phasedlc-plot-peak3.png'),\n             ('/path/to/external/pdm-periodogram.png',\n              '/path/to/external/pdm-phasedlc-plot-bestpeak.png',\n              '/path/to/external/pdm-phasedlc-plot-peak2.png',\n              '/path/to/external/pdm-phasedlc-plot-peak3.png'),\n             ...]\n\n        If `externalplots` is provided here, these paths will be stored in the\n        output checkplotdict. The `checkplot.pkl_png.checkplot_pickle_to_png`\n        function can then automatically retrieve these plot PNGs and put\n        them into the exported checkplot PNG.\n\n    findercmap : str or matplotlib.cm.ColorMap object\n        The Colormap object to use for the finder chart image.\n\n    finderconvolve : astropy.convolution.Kernel object or None\n        If not None, the Kernel object to use for convolving the finder image.\n\n    findercachedir : str\n        The path to the astrobase cache directory for finder chart downloads\n        from the NASA SkyView service.\n\n    normto : {'globalmedian', 'zero'} or a float\n        This specifies the normalization target::\n\n            'globalmedian' -> norms each mag to global median of the LC column\n            'zero'         -> norms each mag to zero\n            a float        -> norms each mag to this specified float value.\n\n    normmingap : float\n        This defines how much the difference between consecutive measurements is\n        allowed to be to consider them as parts of different timegroups. By\n        default it is set to 4.0 days.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    varepoch : 'min' or float or list of lists or None\n        The epoch to use for this phased light curve plot tile. If this is a\n        float, will use the provided value directly. If this is 'min', will\n        automatically figure out the time-of-minimum of the phased light\n        curve. If this is None, will use the mimimum value of `stimes` as the\n        epoch of the phased light curve plot. If this is a list of lists, will\n        use the provided value of `lspmethodind` to look up the current\n        period-finder method and the provided value of `periodind` to look up\n        the epoch associated with that method and the current period. This is\n        mostly only useful when `twolspmode` is True.\n\n    phasewrap : bool\n        If this is True, the phased time-series will be wrapped around\n        phase 0.0.\n\n    phasesort : bool\n        If True, will sort the phased light curve in order of increasing phase.\n\n    phasebin: float\n        The bin size to use to group together measurements closer than this\n        amount in phase. This is in units of phase. If this is a float, a\n        phase-binned version of the phased light curve will be overplotted on\n        top of the regular phased light curve.\n\n    minbinelems : int\n        The minimum number of elements required per phase bin to include it in\n        the phased LC plot.\n\n    plotxlim : sequence of two floats or None\n        The x-range (min, max) of the phased light curve plot. If None, will be\n        determined automatically.\n\n    xliminsetmode : bool\n        If this is True, the generated phased light curve plot will use the\n        values of `plotxlim` as the main plot x-axis limits (i.e. zoomed-in if\n        `plotxlim` is a range smaller than the full phase range), and will show\n        the full phased light curve plot as an smaller inset. Useful for\n        planetary transit light curves.\n\n    plotdpi : int\n        The resolution of the output plot PNGs in dots per inch.\n\n    bestperiodhighlight : str or None\n        If not None, this is a str with a matplotlib color specification to use\n        as the background color to highlight the phased light curve plot of the\n        'best' period and epoch combination. If None, no highlight will be\n        applied.\n\n    xgridlines : list of floats or None\n        If this is provided, must be a list of floats corresponding to the phase\n        values where to draw vertical dashed lines as a means of highlighting\n        these.\n\n    mindet : int\n        The minimum of observations the input object's mag/flux time-series must\n        have for this function to plot its light curve and phased light\n        curve. If the object has less than this number, no light curves will be\n        plotted, but the checkplotdict will still contain all of the other\n        information.\n\n    verbose : bool\n        If True, will indicate progress and warn about problems.\n\n    outfile : str or None\n        The name of the output checkplot pickle file. If this is None, will\n        write the checkplot pickle to file called 'checkplot.pkl' in the current\n        working directory.\n\n    outgzip : bool\n        This controls whether to gzip the output pickle. It turns out that this\n        is the slowest bit in the output process, so if you're after speed, best\n        not to use this. This is False by default since it turns out that gzip\n        actually doesn't save that much space (29 MB vs. 35 MB for the average\n        checkplot pickle).\n\n    pickleprotocol : int or None\n        This sets the pickle file protocol to use when writing the pickle:\n\n        If None, will choose a protocol using the following rules:\n\n        - 4 -> default in Python >= 3.4 - fast but incompatible with Python 2\n        - 3 -> default in Python 3.0-3.3 - mildly fast\n        - 2 -> default in Python 2 - very slow, but compatible with Python 2/3\n\n        The default protocol kwarg is None, this will make an automatic choice\n        for pickle protocol that's best suited for the version of Python in\n        use. Note that this will make pickles generated by Py3 incompatible with\n        Py2.\n\n    returndict : bool\n        If this is True, will return the checkplotdict instead of returning the\n        filename of the output checkplot pickle.\n\n    Returns\n    -------\n\n    dict or str\n        If returndict is False, will return the path to the generated checkplot\n        pickle file. If returndict is True, will return the checkplotdict\n        instead.\n\n    '''\n\n    # call checkplot_dict for most of the work\n    checkplotdict = checkplot_dict(\n        lspinfolist,\n        times,\n        mags,\n        errs,\n        magsarefluxes=magsarefluxes,\n        nperiodstouse=nperiodstouse,\n        objectinfo=objectinfo,\n        deredden_object=deredden_object,\n        custom_bandpasses=custom_bandpasses,\n        gaia_submit_timeout=gaia_submit_timeout,\n        gaia_submit_tries=gaia_submit_tries,\n        gaia_max_timeout=gaia_max_timeout,\n        gaia_mirror=gaia_mirror,\n        complete_query_later=complete_query_later,\n        varinfo=varinfo,\n        getvarfeatures=getvarfeatures,\n        lclistpkl=lclistpkl,\n        nbrradiusarcsec=nbrradiusarcsec,\n        maxnumneighbors=maxnumneighbors,\n        xmatchinfo=xmatchinfo,\n        xmatchradiusarcsec=xmatchradiusarcsec,\n        lcfitfunc=lcfitfunc,\n        lcfitparams=lcfitparams,\n        externalplots=externalplots,\n        findercmap=findercmap,\n        finderconvolve=finderconvolve,\n        findercachedir=findercachedir,\n        normto=normto,\n        normmingap=normmingap,\n        sigclip=sigclip,\n        varepoch=varepoch,\n        phasewrap=phasewrap,\n        phasesort=phasesort,\n        phasebin=phasebin,\n        minbinelems=minbinelems,\n        plotxlim=plotxlim,\n        xliminsetmode=xliminsetmode,\n        plotdpi=plotdpi,\n        bestperiodhighlight=bestperiodhighlight,\n        xgridlines=xgridlines,\n        mindet=mindet,\n        verbose=verbose,\n        fast_mode=fast_mode\n    )\n\n    # for Python >= 3.4, use v4\n    if ((sys.version_info[0:2] >= (3,4) and not pickleprotocol) or\n        (pickleprotocol > 2)):\n        pickleprotocol = 4\n\n    elif ((sys.version_info[0:2] >= (3,0) and not pickleprotocol) or\n          (pickleprotocol > 2)):\n        pickleprotocol = 3\n\n    # for Python == 2.7; use v2\n    elif sys.version_info[0:2] == (2,7) and not pickleprotocol:\n        pickleprotocol = 2\n\n    # otherwise, if left unspecified, use the slowest but most compatible\n    # protocol. this will be readable by all (most?) Pythons\n    elif not pickleprotocol:\n        pickleprotocol = 0\n\n\n    # generate the output file path\n    if outgzip:\n\n        # generate the outfile filename\n        if (not outfile and\n            len(lspinfolist) > 0 and\n            isinstance(lspinfolist[0], str)):\n            plotfpath = os.path.join(os.path.dirname(lspinfolist[0]),\n                                     'checkplot-%s.pkl.gz' %\n                                     checkplotdict['objectid'])\n        elif outfile:\n            plotfpath = outfile\n        else:\n            plotfpath = 'checkplot.pkl.gz'\n\n    else:\n\n        # generate the outfile filename\n        if (not outfile and\n            len(lspinfolist) > 0 and\n            isinstance(lspinfolist[0], str)):\n            plotfpath = os.path.join(os.path.dirname(lspinfolist[0]),\n                                     'checkplot-%s.pkl' %\n                                     checkplotdict['objectid'])\n        elif outfile:\n            plotfpath = outfile\n        else:\n            plotfpath = 'checkplot.pkl'\n\n\n    # write the completed checkplotdict to a gzipped pickle\n    picklefname = _write_checkplot_picklefile(checkplotdict,\n                                              outfile=plotfpath,\n                                              protocol=pickleprotocol,\n                                              outgzip=outgzip)\n\n    # at the end, return the dict and filename if asked for\n    if returndict:\n        if verbose:\n            LOGINFO('checkplot done -> %s' % picklefname)\n        return checkplotdict, picklefname\n\n    # otherwise, just return the filename\n    else:\n        # just to make sure: free up space\n        del checkplotdict\n        if verbose:\n            LOGINFO('checkplot done -> %s' % picklefname)\n        return picklefname", "response": "This function writes a single lspinfo checkplot to a gzipped pickle file."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef pwd_phasebin(phases, mags, binsize=0.002, minbin=9):\n    '''\n    This bins the phased mag series using the given binsize.\n\n    '''\n\n    bins = np.arange(0.0, 1.0, binsize)\n    binnedphaseinds = npdigitize(phases, bins)\n\n    binnedphases, binnedmags = [], []\n\n    for x in npunique(binnedphaseinds):\n\n        thisbin_inds = binnedphaseinds == x\n        thisbin_phases = phases[thisbin_inds]\n        thisbin_mags = mags[thisbin_inds]\n        if thisbin_inds.size > minbin:\n            binnedphases.append(npmedian(thisbin_phases))\n            binnedmags.append(npmedian(thisbin_mags))\n\n    return np.array(binnedphases), np.array(binnedmags)", "response": "This function bins the phased mag series using the given binsize."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef pdw_worker(task):\n    '''\n    This is the parallel worker for the function below.\n\n    task[0] = frequency for this worker\n    task[1] = times array\n    task[2] = mags array\n    task[3] = fold_time\n    task[4] = j_range\n    task[5] = keep_threshold_1\n    task[6] = keep_threshold_2\n    task[7] = phasebinsize\n\n    we don't need errs for the worker.\n\n    '''\n\n    frequency = task[0]\n    times, modmags = task[1], task[2]\n    fold_time = task[3]\n    j_range = range(task[4])\n    keep_threshold_1 = task[5]\n    keep_threshold_2 = task[6]\n    phasebinsize = task[7]\n\n\n    try:\n\n        period = 1.0/frequency\n\n        # use the common phaser to phase and sort the mag\n        phased = phase_magseries(times,\n                                 modmags,\n                                 period,\n                                 fold_time,\n                                 wrap=False,\n                                 sort=True)\n\n        # bin in phase if requested, this turns this into a sort of PDM method\n        if phasebinsize is not None and phasebinsize > 0:\n            bphased = pwd_phasebin(phased['phase'],\n                                   phased['mags'],\n                                   binsize=phasebinsize)\n            phase_sorted = bphased[0]\n            mod_mag_sorted = bphased[1]\n            j_range = range(len(mod_mag_sorted) - 1)\n        else:\n            phase_sorted = phased['phase']\n            mod_mag_sorted = phased['mags']\n\n        # now calculate the string length\n        rolledmags = nproll(mod_mag_sorted,1)\n        rolledphases = nproll(phase_sorted,1)\n        strings = (\n            (rolledmags - mod_mag_sorted)*(rolledmags - mod_mag_sorted) +\n            (rolledphases - phase_sorted)*(rolledphases - phase_sorted)\n        )\n        strings[0] = (\n            ((mod_mag_sorted[0] - mod_mag_sorted[-1]) *\n             (mod_mag_sorted[0] - mod_mag_sorted[-1])) +\n            ((phase_sorted[0] - phase_sorted[-1] + 1) *\n             (phase_sorted[0] - phase_sorted[-1] + 1))\n        )\n        strlen = npsum(npsqrt(strings))\n\n        if (keep_threshold_1 < strlen < keep_threshold_2):\n            p_goodflag  = True\n        else:\n            p_goodflag = False\n\n        return (period, strlen, p_goodflag)\n\n    except Exception as e:\n\n        LOGEXCEPTION('error in DWP')\n        return(period, npnan, False)", "response": "This function is a parallel worker for the PDM method. It is used to create a new object for the task."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef pdw_period_find(times,\n                    mags,\n                    errs,\n                    autofreq=True,\n                    init_p=None,\n                    end_p=None,\n                    f_step=1.0e-4,\n                    phasebinsize=None,\n                    sigclip=10.0,\n                    nworkers=None,\n                    verbose=False):\n    '''This is the parallel version of the function above.\n\n    Uses the string length method in Dworetsky 1983 to calculate the period of a\n    time-series of magnitude measurements and associated magnitude errors. This\n    can optionally bin in phase to try to speed up the calculation.\n\n    PARAMETERS:\n\n    time: series of times at which mags were measured (usually some form of JD)\n    mag: timeseries of magnitudes (np.array)\n    err: associated errs per magnitude measurement (np.array)\n    init_p, end_p: interval to search for periods between (both ends inclusive)\n    f_step: step in frequency [days^-1] to use\n\n    RETURNS:\n\n    tuple of the following form:\n\n    (periods (np.array),\n     string_lengths (np.array),\n     good_period_mask (boolean array))\n\n    '''\n\n    # remove nans\n    find = npisfinite(times) & npisfinite(mags) & npisfinite(errs)\n    ftimes, fmags, ferrs = times[find], mags[find], errs[find]\n\n    mod_mags = (fmags - npmin(fmags))/(2.0*(npmax(fmags) - npmin(fmags))) - 0.25\n\n    if len(ftimes) > 9 and len(fmags) > 9 and len(ferrs) > 9:\n\n        # get the median and stdev = 1.483 x MAD\n        median_mag = np.median(fmags)\n        stddev_mag = (np.median(np.abs(fmags - median_mag))) * 1.483\n\n        # sigclip next\n        if sigclip:\n\n            sigind = (np.abs(fmags - median_mag)) < (sigclip * stddev_mag)\n\n            stimes = ftimes[sigind]\n            smags = fmags[sigind]\n            serrs = ferrs[sigind]\n\n            LOGINFO('sigclip = %s: before = %s observations, '\n                    'after = %s observations' %\n                    (sigclip, len(times), len(stimes)))\n\n        else:\n\n            stimes = ftimes\n            smags = fmags\n            serrs = ferrs\n\n        # make sure there are enough points to calculate a spectrum\n        if len(stimes) > 9 and len(smags) > 9 and len(serrs) > 9:\n\n            # get the frequencies to use\n            if init_p:\n                endf = 1.0/init_p\n            else:\n                # default start period is 0.1 day\n                endf = 1.0/0.1\n\n            if end_p:\n                startf = 1.0/end_p\n            else:\n                # default end period is length of time series\n                startf = 1.0/(stimes.max() - stimes.min())\n\n            # if we're not using autofreq, then use the provided frequencies\n            if not autofreq:\n                frequencies = np.arange(startf, endf, stepsize)\n                LOGINFO(\n                    'using %s frequency points, start P = %.3f, end P = %.3f' %\n                    (frequencies.size, 1.0/endf, 1.0/startf)\n                )\n            else:\n                # this gets an automatic grid of frequencies to use\n                frequencies = get_frequency_grid(stimes,\n                                                 minfreq=startf,\n                                                 maxfreq=endf)\n                LOGINFO(\n                    'using autofreq with %s frequency points, '\n                    'start P = %.3f, end P = %.3f' %\n                    (frequencies.size,\n                     1.0/frequencies.max(),\n                     1.0/frequencies.min())\n                )\n\n\n            # set up some internal stuff\n            fold_time = npmin(ftimes) # fold at the first time element\n            j_range = len(fmags)-1\n            epsilon = 2.0 * npmean(ferrs)\n            delta_l = 0.34 * (epsilon - 0.5*(epsilon**2)) * (len(ftimes) -\n                                                             npsqrt(10.0/epsilon))\n            keep_threshold_1 = 1.6 + 1.2*delta_l\n            l = 0.212*len(ftimes)\n            sig_l = len(ftimes)/37.5\n            keep_threshold_2 = l + 4.0*sig_l\n\n            # generate the tasks\n            tasks = [(x,\n                      ftimes,\n                      mod_mags,\n                      fold_time,\n                      j_range,\n                      keep_threshold_1,\n                      keep_threshold_2,\n                      phasebinsize) for x in frequencies]\n\n            # fire up the pool and farm out the tasks\n            if (not nworkers) or (nworkers > NCPUS):\n                nworkers = NCPUS\n                LOGINFO('using %s workers...' % nworkers)\n\n            pool = Pool(nworkers)\n            strlen_results = pool.map(pdw_worker, tasks)\n            pool.close()\n            pool.join()\n            del pool\n\n            periods, strlens, goodflags = zip(*strlen_results)\n            periods, strlens, goodflags = (np.array(periods),\n                                           np.array(strlens),\n                                           np.array(goodflags))\n\n            strlensort = npargsort(strlens)\n            nbeststrlens = strlens[strlensort[:5]]\n            nbestperiods = periods[strlensort[:5]]\n            nbestflags = goodflags[strlensort[:5]]\n            bestperiod = nbestperiods[0]\n            beststrlen = nbeststrlens[0]\n            bestflag = nbestflags[0]\n\n            return {'bestperiod':bestperiod,\n                    'beststrlen':beststrlen,\n                    'bestflag':bestflag,\n                    'nbeststrlens':nbeststrlens,\n                    'nbestperiods':nbestperiods,\n                    'nbestflags':nbestflags,\n                    'strlens':strlens,\n                    'periods':periods,\n                    'goodflags':goodflags}\n\n        else:\n\n            LOGERROR('no good detections for these times and mags, skipping...')\n            return {'bestperiod':npnan,\n                    'beststrlen':npnan,\n                    'bestflag':npnan,\n                    'nbeststrlens':None,\n                    'nbestperiods':None,\n                    'nbestflags':None,\n                    'strlens':None,\n                    'periods':None,\n                    'goodflags':None}\n    else:\n\n        LOGERROR('no good detections for these times and mags, skipping...')\n        return {'bestperiod':npnan,\n                'beststrlen':npnan,\n                'bestflag':npnan,\n                'nbeststrlens':None,\n                'nbestperiods':None,\n                'nbestflags':None,\n                'strlens':None,\n                'periods':None,\n                'goodflags':None}", "response": "This function is used to find the period of a single object in a single object. This function is used to find the period of a single object in a single object."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef townsend_lombscargle_value(times, mags, omega):\n    '''\n    This calculates the periodogram value for each omega (= 2*pi*f). Mags must\n    be normalized to zero with variance scaled to unity.\n\n    '''\n    cos_omegat = npcos(omega*times)\n    sin_omegat = npsin(omega*times)\n\n    xc = npsum(mags*cos_omegat)\n    xs = npsum(mags*sin_omegat)\n\n    cc = npsum(cos_omegat*cos_omegat)\n    ss = npsum(sin_omegat*sin_omegat)\n\n    cs = npsum(cos_omegat*sin_omegat)\n\n    tau = nparctan(2*cs/(cc - ss))/(2*omega)\n\n    ctau = npcos(omega*tau)\n    stau = npsin(omega*tau)\n\n    leftsumtop = (ctau*xc + stau*xs)*(ctau*xc + stau*xs)\n    leftsumbot = ctau*ctau*cc + 2.0*ctau*stau*cs + stau*stau*ss\n    leftsum = leftsumtop/leftsumbot\n\n    rightsumtop = (ctau*xs - stau*xc)*(ctau*xs - stau*xc)\n    rightsumbot = ctau*ctau*ss - 2.0*ctau*stau*cs + stau*stau*cc\n    rightsum = rightsumtop/rightsumbot\n\n    pval = 0.5*(leftsum + rightsum)\n\n    return pval", "response": "Calculates the periodogram value for each omega."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef parallel_townsend_lsp(times, mags, startp, endp,\n                          stepsize=1.0e-4,\n                          nworkers=4):\n    '''\n    This calculates the Lomb-Scargle periodogram for the frequencies\n    corresponding to the period interval (startp, endp) using a frequency step\n    size of stepsize cycles/day. This uses the algorithm in Townsend 2010.\n\n    '''\n\n    # make sure there are no nans anywhere\n    finiteind = np.isfinite(times) & np.isfinite(mags)\n    ftimes, fmags = times[finiteind], mags[finiteind]\n\n    # renormalize the mags to zero and scale them so that the variance = 1\n    nmags = (fmags - np.median(fmags))/np.std(fmags)\n\n    startf = 1.0/endp\n    endf = 1.0/startp\n    omegas = 2*np.pi*np.arange(startf, endf, stepsize)\n\n    # parallel map the lsp calculations\n    if (not nworkers) or (nworkers > NCPUS):\n        nworkers = NCPUS\n        LOGINFO('using %s workers...' % nworkers)\n\n    pool = Pool(nworkers)\n\n    tasks = [(ftimes, nmags, x) for x in omegas]\n    lsp = pool.map(townsend_lombscargle_wrapper, tasks)\n\n    pool.close()\n    pool.join()\n\n    return np.array(omegas), np.array(lsp)", "response": "This function calculates the Lomb - Scargle periodogram for the frequencies of the frequencies of the mags corresponding to the period interval startp and endp. This function uses a parallel pool to do the work."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef scipylsp_parallel(times,\n                      mags,\n                      errs, # ignored but for consistent API\n                      startp,\n                      endp,\n                      nbestpeaks=5,\n                      periodepsilon=0.1, # 0.1\n                      stepsize=1.0e-4,\n                      nworkers=4,\n                      sigclip=None,\n                      timebin=None):\n    '''\n    This uses the LSP function from the scipy library, which is fast as hell. We\n    try to make it faster by running LSP for sections of the omegas array in\n    parallel.\n\n    '''\n\n    # make sure there are no nans anywhere\n    finiteind = np.isfinite(mags) & np.isfinite(errs)\n    ftimes, fmags, ferrs = times[finiteind], mags[finiteind], errs[finiteind]\n\n    if len(ftimes) > 0 and len(fmags) > 0:\n\n        # sigclip the lightcurve if asked to do so\n        if sigclip:\n            worktimes, workmags, _ = sigclip_magseries(ftimes,\n                                                       fmags,\n                                                       ferrs,\n                                                       sigclip=sigclip)\n            LOGINFO('ndet after sigclipping = %s' % len(worktimes))\n\n        else:\n            worktimes = ftimes\n            workmags = fmags\n\n        # bin the lightcurve if asked to do so\n        if timebin:\n\n            binned = time_bin_magseries(worktimes, workmags, binsize=timebin)\n            worktimes = binned['binnedtimes']\n            workmags = binned['binnedmags']\n\n        # renormalize the working mags to zero and scale them so that the\n        # variance = 1 for use with our LSP functions\n        normmags = (workmags - np.median(workmags))/np.std(workmags)\n\n        startf = 1.0/endp\n        endf = 1.0/startp\n        omegas = 2*np.pi*np.arange(startf, endf, stepsize)\n\n        # partition the omegas array by nworkers\n        tasks = []\n        chunksize = int(float(len(omegas))/nworkers) + 1\n        tasks = [omegas[x*chunksize:x*chunksize+chunksize]\n                 for x in range(nworkers)]\n\n        # map to parallel workers\n        if (not nworkers) or (nworkers > NCPUS):\n            nworkers = NCPUS\n            LOGINFO('using %s workers...' % nworkers)\n\n        pool = Pool(nworkers)\n\n        tasks = [(worktimes, normmags, x) for x in tasks]\n        lsp = pool.map(parallel_scipylsp_worker, tasks)\n\n        pool.close()\n        pool.join()\n\n        lsp = np.concatenate(lsp)\n        periods = 2.0*np.pi/omegas\n\n        # find the nbestpeaks for the periodogram: 1. sort the lsp array by\n        # highest value first 2. go down the values until we find five values\n        # that are separated by at least periodepsilon in period\n\n        # make sure we only get finite lsp values\n        finitepeakind = npisfinite(lsp)\n        finlsp = lsp[finitepeakind]\n        finperiods = periods[finitepeakind]\n\n        bestperiodind = npargmax(finlsp)\n\n        sortedlspind = np.argsort(finlsp)[::-1]\n        sortedlspperiods = finperiods[sortedlspind]\n        sortedlspvals = finlsp[sortedlspind]\n\n        prevbestlspval = sortedlspvals[0]\n        # now get the nbestpeaks\n        nbestperiods, nbestlspvals, peakcount = (\n            [finperiods[bestperiodind]],\n            [finlsp[bestperiodind]],\n            1\n        )\n        prevperiod = sortedlspperiods[0]\n\n        # find the best nbestpeaks in the lsp and their periods\n        for period, lspval in zip(sortedlspperiods, sortedlspvals):\n\n            if peakcount == nbestpeaks:\n                break\n            perioddiff = abs(period - prevperiod)\n            bestperiodsdiff = [abs(period - x) for x in nbestperiods]\n\n            # print('prevperiod = %s, thisperiod = %s, '\n            #       'perioddiff = %s, peakcount = %s' %\n            #       (prevperiod, period, perioddiff, peakcount))\n\n            # this ensures that this period is different from the last period\n            # and from all the other existing best periods by periodepsilon to\n            # make sure we jump to an entire different peak in the periodogram\n            if (perioddiff > periodepsilon and\n                all(x > periodepsilon for x in bestperiodsdiff)):\n                nbestperiods.append(period)\n                nbestlspvals.append(lspval)\n                peakcount = peakcount + 1\n\n            prevperiod = period\n\n\n        return {'bestperiod':finperiods[bestperiodind],\n                'bestlspval':finlsp[bestperiodind],\n                'nbestpeaks':nbestpeaks,\n                'nbestlspvals':nbestlspvals,\n                'nbestperiods':nbestperiods,\n                'lspvals':lsp,\n                'omegas':omegas,\n                'periods':periods,\n                'method':'sls'}\n\n    else:\n\n        LOGERROR('no good detections for these times and mags, skipping...')\n        return {'bestperiod':npnan,\n                'bestlspval':npnan,\n                'nbestpeaks':nbestpeaks,\n                'nbestlspvals':None,\n                'nbestperiods':None,\n                'lspvals':None,\n                'periods':None,\n                'method':'sls'}", "response": "This function uses scipy lsp to run a set of functions in parallel."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _periodicfeatures_worker(task):\n    '''\n    This is a parallel worker for the drivers below.\n\n    '''\n\n    pfpickle, lcbasedir, outdir, starfeatures, kwargs = task\n\n    try:\n\n        return get_periodicfeatures(pfpickle,\n                                    lcbasedir,\n                                    outdir,\n                                    starfeatures=starfeatures,\n                                    **kwargs)\n\n    except Exception as e:\n\n        LOGEXCEPTION('failed to get periodicfeatures for %s' % pfpickle)", "response": "This is a parallel worker for the drivers below. It is a parallel worker for the drivers below."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef parallel_periodicfeatures_lcdir(\n        pfpkl_dir,\n        lcbasedir,\n        outdir,\n        pfpkl_glob='periodfinding-*.pkl*',\n        starfeaturesdir=None,\n        fourierorder=5,\n        # these are depth, duration, ingress duration\n        transitparams=(-0.01,0.1,0.1),\n        # these are depth, duration, depth ratio, secphase\n        ebparams=(-0.2,0.3,0.7,0.5),\n        pdiff_threshold=1.0e-4,\n        sidereal_threshold=1.0e-4,\n        sampling_peak_multiplier=5.0,\n        sampling_startp=None,\n        sampling_endp=None,\n        timecols=None,\n        magcols=None,\n        errcols=None,\n        lcformat='hat-sql',\n        lcformatdir=None,\n        sigclip=10.0,\n        verbose=False,\n        maxobjects=None,\n        nworkers=NCPUS,\n        recursive=True,\n):\n    '''This runs parallel periodicfeature extraction for a directory of\n    periodfinding result pickles.\n\n    Parameters\n    ----------\n\n    pfpkl_dir : str\n        The directory containing the pickles to process.\n\n    lcbasedir : str\n        The directory where all of the associated light curve files are located.\n\n    outdir : str\n        The directory where all the output will be written.\n\n    pfpkl_glob : str\n        The UNIX file glob to use to search for period-finder result pickles in\n        `pfpkl_dir`.\n\n    starfeaturesdir : str or None\n        The directory containing the `starfeatures-<objectid>.pkl` files for\n        each object to use calculate neighbor proximity light curve features.\n\n    fourierorder : int\n        The Fourier order to use to generate sinusoidal function and fit that to\n        the phased light curve.\n\n    transitparams : list of floats\n        The transit depth, duration, and ingress duration to use to generate a\n        trapezoid planet transit model fit to the phased light curve. The period\n        used is the one provided in `period`, while the epoch is automatically\n        obtained from a spline fit to the phased light curve.\n\n    ebparams : list of floats\n        The primary eclipse depth, eclipse duration, the primary-secondary depth\n        ratio, and the phase of the secondary eclipse to use to generate an\n        eclipsing binary model fit to the phased light curve. The period used is\n        the one provided in `period`, while the epoch is automatically obtained\n        from a spline fit to the phased light curve.\n\n    pdiff_threshold : float\n        This is the max difference between periods to consider them the same.\n\n    sidereal_threshold : float\n        This is the max difference between any of the 'best' periods and the\n        sidereal day periods to consider them the same.\n\n    sampling_peak_multiplier : float\n        This is the minimum multiplicative factor of a 'best' period's\n        normalized periodogram peak over the sampling periodogram peak at the\n        same period required to accept the 'best' period as possibly real.\n\n    sampling_startp, sampling_endp : float\n        If the `pgramlist` doesn't have a time-sampling Lomb-Scargle\n        periodogram, it will be obtained automatically. Use these kwargs to\n        control the minimum and maximum period interval to be searched when\n        generating this periodogram.\n\n    timecols : list of str or None\n        The timecol keys to use from the lcdict in calculating the features.\n\n    magcols : list of str or None\n        The magcol keys to use from the lcdict in calculating the features.\n\n    errcols : list of str or None\n        The errcol keys to use from the lcdict in calculating the features.\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curves specified in `basedir` or `use_list_of_filenames`.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    verbose : bool\n        If True, will indicate progress while working.\n\n    maxobjects : int\n        The total number of objects to process from `pfpkl_list`.\n\n    nworkers : int\n        The number of parallel workers to launch to process the input.\n\n    Returns\n    -------\n\n    dict\n        A dict containing key: val pairs of the input period-finder result and\n        the output periodic feature result pickles for each input pickle is\n        returned.\n\n    '''\n\n    try:\n        formatinfo = get_lcformat(lcformat,\n                                  use_lcformat_dir=lcformatdir)\n        if formatinfo:\n            (dfileglob, readerfunc,\n             dtimecols, dmagcols, derrcols,\n             magsarefluxes, normfunc) = formatinfo\n        else:\n            LOGERROR(\"can't figure out the light curve format\")\n            return None\n    except Exception as e:\n        LOGEXCEPTION(\"can't figure out the light curve format\")\n        return None\n\n    fileglob = pfpkl_glob\n\n    # now find the files\n    LOGINFO('searching for periodfinding pickles in %s ...' % pfpkl_dir)\n\n    if recursive is False:\n        matching = glob.glob(os.path.join(pfpkl_dir, fileglob))\n\n    else:\n        # use recursive glob for Python 3.5+\n        if sys.version_info[:2] > (3,4):\n\n            matching = glob.glob(os.path.join(pfpkl_dir,\n                                              '**',\n                                              fileglob),recursive=True)\n\n        # otherwise, use os.walk and glob\n        else:\n\n            # use os.walk to go through the directories\n            walker = os.walk(pfpkl_dir)\n            matching = []\n\n            for root, dirs, _files in walker:\n                for sdir in dirs:\n                    searchpath = os.path.join(root,\n                                              sdir,\n                                              fileglob)\n                    foundfiles = glob.glob(searchpath)\n\n                    if foundfiles:\n                        matching.extend(foundfiles)\n\n\n    # now that we have all the files, process them\n    if matching and len(matching) > 0:\n\n        LOGINFO('found %s periodfinding pickles, getting periodicfeatures...' %\n                len(matching))\n\n        return parallel_periodicfeatures(\n            matching,\n            lcbasedir,\n            outdir,\n            starfeaturesdir=starfeaturesdir,\n            fourierorder=fourierorder,\n            transitparams=transitparams,\n            ebparams=ebparams,\n            pdiff_threshold=pdiff_threshold,\n            sidereal_threshold=sidereal_threshold,\n            sampling_peak_multiplier=sampling_peak_multiplier,\n            sampling_startp=sampling_startp,\n            sampling_endp=sampling_endp,\n            timecols=timecols,\n            magcols=magcols,\n            errcols=errcols,\n            lcformat=lcformat,\n            lcformatdir=lcformatdir,\n            sigclip=sigclip,\n            verbose=verbose,\n            maxobjects=maxobjects,\n            nworkers=nworkers,\n        )\n\n    else:\n\n        LOGERROR('no periodfinding pickles found in %s' % (pfpkl_dir))\n        return None", "response": "This function runs the parallel periodicfeature extraction for a directory of period - finder result pickles."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef angle_wrap(angle, radians=False):\n    '''Wraps the input angle to 360.0 degrees.\n\n    Parameters\n    ----------\n\n    angle : float\n        The angle to wrap around 360.0 deg.\n\n    radians : bool\n        If True, will assume that the input is in radians. The output will then\n        also be in radians.\n\n    Returns\n    -------\n\n    float\n        Wrapped angle. If radians is True: input is assumed to be in radians,\n        output is also in radians.\n\n    '''\n\n    if radians:\n        wrapped = angle % (2.0*pi_value)\n        if wrapped < 0.0:\n            wrapped = 2.0*pi_value + wrapped\n\n    else:\n\n        wrapped = angle % 360.0\n        if wrapped < 0.0:\n            wrapped = 360.0 + wrapped\n\n    return wrapped", "response": "Wraps the input angle to 360. 0 degrees."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nconverting from decimal degrees to DD MM SS.", "response": "def decimal_to_dms(decimal_value):\n    '''Converts from decimal degrees (for declination coords) to DD:MM:SS.\n\n    Parameters\n    ----------\n\n    decimal_value : float\n        A decimal value to convert to degrees, minutes, seconds sexagesimal\n        format.\n\n    Returns\n    -------\n\n    tuple\n        A four element tuple is returned: (sign, HH, MM, SS.ssss...)\n\n    '''\n\n    if decimal_value < 0:\n        negative = True\n        dec_val = fabs(decimal_value)\n    else:\n        negative = False\n        dec_val = decimal_value\n\n    degrees = trunc(dec_val)\n    minutes_deg = dec_val - degrees\n\n    minutes_mm = minutes_deg * 60.0\n    minutes_out = trunc(minutes_mm)\n    seconds = (minutes_mm - minutes_out)*60.0\n\n    if negative:\n        degrees = degrees\n        return '-', degrees, minutes_out, seconds\n    else:\n        return '+', degrees, minutes_out, seconds"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nconverting from decimal degrees to HHMMSS.", "response": "def decimal_to_hms(decimal_value):\n    '''Converts from decimal degrees (for RA coords) to HH:MM:SS.\n\n    Parameters\n    ----------\n\n    decimal_value : float\n        A decimal value to convert to hours, minutes, seconds. Negative values\n        will be wrapped around 360.0.\n\n    Returns\n    -------\n\n    tuple\n        A three element tuple is returned: (HH, MM, SS.ssss...)\n\n    '''\n\n    # wrap to 360.0\n    if decimal_value < 0:\n        dec_wrapped = 360.0 + decimal_value\n    else:\n        dec_wrapped = decimal_value\n\n    # convert to decimal hours first\n    dec_hours = dec_wrapped/15.0\n\n    if dec_hours < 0:\n        negative = True\n        dec_val = fabs(dec_hours)\n    else:\n        negative = False\n        dec_val = dec_hours\n\n    hours = trunc(dec_val)\n    minutes_hrs = dec_val - hours\n\n    minutes_mm = minutes_hrs * 60.0\n    minutes_out = trunc(minutes_mm)\n    seconds = (minutes_mm - minutes_out)*60.0\n\n    if negative:\n        hours = -hours\n        return hours, minutes_out, seconds\n    else:\n        return hours, minutes_out, seconds"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef hms_str_to_tuple(hms_string):\n    '''Converts a string of the form HH:MM:SS or HH MM SS to a tuple of the form\n    (HH, MM, SS).\n\n    Parameters\n    ----------\n\n    hms_string : str\n        A RA coordinate string of the form 'HH:MM:SS.sss' or 'HH MM SS.sss'.\n\n    Returns\n    -------\n\n    tuple\n        A three element tuple is returned (HH, MM, SS.ssss...)\n\n    '''\n\n    if ':' in hms_string:\n        separator = ':'\n    else:\n        separator = ' '\n\n    hh, mm, ss = hms_string.split(separator)\n\n    return int(hh), int(mm), float(ss)", "response": "Converts a string of the form HH MM SS to a tuple of the form\n    HH MM SS. ssss."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef dms_str_to_tuple(dms_string):\n    '''Converts a string of the form [+-]DD:MM:SS or [+-]DD MM SS to a tuple of\n    the form (sign, DD, MM, SS).\n\n    Parameters\n    ----------\n\n    dms_string : str\n        A declination coordinate string of the form '[+-]DD:MM:SS.sss' or\n        '[+-]DD MM SS.sss'. The sign in front of DD is optional. If it's not\n        there, this function will assume that the coordinate string is a\n        positive value.\n\n    Returns\n    -------\n\n    tuple\n        A four element tuple of the form: (sign, DD, MM, SS.ssss...).\n\n    '''\n    if ':' in dms_string:\n        separator = ':'\n    else:\n        separator = ' '\n\n    sign_dd, mm, ss = dms_string.split(separator)\n    if sign_dd.startswith('+') or sign_dd.startswith('-'):\n        sign, dd = sign_dd[0], sign_dd[1:]\n    else:\n        sign, dd = '+', sign_dd\n\n    return sign, int(dd), int(mm), float(ss)", "response": "Converts a string of the form [ + DD MM SS or - SS. sss or + DD MM SS to a tuple of\n    the form ( sign DD MM SS SS. ssss..."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef hms_to_decimal(hours, minutes, seconds, returndeg=True):\n    '''Converts from HH, MM, SS to a decimal value.\n\n    Parameters\n    ----------\n\n    hours : int\n        The HH part of a RA coordinate.\n\n    minutes : int\n        The MM part of a RA coordinate.\n\n    seconds : float\n        The SS.sss part of a RA coordinate.\n\n    returndeg : bool\n        If this is True, then will return decimal degrees as the output.\n        If this is False, then will return decimal HOURS as the output.\n        Decimal hours are sometimes used in FITS headers.\n\n    Returns\n    -------\n\n    float\n        The right ascension value in either decimal degrees or decimal hours\n        depending on `returndeg`.\n\n    '''\n\n    if hours > 24:\n\n        return None\n\n    else:\n\n        dec_hours = fabs(hours) + fabs(minutes)/60.0 + fabs(seconds)/3600.0\n\n        if returndeg:\n\n            dec_deg = dec_hours*15.0\n\n            if dec_deg < 0:\n                dec_deg = dec_deg + 360.0\n            dec_deg = dec_deg % 360.0\n            return dec_deg\n        else:\n            return dec_hours", "response": "Converts from HH MM SS to decimal value."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nconverting from DD MM : SS to decimal degrees.", "response": "def dms_to_decimal(sign, degrees, minutes, seconds):\n    '''Converts from DD:MM:SS to a decimal value.\n\n    Parameters\n    ----------\n\n    sign : {'+', '-', ''}\n        The sign part of a Dec coordinate.\n\n    degrees : int\n        The DD part of a Dec coordinate.\n\n    minutes : int\n        The MM part of a Dec coordinate.\n\n    seconds : float\n        The SS.sss part of a Dec coordinate.\n\n    Returns\n    -------\n\n    float\n        The declination value in decimal degrees.\n\n    '''\n\n    dec_deg = fabs(degrees) + fabs(minutes)/60.0 + fabs(seconds)/3600.0\n\n    if sign == '-':\n        return -dec_deg\n    else:\n        return dec_deg"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef great_circle_dist(ra1, dec1, ra2, dec2):\n    '''Calculates the great circle angular distance between two coords.\n\n    This calculates the great circle angular distance in arcseconds between two\n    coordinates (ra1,dec1) and (ra2,dec2). This is basically a clone of GCIRC\n    from the IDL Astrolib.\n\n    Parameters\n    ----------\n\n    ra1,dec1 : float or array-like\n        The first coordinate's right ascension and declination value(s) in\n        decimal degrees.\n\n    ra2,dec2 : float or array-like\n        The second coordinate's right ascension and declination value(s) in\n        decimal degrees.\n\n    Returns\n    -------\n\n    float or array-like\n        Great circle distance between the two coordinates in arseconds.\n\n    Notes\n    -----\n\n    If (`ra1`, `dec1`) is scalar and (`ra2`, `dec2`) is scalar: the result is a\n    float distance in arcseconds.\n\n    If (`ra1`, `dec1`) is scalar and (`ra2`, `dec2`) is array-like: the result\n    is an np.array with distance in arcseconds between (`ra1`, `dec1`) and each\n    element of (`ra2`, `dec2`).\n\n    If (`ra1`, `dec1`) is array-like and (`ra2`, `dec2`) is scalar: the result\n    is an np.array with distance in arcseconds between (`ra2`, `dec2`) and each\n    element of (`ra1`, `dec1`).\n\n    If (`ra1`, `dec1`) and (`ra2`, `dec2`) are both array-like: the result is an\n    np.array with the pair-wise distance in arcseconds between each element of\n    the two coordinate lists. In this case, if the input array-likes are not the\n    same length, then excess elements of the longer one will be ignored.\n\n    '''\n\n    # wrap RA if negative or larger than 360.0 deg\n    in_ra1 = ra1 % 360.0\n    in_ra1 = in_ra1 + 360.0*(in_ra1 < 0.0)\n    in_ra2 = ra2 % 360.0\n    in_ra2 = in_ra2 + 360.0*(in_ra1 < 0.0)\n\n    # convert to radians\n    ra1_rad, dec1_rad = np.deg2rad(in_ra1), np.deg2rad(dec1)\n    ra2_rad, dec2_rad = np.deg2rad(in_ra2), np.deg2rad(dec2)\n\n    del_dec2 = (dec2_rad - dec1_rad)/2.0\n    del_ra2 = (ra2_rad - ra1_rad)/2.0\n    sin_dist = np.sqrt(np.sin(del_dec2) * np.sin(del_dec2) +\n                       np.cos(dec1_rad) * np.cos(dec2_rad) *\n                       np.sin(del_ra2) * np.sin(del_ra2))\n\n    dist_rad = 2.0 * np.arcsin(sin_dist)\n\n    # return the distance in arcseconds\n    return np.rad2deg(dist_rad)*3600.0", "response": "Calculates the great circle angular distance between two coordinates."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nfind the closest object in (`ra2`, `dec2`) to scalar coordinate pair (`ra1`, `dec1`) and returns the distance in arcseconds. This is a quick matcher that uses the `great_circle_dist` function to find the closest object in (`ra2`, `dec2`) within `match_radius` arcseconds to (`ra1`, `dec1`). (`ra1`, `dec1`) must be a scalar pair, while (`ra2`, `dec2`) must be array-likes of the same lengths. Parameters ---------- ra1,dec1 : float Coordinate of the object to find matches to. In decimal degrees. ra2,dec2 : array-like The coordinates that will be searched for matches. In decimal degrees. match_radius : float The match radius in arcseconds to use for the match. Returns ------- tuple A two element tuple like the following:: (True -> no match found or False -> found a match, minimum distance between target and list in arcseconds)", "response": "def xmatch_basic(ra1, dec1, ra2, dec2, match_radius=5.0):\n    '''Finds the closest object in (`ra2`, `dec2`) to scalar coordinate pair\n    (`ra1`, `dec1`) and returns the distance in arcseconds.\n\n    This is a quick matcher that uses the `great_circle_dist` function to find\n    the closest object in (`ra2`, `dec2`) within `match_radius` arcseconds to\n    (`ra1`, `dec1`). (`ra1`, `dec1`) must be a scalar pair, while\n    (`ra2`, `dec2`) must be array-likes of the same lengths.\n\n    Parameters\n    ----------\n\n    ra1,dec1 : float\n        Coordinate of the object to find matches to. In decimal degrees.\n\n    ra2,dec2 : array-like\n        The coordinates that will be searched for matches. In decimal degrees.\n\n    match_radius : float\n        The match radius in arcseconds to use for the match.\n\n    Returns\n    -------\n\n    tuple\n        A two element tuple like the following::\n\n            (True -> no match found or False -> found a match,\n             minimum distance between target and list in arcseconds)\n\n    '''\n\n    min_dist_arcsec = np.min(great_circle_dist(ra1,dec1,ra2,dec2))\n\n    if (min_dist_arcsec < match_radius):\n        return (True,min_dist_arcsec)\n    else:\n        return (False,min_dist_arcsec)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef xmatch_neighbors(ra1, dec1,\n                     ra2, dec2,\n                     match_radius=60.0,\n                     includeself=False,\n                     sortresults=True):\n    '''Finds the closest objects in (`ra2`, `dec2`) to scalar coordinate pair\n    (`ra1`, `dec1`) and returns the indices of the objects that match.\n\n    This is a quick matcher that uses the `great_circle_dist` function to find\n    the closest object in (`ra2`, `dec2`) within `match_radius` arcseconds to\n    (`ra1`, `dec1`). (`ra1`, `dec1`) must be a scalar pair, while\n    (`ra2`, `dec2`) must be array-likes of the same lengths.\n\n    Parameters\n    ----------\n\n    ra1,dec1 : float\n        Coordinate of the object to find matches to. In decimal degrees.\n\n    ra2,dec2 : array-like\n        The coordinates that will be searched for matches. In decimal degrees.\n\n    match_radius : float\n        The match radius in arcseconds to use for the match.\n\n    includeself : bool\n        If this is True, the object itself will be included in the match\n        results.\n\n    sortresults : bool\n        If this is True, the match indices will be sorted by distance.\n\n    Returns\n    -------\n\n    tuple\n        A tuple like the following is returned::\n\n            (True -> matches found or False -> no matches found,\n             minimum distance between target and list,\n             np.array of indices where list of coordinates is\n             closer than `match_radius` arcseconds from the target,\n             np.array of distances in arcseconds)\n\n    '''\n\n    dist = great_circle_dist(ra1,dec1,ra2,dec2)\n\n    if includeself:\n        match_dist_ind = np.where(dist < match_radius)\n\n    else:\n        # make sure we match only objects that are not the same as this object\n        match_dist_ind = np.where((dist < match_radius) & (dist > 0.1))\n\n    if len(match_dist_ind) > 0:\n        match_dists = dist[match_dist_ind]\n        dist_sort_ind = np.argsort(match_dists)\n\n        if sortresults:\n            match_dist_ind = (match_dist_ind[0])[dist_sort_ind]\n\n        min_dist = np.min(match_dists)\n\n        return (True,min_dist,match_dist_ind,match_dists[dist_sort_ind])\n\n    else:\n        return (False,)", "response": "Finds the closest objects in ra1 to dec1 and returns the indices of the objects that are closer than match_radius arcseconds from the target."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef equatorial_to_galactic(ra, decl, equinox='J2000'):\n    '''This converts from equatorial coords to galactic coords.\n\n    Parameters\n    ----------\n\n    ra : float or array-like\n        Right ascension values(s) in decimal degrees.\n\n    decl : float or array-like\n        Declination value(s) in decimal degrees.\n\n    equinox : str\n        The equinox that the coordinates are measured at. This must be\n        recognizable by Astropy's `SkyCoord` class.\n\n    Returns\n    -------\n\n    tuple of (float, float) or tuple of (np.array, np.array)\n        The galactic coordinates (l, b) for each element of the input\n        (`ra`, `decl`).\n\n    '''\n\n    # convert the ra/decl to gl, gb\n    radecl = SkyCoord(ra=ra*u.degree, dec=decl*u.degree, equinox=equinox)\n\n    gl = radecl.galactic.l.degree\n    gb = radecl.galactic.b.degree\n\n    return gl, gb", "response": "This converts from equatorial coordinates to galactic coordinates."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef generate_transit_lightcurve(\n        times,\n        mags=None,\n        errs=None,\n        paramdists={'transitperiod':sps.uniform(loc=0.1,scale=49.9),\n                    'transitdepth':sps.uniform(loc=1.0e-4,scale=2.0e-2),\n                    'transitduration':sps.uniform(loc=0.01,scale=0.29)},\n        magsarefluxes=False,\n):\n    '''This generates fake planet transit light curves.\n\n    Parameters\n    ----------\n\n    times : np.array\n        This is an array of time values that will be used as the time base.\n\n    mags,errs : np.array\n        These arrays will have the model added to them. If either is\n        None, `np.full_like(times, 0.0)` will used as a substitute and the model\n        light curve will be centered around 0.0.\n\n    paramdists : dict\n        This is a dict containing parameter distributions to use for the\n        model params, containing the following keys ::\n\n            {'transitperiod', 'transitdepth', 'transitduration'}\n\n        The values of these keys should all be 'frozen' scipy.stats distribution\n        objects, e.g.:\n\n        https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions\n        The variability epoch will be automatically chosen from a uniform\n        distribution between `times.min()` and `times.max()`.\n\n        The ingress duration will be automatically chosen from a uniform\n        distribution ranging from 0.05 to 0.5 of the transitduration.\n\n        The transitdepth will be flipped automatically as appropriate if\n        `magsarefluxes=True`.\n\n    magsarefluxes : bool\n        If the generated time series is meant to be a flux time-series, set this\n        to True to get the correct sign of variability amplitude.\n\n    Returns\n    -------\n\n    dict\n        A dict of the form below is returned::\n\n            {'vartype': 'planet',\n             'params': {'transitperiod': generated value of period,\n                        'transitepoch': generated value of epoch,\n                        'transitdepth': generated value of transit depth,\n                        'transitduration': generated value of transit duration,\n                        'ingressduration': generated value of transit ingress\n                                           duration},\n             'times': the model times,\n             'mags': the model mags,\n             'errs': the model errs,\n             'varperiod': the generated period of variability == 'transitperiod'\n             'varamplitude': the generated amplitude of\n                             variability == 'transitdepth'}\n\n    '''\n\n    if mags is None:\n        mags = np.full_like(times, 0.0)\n\n    if errs is None:\n        errs = np.full_like(times, 0.0)\n\n    # choose the epoch\n    epoch = npr.random()*(times.max() - times.min()) + times.min()\n\n    # choose the period, depth, duration\n    period = paramdists['transitperiod'].rvs(size=1)\n    depth = paramdists['transitdepth'].rvs(size=1)\n    duration = paramdists['transitduration'].rvs(size=1)\n\n    # figure out the ingress duration\n    ingduration = npr.random()*(0.5*duration - 0.05*duration) + 0.05*duration\n\n    # fix the transit depth if it needs to be flipped\n    if magsarefluxes and depth < 0.0:\n        depth = -depth\n    elif not magsarefluxes and depth > 0.0:\n        depth = -depth\n\n    # generate the model\n    modelmags, phase, ptimes, pmags, perrs = (\n        transits.trapezoid_transit_func([period, epoch, depth,\n                                         duration, ingduration],\n                                        times,\n                                        mags,\n                                        errs)\n    )\n\n    # resort in original time order\n    timeind = np.argsort(ptimes)\n    mtimes = ptimes[timeind]\n    mmags = modelmags[timeind]\n    merrs = perrs[timeind]\n\n    # return a dict with everything\n    modeldict = {\n        'vartype':'planet',\n        'params':{x:np.asscalar(y) for x,y in zip(['transitperiod',\n                                                   'transitepoch',\n                                                   'transitdepth',\n                                                   'transitduration',\n                                                   'ingressduration'],\n                                                  [period,\n                                                   epoch,\n                                                   depth,\n                                                   duration,\n                                                   ingduration])},\n        'times':mtimes,\n        'mags':mmags,\n        'errs':merrs,\n        # these are standard keys that help with later characterization of\n        # variability as a function period, variability amplitude, object mag,\n        # ndet, etc.\n        'varperiod':period,\n        'varamplitude':depth\n    }\n\n    return modeldict", "response": "This function generates a fake planet transit light curve for the given time - base and time - series."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef generate_eb_lightcurve(\n        times,\n        mags=None,\n        errs=None,\n        paramdists={'period':sps.uniform(loc=0.2,scale=99.8),\n                    'pdepth':sps.uniform(loc=1.0e-4,scale=0.7),\n                    'pduration':sps.uniform(loc=0.01,scale=0.44),\n                    'depthratio':sps.uniform(loc=0.01,scale=0.99),\n                    'secphase':sps.norm(loc=0.5,scale=0.1)},\n        magsarefluxes=False,\n):\n    '''This generates fake EB light curves.\n\n    Parameters\n    ----------\n\n    times : np.array\n        This is an array of time values that will be used as the time base.\n\n    mags,errs : np.array\n        These arrays will have the model added to them. If either is\n        None, `np.full_like(times, 0.0)` will used as a substitute and the model\n        light curve will be centered around 0.0.\n\n    paramdists : dict\n        This is a dict containing parameter distributions to use for the\n        model params, containing the following keys ::\n\n            {'period', 'pdepth', 'pduration', 'depthratio', 'secphase'}\n\n        The values of these keys should all be 'frozen' scipy.stats distribution\n        objects, e.g.:\n\n        https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions\n        The variability epoch will be automatically chosen from a uniform\n        distribution between `times.min()` and `times.max()`.\n\n        The `pdepth` will be flipped automatically as appropriate if\n        `magsarefluxes=True`.\n\n    magsarefluxes : bool\n        If the generated time series is meant to be a flux time-series, set this\n        to True to get the correct sign of variability amplitude.\n\n    Returns\n    -------\n\n    dict\n        A dict of the form below is returned::\n\n            {'vartype': 'EB',\n             'params': {'period': generated value of period,\n                        'epoch': generated value of epoch,\n                        'pdepth': generated value of priary eclipse depth,\n                        'pduration': generated value of prim eclipse duration,\n                        'depthratio': generated value of prim/sec eclipse\n                                      depth ratio},\n             'times': the model times,\n             'mags': the model mags,\n             'errs': the model errs,\n             'varperiod': the generated period of variability == 'period'\n             'varamplitude': the generated amplitude of\n                             variability == 'pdepth'}\n\n    '''\n\n    if mags is None:\n        mags = np.full_like(times, 0.0)\n\n    if errs is None:\n        errs = np.full_like(times, 0.0)\n\n    # choose the epoch\n    epoch = npr.random()*(times.max() - times.min()) + times.min()\n\n    # choose the period, pdepth, duration, depthratio\n    period = paramdists['period'].rvs(size=1)\n    pdepth = paramdists['pdepth'].rvs(size=1)\n    pduration = paramdists['pduration'].rvs(size=1)\n    depthratio = paramdists['depthratio'].rvs(size=1)\n    secphase = paramdists['secphase'].rvs(size=1)\n\n    # fix the transit depth if it needs to be flipped\n    if magsarefluxes and pdepth < 0.0:\n        pdepth = -pdepth\n    elif not magsarefluxes and pdepth > 0.0:\n        pdepth = -pdepth\n\n    # generate the model\n    modelmags, phase, ptimes, pmags, perrs = (\n        eclipses.invgauss_eclipses_func([period, epoch, pdepth,\n                                         pduration, depthratio, secphase],\n                                        times,\n                                        mags,\n                                        errs)\n    )\n\n    # resort in original time order\n    timeind = np.argsort(ptimes)\n    mtimes = ptimes[timeind]\n    mmags = modelmags[timeind]\n    merrs = perrs[timeind]\n\n    # return a dict with everything\n    modeldict = {\n        'vartype':'EB',\n        'params':{x:np.asscalar(y) for x,y in zip(['period',\n                                                   'epoch',\n                                                   'pdepth',\n                                                   'pduration',\n                                                   'depthratio'],\n                                                  [period,\n                                                   epoch,\n                                                   pdepth,\n                                                   pduration,\n                                                   depthratio])},\n        'times':mtimes,\n        'mags':mmags,\n        'errs':merrs,\n        'varperiod':period,\n        'varamplitude':pdepth,\n    }\n\n    return modeldict", "response": "This function generates fake EB light curves for a set of time - base objects."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef generate_flare_lightcurve(\n        times,\n        mags=None,\n        errs=None,\n        paramdists={\n            # flare peak amplitude from 0.01 mag to 1.0 mag above median.  this\n            # is tuned for redder bands, flares are much stronger in bluer\n            # bands, so tune appropriately for your situation.\n            'amplitude':sps.uniform(loc=0.01,scale=0.99),\n            # up to 5 flares per LC and at least 1\n            'nflares':[1,5],\n            # 10 minutes to 1 hour for rise stdev\n            'risestdev':sps.uniform(loc=0.007, scale=0.04),\n            # 1 hour to 4 hours for decay time constant\n            'decayconst':sps.uniform(loc=0.04, scale=0.163)\n        },\n        magsarefluxes=False,\n):\n    '''This generates fake flare light curves.\n\n    Parameters\n    ----------\n\n    times : np.array\n        This is an array of time values that will be used as the time base.\n\n    mags,errs : np.array\n        These arrays will have the model added to them. If either is\n        None, `np.full_like(times, 0.0)` will used as a substitute and the model\n        light curve will be centered around 0.0.\n\n    paramdists : dict\n        This is a dict containing parameter distributions to use for the\n        model params, containing the following keys ::\n\n            {'amplitude', 'nflares', 'risestdev', 'decayconst'}\n\n        The values of these keys should all be 'frozen' scipy.stats distribution\n        objects, e.g.:\n\n        https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions\n        The `flare_peak_time` for each flare will be generated automatically\n        between `times.min()` and `times.max()` using a uniform distribution.\n\n        The `amplitude` will be flipped automatically as appropriate if\n        `magsarefluxes=True`.\n\n    magsarefluxes : bool\n        If the generated time series is meant to be a flux time-series, set this\n        to True to get the correct sign of variability amplitude.\n\n    Returns\n    -------\n\n    dict\n        A dict of the form below is returned::\n\n            {'vartype': 'flare',\n             'params': {'amplitude': generated value of flare amplitudes,\n                        'nflares': generated value of number of flares,\n                        'risestdev': generated value of stdev of rise time,\n                        'decayconst': generated value of decay constant,\n                        'peaktime': generated value of flare peak time},\n             'times': the model times,\n             'mags': the model mags,\n             'errs': the model errs,\n             'varamplitude': the generated amplitude of\n                             variability == 'amplitude'}\n\n    '''\n\n    if mags is None:\n        mags = np.full_like(times, 0.0)\n\n    if errs is None:\n        errs = np.full_like(times, 0.0)\n\n    nflares = npr.randint(paramdists['nflares'][0],\n                          high=paramdists['nflares'][1])\n\n    # generate random flare peak times based on the number of flares\n    flarepeaktimes = (\n        npr.random(\n            size=nflares\n        )*(times.max() - times.min()) + times.min()\n    )\n\n    # now add the flares to the time-series\n    params = {'nflares':nflares}\n\n    for flareind, peaktime in zip(range(nflares), flarepeaktimes):\n\n        # choose the amplitude, rise stdev and decay time constant\n        amp = paramdists['amplitude'].rvs(size=1)\n        risestdev = paramdists['risestdev'].rvs(size=1)\n        decayconst = paramdists['decayconst'].rvs(size=1)\n\n        # fix the transit depth if it needs to be flipped\n        if magsarefluxes and amp < 0.0:\n            amp = -amp\n        elif not magsarefluxes and amp > 0.0:\n            amp = -amp\n\n        # add this flare to the light curve\n        modelmags, ptimes, pmags, perrs = (\n            flares.flare_model(\n                [amp, peaktime, risestdev, decayconst],\n                times,\n                mags,\n                errs\n            )\n        )\n\n        # update the mags\n        mags = modelmags\n\n        # add the flare params to the modeldict\n        params[flareind] = {'peaktime':peaktime,\n                            'amplitude':amp,\n                            'risestdev':risestdev,\n                            'decayconst':decayconst}\n\n\n    #\n    # done with all flares\n    #\n\n    # return a dict with everything\n    modeldict = {\n        'vartype':'flare',\n        'params':params,\n        'times':times,\n        'mags':mags,\n        'errs':errs,\n        'varperiod':None,\n        # FIXME: this is complicated because we can have multiple flares\n        # figure out a good way to handle this upstream\n        'varamplitude':[params[x]['amplitude']\n                        for x in range(params['nflares'])],\n    }\n\n    return modeldict", "response": "This function generates a fake flare light curve for the given set of time - base entries."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef generate_rrab_lightcurve(\n        times,\n        mags=None,\n        errs=None,\n        paramdists={\n            'period':sps.uniform(loc=0.45,scale=0.35),\n            'fourierorder':[8,11],\n            'amplitude':sps.uniform(loc=0.4,scale=0.5),\n            'phioffset':np.pi,\n        },\n        magsarefluxes=False\n):\n    '''This generates fake RRab light curves.\n\n    Parameters\n    ----------\n\n    times : np.array\n        This is an array of time values that will be used as the time base.\n\n    mags,errs : np.array\n        These arrays will have the model added to them. If either is\n        None, `np.full_like(times, 0.0)` will used as a substitute and the model\n        light curve will be centered around 0.0.\n\n    paramdists : dict\n        This is a dict containing parameter distributions to use for the\n        model params, containing the following keys ::\n\n            {'period', 'fourierorder', 'amplitude'}\n\n        The values of these keys should all be 'frozen' scipy.stats distribution\n        objects, e.g.:\n\n        https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions\n        The variability epoch will be automatically chosen from a uniform\n        distribution between `times.min()` and `times.max()`.\n\n        The `amplitude` will be flipped automatically as appropriate if\n        `magsarefluxes=True`.\n\n    magsarefluxes : bool\n        If the generated time series is meant to be a flux time-series, set this\n        to True to get the correct sign of variability amplitude.\n\n    Returns\n    -------\n\n    dict\n        A dict of the form below is returned::\n\n            {'vartype': 'RRab',\n             'params': {'period': generated value of period,\n                        'epoch': generated value of epoch,\n                        'amplitude': generated value of amplitude,\n                        'fourierorder': generated value of fourier order,\n                        'fourieramps': generated values of fourier amplitudes,\n                        'fourierphases': generated values of fourier phases},\n             'times': the model times,\n             'mags': the model mags,\n             'errs': the model errs,\n             'varperiod': the generated period of variability == 'period'\n             'varamplitude': the generated amplitude of\n                             variability == 'amplitude'}\n\n    '''\n\n    modeldict = generate_sinusoidal_lightcurve(times,\n                                               mags=mags,\n                                               errs=errs,\n                                               paramdists=paramdists,\n                                               magsarefluxes=magsarefluxes)\n    modeldict['vartype'] = 'RRab'\n    return modeldict", "response": "This function generates fake RRab light curves for a set of time - base objects."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef make_fakelc(lcfile,\n                outdir,\n                magrms=None,\n                randomizemags=True,\n                randomizecoords=False,\n                lcformat='hat-sql',\n                lcformatdir=None,\n                timecols=None,\n                magcols=None,\n                errcols=None):\n    '''This preprocesses an input real LC and sets it up to be a fake LC.\n\n    Parameters\n    ----------\n\n    lcfile : str\n        This is an input light curve file that will be used to copy over the\n        time-base. This will be used to generate the time-base for fake light\n        curves to provide a realistic simulation of the observing window\n        function.\n\n    outdir : str\n        The output directory where the the fake light curve will be written.\n\n    magrms : dict\n        This is a dict containing the SDSS r mag-RMS (SDSS rmag-MAD preferably)\n        relation based on all light curves that the input lcfile is from. This\n        will be used to generate the median mag and noise corresponding to the\n        magnitude chosen for this fake LC.\n\n    randomizemags : bool\n        If this is True, then a random mag between the first and last magbin in\n        magrms will be chosen as the median mag for this light curve. This\n        choice will be weighted by the mag bin probability obtained from the\n        magrms kwarg. Otherwise, the median mag will be taken from the input\n        lcfile's lcdict['objectinfo']['sdssr'] key or a transformed SDSS r mag\n        generated from the input lcfile's lcdict['objectinfo']['jmag'],\n        ['hmag'], and ['kmag'] keys. The magrms relation for each magcol will be\n        used to generate Gaussian noise at the correct level for the magbin this\n        light curve's median mag falls into.\n\n    randomizecoords : bool\n        If this is True, will randomize the RA, DEC of the output fake object\n        and not copy over the RA/DEC from the real input object.\n\n    lcformat : str\n        This is the `formatkey` associated with your input real light curve\n        format, which you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curve specified in `lcfile`.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    timecols : list of str or None\n        The timecol keys to use from the input lcdict in generating the fake\n        light curve. Fake LCs will be generated for each each\n        timecol/magcol/errcol combination in the input light curve.\n\n    magcols : list of str or None\n        The magcol keys to use from the input lcdict in generating the fake\n        light curve. Fake LCs will be generated for each each\n        timecol/magcol/errcol combination in the input light curve.\n\n    errcols : list of str or None\n        The errcol keys to use from the input lcdict in generating the fake\n        light curve. Fake LCs will be generated for each each\n        timecol/magcol/errcol combination in the input light curve.\n\n    Returns\n    -------\n\n    tuple\n        A tuple of the following form is returned::\n\n            (fakelc_fpath,\n             fakelc_lcdict['columns'],\n             fakelc_lcdict['objectinfo'],\n             fakelc_lcdict['moments'])\n\n    '''\n\n    try:\n        formatinfo = get_lcformat(lcformat,\n                                  use_lcformat_dir=lcformatdir)\n        if formatinfo:\n            (fileglob, readerfunc,\n             dtimecols, dmagcols, derrcols,\n             magsarefluxes, normfunc) = formatinfo\n        else:\n            LOGERROR(\"can't figure out the light curve format\")\n            return None\n    except Exception as e:\n        LOGEXCEPTION(\"can't figure out the light curve format\")\n        return None\n\n    # override the default timecols, magcols, and errcols\n    # using the ones provided to the function\n    if timecols is None:\n        timecols = dtimecols\n    if magcols is None:\n        magcols = dmagcols\n    if errcols is None:\n        errcols = derrcols\n\n    # read in the light curve\n    lcdict = readerfunc(lcfile)\n    if isinstance(lcdict, tuple) and isinstance(lcdict[0],dict):\n        lcdict = lcdict[0]\n\n    # set up the fakelcdict with a randomly assigned objectid\n    fakeobjectid = sha512(npr.bytes(12)).hexdigest()[-8:]\n    fakelcdict = {\n        'objectid':fakeobjectid,\n        'objectinfo':{'objectid':fakeobjectid},\n        'columns':[],\n        'moments':{},\n        'origformat':lcformat,\n    }\n\n\n    # now, get the actual mag of this object and other info and use that to\n    # populate the corresponding entries of the fakelcdict objectinfo\n    if ('objectinfo' in lcdict and\n        isinstance(lcdict['objectinfo'], dict)):\n\n        objectinfo = lcdict['objectinfo']\n\n        # get the RA\n        if (not randomizecoords and 'ra' in objectinfo and\n            objectinfo['ra'] is not None and\n            np.isfinite(objectinfo['ra'])):\n\n            fakelcdict['objectinfo']['ra'] = objectinfo['ra']\n\n        else:\n\n            # if there's no RA available, we'll assign a random one between 0\n            # and 360.0\n            LOGWARNING('%s: assigning a random right ascension' % lcfile)\n            fakelcdict['objectinfo']['ra'] = npr.random()*360.0\n\n        # get the DEC\n        if (not randomizecoords and 'decl' in objectinfo and\n            objectinfo['decl'] is not None and\n            np.isfinite(objectinfo['decl'])):\n\n            fakelcdict['objectinfo']['decl'] = objectinfo['decl']\n\n        else:\n\n            # if there's no DECL available, we'll assign a random one between\n            # -90.0 and +90.0\n            LOGWARNING(' %s: assigning a random declination' % lcfile)\n            fakelcdict['objectinfo']['decl'] = npr.random()*180.0 - 90.0\n\n        # get the SDSS r mag for this object\n        # this will be used for getting the eventual mag-RMS relation later\n        if ((not randomizemags) and 'sdssr' in objectinfo and\n            objectinfo['sdssr'] is not None and\n            np.isfinite(objectinfo['sdssr'])):\n\n            fakelcdict['objectinfo']['sdssr'] = objectinfo['sdssr']\n\n        # if the SDSS r is unavailable, but we have J, H, K: use those to get\n        # the SDSS r by using transformations\n        elif ((not randomizemags) and ('jmag' in objectinfo and\n                                       objectinfo['jmag'] is not None and\n                                       np.isfinite(objectinfo['jmag'])) and\n              ('hmag' in objectinfo and\n               objectinfo['hmag'] is not None and\n               np.isfinite(objectinfo['hmag'])) and\n              ('kmag' in objectinfo and\n               objectinfo['kmag'] is not None and\n               np.isfinite(objectinfo['kmag']))):\n\n            LOGWARNING('used JHK mags to generate an SDSS r mag for %s' %\n                       lcfile)\n            fakelcdict['objectinfo']['sdssr'] = jhk_to_sdssr(\n                objectinfo['jmag'],\n                objectinfo['hmag'],\n                objectinfo['kmag']\n            )\n\n        # if there are no mags available or we're specically told to randomize\n        # them, generate a random mag between 8 and 16.0\n        elif randomizemags and magrms:\n\n            LOGWARNING(' %s: assigning a random mag weighted by mag '\n                       'bin probabilities' % lcfile)\n\n            magbins = magrms[magcols[0]]['binned_sdssr_median']\n            binprobs = magrms[magcols[0]]['magbin_probabilities']\n\n            # this is the center of the magbin chosen\n            magbincenter = npr.choice(magbins,size=1,p=binprobs)\n\n            # in this magbin, choose between center and -+ 0.25 mag\n            chosenmag = (\n                npr.random()*((magbincenter+0.25) - (magbincenter-0.25)) +\n                (magbincenter-0.25)\n            )\n\n            fakelcdict['objectinfo']['sdssr'] = np.asscalar(chosenmag)\n\n        # if there are no mags available at all, generate a random mag\n        # between 8 and 16.0\n        else:\n\n            LOGWARNING(' %s: assigning a random mag from '\n                       'uniform distribution between 8.0 and 16.0' % lcfile)\n\n            fakelcdict['objectinfo']['sdssr'] = npr.random()*8.0 + 8.0\n\n    # if there's no info available, generate fake info\n    else:\n\n        LOGWARNING('no object information found in %s, '\n                   'generating random ra, decl, sdssr' %\n                   lcfile)\n        fakelcdict['objectinfo']['ra'] = npr.random()*360.0\n        fakelcdict['objectinfo']['decl'] = npr.random()*180.0 - 90.0\n        fakelcdict['objectinfo']['sdssr'] = npr.random()*8.0 + 8.0\n\n\n    #\n    # NOW FILL IN THE TIMES, MAGS, ERRS\n    #\n\n    # get the time columns\n    for tcind, tcol in enumerate(timecols):\n\n        if '.' in tcol:\n            tcolget = tcol.split('.')\n        else:\n            tcolget = [tcol]\n\n        if tcol not in fakelcdict:\n            fakelcdict[tcol] = _dict_get(lcdict, tcolget)\n            fakelcdict['columns'].append(tcol)\n\n            # update the ndet with the first time column's size. it's possible\n            # that different time columns have different lengths, but that would\n            # be weird and we won't deal with it for now\n            if tcind == 0:\n                fakelcdict['objectinfo']['ndet'] = fakelcdict[tcol].size\n\n\n    # get the mag columns\n    for mcol in magcols:\n\n        if '.' in mcol:\n            mcolget = mcol.split('.')\n        else:\n            mcolget = [mcol]\n\n        # put the mcol in only once\n        if mcol not in fakelcdict:\n\n            measuredmags = _dict_get(lcdict, mcolget)\n            measuredmags = measuredmags[np.isfinite(measuredmags)]\n\n            # if we're randomizing, get the mags from the interpolated mag-RMS\n            # relation\n            if (randomizemags and magrms and mcol in magrms and\n                'interpolated_magmad' in magrms[mcol] and\n                magrms[mcol]['interpolated_magmad'] is not None):\n\n                interpfunc = magrms[mcol]['interpolated_magmad']\n                lcmad = interpfunc(fakelcdict['objectinfo']['sdssr'])\n\n                fakelcdict['moments'][mcol] = {\n                    'median': fakelcdict['objectinfo']['sdssr'],\n                    'mad': lcmad\n                }\n\n            # if we're not randomizing, get the median and MAD from the light\n            # curve itself\n            else:\n\n                # we require at least 10 finite measurements\n                if measuredmags.size > 9:\n\n                    measuredmedian = np.median(measuredmags)\n                    measuredmad = np.median(\n                        np.abs(measuredmags - measuredmedian)\n                    )\n                    fakelcdict['moments'][mcol] = {'median':measuredmedian,\n                                                   'mad':measuredmad}\n\n                # if there aren't enough measurements in this LC, try to get the\n                # median and RMS from the interpolated mag-RMS relation first\n                else:\n\n                    if (magrms and mcol in magrms and\n                        'interpolated_magmad' in magrms[mcol] and\n                        magrms[mcol]['interpolated_magmad'] is not None):\n\n                        LOGWARNING(\n                            'input LC %s does not have enough '\n                            'finite measurements, '\n                            'generating mag moments from '\n                            'fakelc sdssr and the mag-RMS relation' % lcfile\n                        )\n\n                        interpfunc = magrms[mcol]['interpolated_magmad']\n                        lcmad = interpfunc(fakelcdict['objectinfo']['sdssr'])\n\n                        fakelcdict['moments'][mcol] = {\n                            'median': fakelcdict['objectinfo']['sdssr'],\n                            'mad': lcmad\n                        }\n\n                    # if we don't have the mag-RMS relation either, then we\n                    # can't do anything for this light curve, generate a random\n                    # MAD between 5e-4 and 0.1\n                    else:\n\n                        LOGWARNING(\n                            'input LC %s does not have enough '\n                            'finite measurements and '\n                            'no mag-RMS relation provided '\n                            'assigning a random MAD between 5.0e-4 and 0.1'\n                            % lcfile\n                        )\n\n                        fakelcdict['moments'][mcol] = {\n                            'median':fakelcdict['objectinfo']['sdssr'],\n                            'mad':npr.random()*(0.1 - 5.0e-4) + 5.0e-4\n                        }\n\n            # the magnitude column is set to all zeros initially. this will be\n            # filled in by the add_fakelc_variability function below\n            fakelcdict[mcol] = np.full_like(_dict_get(lcdict, mcolget), 0.0)\n            fakelcdict['columns'].append(mcol)\n\n\n    # get the err columns\n    for mcol, ecol in zip(magcols, errcols):\n\n        if '.' in ecol:\n            ecolget = ecol.split('.')\n        else:\n            ecolget = [ecol]\n\n        if ecol not in fakelcdict:\n\n            measurederrs = _dict_get(lcdict, ecolget)\n            measurederrs = measurederrs[np.isfinite(measurederrs)]\n\n            # if we're randomizing, get the errs from the interpolated mag-RMS\n            # relation\n            if (randomizemags and magrms and mcol in magrms and\n                'interpolated_magmad' in magrms[mcol] and\n                magrms[mcol]['interpolated_magmad'] is not None):\n\n                interpfunc = magrms[mcol]['interpolated_magmad']\n                lcmad = interpfunc(fakelcdict['objectinfo']['sdssr'])\n\n                # the median of the errs = lcmad\n                # the mad of the errs is 0.1 x lcmad\n                fakelcdict['moments'][ecol] = {\n                    'median': lcmad,\n                    'mad': 0.1*lcmad\n                }\n\n            else:\n\n                # we require at least 10 finite measurements\n                # we'll calculate the median and MAD of the errs to use later on\n                if measurederrs.size > 9:\n                    measuredmedian = np.median(measurederrs)\n                    measuredmad = np.median(\n                        np.abs(measurederrs - measuredmedian)\n                    )\n                    fakelcdict['moments'][ecol] = {'median':measuredmedian,\n                                                   'mad':measuredmad}\n                else:\n\n                    if (magrms and mcol in magrms and\n                        'interpolated_magmad' in magrms[mcol] and\n                        magrms[mcol]['interpolated_magmad'] is not None):\n\n                        LOGWARNING(\n                            'input LC %s does not have enough '\n                            'finite measurements, '\n                            'generating err moments from '\n                            'the mag-RMS relation' % lcfile\n                        )\n\n                        interpfunc = magrms[mcol]['interpolated_magmad']\n                        lcmad = interpfunc(fakelcdict['objectinfo']['sdssr'])\n\n                        fakelcdict['moments'][ecol] = {\n                            'median': lcmad,\n                            'mad': 0.1*lcmad\n                        }\n\n                    # if we don't have the mag-RMS relation either, then we\n                    # can't do anything for this light curve, generate a random\n                    # MAD between 5e-4 and 0.1\n                    else:\n\n                        LOGWARNING(\n                            'input LC %s does not have '\n                            'enough finite measurements and '\n                            'no mag-RMS relation provided, '\n                            'generating errs randomly' % lcfile\n                        )\n                        fakelcdict['moments'][ecol] = {\n                            'median':npr.random()*(0.01 - 5.0e-4) + 5.0e-4,\n                            'mad':npr.random()*(0.01 - 5.0e-4) + 5.0e-4\n                        }\n\n            # the errors column is set to all zeros initially. this will be\n            # filled in by the add_fakelc_variability function below.\n            fakelcdict[ecol] = np.full_like(_dict_get(lcdict, ecolget), 0.0)\n            fakelcdict['columns'].append(ecol)\n\n\n\n    # add the timecols, magcols, errcols to the lcdict\n    fakelcdict['timecols'] = timecols\n    fakelcdict['magcols'] = magcols\n    fakelcdict['errcols'] = errcols\n\n    # generate an output file name\n    fakelcfname = '%s-fakelc.pkl' % fakelcdict['objectid']\n    fakelcfpath = os.path.abspath(os.path.join(outdir, fakelcfname))\n\n    # write this out to the output directory\n    with open(fakelcfpath,'wb') as outfd:\n        pickle.dump(fakelcdict, outfd, protocol=pickle.HIGHEST_PROTOCOL)\n\n    # return the fakelc path, its columns, info, and moments so we can put them\n    # into a collection DB later on\n    LOGINFO('real LC %s -> fake LC %s OK' % (lcfile, fakelcfpath))\n\n    return (fakelcfpath, fakelcdict['columns'],\n            fakelcdict['objectinfo'], fakelcdict['moments'])", "response": "This function will preprocesses an input real LC and sets it up to be a fake LC and generates a fake LC."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef add_fakelc_variability(fakelcfile,\n                           vartype,\n                           override_paramdists=None,\n                           magsarefluxes=False,\n                           overwrite=False):\n    '''This adds variability of the specified type to the fake LC.\n\n    The procedure is (for each `magcol`):\n\n    - read the fakelcfile, get the stored moments and vartype info\n\n    - add the periodic variability specified in vartype and varparamdists. if\n      `vartype == None`, then do nothing in this step. If `override_vartype` is\n      not None, override stored vartype with specified vartype. If\n      `override_varparamdists` provided, override with specified\n      `varparamdists`. NOTE: the varparamdists must make sense for the vartype,\n      otherwise, weird stuff will happen.\n\n    - add the median mag level stored in `fakelcfile` to the time series\n\n    - add Gaussian noise to the light curve as specified in `fakelcfile`\n\n    - add a varinfo key and dict to the lcdict with `varperiod`, `varepoch`,\n      `varparams`\n\n    - write back to fake LC pickle\n\n    - return the `varinfo` dict to the caller\n\n    Parameters\n    ----------\n\n    fakelcfile : str\n        The name of the fake LC file to process.\n\n    vartype : str\n        The type of variability to add to this fake LC file.\n\n    override_paramdists : dict\n        A parameter distribution dict as in the `generate_XX_lightcurve`\n        functions above. If provided, will override the distribution stored in\n        the input fake LC file itself.\n\n    magsarefluxes : bool\n        Sets if the variability amplitude is in fluxes and not magnitudes.\n\n    overwite : bool\n        This overwrites the input fake LC file with a new variable LC even if\n        it's been processed before.\n\n    Returns\n    -------\n\n    dict\n        A dict of the following form is returned::\n\n            {'objectid':lcdict['objectid'],\n             'lcfname':fakelcfile,\n             'actual_vartype':vartype,\n             'actual_varparams':lcdict['actual_varparams']}\n\n    '''\n\n    # read in the fakelcfile\n    lcdict = _read_pklc(fakelcfile)\n\n    # make sure to bail out if this light curve already has fake variability\n    # added\n    if ('actual_vartype' in lcdict and\n        'actual_varparams' in lcdict and\n        not overwrite):\n        LOGERROR('%s has existing variability type: %s '\n                 'and params: %s and overwrite = False, '\n                 'skipping this file...' %\n                 (fakelcfile, lcdict['actual_vartype'],\n                  repr(lcdict['actual_varparams'])))\n        return None\n\n    # get the times, mags, errs from this LC\n    timecols, magcols, errcols = (lcdict['timecols'],\n                                  lcdict['magcols'],\n                                  lcdict['errcols'])\n\n\n    # get the correct function to apply variability\n    if vartype in VARTYPE_LCGEN_MAP:\n        vargenfunc = VARTYPE_LCGEN_MAP[vartype]\n    elif vartype is None:\n        vargenfunc = None\n    else:\n        LOGERROR('unknown variability type: %s, choose from: %s' %\n                 (vartype, repr(list(VARTYPE_LCGEN_MAP.keys()))))\n        return None\n\n\n    # 1. generate the variability, including the overrides if provided we do\n    # this outside the loop below to get the period, etc. distributions once\n    # only per object. NOTE: in doing so, we're assuming that the difference\n    # between magcols is just additive and the timebases for each magcol are the\n    # same; this is not strictly correct\n    if vargenfunc is not None:\n        if (override_paramdists is not None and\n            isinstance(override_paramdists,dict)):\n\n            variablelc = vargenfunc(lcdict[timecols[0]],\n                                    paramdists=override_paramdists,\n                                    magsarefluxes=magsarefluxes)\n\n        else:\n\n            variablelc = vargenfunc(lcdict[timecols[0]],\n                                    magsarefluxes=magsarefluxes)\n\n    # for nonvariables, don't execute vargenfunc, but return a similar dict\n    # so we can add the required noise to it\n    else:\n        variablelc = {'vartype':None,\n                      'params':None,\n                      'times':lcdict[timecols[0]],\n                      'mags':np.full_like(lcdict[timecols[0]], 0.0),\n                      'errs':np.full_like(lcdict[timecols[0]], 0.0)}\n\n\n    # now iterate over the time, mag, err columns\n    for tcol, mcol, ecol in zip(timecols, magcols, errcols):\n\n        times, mags, errs = lcdict[tcol], lcdict[mcol], lcdict[ecol]\n\n        # 2. get the moments for this magcol\n        mag_median = lcdict['moments'][mcol]['median']\n        mag_mad = lcdict['moments'][mcol]['mad']\n\n        # add up to 5 mmag of extra RMS for systematics and red-noise\n        mag_rms = mag_mad*1.483\n\n        err_median = lcdict['moments'][ecol]['median']\n        err_mad = lcdict['moments'][ecol]['mad']\n        err_rms = err_mad*1.483\n\n        # 3. add the median level + gaussian noise\n        magnoise = npr.normal(size=variablelc['mags'].size)*mag_rms\n        errnoise = npr.normal(size=variablelc['errs'].size)*err_rms\n\n        finalmags = mag_median + (variablelc['mags'] + magnoise)\n        finalerrs = err_median + (variablelc['errs'] + errnoise)\n\n        # 4. update these tcol, mcol, ecol values in the lcdict\n        lcdict[mcol] = finalmags\n        lcdict[ecol] = finalerrs\n\n    #\n    # all done with updating mags and errs\n    #\n\n    # 5. update the light curve with the variability info\n    lcdict['actual_vartype'] = variablelc['vartype']\n    lcdict['actual_varparams'] = variablelc['params']\n\n    # these standard keys are set to help out later with characterizing recovery\n    # rates by magnitude, period, amplitude, ndet, etc.\n    if vartype is not None:\n        lcdict['actual_varperiod'] = variablelc['varperiod']\n        lcdict['actual_varamplitude'] = variablelc['varamplitude']\n    else:\n        lcdict['actual_varperiod'] = np.nan\n        lcdict['actual_varamplitude'] = np.nan\n\n\n    # 6. write back, making sure to do it safely\n    tempoutf = '%s.%s' % (fakelcfile, md5(npr.bytes(4)).hexdigest()[-8:])\n    with open(tempoutf, 'wb') as outfd:\n        pickle.dump(lcdict, outfd, pickle.HIGHEST_PROTOCOL)\n\n    if os.path.exists(tempoutf):\n        shutil.copy(tempoutf, fakelcfile)\n        os.remove(tempoutf)\n    else:\n        LOGEXCEPTION('could not write output light curve file to dir: %s' %\n                     os.path.dirname(tempoutf))\n        # fail here\n        raise\n\n    LOGINFO('object: %s, vartype: %s -> %s OK' % (\n        lcdict['objectid'],\n        vartype,\n        fakelcfile)\n    )\n\n    return {'objectid':lcdict['objectid'],\n            'lcfname':fakelcfile,\n            'actual_vartype':vartype,\n            'actual_varparams':lcdict['actual_varparams']}", "response": "This function adds a variability to the fake LC file."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef add_variability_to_fakelc_collection(simbasedir,\n                                         override_paramdists=None,\n                                         overwrite_existingvar=False):\n    '''This adds variability and noise to all fake LCs in `simbasedir`.\n\n    If an object is marked as variable in the `fakelcs-info`.pkl file in\n    `simbasedir`, a variable signal will be added to its light curve based on\n    its selected type, default period and amplitude distribution, the\n    appropriate params, etc. the epochs for each variable object will be chosen\n    uniformly from its time-range (and may not necessarily fall on a actual\n    observed time). Nonvariable objects will only have noise added as determined\n    by their params, but no variable signal will be added.\n\n    Parameters\n    ----------\n\n    simbasedir : str\n        The directory containing the fake LCs to process.\n\n    override_paramdists : dict\n        This can be used to override the stored variable parameters in each fake\n        LC. It should be a dict of the following form::\n\n            {'<vartype1>': {'<param1>: a scipy.stats distribution function or\n                                       the np.random.randint function,\n                            .\n                            .\n                            .\n                            '<paramN>: a scipy.stats distribution function\n                                       or the np.random.randint function}\n\n        for any vartype in VARTYPE_LCGEN_MAP. These are used to override the\n        default parameter distributions for each variable type.\n\n    overwrite_existingvar : bool\n        If this is True, then will overwrite any existing variability in the\n        input fake LCs in `simbasedir`.\n\n    Returns\n    -------\n\n    dict\n        This returns a dict containing the fake LC filenames as keys and\n        variability info for each as values.\n\n    '''\n\n    # open the fakelcs-info.pkl\n    infof = os.path.join(simbasedir,'fakelcs-info.pkl')\n    with open(infof, 'rb') as infd:\n        lcinfo = pickle.load(infd)\n\n\n    lclist = lcinfo['lcfpath']\n    varflag = lcinfo['isvariable']\n    vartypes = lcinfo['vartype']\n\n    vartind = 0\n\n    varinfo = {}\n\n    # go through all the LCs and add the required type of variability\n    for lc, varf, _lcind in zip(lclist, varflag, range(len(lclist))):\n\n        # if this object is variable, add variability\n        if varf:\n\n            thisvartype = vartypes[vartind]\n\n            if (override_paramdists and\n                isinstance(override_paramdists, dict) and\n                thisvartype in override_paramdists and\n                isinstance(override_paramdists[thisvartype], dict)):\n\n                thisoverride_paramdists = override_paramdists[thisvartype]\n            else:\n                thisoverride_paramdists = None\n\n\n            varlc = add_fakelc_variability(\n                lc, thisvartype,\n                override_paramdists=thisoverride_paramdists,\n                overwrite=overwrite_existingvar\n            )\n            varinfo[varlc['objectid']] = {'params': varlc['actual_varparams'],\n                                          'vartype': varlc['actual_vartype']}\n\n            # update vartind\n            vartind = vartind + 1\n\n        else:\n\n            varlc = add_fakelc_variability(\n                lc, None,\n                overwrite=overwrite_existingvar\n            )\n            varinfo[varlc['objectid']] = {'params': varlc['actual_varparams'],\n                                          'vartype': varlc['actual_vartype']}\n\n\n    #\n    # done with all objects\n    #\n\n    # write the varinfo back to the dict and fakelcs-info.pkl\n    lcinfo['varinfo'] = varinfo\n\n    tempoutf = '%s.%s' % (infof, md5(npr.bytes(4)).hexdigest()[-8:])\n    with open(tempoutf, 'wb') as outfd:\n        pickle.dump(lcinfo, outfd, pickle.HIGHEST_PROTOCOL)\n\n    if os.path.exists(tempoutf):\n        shutil.copy(tempoutf, infof)\n        os.remove(tempoutf)\n    else:\n        LOGEXCEPTION('could not write output light curve file to dir: %s' %\n                     os.path.dirname(tempoutf))\n        # fail here\n        raise\n\n    return lcinfo", "response": "This function adds variability and noise to all fake LCs in the specified directory."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef add_flare_model(flareparams,\n                    times,\n                    mags,\n                    errs):\n    '''This adds a flare model function to the input magnitude/flux time-series.\n\n    Parameters\n    ----------\n\n    flareparams : list of float\n        This defines the flare model::\n\n            [amplitude,\n             flare_peak_time,\n             rise_gaussian_stdev,\n             decay_time_constant]\n\n        where:\n\n        `amplitude`: the maximum flare amplitude in mags or flux. If flux, then\n        amplitude should be positive. If mags, amplitude should be negative.\n\n        `flare_peak_time`: time at which the flare maximum happens.\n\n        `rise_gaussian_stdev`: the stdev of the gaussian describing the rise of\n        the flare.\n\n        `decay_time_constant`: the time constant of the exponential fall of the\n        flare.\n\n    times,mags,errs : np.array\n        The input time-series of measurements and associated errors for which\n        the model will be generated. The times will be used to generate\n        model mags.\n\n    magsarefluxes : bool\n        Sets the correct direction of the flare amplitude (+ve) for fluxes if\n        True and for mags (-ve) if False.\n\n    Returns\n    -------\n\n    dict\n        A dict of the form below is returned::\n\n        {'times': the original times array\n         'mags': the original mags + the flare model mags evaluated at times,\n         'errs': the original errs array,\n         'flareparams': the input list of flare params}\n\n    '''\n\n    modelmags, ftimes, fmags, ferrs = flares.flare_model(\n        flareparams,\n        times,\n        mags,\n        errs\n    )\n\n    return {'times':times,\n            'mags':mags + modelmags,\n            'errs':errs,\n            'flareparams':flareparams}", "response": "This function adds a flare model function to the input time - series."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _smooth_acf(acf, windowfwhm=7, windowsize=21):\n    '''This returns a smoothed version of the ACF.\n\n    Convolves the ACF with a Gaussian of given `windowsize` and `windowfwhm`.\n\n    Parameters\n    ----------\n\n    acf : np.array\n        The auto-correlation function array to smooth.\n\n    windowfwhm : int\n        The smoothing window Gaussian kernel's FWHM .\n\n    windowsize : int\n        The number of input points to apply the smoothing over.\n\n    Returns\n    -------\n\n    np.array\n        Smoothed version of the input ACF array.\n\n    '''\n\n    convkernel = Gaussian1DKernel(windowfwhm, x_size=windowsize)\n    smoothed = convolve(acf, convkernel, boundary='extend')\n\n    return smoothed", "response": "This function returns a smoothed version of the ACF."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _smooth_acf_savgol(acf, windowsize=21, polyorder=2):\n    '''\n    This returns a smoothed version of the ACF.\n\n    This version uses the Savitsky-Golay smoothing filter.\n\n    Parameters\n    ----------\n\n    acf : np.array\n        The auto-correlation function array to smooth.\n\n    windowsize : int\n        The number of input points to apply the smoothing over.\n\n    polyorder : int\n        The order of the polynomial to use in the Savitsky-Golay filter.\n\n    Returns\n    -------\n\n    np.array\n        Smoothed version of the input ACF array.\n\n    '''\n\n    smoothed = savgol_filter(acf, windowsize, polyorder)\n\n    return smoothed", "response": "This function returns a smoothed version of the ACF array using the Savitsky - Golay smoothing filter."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _get_acf_peakheights(lags, acf, npeaks=20, searchinterval=1):\n    '''This calculates the relative peak heights for first npeaks in ACF.\n\n    Usually, the first peak or the second peak (if its peak height > first peak)\n    corresponds to the correct lag. When we know the correct lag, the period is\n    then::\n\n        bestperiod = time[lags == bestlag] - time[0]\n\n    Parameters\n    ----------\n\n    lags : np.array\n        An array of lags that the ACF is calculated at.\n\n    acf : np.array\n        The array containing the ACF values.\n\n    npeaks : int\n        THe maximum number of peaks to consider when finding peak heights.\n\n    searchinterval : int\n        From `scipy.signal.argrelmax`: \"How many points on each side to use for\n        the comparison to consider comparator(n, n+x) to be True.\" This\n        effectively sets how many points on each of the current peak will be\n        used to check if the current peak is the local maximum.\n\n    Returns\n    -------\n\n    dict\n        This returns a dict of the following form::\n\n            {'maxinds':the indices of the lag array where maxes are,\n             'maxacfs':the ACF values at each max,\n             'maxlags':the lag values at each max,\n             'mininds':the indices of the lag array where mins are,\n             'minacfs':the ACF values at each min,\n             'minlags':the lag values at each min,\n             'relpeakheights':the relative peak heights of each rel. ACF peak,\n             'relpeaklags':the lags at each rel. ACF peak found,\n             'peakindices':the indices of arrays where each rel. ACF peak is,\n             'bestlag':the lag value with the largest rel. ACF peak height,\n             'bestpeakheight':the largest rel. ACF peak height,\n             'bestpeakindex':the largest rel. ACF peak's number in all peaks}\n\n    '''\n\n    maxinds = argrelmax(acf, order=searchinterval)[0]\n    maxacfs = acf[maxinds]\n    maxlags = lags[maxinds]\n    mininds = argrelmin(acf, order=searchinterval)[0]\n    minacfs = acf[mininds]\n    minlags = lags[mininds]\n\n    relpeakheights = npzeros(npeaks)\n    relpeaklags = npzeros(npeaks,dtype=npint64)\n    peakindices = npzeros(npeaks,dtype=npint64)\n\n    for peakind, mxi in enumerate(maxinds[:npeaks]):\n\n        # check if there are no mins to the left\n        # throw away this peak because it's probably spurious\n        # (FIXME: is this OK?)\n        if npall(mxi < mininds):\n            continue\n\n        leftminind = mininds[mininds < mxi][-1]  # the last index to the left\n        rightminind = mininds[mininds > mxi][0]  # the first index to the right\n        relpeakheights[peakind] = (\n            acf[mxi] - (acf[leftminind] + acf[rightminind])/2.0\n        )\n        relpeaklags[peakind] = lags[mxi]\n        peakindices[peakind] = peakind\n\n    # figure out the bestperiod if possible\n    if relpeakheights[0] > relpeakheights[1]:\n        bestlag = relpeaklags[0]\n        bestpeakheight = relpeakheights[0]\n        bestpeakindex = peakindices[0]\n    else:\n        bestlag = relpeaklags[1]\n        bestpeakheight = relpeakheights[1]\n        bestpeakindex = peakindices[1]\n\n    return {'maxinds':maxinds,\n            'maxacfs':maxacfs,\n            'maxlags':maxlags,\n            'mininds':mininds,\n            'minacfs':minacfs,\n            'minlags':minlags,\n            'relpeakheights':relpeakheights,\n            'relpeaklags':relpeaklags,\n            'peakindices':peakindices,\n            'bestlag':bestlag,\n            'bestpeakheight':bestpeakheight,\n            'bestpeakindex':bestpeakindex}", "response": "This function calculates the relative peak heights for the first npeaks in an ACF."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nplot the unsmoothed and smoothed ACF vs. lag plot.", "response": "def plot_acf_results(acfp, outfile, maxlags=5000, yrange=(-0.4,0.4)):\n    '''\n    This plots the unsmoothed/smoothed ACF vs lag.\n\n    Parameters\n    ----------\n\n    acfp : dict\n        This is the dict returned from `macf_period_find` below.\n\n    outfile : str\n        The output file the plot will be written to.\n\n    maxlags: int\n        The maximum number of lags to include in the plot.\n\n    yrange : sequence of two floats\n        The y-range of the ACF vs. lag plot to use.\n\n    '''\n\n    import matplotlib.pyplot as plt\n\n    lags = acfp['acfresults']['lags'][:maxlags]\n    smoothedacf = acfp['acf'][:maxlags]\n    unsmoothedacf = acfp['acfresults']['acf'][:maxlags]\n\n    acfparams = acfp['kwargs']['smoothfunckwargs'].copy()\n    acfparams.update({'peakinterval': int(acfp['kwargs']['smoothacf']/2.0)})\n\n    # plot the ACFs\n    fig, ax1 = plt.subplots()\n\n    # this is lags vs acf\n    ax1.plot(lags, unsmoothedacf, label='unsmoothed ACF',color='#1f77b4')\n    ax1.plot(lags, smoothedacf, label='smoothed ACF', color='#ff7f0e')\n\n    ax1.set_xlim((0,maxlags))\n\n    ax1.set_xlabel('lags')\n\n    # overplot the identified peaks\n    acfmaxinds = acfp['acfpeaks']['maxinds']\n\n    for i, maxind in enumerate(acfmaxinds):\n        if i == 0:\n            ax1.axvline(maxind,\n                        linewidth=2.0,\n                        color='red',\n                        ymin=0.2, ymax=0.3,\n                        label='identified ACF peaks')\n        else:\n            ax1.axvline(maxind,\n                        linewidth=2.0,\n                        color='red',\n                        ymin=0.2, ymax=0.3)\n\n    plt.ylabel('ACF')\n    plt.ylim(yrange)\n    ax1.legend()\n    plt.title('%s' % repr(acfparams))\n    plt.tight_layout()\n    plt.savefig(outfile)\n    plt.close('all')\n\n    return outfile"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _autocorr_func1(mags, lag, maglen, magmed, magstd):\n    '''Calculates the autocorr of mag series for specific lag.\n\n    This version of the function is taken from: Kim et al. (`2011\n    <https://dx.doi.org/10.1088/0004-637X/735/2/68>`_)\n\n    Parameters\n    ----------\n\n    mags : np.array\n        This is the magnitudes array. MUST NOT have any nans.\n\n    lag : float\n        The specific lag value to calculate the auto-correlation for. This MUST\n        be less than total number of observations in `mags`.\n\n    maglen : int\n        The number of elements in the `mags` array.\n\n    magmed : float\n        The median of the `mags` array.\n\n    magstd : float\n        The standard deviation of the `mags` array.\n\n    Returns\n    -------\n\n    float\n        The auto-correlation at this specific `lag` value.\n\n    '''\n\n    lagindex = nparange(1,maglen-lag)\n    products = (mags[lagindex] - magmed) * (mags[lagindex+lag] - magmed)\n    acorr = (1.0/((maglen - lag)*magstd)) * npsum(products)\n\n    return acorr", "response": "Calculates the autocorr of mag series for specific lag."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _autocorr_func3(mags, lag, maglen, magmed, magstd):\n    '''\n    This is yet another alternative to calculate the autocorrelation.\n\n    Taken from: `Bayesian Methods for Hackers by Cameron Pilon <http://nbviewer.jupyter.org/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/blob/master/Chapter3_MCMC/Chapter3.ipynb#Autocorrelation>`_\n\n    (This should be the fastest method to calculate ACFs.)\n\n    Parameters\n    ----------\n\n    mags : np.array\n        This is the magnitudes array. MUST NOT have any nans.\n\n    lag : float\n        The specific lag value to calculate the auto-correlation for. This MUST\n        be less than total number of observations in `mags`.\n\n    maglen : int\n        The number of elements in the `mags` array.\n\n    magmed : float\n        The median of the `mags` array.\n\n    magstd : float\n        The standard deviation of the `mags` array.\n\n    Returns\n    -------\n\n    float\n        The auto-correlation at this specific `lag` value.\n\n    '''\n\n    # from http://tinyurl.com/afz57c4\n    result = npcorrelate(mags, mags, mode='full')\n    result = result / npmax(result)\n\n    return result[int(result.size / 2):]", "response": "This function is a fastest alternative to calculate the autocorrelation for a given set of magnitudes."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef autocorr_magseries(times, mags, errs,\n                       maxlags=1000,\n                       func=_autocorr_func3,\n                       fillgaps=0.0,\n                       filterwindow=11,\n                       forcetimebin=None,\n                       sigclip=3.0,\n                       magsarefluxes=False,\n                       verbose=True):\n    '''This calculates the ACF of a light curve.\n\n    This will pre-process the light curve to fill in all the gaps and normalize\n    everything to zero. If `fillgaps = 'noiselevel'`, fills the gaps with the\n    noise level obtained via the procedure above. If `fillgaps = 'nan'`, fills\n    the gaps with `np.nan`.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The measurement time-series and associated errors.\n\n    maxlags : int\n        The maximum number of lags to calculate.\n\n    func : Python function\n        This is a function to calculate the lags.\n\n    fillgaps : 'noiselevel' or float\n        This sets what to use to fill in gaps in the time series. If this is\n        'noiselevel', will smooth the light curve using a point window size of\n        `filterwindow` (this should be an odd integer), subtract the smoothed LC\n        from the actual LC and estimate the RMS. This RMS will be used to fill\n        in the gaps. Other useful values here are 0.0, and npnan.\n\n    filterwindow : int\n        The light curve's smoothing filter window size to use if\n        `fillgaps='noiselevel`'.\n\n    forcetimebin : None or float\n        This is used to force a particular cadence in the light curve other than\n        the automatically determined cadence. This effectively rebins the light\n        curve to this cadence. This should be in the same time units as `times`.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    magsarefluxes : bool\n        If your input measurements in `mags` are actually fluxes instead of\n        mags, set this is True.\n\n    verbose : bool\n        If True, will indicate progress and report errors.\n\n    Returns\n    -------\n\n    dict\n        A dict of the following form is returned::\n\n            {'itimes': the interpolated time values after gap-filling,\n             'imags': the interpolated mag/flux values after gap-filling,\n             'ierrs': the interpolated mag/flux values after gap-filling,\n             'cadence': the cadence of the output mag/flux time-series,\n             'minitime': the minimum value of the interpolated times array,\n             'lags': the lags used to calculate the auto-correlation function,\n             'acf': the value of the ACF at each lag used}\n\n    '''\n\n    # get the gap-filled timeseries\n    interpolated = fill_magseries_gaps(times, mags, errs,\n                                       fillgaps=fillgaps,\n                                       forcetimebin=forcetimebin,\n                                       sigclip=sigclip,\n                                       magsarefluxes=magsarefluxes,\n                                       filterwindow=filterwindow,\n                                       verbose=verbose)\n\n    if not interpolated:\n        print('failed to interpolate light curve to minimum cadence!')\n        return None\n\n    itimes, imags = interpolated['itimes'], interpolated['imags'],\n\n    # calculate the lags up to maxlags\n    if maxlags:\n        lags = nparange(0, maxlags)\n    else:\n        lags = nparange(itimes.size)\n\n    series_stdev = 1.483*npmedian(npabs(imags))\n\n    if func != _autocorr_func3:\n\n        # get the autocorrelation as a function of the lag of the mag series\n        autocorr = nparray([func(imags, x, imags.size, 0.0, series_stdev)\n                            for x in lags])\n\n    # this doesn't need a lags array\n    else:\n\n        autocorr = _autocorr_func3(imags, lags[0], imags.size,\n                                   0.0, series_stdev)\n        # return only the maximum number of lags\n        if maxlags is not None:\n            autocorr = autocorr[:maxlags]\n\n    interpolated.update({'minitime':itimes.min(),\n                         'lags':lags,\n                         'acf':autocorr})\n\n    return interpolated", "response": "This function calculates the ACF of a light curve with the given time - series and associated errors."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef aovhm_theta(times, mags, errs, frequency,\n                nharmonics, magvariance):\n    '''This calculates the harmonic AoV theta statistic for a frequency.\n\n    This is a mostly faithful translation of the inner loop in `aovper.f90`. See\n    the following for details:\n\n    - http://users.camk.edu.pl/alex/\n    - Schwarzenberg-Czerny (`1996\n      <http://iopscience.iop.org/article/10.1086/309985/meta>`_)\n\n    Schwarzenberg-Czerny (1996) equation 11::\n\n        theta_prefactor = (K - 2N - 1)/(2N)\n        theta_top = sum(c_n*c_n) (from n=0 to n=2N)\n        theta_bot = variance(timeseries) - sum(c_n*c_n) (from n=0 to n=2N)\n\n        theta = theta_prefactor * (theta_top/theta_bot)\n\n        N = number of harmonics (nharmonics)\n        K = length of time series (times.size)\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The input time-series to calculate the test statistic for. These should\n        all be of nans/infs and be normalized to zero.\n\n    frequency : float\n        The test frequency to calculate the statistic for.\n\n    nharmonics : int\n        The number of harmonics to calculate up to.The recommended range is 4 to\n        8.\n\n    magvariance : float\n        This is the (weighted by errors) variance of the magnitude time\n        series. We provide it as a pre-calculated value here so we don't have to\n        re-calculate it for every worker.\n\n    Returns\n    -------\n\n    aov_harmonic_theta : float\n        THe value of the harmonic AoV theta for the specified test `frequency`.\n\n    '''\n\n    period = 1.0/frequency\n\n    ndet = times.size\n    two_nharmonics = nharmonics + nharmonics\n\n    # phase with test period\n    phasedseries = phase_magseries_with_errs(\n        times, mags, errs, period, times[0],\n        sort=True, wrap=False\n    )\n\n    # get the phased quantities\n    phase = phasedseries['phase']\n    pmags = phasedseries['mags']\n    perrs = phasedseries['errs']\n\n    # this is sqrt(1.0/errs^2) -> the weights\n    pweights = 1.0/perrs\n\n    # multiply by 2.0*PI (for omega*time)\n    phase = phase * 2.0 * pi_value\n\n    # this is the z complex vector\n    z = npcos(phase) + 1.0j*npsin(phase)\n\n    # multiply phase with N\n    phase = nharmonics * phase\n\n    # this is the psi complex vector\n    psi = pmags * pweights * (npcos(phase) + 1j*npsin(phase))\n\n    # this is the initial value of z^n\n    zn = 1.0 + 0.0j\n\n    # this is the initial value of phi\n    phi = pweights + 0.0j\n\n    # initialize theta to zero\n    theta_aov = 0.0\n\n    # go through all the harmonics now up to 2N\n    for _ in range(two_nharmonics):\n\n        # this is <phi, phi>\n        phi_dot_phi = npsum(phi * phi.conjugate())\n\n        # this is the alpha_n numerator\n        alpha = npsum(pweights * z * phi)\n\n        # this is <phi, psi>. make sure to use npvdot and NOT npdot to get\n        # complex conjugate of first vector as expected for complex vectors\n        phi_dot_psi = npvdot(phi, psi)\n\n        # make sure phi_dot_phi is not zero\n        phi_dot_phi = npmax([phi_dot_phi, 10.0e-9])\n\n        # this is the expression for alpha_n\n        alpha = alpha / phi_dot_phi\n\n        # update theta_aov for this harmonic\n        theta_aov = (theta_aov +\n                     npabs(phi_dot_psi) * npabs(phi_dot_psi) / phi_dot_phi)\n\n        # use the recurrence relation to find the next phi\n        phi = phi * z - alpha * zn * phi.conjugate()\n\n        # update z^n\n        zn = zn * z\n\n\n    # done with all harmonics, calculate the theta_aov for this freq\n    # the max below makes sure that magvariance - theta_aov > zero\n    theta_aov = ( (ndet - two_nharmonics - 1.0) * theta_aov /\n                  (two_nharmonics * npmax([magvariance - theta_aov,\n                                           1.0e-9])) )\n\n    return theta_aov", "response": "This function calculates the harmonic AoV theta statistic for a specific test frequency."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef open_default(self):\n        '''\n        This opens the database connection using the default database parameters\n        given in the ~/.astrobase/astrobase.conf file.\n\n        '''\n\n        if HAVECONF:\n            self.open(DBDATA, DBUSER, DBPASS, DBHOST)\n        else:\n            LOGERROR(\"no default DB connection config found in lcdb.conf, \"\n                     \"this function won't work otherwise\")", "response": "This opens the default database connection using the default parameters in lcdb. conf."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nset the database connection to autocommit.", "response": "def autocommit(self):\n        '''\n        This sets the database connection to autocommit. Must be called before\n        any cursors have been instantiated.\n\n        '''\n\n        if len(self.cursors.keys()) == 0:\n            self.connection.autocommit = True\n        else:\n            raise AttributeError('database cursors are already active, '\n                                 'cannot switch to autocommit now')"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ncreate a new DB cursor for the current DB connection using a randomly generated handle.", "response": "def newcursor(self, dictcursor=False):\n        '''\n        This creates a DB cursor for the current DB connection using a\n        randomly generated handle. Returns a tuple with cursor and handle.\n\n        Parameters\n        ----------\n\n        dictcursor : bool\n            If True, returns a cursor where each returned row can be addressed\n            as a dictionary by column name.\n\n        Returns\n        -------\n\n        tuple\n            The tuple is of the form (handle, psycopg2.Cursor instance).\n\n        '''\n\n        handle = hashlib.sha256(os.urandom(12)).hexdigest()\n\n        if dictcursor:\n            self.cursors[handle] = self.connection.cursor(\n                cursor_factory=psycopg2.extras.DictCursor\n            )\n        else:\n            self.cursors[handle] = self.connection.cursor()\n\n            return (self.cursors[handle], handle)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef commit(self):\n        '''\n        This just calls the connection's commit method.\n\n        '''\n        if not self.connection.closed:\n            self.connection.commit()\n        else:\n            raise AttributeError('postgres connection to %s is closed' %\n                                 self.database)", "response": "Commits the current state of the current resource."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef rollback(self):\n        '''\n        This just calls the connection's commit method.\n\n        '''\n        if not self.connection.closed:\n            self.connection.rollback()\n        else:\n            raise AttributeError('postgres connection to %s is closed' %\n                                 self.database)", "response": "This just calls the connection s commit method."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nclosing the specified cursor and removes it from the self. cursors dictionary.", "response": "def close_cursor(self, handle):\n        '''\n        Closes the cursor specified and removes it from the `self.cursors`\n        dictionary.\n\n        '''\n\n        if handle in self.cursors:\n            self.cursors[handle].close()\n        else:\n            raise KeyError('cursor with handle %s was not found' % handle)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef tap_query(querystr,\n              simbad_mirror='simbad',\n              returnformat='csv',\n              forcefetch=False,\n              cachedir='~/.astrobase/simbad-cache',\n              verbose=True,\n              timeout=10.0,\n              refresh=2.0,\n              maxtimeout=90.0,\n              maxtries=3,\n              complete_query_later=False,\n              jitter=5.0):\n    '''This queries the SIMBAD TAP service using the ADQL query string provided.\n\n    Parameters\n    ----------\n\n    querystr : str\n        This is the ADQL query string. See:\n        http://www.ivoa.net/documents/ADQL/2.0 for the specification.\n\n    simbad_mirror : str\n        This is the key used to select a SIMBAD mirror from the\n        `SIMBAD_URLS` dict above. If set, the specified mirror will be used. If\n        None, a random mirror chosen from that dict will be used.\n\n    returnformat : {'csv','votable','json'}\n        The returned file format to request from the GAIA catalog service.\n\n    forcefetch : bool\n        If this is True, the query will be retried even if cached results for\n        it exist.\n\n    cachedir : str\n        This points to the directory where results will be downloaded.\n\n    verbose : bool\n        If True, will indicate progress and warn of any issues.\n\n    timeout : float\n        This sets the amount of time in seconds to wait for the service to\n        respond to our initial request.\n\n    refresh : float\n        This sets the amount of time in seconds to wait before checking if the\n        result file is available. If the results file isn't available after\n        `refresh` seconds have elapsed, the function will wait for `refresh`\n        seconds continuously, until `maxtimeout` is reached or the results file\n        becomes available.\n\n    maxtimeout : float\n        The maximum amount of time in seconds to wait for a result to become\n        available after submitting our query request.\n\n    maxtries : int\n        The maximum number of tries (across all mirrors tried) to make to either\n        submit the request or download the results, before giving up.\n\n    complete_query_later : bool\n        If set to True, a submitted query that does not return a result before\n        `maxtimeout` has passed will be cancelled but its input request\n        parameters and the result URL provided by the service will be saved. If\n        this function is then called later with these same input request\n        parameters, it will check if the query finally finished and a result is\n        available. If so, will download the results instead of submitting a new\n        query. If it's not done yet, will start waiting for results again. To\n        force launch a new query with the same request parameters, set the\n        `forcefetch` kwarg to True.\n\n    jitter : float\n        This is used to control the scale of the random wait in seconds before\n        starting the query. Useful in parallelized situations.\n\n    Returns\n    -------\n\n    dict\n        This returns a dict of the following form::\n\n            {'params':dict of the input params used for the query,\n             'provenance':'cache' or 'new download',\n             'result':path to the file on disk with the downloaded data table}\n\n    '''\n\n    # get the default params\n    inputparams = TAP_PARAMS.copy()\n\n    # update them with our input params\n\n    inputparams['QUERY'] = querystr[::]\n\n    if returnformat in RETURN_FORMATS:\n        inputparams['FORMAT'] = returnformat\n    else:\n        LOGWARNING('unknown result format: %s requested, using CSV' %\n                   returnformat)\n        inputparams['FORMAT'] = 'csv'\n\n    # see if the cachedir exists\n    if '~' in cachedir:\n        cachedir = os.path.expanduser(cachedir)\n    if not os.path.exists(cachedir):\n        os.makedirs(cachedir)\n\n    # generate the cachefname and look for it\n    xcachekey = '-'.join([repr(inputparams[x])\n                         for x in sorted(inputparams.keys())])\n    cachekey = hashlib.sha256(xcachekey.encode()).hexdigest()\n    cachefname = os.path.join(\n        cachedir,\n        '%s.%s' % (cachekey, RETURN_FORMATS[returnformat])\n    )\n    provenance = 'cache'\n\n    incomplete_qpklf = os.path.join(\n        cachedir,\n        'incomplete-query-%s' % cachekey\n    )\n\n    ##########################################\n    ## COMPLETE A QUERY THAT MAY BE RUNNING ##\n    ##########################################\n\n    # first, check if this query can be resurrected\n    if (not forcefetch and\n        complete_query_later and\n        os.path.exists(incomplete_qpklf)):\n\n        with open(incomplete_qpklf, 'rb') as infd:\n            incomplete_qinfo = pickle.load(infd)\n\n        LOGWARNING('complete_query_later = True, and '\n                   'this query was not completed on a '\n                   'previous run, will check if it is done now...')\n\n        # get the status URL and go into a loop to see if the query completed\n        waitdone = False\n        timeelapsed = 0.0\n\n        simbad_mirror = incomplete_qinfo['simbad_mirror']\n        status_url = incomplete_qinfo['status_url']\n        phasekeyword = incomplete_qinfo['phase_keyword']\n        resultkeyword = incomplete_qinfo['result_keyword']\n\n        while not waitdone:\n\n            if timeelapsed > maxtimeout:\n\n                LOGERROR('SIMBAD TAP query still not done '\n                         'after waiting %s seconds for results.\\n'\n                         'status URL is: %s' %\n                         (maxtimeout,\n                          repr(inputparams),\n                          status_url))\n\n                return None\n\n            try:\n\n                resreq = requests.get(status_url,\n                                      timeout=timeout)\n\n                resreq.raise_for_status()\n\n                # parse the response XML and get the job status\n                resxml = parseString(resreq.text)\n\n                jobstatuselem = (\n                    resxml.getElementsByTagName(phasekeyword)[0]\n                )\n                jobstatus = jobstatuselem.firstChild.toxml()\n\n                if jobstatus == 'COMPLETED':\n\n                    if verbose:\n\n                        LOGINFO('SIMBAD query completed, '\n                                'retrieving results...')\n                    waitdone = True\n\n                # if we're not done yet, then wait some more\n                elif jobstatus != 'ERROR':\n\n                    if verbose:\n                        LOGINFO('elapsed time: %.1f, '\n                                'current status: %s, '\n                                'status URL: %s, waiting...'\n                                % (timeelapsed, jobstatus, status_url))\n\n                    time.sleep(refresh)\n                    timeelapsed = timeelapsed + refresh\n\n                # if the JOB failed, then bail out immediately\n                else:\n\n                    LOGERROR('SIMBAD TAP query failed due to a server error.\\n'\n                             'status URL: %s\\n'\n                             'status contents: %s' %\n                             (status_url,\n                              resreq.text))\n\n                    # since this job failed, remove the incomplete query pickle\n                    # so we can try this from scratch\n                    os.remove(incomplete_qpklf)\n\n                    return None\n\n            except requests.exceptions.Timeout as e:\n\n                LOGEXCEPTION(\n                    'SIMBAD query timed out while waiting for status '\n                    'download results.\\n'\n                    'query: %s\\n'\n                    'status URL: %s' %\n                    (repr(inputparams), status_url)\n                )\n\n                return None\n\n\n            except Exception as e:\n\n                LOGEXCEPTION(\n                    'SIMBAD query failed while waiting for status\\n'\n                    'query: %s\\n'\n                    'status URL: %s\\n'\n                    'status contents: %s' %\n                    (repr(inputparams),\n                     status_url,\n                     resreq.text)\n                )\n\n                # if the query fails completely, then either the status URL\n                # doesn't exist any more or something else went wrong. we'll\n                # remove the incomplete query pickle so we can try this from\n                # scratch\n                os.remove(incomplete_qpklf)\n\n                return None\n\n        #\n        # at this point, we should be ready to get the query results\n        #\n        LOGINFO('query completed, retrieving results...')\n        result_url_elem = resxml.getElementsByTagName(resultkeyword)[0]\n        result_url = result_url_elem.getAttribute('xlink:href')\n        result_nrows = result_url_elem.getAttribute('rows')\n\n        try:\n\n            resreq = requests.get(result_url, timeout=timeout)\n            resreq.raise_for_status()\n\n            if cachefname.endswith('.gz'):\n\n                with gzip.open(cachefname,'wb') as outfd:\n                    for chunk in resreq.iter_content(chunk_size=65536):\n                        outfd.write(chunk)\n\n            else:\n\n                with open(cachefname,'wb') as outfd:\n                    for chunk in resreq.iter_content(chunk_size=65536):\n                        outfd.write(chunk)\n\n            if verbose:\n                LOGINFO('done. rows in result: %s' % result_nrows)\n            tablefname = cachefname\n\n            provenance = 'cache'\n\n            # return a dict pointing to the result file\n            # we'll parse this later\n            resdict = {'params':inputparams,\n                       'provenance':provenance,\n                       'result':tablefname}\n\n            # all went well, so we'll remove the incomplete query pickle\n            os.remove(incomplete_qpklf)\n\n            return resdict\n\n        except requests.exceptions.Timeout as e:\n\n            LOGEXCEPTION(\n                'SIMBAD query timed out while trying to '\n                'download results.\\n'\n                'query: %s\\n'\n                'result URL: %s' %\n                (repr(inputparams), result_url)\n            )\n            return None\n\n        except Exception as e:\n\n            LOGEXCEPTION(\n                'SIMBAD query failed because of an error '\n                'while trying to download results.\\n'\n                'query: %s\\n'\n                'result URL: %s\\n'\n                'response status code: %s' %\n                (repr(inputparams),\n                 result_url,\n                 resreq.status_code)\n            )\n\n            # if the result download fails, then either the result URL doesn't\n            # exist any more or something else went wrong. we'll remove the\n            # incomplete query pickle so we can try this from scratch\n            os.remove(incomplete_qpklf)\n\n            return None\n\n\n    #####################\n    ## RUN A NEW QUERY ##\n    #####################\n\n    # otherwise, we check the cache if it's done already, or run it again if not\n    if forcefetch or (not os.path.exists(cachefname)):\n\n        provenance = 'new download'\n        time.sleep(random.randint(1,jitter))\n\n        # generate a jobid here and update the input params\n        jobid = 'ab-simbad-%i' % time.time()\n        inputparams['JOBNAME'] = jobid\n        inputparams['JOBDESCRIPTION'] = 'astrobase-simbad-tap-ADQL-query'\n\n        try:\n\n            waitdone = False\n            timeelapsed = 0.0\n\n            # set the simbad mirror to use\n            if simbad_mirror is not None and simbad_mirror in SIMBAD_URLS:\n\n                tapurl = SIMBAD_URLS[simbad_mirror]['url']\n                resultkeyword = SIMBAD_URLS[simbad_mirror]['resultkeyword']\n                phasekeyword = SIMBAD_URLS[simbad_mirror]['phasekeyword']\n                randkey = simbad_mirror\n\n                # sub in a table name if this is left unresolved in the input\n                # query\n                if '{table}' in querystr:\n                    inputparams['QUERY'] = (\n                        querystr.format(\n                            table=SIMBAD_URLS[simbad_mirror]['table']\n                        )\n                    )\n\n            else:\n\n                randkey = random.choice(list(SIMBAD_URLS.keys()))\n                tapurl = SIMBAD_URLS[randkey]['url']\n                resultkeyword = SIMBAD_URLS[randkey]['resultkeyword']\n                phasekeyword = SIMBAD_URLS[randkey]['phasekeyword']\n\n                # sub in a table name if this is left unresolved in the input\n                # query\n                if '{table}' in querystr:\n                    inputparams['QUERY'] = (\n                        querystr.format(\n                            table=SIMBAD_URLS[randkey]['table']\n                        )\n                    )\n\n                if verbose:\n                    LOGINFO('using SIMBAD mirror TAP URL: %s' % tapurl)\n\n            # send the query and get status\n            if verbose:\n                LOGINFO(\n                    'submitting SIMBAD TAP query request for input params: %s'\n                    % repr(inputparams)\n                )\n\n\n            # here, we'll make sure the SIMBAD mirror works before doing\n            # anything else\n            mirrorok = False\n            ntries = 1\n\n            while (not mirrorok):\n\n                if ntries > maxtries:\n\n                    LOGERROR('maximum number of allowed SIMBAD query '\n                             'submission tries (%s) reached, bailing out...' %\n                             maxtries)\n                    return None\n\n                try:\n\n                    req = requests.post(tapurl,\n                                        data=inputparams,\n                                        timeout=timeout)\n                    resp_status = req.status_code\n                    req.raise_for_status()\n\n                    mirrorok = True\n\n                # this handles immediate 503s\n                except requests.exceptions.HTTPError as e:\n\n                    LOGWARNING(\n                        'SIMBAD TAP server: %s not responding, '\n                        'trying another mirror...'\n                        % tapurl\n                    )\n                    mirrorok = False\n\n                    # for now, we have only one SIMBAD mirror to hit, so we'll\n                    # wait a random time between 1 and 5 seconds to hit it again\n                    remainingmirrors = list(SIMBAD_URLS.keys())\n                    waittime = random.choice(range(1,6))\n                    time.sleep(waittime)\n\n                    randkey = remainingmirrors[0]\n                    tapurl = SIMBAD_URLS[randkey]['url']\n                    resultkeyword = SIMBAD_URLS[randkey]['resultkeyword']\n                    phasekeyword = SIMBAD_URLS[randkey]['phasekeyword']\n                    if '{table}' in querystr:\n                        inputparams['QUERY'] = (\n                            querystr.format(\n                                table=SIMBAD_URLS[randkey]['table']\n                            )\n                        )\n\n                # this handles initial query submission timeouts\n                except requests.exceptions.Timeout as e:\n\n                    LOGWARNING(\n                        'SIMBAD TAP query submission timed out, '\n                        'mirror is probably down. Trying another mirror...'\n                    )\n                    mirrorok = False\n\n                    # for now, we have only one SIMBAD mirror to hit, so we'll\n                    # wait a random time between 1 and 5 seconds to hit it again\n                    remainingmirrors = list(SIMBAD_URLS.keys())\n                    waittime = random.choice(range(1,6))\n                    time.sleep(waittime)\n\n                    randkey = remainingmirrors[0]\n                    tapurl = SIMBAD_URLS[randkey]['url']\n                    resultkeyword = SIMBAD_URLS[randkey]['resultkeyword']\n                    phasekeyword = SIMBAD_URLS[randkey]['phasekeyword']\n                    if '{table}' in querystr:\n                        inputparams['QUERY'] = (\n                            querystr.format(\n                                table=SIMBAD_URLS[randkey]['table']\n                            )\n                        )\n\n                # update the number of submission tries\n                ntries = ntries + 1\n\n\n\n            # NOTE: python-requests follows the \"303 See Other\" redirect\n            # automatically, so we get the XML status doc immediately. We don't\n            # need to look up the location of it in the initial response's\n            # header as in the SIMBAD example.\n            status_url = req.url\n\n            # parse the response XML and get the job status\n            resxml = parseString(req.text)\n\n\n            jobstatuselem = resxml.getElementsByTagName(phasekeyword)\n\n            if jobstatuselem:\n\n                jobstatuselem = jobstatuselem[0]\n\n            else:\n                LOGERROR('could not parse job phase using '\n                         'keyword %s in result XML' % phasekeyword)\n                LOGERROR('%s' % req.txt)\n\n                req.close()\n                return None\n\n            jobstatus = jobstatuselem.firstChild.toxml()\n\n            # if the job completed already, jump down to retrieving results\n            if jobstatus == 'COMPLETED':\n\n                if verbose:\n\n                    LOGINFO('SIMBAD query completed, '\n                            'retrieving results...')\n\n                    waitdone = True\n\n            elif jobstatus == 'ERROR':\n\n                if verbose:\n\n                    LOGERROR(\n                        'SIMBAD query failed immediately '\n                        '(probably an ADQL error): %s, '\n                        'status URL: %s, status contents: %s' %\n                        (repr(inputparams),\n                         status_url,\n                         req.text)\n                    )\n                    return None\n\n            # we wait for the job to complete if it's not done already\n            else:\n\n                if verbose:\n                    LOGINFO(\n                        'request submitted successfully, '\n                        'current status is: %s. '\n                        'waiting for results...' % jobstatus\n                    )\n\n                while not waitdone:\n\n                    if timeelapsed > maxtimeout:\n\n                        LOGERROR('SIMBAD TAP query timed out '\n                                 'after waiting %s seconds for results.\\n'\n                                 'request was: %s\\n'\n                                 'status URL is: %s\\n'\n                                 'last status was: %s' %\n                                 (maxtimeout,\n                                  repr(inputparams),\n                                  status_url,\n                                  jobstatus))\n\n                        # here, we'll check if we're allowed to sleep on a query\n                        # for a bit and return to it later if the last status\n                        # was QUEUED or EXECUTING\n                        if complete_query_later and jobstatus in ('EXECUTING',\n                                                                  'QUEUED'):\n\n                            # write a pickle with the query params that we can\n                            # pick up later to finish this query\n                            incomplete_qpklf = os.path.join(\n                                cachedir,\n                                'incomplete-query-%s' % cachekey\n                            )\n                            with open(incomplete_qpklf, 'wb') as outfd:\n\n                                savedict = inputparams.copy()\n\n                                savedict['status_url'] = status_url\n                                savedict['last_status'] = jobstatus\n                                savedict['simbad_mirror'] = simbad_mirror\n                                savedict['phase_keyword'] = phasekeyword\n                                savedict['result_keyword'] = resultkeyword\n\n                                pickle.dump(savedict,\n                                            outfd,\n                                            pickle.HIGHEST_PROTOCOL)\n\n                            LOGINFO('complete_query_later = True, '\n                                    'last state of query was: %s, '\n                                    'will resume later if this function '\n                                    'is called again with the same query' %\n                                    jobstatus)\n\n                        return None\n\n                    time.sleep(refresh)\n                    timeelapsed = timeelapsed + refresh\n\n                    try:\n\n                        resreq = requests.get(status_url, timeout=timeout)\n                        resreq.raise_for_status()\n\n                        # parse the response XML and get the job status\n                        resxml = parseString(resreq.text)\n\n                        jobstatuselem = (\n                            resxml.getElementsByTagName(phasekeyword)[0]\n                        )\n                        jobstatus = jobstatuselem.firstChild.toxml()\n\n                        if jobstatus == 'COMPLETED':\n\n                            if verbose:\n\n                                LOGINFO('SIMBAD query completed, '\n                                        'retrieving results...')\n                            waitdone = True\n\n                        else:\n                            if verbose:\n                                LOGINFO('elapsed time: %.1f, '\n                                        'current status: %s, '\n                                        'status URL: %s, waiting...'\n                                        % (timeelapsed, jobstatus, status_url))\n                            continue\n\n                    except requests.exceptions.Timeout as e:\n\n                        LOGEXCEPTION(\n                            'SIMBAD query timed out while waiting for results '\n                            'download results.\\n'\n                            'query: %s\\n'\n                            'status URL: %s' %\n                            (repr(inputparams), status_url)\n                        )\n                        return None\n\n\n                    except Exception as e:\n\n                        LOGEXCEPTION(\n                            'SIMBAD query failed while waiting for results\\n'\n                            'query: %s\\n'\n                            'status URL: %s\\n'\n                            'status contents: %s' %\n                            (repr(inputparams),\n                             status_url,\n                             resreq.text)\n                        )\n                        return None\n\n            #\n            # at this point, we should be ready to get the query results\n            #\n            result_url_elem = resxml.getElementsByTagName(resultkeyword)[0]\n            result_url = result_url_elem.getAttribute('xlink:href')\n            result_nrows = result_url_elem.getAttribute('rows')\n\n            try:\n\n                resreq = requests.get(result_url, timeout=timeout)\n                resreq.raise_for_status()\n\n                if cachefname.endswith('.gz'):\n\n                    with gzip.open(cachefname,'wb') as outfd:\n                        for chunk in resreq.iter_content(chunk_size=65536):\n                            outfd.write(chunk)\n\n                else:\n\n                    with open(cachefname,'wb') as outfd:\n                        for chunk in resreq.iter_content(chunk_size=65536):\n                            outfd.write(chunk)\n\n                if verbose:\n                    LOGINFO('done. rows in result: %s' % result_nrows)\n                tablefname = cachefname\n\n            except requests.exceptions.Timeout as e:\n\n                LOGEXCEPTION(\n                    'SIMBAD query timed out while trying to '\n                    'download results.\\n'\n                    'query: %s\\n'\n                    'result URL: %s' %\n                    (repr(inputparams), result_url)\n                )\n                return None\n\n            except Exception as e:\n\n                LOGEXCEPTION(\n                    'SIMBAD query failed because of an error '\n                    'while trying to download results.\\n'\n                    'query: %s\\n'\n                    'result URL: %s\\n'\n                    'response status code: %s' %\n                    (repr(inputparams),\n                     result_url,\n                     resreq.status_code)\n                )\n                return None\n\n        except requests.exceptions.HTTPError as e:\n            LOGEXCEPTION('SIMBAD TAP query failed.\\nrequest status was: '\n                         '%s.\\nquery was: %s' % (resp_status,\n                                                 repr(inputparams)))\n            return None\n\n\n        except requests.exceptions.Timeout as e:\n            LOGERROR('SIMBAD TAP query submission timed out, '\n                     'site is probably down. Request was: '\n                     '%s' % repr(inputparams))\n            return None\n\n        except Exception as e:\n            LOGEXCEPTION('SIMBAD TAP query request failed for '\n                         '%s' % repr(inputparams))\n\n            if 'resxml' in locals():\n                LOGERROR('HTTP response from service:\\n%s' % req.text)\n\n            return None\n\n    ############################\n    ## GET RESULTS FROM CACHE ##\n    ############################\n\n    else:\n\n        if verbose:\n            LOGINFO('getting cached SIMBAD query result for '\n                    'request: %s' %\n                    (repr(inputparams)))\n\n        tablefname = cachefname\n\n        # try to open the cached file to make sure it's OK\n        try:\n            infd = gzip.open(cachefname,'rb')\n            simbad_objectnames = np.genfromtxt(\n                infd,\n                names=True,\n                delimiter=',',\n                dtype='U20,f8,f8,U20,U20,U20,i8,U600,f8',\n                usecols=(0,1,2,3,4,5,6,7,8),\n                comments='?',  # object names can have '#' in them\n            )\n            infd.close()\n\n        except Exception as e:\n\n            LOGEXCEPTION('could not read cached SIMBAD result file: %s, '\n                         'fetching from server again' % cachefname)\n\n            return tap_query(querystr,\n                             simbad_mirror=simbad_mirror,\n                             returnformat=returnformat,\n                             forcefetch=True,\n                             cachedir=cachedir,\n                             verbose=verbose,\n                             timeout=timeout,\n                             refresh=refresh,\n                             maxtimeout=maxtimeout)\n\n    #\n    # all done with retrieval, now return the result dict\n    #\n\n    # return a dict pointing to the result file\n    # we'll parse this later\n    resdict = {'params':inputparams,\n               'provenance':provenance,\n               'result':tablefname}\n\n    return resdict", "response": "This function queries the SIMBAD TAP service using the provided ADQL query string."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef xmatch_cplist_external_catalogs(cplist,\n                                    xmatchpkl,\n                                    xmatchradiusarcsec=2.0,\n                                    updateexisting=True,\n                                    resultstodir=None):\n    '''This xmatches external catalogs to a collection of checkplots.\n\n    Parameters\n    ----------\n\n    cplist : list of str\n        This is the list of checkplot pickle files to process.\n\n    xmatchpkl : str\n        The filename of a pickle prepared beforehand with the\n        `checkplot.pkl_xmatch.load_xmatch_external_catalogs` function,\n        containing collected external catalogs to cross-match the objects in the\n        input `cplist` against.\n\n    xmatchradiusarcsec : float\n        The match radius to use for the cross-match in arcseconds.\n\n    updateexisting : bool\n        If this is True, will only update the `xmatch` dict in each checkplot\n        pickle with any new cross-matches to the external catalogs. If False,\n        will overwrite the `xmatch` dict with results from the current run.\n\n    resultstodir : str or None\n        If this is provided, then it must be a directory to write the resulting\n        checkplots to after xmatch is done. This can be used to keep the\n        original checkplots in pristine condition for some reason.\n\n    Returns\n    -------\n\n    dict\n        Returns a dict with keys = input checkplot pickle filenames and vals =\n        xmatch status dict for each checkplot pickle.\n\n    '''\n\n    # load the external catalog\n    with open(xmatchpkl,'rb') as infd:\n        xmd = pickle.load(infd)\n\n    # match each object. this is fairly fast, so this is not parallelized at the\n    # moment\n\n    status_dict = {}\n\n    for cpf in cplist:\n\n        cpd = _read_checkplot_picklefile(cpf)\n\n        try:\n\n            # match in place\n            xmatch_external_catalogs(cpd, xmd,\n                                     xmatchradiusarcsec=xmatchradiusarcsec,\n                                     updatexmatch=updateexisting)\n\n            for xmi in cpd['xmatch']:\n\n                if cpd['xmatch'][xmi]['found']:\n                    LOGINFO('checkplot %s: %s matched to %s, '\n                            'match dist: %s arcsec' %\n                            (os.path.basename(cpf),\n                             cpd['objectid'],\n                             cpd['xmatch'][xmi]['name'],\n                             cpd['xmatch'][xmi]['distarcsec']))\n\n                if not resultstodir:\n                    outcpf = _write_checkplot_picklefile(cpd,\n                                                         outfile=cpf)\n                else:\n                    xcpf = os.path.join(resultstodir, os.path.basename(cpf))\n                    outcpf = _write_checkplot_picklefile(cpd,\n                                                         outfile=xcpf)\n\n            status_dict[cpf] = outcpf\n\n        except Exception as e:\n\n            LOGEXCEPTION('failed to match objects for %s' % cpf)\n            status_dict[cpf] = None\n\n    return status_dict", "response": "This function matches external catalogs to a collection of checkplots."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef xmatch_cpdir_external_catalogs(cpdir,\n                                   xmatchpkl,\n                                   cpfileglob='checkplot-*.pkl*',\n                                   xmatchradiusarcsec=2.0,\n                                   updateexisting=True,\n                                   resultstodir=None):\n    '''This xmatches external catalogs to all checkplots in a directory.\n\n    Parameters\n    -----------\n\n    cpdir : str\n        This is the directory to search in for checkplots.\n\n    xmatchpkl : str\n        The filename of a pickle prepared beforehand with the\n        `checkplot.pkl_xmatch.load_xmatch_external_catalogs` function,\n        containing collected external catalogs to cross-match the objects in the\n        input `cplist` against.\n\n    cpfileglob : str\n        This is the UNIX fileglob to use in searching for checkplots.\n\n    xmatchradiusarcsec : float\n        The match radius to use for the cross-match in arcseconds.\n\n    updateexisting : bool\n        If this is True, will only update the `xmatch` dict in each checkplot\n        pickle with any new cross-matches to the external catalogs. If False,\n        will overwrite the `xmatch` dict with results from the current run.\n\n    resultstodir : str or None\n        If this is provided, then it must be a directory to write the resulting\n        checkplots to after xmatch is done. This can be used to keep the\n        original checkplots in pristine condition for some reason.\n\n    Returns\n    -------\n\n    dict\n        Returns a dict with keys = input checkplot pickle filenames and vals =\n        xmatch status dict for each checkplot pickle.\n\n    '''\n\n    cplist = glob.glob(os.path.join(cpdir, cpfileglob))\n\n    return xmatch_cplist_external_catalogs(\n        cplist,\n        xmatchpkl,\n        xmatchradiusarcsec=xmatchradiusarcsec,\n        updateexisting=updateexisting,\n        resultstodir=resultstodir\n    )", "response": "This function will load all external catalogs from a directory and return a dict containing the xmatch status of each external catalogs."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef colormagdiagram_cplist(cplist,\n                           outpkl,\n                           color_mag1=['gaiamag','sdssg'],\n                           color_mag2=['kmag','kmag'],\n                           yaxis_mag=['gaia_absmag','rpmj']):\n    '''This makes color-mag diagrams for all checkplot pickles in the provided\n    list.\n\n    Can make an arbitrary number of CMDs given lists of x-axis colors and y-axis\n    mags to use.\n\n    Parameters\n    ----------\n\n    cplist : list of str\n        This is the list of checkplot pickles to process.\n\n    outpkl : str\n        The filename of the output pickle that will contain the color-mag\n        information for all objects in the checkplots specified in `cplist`.\n\n    color_mag1 : list of str\n        This a list of the keys in each checkplot's `objectinfo` dict that will\n        be used as color_1 in the equation::\n\n                x-axis color = color_mag1 - color_mag2\n\n    color_mag2 : list of str\n        This a list of the keys in each checkplot's `objectinfo` dict that will\n        be used as color_2 in the equation::\n\n                x-axis color = color_mag1 - color_mag2\n\n    yaxis_mag : list of str\n        This is a list of the keys in each checkplot's `objectinfo` dict that\n        will be used as the (absolute) magnitude y-axis of the color-mag\n        diagrams.\n\n    Returns\n    -------\n\n    str\n        The path to the generated CMD pickle file for the collection of objects\n        in the input checkplot list.\n\n    Notes\n    -----\n\n    This can make many CMDs in one go. For example, the default kwargs for\n    `color_mag`, `color_mag2`, and `yaxis_mag` result in two CMDs generated and\n    written to the output pickle file:\n\n    - CMD1 -> gaiamag - kmag on the x-axis vs gaia_absmag on the y-axis\n    - CMD2 -> sdssg - kmag on the x-axis vs rpmj (J reduced PM) on the y-axis\n\n    '''\n\n    # first, we'll collect all of the info\n    cplist_objectids = []\n    cplist_mags = []\n    cplist_colors = []\n\n    for cpf in cplist:\n\n        cpd = _read_checkplot_picklefile(cpf)\n        cplist_objectids.append(cpd['objectid'])\n\n        thiscp_mags = []\n        thiscp_colors = []\n\n        for cm1, cm2, ym in zip(color_mag1, color_mag2, yaxis_mag):\n\n            if (ym in cpd['objectinfo'] and\n                cpd['objectinfo'][ym] is not None):\n                thiscp_mags.append(cpd['objectinfo'][ym])\n            else:\n                thiscp_mags.append(np.nan)\n\n            if (cm1 in cpd['objectinfo'] and\n                cpd['objectinfo'][cm1] is not None and\n                cm2 in cpd['objectinfo'] and\n                cpd['objectinfo'][cm2] is not None):\n                thiscp_colors.append(cpd['objectinfo'][cm1] -\n                                     cpd['objectinfo'][cm2])\n            else:\n                thiscp_colors.append(np.nan)\n\n        cplist_mags.append(thiscp_mags)\n        cplist_colors.append(thiscp_colors)\n\n\n    # convert these to arrays\n    cplist_objectids = np.array(cplist_objectids)\n    cplist_mags = np.array(cplist_mags)\n    cplist_colors = np.array(cplist_colors)\n\n    # prepare the outdict\n    cmddict = {'objectids':cplist_objectids,\n               'mags':cplist_mags,\n               'colors':cplist_colors,\n               'color_mag1':color_mag1,\n               'color_mag2':color_mag2,\n               'yaxis_mag':yaxis_mag}\n\n    # save the pickled figure and dict for fast retrieval later\n    with open(outpkl,'wb') as outfd:\n        pickle.dump(cmddict, outfd, pickle.HIGHEST_PROTOCOL)\n\n    plt.close('all')\n\n    return cmddict", "response": "This function will make a list of checkplot - specific colormag diagrams for all objects in a checkplot."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef add_cmd_to_checkplot(\n        cpx,\n        cmdpkl,\n        require_cmd_magcolor=True,\n        save_cmd_pngs=False\n):\n    '''This adds CMD figures to a checkplot dict or pickle.\n\n    Looks up the CMDs in `cmdpkl`, adds the object from `cpx` as a gold(-ish)\n    star in the plot, and then saves the figure to a base64 encoded PNG, which\n    can then be read and used by the `checkplotserver`.\n\n    Parameters\n    ----------\n\n    cpx : str or dict\n        This is the input checkplot pickle or dict to add the CMD to.\n\n    cmdpkl : str or dict\n        The CMD pickle generated by the `colormagdiagram_cplist` or\n        `colormagdiagram_cpdir` functions above, or the dict produced by reading\n        this pickle in.\n\n    require_cmd_magcolor : bool\n        If this is True, a CMD plot will not be made if the color and mag keys\n        required by the CMD are not present or are nan in this checkplot's\n        objectinfo dict.\n\n    save_cmd_png : bool\n        If this is True, then will save the CMD plots that were generated and\n        added back to the checkplotdict as PNGs to the same directory as\n        `cpx`. If `cpx` is a dict, will save them to the current working\n        directory.\n\n    Returns\n    -------\n\n    str or dict\n        If `cpx` was a str filename of checkplot pickle, this will return that\n        filename to indicate that the CMD was added to the file. If `cpx` was a\n        checkplotdict, this will return the checkplotdict with a new key called\n        'colormagdiagram' containing the base64 encoded PNG binary streams of\n        all CMDs generated.\n\n    '''\n\n    # get the checkplot\n    if isinstance(cpx, str) and os.path.exists(cpx):\n        cpdict = _read_checkplot_picklefile(cpx)\n    elif isinstance(cpx, dict):\n        cpdict = cpx\n    else:\n        LOGERROR('unknown type of checkplot provided as the cpx arg')\n        return None\n\n    # get the CMD\n    if isinstance(cmdpkl, str) and os.path.exists(cmdpkl):\n        with open(cmdpkl, 'rb') as infd:\n            cmd = pickle.load(infd)\n    elif isinstance(cmdpkl, dict):\n        cmd = cmdpkl\n\n\n    cpdict['colormagdiagram'] = {}\n\n    # get the mags and colors from the CMD dict\n    cplist_mags = cmd['mags']\n    cplist_colors = cmd['colors']\n\n    # now make the CMD plots for each color-mag combination in the CMD\n    for c1, c2, ym, ind in zip(cmd['color_mag1'],\n                               cmd['color_mag2'],\n                               cmd['yaxis_mag'],\n                               range(len(cmd['color_mag1']))):\n\n        # get these from the checkplot for this object\n        if (c1 in cpdict['objectinfo'] and\n            cpdict['objectinfo'][c1] is not None):\n            c1mag = cpdict['objectinfo'][c1]\n        else:\n            c1mag = np.nan\n\n        if (c2 in cpdict['objectinfo'] and\n            cpdict['objectinfo'][c2] is not None):\n            c2mag = cpdict['objectinfo'][c2]\n        else:\n            c2mag = np.nan\n\n        if (ym in cpdict['objectinfo'] and\n            cpdict['objectinfo'][ym] is not None):\n            ymmag = cpdict['objectinfo'][ym]\n        else:\n            ymmag = np.nan\n\n        if (require_cmd_magcolor and\n            not (np.isfinite(c1mag) and\n                 np.isfinite(c2mag) and\n                 np.isfinite(ymmag))):\n\n            LOGWARNING(\"required color: %s-%s or mag: %s are not \"\n                       \"in this checkplot's objectinfo dict \"\n                       \"(objectid: %s), skipping CMD...\" %\n                       (c1, c2, ym, cpdict['objectid']))\n            continue\n\n        # make the CMD for this color-mag combination\n        try:\n\n            thiscmd_title = r'%s-%s/%s' % (CMD_LABELS[c1],\n                                           CMD_LABELS[c2],\n                                           CMD_LABELS[ym])\n\n            # make the scatter plot\n            fig = plt.figure(figsize=(10,8))\n            plt.plot(cplist_colors[:,ind],\n                     cplist_mags[:,ind],\n                     rasterized=True,\n                     marker='o',\n                     linestyle='none',\n                     mew=0,\n                     ms=3)\n\n            # put this object on the plot\n            plt.plot([c1mag - c2mag], [ymmag],\n                     ms=20,\n                     color='#b0ff05',\n                     marker='*',\n                     mew=0)\n\n            plt.xlabel(r'$%s - %s$' % (CMD_LABELS[c1], CMD_LABELS[c2]))\n            plt.ylabel(r'$%s$' % CMD_LABELS[ym])\n            plt.title('%s - $%s$ CMD' % (cpdict['objectid'], thiscmd_title))\n            plt.gca().invert_yaxis()\n\n            # now save the figure to StrIO and put it back in the checkplot\n            cmdpng = StrIO()\n            plt.savefig(cmdpng, bbox_inches='tight',\n                        pad_inches=0.0, format='png')\n            cmdpng.seek(0)\n            cmdb64 = base64.b64encode(cmdpng.read())\n            cmdpng.close()\n\n            plt.close('all')\n            plt.gcf().clear()\n\n            cpdict['colormagdiagram']['%s-%s/%s' % (c1,c2,ym)] = cmdb64\n\n            # if we're supposed to export to PNG, do so\n            if save_cmd_pngs:\n\n                if isinstance(cpx, str):\n                    outpng = os.path.join(os.path.dirname(cpx),\n                                          'cmd-%s-%s-%s.%s.png' %\n                                          (cpdict['objectid'],\n                                           c1,c2,ym))\n                else:\n                    outpng = 'cmd-%s-%s-%s.%s.png' % (cpdict['objectid'],\n                                                      c1,c2,ym)\n\n                _base64_to_file(cmdb64, outpng)\n\n        except Exception as e:\n            LOGEXCEPTION('CMD for %s-%s/%s does not exist in %s, skipping...' %\n                         (c1, c2, ym, cmdpkl))\n            continue\n\n\n    #\n    # end of making CMDs\n    #\n\n    if isinstance(cpx, str):\n        cpf = _write_checkplot_picklefile(cpdict, outfile=cpx, protocol=4)\n        return cpf\n    elif isinstance(cpx, dict):\n        return cpdict", "response": "This function adds a CMD to a checkplot dict or pickle."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef add_cmds_cplist(cplist, cmdpkl,\n                    require_cmd_magcolor=True,\n                    save_cmd_pngs=False):\n    '''This adds CMDs for each object in cplist.\n\n    Parameters\n    ----------\n\n    cplist : list of str\n        This is the input list of checkplot pickles to add the CMDs to.\n\n    cmdpkl : str\n        This is the filename of the CMD pickle created previously.\n\n    require_cmd_magcolor : bool\n        If this is True, a CMD plot will not be made if the color and mag keys\n        required by the CMD are not present or are nan in each checkplot's\n        objectinfo dict.\n\n    save_cmd_pngs : bool\n        If this is True, then will save the CMD plots that were generated and\n        added back to the checkplotdict as PNGs to the same directory as\n        `cpx`.\n\n    Returns\n    -------\n\n    Nothing.\n\n\n    '''\n\n    # load the CMD first to save on IO\n    with open(cmdpkl,'rb') as infd:\n        cmd = pickle.load(infd)\n\n    for cpf in cplist:\n\n        add_cmd_to_checkplot(cpf, cmd,\n                             require_cmd_magcolor=require_cmd_magcolor,\n                             save_cmd_pngs=save_cmd_pngs)", "response": "This function adds the CMDs for each object in cplist to the checkplotdict."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef add_cmds_cpdir(cpdir,\n                   cmdpkl,\n                   cpfileglob='checkplot*.pkl*',\n                   require_cmd_magcolor=True,\n                   save_cmd_pngs=False):\n    '''This adds CMDs for each object in cpdir.\n\n    Parameters\n    ----------\n\n    cpdir : list of str\n        This is the directory to search for checkplot pickles.\n\n    cmdpkl : str\n        This is the filename of the CMD pickle created previously.\n\n    cpfileglob : str\n        The UNIX fileglob to use when searching for checkplot pickles to operate\n        on.\n\n    require_cmd_magcolor : bool\n        If this is True, a CMD plot will not be made if the color and mag keys\n        required by the CMD are not present or are nan in each checkplot's\n        objectinfo dict.\n\n    save_cmd_pngs : bool\n        If this is True, then will save the CMD plots that were generated and\n        added back to the checkplotdict as PNGs to the same directory as\n        `cpx`.\n\n    Returns\n    -------\n\n    Nothing.\n\n    '''\n\n    cplist = glob.glob(os.path.join(cpdir, cpfileglob))\n\n    return add_cmds_cplist(cplist,\n                           cmdpkl,\n                           require_cmd_magcolor=require_cmd_magcolor,\n                           save_cmd_pngs=save_cmd_pngs)", "response": "This function adds CMDs for each object in cpdir."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef parallel_update_objectinfo_cplist(\n        cplist,\n        liststartindex=None,\n        maxobjects=None,\n        nworkers=NCPUS,\n        fast_mode=False,\n        findercmap='gray_r',\n        finderconvolve=None,\n        deredden_object=True,\n        custom_bandpasses=None,\n        gaia_submit_timeout=10.0,\n        gaia_submit_tries=3,\n        gaia_max_timeout=180.0,\n        gaia_mirror=None,\n        complete_query_later=True,\n        lclistpkl=None,\n        nbrradiusarcsec=60.0,\n        maxnumneighbors=5,\n        plotdpi=100,\n        findercachedir='~/.astrobase/stamp-cache',\n        verbose=True\n):\n    '''\n    This updates objectinfo for a list of checkplots.\n\n    Useful in cases where a previous round of GAIA/finderchart/external catalog\n    acquisition failed. This will preserve the following keys in the checkplots\n    if they exist:\n\n    comments\n    varinfo\n    objectinfo.objecttags\n\n    Parameters\n    ----------\n\n    cplist : list of str\n        A list of checkplot pickle file names to update.\n\n    liststartindex : int\n        The index of the input list to start working at.\n\n    maxobjects : int\n        The maximum number of objects to process in this run. Use this with\n        `liststartindex` to effectively distribute working on a large list of\n        input checkplot pickles over several sessions or machines.\n\n    nworkers : int\n        The number of parallel workers that will work on the checkplot\n        update process.\n\n    fast_mode : bool or float\n        This runs the external catalog operations in a \"fast\" mode, with short\n        timeouts and not trying to hit external catalogs that take a long time\n        to respond. See the docstring for\n        `checkplot.pkl_utils._pkl_finder_objectinfo` for details on how this\n        works. If this is True, will run in \"fast\" mode with default timeouts (5\n        seconds in most cases). If this is a float, will run in \"fast\" mode with\n        the provided timeout value in seconds.\n\n    findercmap : str or matplotlib.cm.Colormap object\n\n    findercmap : str or matplotlib.cm.ColorMap object\n        The Colormap object to use for the finder chart image.\n\n    finderconvolve : astropy.convolution.Kernel object or None\n        If not None, the Kernel object to use for convolving the finder image.\n\n    deredden_objects : bool\n        If this is True, will use the 2MASS DUST service to get extinction\n        coefficients in various bands, and then try to deredden the magnitudes\n        and colors of the object already present in the checkplot's objectinfo\n        dict.\n\n    custom_bandpasses : dict\n        This is a dict used to provide custom bandpass definitions for any\n        magnitude measurements in the objectinfo dict that are not automatically\n        recognized by the `varclass.starfeatures.color_features` function. See\n        its docstring for details on the required format.\n\n    gaia_submit_timeout : float\n        Sets the timeout in seconds to use when submitting a request to look up\n        the object's information to the GAIA service. Note that if `fast_mode`\n        is set, this is ignored.\n\n    gaia_submit_tries : int\n        Sets the maximum number of times the GAIA services will be contacted to\n        obtain this object's information. If `fast_mode` is set, this is\n        ignored, and the services will be contacted only once (meaning that a\n        failure to respond will be silently ignored and no GAIA data will be\n        added to the checkplot's objectinfo dict).\n\n    gaia_max_timeout : float\n        Sets the timeout in seconds to use when waiting for the GAIA service to\n        respond to our request for the object's information. Note that if\n        `fast_mode` is set, this is ignored.\n\n    gaia_mirror : str\n        This sets the GAIA mirror to use. This is a key in the\n        `services.gaia.GAIA_URLS` dict which defines the URLs to hit for each\n        mirror.\n\n    complete_query_later : bool\n        If this is True, saves the state of GAIA queries that are not yet\n        complete when `gaia_max_timeout` is reached while waiting for the GAIA\n        service to respond to our request. A later call for GAIA info on the\n        same object will attempt to pick up the results from the existing query\n        if it's completed. If `fast_mode` is True, this is ignored.\n\n    lclistpkl : dict or str\n        If this is provided, must be a dict resulting from reading a catalog\n        produced by the `lcproc.catalogs.make_lclist` function or a str path\n        pointing to the pickle file produced by that function. This catalog is\n        used to find neighbors of the current object in the current light curve\n        collection. Looking at neighbors of the object within the radius\n        specified by `nbrradiusarcsec` is useful for light curves produced by\n        instruments that have a large pixel scale, so are susceptible to\n        blending of variability and potential confusion of neighbor variability\n        with that of the actual object being looked at. If this is None, no\n        neighbor lookups will be performed.\n\n    nbrradiusarcsec : float\n        The radius in arcseconds to use for a search conducted around the\n        coordinates of this object to look for any potential confusion and\n        blending of variability amplitude caused by their proximity.\n\n    maxnumneighbors : int\n        The maximum number of neighbors that will have their light curves and\n        magnitudes noted in this checkplot as potential blends with the target\n        object.\n\n    plotdpi : int\n        The resolution in DPI of the plots to generate in this function\n        (e.g. the finder chart, etc.)\n\n    findercachedir : str\n        The path to the astrobase cache directory for finder chart downloads\n        from the NASA SkyView service.\n\n    verbose : bool\n        If True, will indicate progress and warn about potential problems.\n\n    Returns\n    -------\n\n    list of str\n        Paths to the updated checkplot pickle file.\n\n    '''\n\n    # work around the Darwin segfault after fork if no network activity in\n    # main thread bug: https://bugs.python.org/issue30385#msg293958\n    if sys.platform == 'darwin':\n        import requests\n        requests.get('http://captive.apple.com/hotspot-detect.html')\n\n    # handle the start and end indices\n    if (liststartindex is not None) and (maxobjects is None):\n        cplist = cplist[liststartindex:]\n\n    elif (liststartindex is None) and (maxobjects is not None):\n        cplist = cplist[:maxobjects]\n\n    elif (liststartindex is not None) and (maxobjects is not None):\n        cplist = (\n            cplist[liststartindex:liststartindex+maxobjects]\n        )\n\n    tasks = [(x, {'fast_mode':fast_mode,\n                  'findercmap':findercmap,\n                  'finderconvolve':finderconvolve,\n                  'deredden_object':deredden_object,\n                  'custom_bandpasses':custom_bandpasses,\n                  'gaia_submit_timeout':gaia_submit_timeout,\n                  'gaia_submit_tries':gaia_submit_tries,\n                  'gaia_max_timeout':gaia_max_timeout,\n                  'gaia_mirror':gaia_mirror,\n                  'complete_query_later':complete_query_later,\n                  'lclistpkl':lclistpkl,\n                  'nbrradiusarcsec':nbrradiusarcsec,\n                  'maxnumneighbors':maxnumneighbors,\n                  'plotdpi':plotdpi,\n                  'findercachedir':findercachedir,\n                  'verbose':verbose}) for x in cplist]\n\n    resultfutures = []\n    results = []\n\n    with ProcessPoolExecutor(max_workers=nworkers) as executor:\n        resultfutures = executor.map(cp_objectinfo_worker, tasks)\n\n    results = [x for x in resultfutures]\n\n    executor.shutdown()\n    return results", "response": "This function updates objectinfo for a list of checkplot pickle files."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef parallel_update_objectinfo_cpdir(cpdir,\n                                     cpglob='checkplot-*.pkl*',\n                                     liststartindex=None,\n                                     maxobjects=None,\n                                     nworkers=NCPUS,\n                                     fast_mode=False,\n                                     findercmap='gray_r',\n                                     finderconvolve=None,\n                                     deredden_object=True,\n                                     custom_bandpasses=None,\n                                     gaia_submit_timeout=10.0,\n                                     gaia_submit_tries=3,\n                                     gaia_max_timeout=180.0,\n                                     gaia_mirror=None,\n                                     complete_query_later=True,\n                                     lclistpkl=None,\n                                     nbrradiusarcsec=60.0,\n                                     maxnumneighbors=5,\n                                     plotdpi=100,\n                                     findercachedir='~/.astrobase/stamp-cache',\n                                     verbose=True):\n    '''This updates the objectinfo for a directory of checkplot pickles.\n\n    Useful in cases where a previous round of GAIA/finderchart/external catalog\n    acquisition failed. This will preserve the following keys in the checkplots\n    if they exist:\n\n    comments\n    varinfo\n    objectinfo.objecttags\n\n    Parameters\n    ----------\n\n    cpdir : str\n        The directory to look for checkplot pickles in.\n\n    cpglob : str\n        The UNIX fileglob to use when searching for checkplot pickle files.\n\n    liststartindex : int\n        The index of the input list to start working at.\n\n    maxobjects : int\n        The maximum number of objects to process in this run. Use this with\n        `liststartindex` to effectively distribute working on a large list of\n        input checkplot pickles over several sessions or machines.\n\n    nworkers : int\n        The number of parallel workers that will work on the checkplot\n        update process.\n\n    fast_mode : bool or float\n        This runs the external catalog operations in a \"fast\" mode, with short\n        timeouts and not trying to hit external catalogs that take a long time\n        to respond. See the docstring for\n        `checkplot.pkl_utils._pkl_finder_objectinfo` for details on how this\n        works. If this is True, will run in \"fast\" mode with default timeouts (5\n        seconds in most cases). If this is a float, will run in \"fast\" mode with\n        the provided timeout value in seconds.\n\n    findercmap : str or matplotlib.cm.Colormap object\n\n    findercmap : str or matplotlib.cm.ColorMap object\n        The Colormap object to use for the finder chart image.\n\n    finderconvolve : astropy.convolution.Kernel object or None\n        If not None, the Kernel object to use for convolving the finder image.\n\n    deredden_objects : bool\n        If this is True, will use the 2MASS DUST service to get extinction\n        coefficients in various bands, and then try to deredden the magnitudes\n        and colors of the object already present in the checkplot's objectinfo\n        dict.\n\n    custom_bandpasses : dict\n        This is a dict used to provide custom bandpass definitions for any\n        magnitude measurements in the objectinfo dict that are not automatically\n        recognized by the `varclass.starfeatures.color_features` function. See\n        its docstring for details on the required format.\n\n    gaia_submit_timeout : float\n        Sets the timeout in seconds to use when submitting a request to look up\n        the object's information to the GAIA service. Note that if `fast_mode`\n        is set, this is ignored.\n\n    gaia_submit_tries : int\n        Sets the maximum number of times the GAIA services will be contacted to\n        obtain this object's information. If `fast_mode` is set, this is\n        ignored, and the services will be contacted only once (meaning that a\n        failure to respond will be silently ignored and no GAIA data will be\n        added to the checkplot's objectinfo dict).\n\n    gaia_max_timeout : float\n        Sets the timeout in seconds to use when waiting for the GAIA service to\n        respond to our request for the object's information. Note that if\n        `fast_mode` is set, this is ignored.\n\n    gaia_mirror : str\n        This sets the GAIA mirror to use. This is a key in the\n        `services.gaia.GAIA_URLS` dict which defines the URLs to hit for each\n        mirror.\n\n    complete_query_later : bool\n        If this is True, saves the state of GAIA queries that are not yet\n        complete when `gaia_max_timeout` is reached while waiting for the GAIA\n        service to respond to our request. A later call for GAIA info on the\n        same object will attempt to pick up the results from the existing query\n        if it's completed. If `fast_mode` is True, this is ignored.\n\n    lclistpkl : dict or str\n        If this is provided, must be a dict resulting from reading a catalog\n        produced by the `lcproc.catalogs.make_lclist` function or a str path\n        pointing to the pickle file produced by that function. This catalog is\n        used to find neighbors of the current object in the current light curve\n        collection. Looking at neighbors of the object within the radius\n        specified by `nbrradiusarcsec` is useful for light curves produced by\n        instruments that have a large pixel scale, so are susceptible to\n        blending of variability and potential confusion of neighbor variability\n        with that of the actual object being looked at. If this is None, no\n        neighbor lookups will be performed.\n\n    nbrradiusarcsec : float\n        The radius in arcseconds to use for a search conducted around the\n        coordinates of this object to look for any potential confusion and\n        blending of variability amplitude caused by their proximity.\n\n    maxnumneighbors : int\n        The maximum number of neighbors that will have their light curves and\n        magnitudes noted in this checkplot as potential blends with the target\n        object.\n\n    plotdpi : int\n        The resolution in DPI of the plots to generate in this function\n        (e.g. the finder chart, etc.)\n\n    findercachedir : str\n        The path to the astrobase cache directory for finder chart downloads\n        from the NASA SkyView service.\n\n    verbose : bool\n        If True, will indicate progress and warn about potential problems.\n\n    Returns\n    -------\n\n    list of str\n        Paths to the updated checkplot pickle file.\n\n    '''\n\n    cplist = sorted(glob.glob(os.path.join(cpdir, cpglob)))\n\n    return parallel_update_objectinfo_cplist(\n        cplist,\n        liststartindex=liststartindex,\n        maxobjects=maxobjects,\n        nworkers=nworkers,\n        fast_mode=fast_mode,\n        findercmap=findercmap,\n        finderconvolve=finderconvolve,\n        deredden_object=deredden_object,\n        custom_bandpasses=custom_bandpasses,\n        gaia_submit_timeout=gaia_submit_timeout,\n        gaia_submit_tries=gaia_submit_tries,\n        gaia_max_timeout=gaia_max_timeout,\n        gaia_mirror=gaia_mirror,\n        complete_query_later=complete_query_later,\n        lclistpkl=lclistpkl,\n        nbrradiusarcsec=nbrradiusarcsec,\n        maxnumneighbors=maxnumneighbors,\n        plotdpi=plotdpi,\n        findercachedir=findercachedir,\n        verbose=verbose\n    )", "response": "This function updates the objectinfo for a directory of checkplot pickles."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _gaussian(x, amp, loc, std):\n    '''This is a simple gaussian.\n\n    Parameters\n    ----------\n\n    x : np.array\n        The items at which the Gaussian is evaluated.\n\n    amp : float\n        The amplitude of the Gaussian.\n\n    loc : float\n        The central value of the Gaussian.\n\n    std : float\n        The standard deviation of the Gaussian.\n\n    Returns\n    -------\n\n    np.array\n        Returns the Gaussian evaluated at the items in `x`, using the provided\n        parameters of `amp`, `loc`, and `std`.\n\n    '''\n\n    return amp * np.exp(-((x - loc)*(x - loc))/(2.0*std*std))", "response": "This function is a simple gaussian. It is a simple gaussian."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef invgauss_eclipses_func(ebparams, times, mags, errs):\n    '''This returns a double eclipse shaped function.\n\n    Suitable for first order modeling of eclipsing binaries.\n\n    Parameters\n    ----------\n\n    ebparams : list of float\n        This contains the parameters for the eclipsing binary::\n\n            ebparams = [period (time),\n                        epoch (time),\n                        pdepth: primary eclipse depth (mags),\n                        pduration: primary eclipse duration (phase),\n                        psdepthratio: primary-secondary eclipse depth ratio,\n                        secondaryphase: center phase of the secondary eclipse]\n\n        `period` is the period in days.\n\n        `epoch` is the time of minimum in JD.\n\n        `pdepth` is the depth of the primary eclipse.\n\n        - for magnitudes -> pdepth should be < 0\n        - for fluxes     -> pdepth should be > 0\n\n        `pduration` is the length of the primary eclipse in phase.\n\n        `psdepthratio` is the ratio in the eclipse depths:\n        `depth_secondary/depth_primary`. This is generally the same as the ratio\n        of the `T_effs` of the two stars.\n\n        `secondaryphase` is the phase at which the minimum of the secondary\n        eclipse is located. This effectively parameterizes eccentricity.\n\n        All of these will then have fitted values after the fit is done.\n\n    times,mags,errs : np.array\n        The input time-series of measurements and associated errors for which\n        the eclipse model will be generated. The times will be used to generate\n        model mags, and the input `times`, `mags`, and `errs` will be resorted\n        by model phase and returned.\n\n    Returns\n    -------\n\n    (modelmags, phase, ptimes, pmags, perrs) : tuple\n        Returns the model mags and phase values. Also returns the input `times`,\n        `mags`, and `errs` sorted by the model's phase.\n\n    '''\n\n    (period, epoch, pdepth, pduration, depthratio, secondaryphase) = ebparams\n\n    # generate the phases\n    iphase = (times - epoch)/period\n    iphase = iphase - np.floor(iphase)\n\n    phasesortind = np.argsort(iphase)\n    phase = iphase[phasesortind]\n    ptimes = times[phasesortind]\n    pmags = mags[phasesortind]\n    perrs = errs[phasesortind]\n\n    zerolevel = np.median(pmags)\n    modelmags = np.full_like(phase, zerolevel)\n\n    primaryecl_amp = -pdepth\n    secondaryecl_amp = -pdepth * depthratio\n\n    primaryecl_std = pduration/5.0    # we use 5-sigma as full-width -> duration\n    secondaryecl_std = pduration/5.0  # secondary eclipse has the same duration\n\n    halfduration = pduration/2.0\n\n\n    # phase indices\n    primary_eclipse_ingress = (\n        (phase >= (1.0 - halfduration)) & (phase <= 1.0)\n    )\n    primary_eclipse_egress = (\n        (phase >= 0.0) & (phase <= halfduration)\n    )\n\n    secondary_eclipse_phase = (\n        (phase >= (secondaryphase - halfduration)) &\n        (phase <= (secondaryphase + halfduration))\n    )\n\n    # put in the eclipses\n    modelmags[primary_eclipse_ingress] = (\n        zerolevel + _gaussian(phase[primary_eclipse_ingress],\n                              primaryecl_amp,\n                              1.0,\n                              primaryecl_std)\n    )\n    modelmags[primary_eclipse_egress] = (\n        zerolevel + _gaussian(phase[primary_eclipse_egress],\n                              primaryecl_amp,\n                              0.0,\n                              primaryecl_std)\n    )\n    modelmags[secondary_eclipse_phase] = (\n        zerolevel + _gaussian(phase[secondary_eclipse_phase],\n                              secondaryecl_amp,\n                              secondaryphase,\n                              secondaryecl_std)\n    )\n\n    return modelmags, phase, ptimes, pmags, perrs", "response": "This function returns a double eclipse shaped function that can be used to generate the model mags and phase values for the eclipsing binaries."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef invgauss_eclipses_residual(ebparams, times, mags, errs):\n    '''This returns the residual between the modelmags and the actual mags.\n\n    Parameters\n    ----------\n\n    ebparams : list of float\n        This contains the parameters for the eclipsing binary::\n\n            ebparams = [period (time),\n                        epoch (time),\n                        pdepth: primary eclipse depth (mags),\n                        pduration: primary eclipse duration (phase),\n                        psdepthratio: primary-secondary eclipse depth ratio,\n                        secondaryphase: center phase of the secondary eclipse]\n\n        `period` is the period in days.\n\n        `epoch` is the time of minimum in JD.\n\n        `pdepth` is the depth of the primary eclipse.\n\n        - for magnitudes -> `pdepth` should be < 0\n        - for fluxes     -> `pdepth` should be > 0\n\n        `pduration` is the length of the primary eclipse in phase.\n\n        `psdepthratio` is the ratio in the eclipse depths:\n        `depth_secondary/depth_primary`. This is generally the same as the ratio\n        of the `T_effs` of the two stars.\n\n        `secondaryphase` is the phase at which the minimum of the secondary\n        eclipse is located. This effectively parameterizes eccentricity.\n\n        All of these will then have fitted values after the fit is done.\n\n    times,mags,errs : np.array\n        The input time-series of measurements and associated errors for which\n        the eclipse model will be generated. The times will be used to generate\n        model mags, and the input `times`, `mags`, and `errs` will be resorted\n        by model phase and returned.\n\n    Returns\n    -------\n\n    np.array\n        The residuals between the input `mags` and generated `modelmags`,\n        weighted by the measurement errors in `errs`.\n\n    '''\n\n    modelmags, phase, ptimes, pmags, perrs = (\n        invgauss_eclipses_func(ebparams, times, mags, errs)\n    )\n\n    # this is now a weighted residual taking into account the measurement err\n    return (pmags - modelmags)/perrs", "response": "This function returns the residual between the modelmags and the actual mags."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef jhk_to_bmag(jmag, hmag, kmag):\n    '''Converts given J, H, Ks mags to a B magnitude value.\n\n    Parameters\n    ----------\n\n    jmag,hmag,kmag : float\n        2MASS J, H, Ks mags of the object.\n\n    Returns\n    -------\n\n    float\n        The converted B band magnitude.\n\n    '''\n\n    return convert_constants(jmag,hmag,kmag,\n                             BJHK,\n                             BJH, BJK, BHK,\n                             BJ, BH, BK)", "response": "Converts given J H Ks mags to a B magnitude value."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef jhk_to_vmag(jmag,hmag,kmag):\n    '''Converts given J, H, Ks mags to a V magnitude value.\n\n    Parameters\n    ----------\n\n    jmag,hmag,kmag : float\n        2MASS J, H, Ks mags of the object.\n\n    Returns\n    -------\n\n    float\n        The converted V band magnitude.\n\n    '''\n\n    return convert_constants(jmag,hmag,kmag,\n                             VJHK,\n                             VJH, VJK, VHK,\n                             VJ, VH, VK)", "response": "Converts given J H Ks mags to V magnitude value."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nconverting given J H Ks mags to an R band magnitude value.", "response": "def jhk_to_rmag(jmag,hmag,kmag):\n    '''Converts given J, H, Ks mags to an R magnitude value.\n\n    Parameters\n    ----------\n\n    jmag,hmag,kmag : float\n        2MASS J, H, Ks mags of the object.\n\n    Returns\n    -------\n\n    float\n        The converted R band magnitude.\n\n    '''\n\n    return convert_constants(jmag,hmag,kmag,\n                             RJHK,\n                             RJH, RJK, RHK,\n                             RJ, RH, RK)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nconvert given J H Ks mags to an I band magnitude value.", "response": "def jhk_to_imag(jmag,hmag,kmag):\n    '''Converts given J, H, Ks mags to an I magnitude value.\n\n    Parameters\n    ----------\n\n    jmag,hmag,kmag : float\n        2MASS J, H, Ks mags of the object.\n\n    Returns\n    -------\n\n    float\n        The converted I band magnitude.\n\n    '''\n\n    return convert_constants(jmag,hmag,kmag,\n                             IJHK,\n                             IJH, IJK, IHK,\n                             IJ, IH, IK)"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nconverts given J H Ks mags to SDSS u band magnitude value.", "response": "def jhk_to_sdssu(jmag,hmag,kmag):\n    '''Converts given J, H, Ks mags to an SDSS u magnitude value.\n\n    Parameters\n    ----------\n\n    jmag,hmag,kmag : float\n        2MASS J, H, Ks mags of the object.\n\n    Returns\n    -------\n\n    float\n        The converted SDSS u band magnitude.\n\n    '''\n\n    return convert_constants(jmag,hmag,kmag,\n                             SDSSU_JHK,\n                             SDSSU_JH, SDSSU_JK, SDSSU_HK,\n                             SDSSU_J, SDSSU_H, SDSSU_K)"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nconverts given J H Ks mags to SDSS g magnitude value.", "response": "def jhk_to_sdssg(jmag,hmag,kmag):\n    '''Converts given J, H, Ks mags to an SDSS g magnitude value.\n\n    Parameters\n    ----------\n\n    jmag,hmag,kmag : float\n        2MASS J, H, Ks mags of the object.\n\n    Returns\n    -------\n\n    float\n        The converted SDSS g band magnitude.\n\n    '''\n\n    return convert_constants(jmag,hmag,kmag,\n                             SDSSG_JHK,\n                             SDSSG_JH, SDSSG_JK, SDSSG_HK,\n                             SDSSG_J, SDSSG_H, SDSSG_K)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef jhk_to_sdssr(jmag,hmag,kmag):\n    '''Converts given J, H, Ks mags to an SDSS r magnitude value.\n\n    Parameters\n    ----------\n\n    jmag,hmag,kmag : float\n        2MASS J, H, Ks mags of the object.\n\n    Returns\n    -------\n\n    float\n        The converted SDSS r band magnitude.\n\n    '''\n\n    return convert_constants(jmag,hmag,kmag,\n                             SDSSR_JHK,\n                             SDSSR_JH, SDSSR_JK, SDSSR_HK,\n                             SDSSR_J, SDSSR_H, SDSSR_K)", "response": "Converts given J H Ks mags to SDSS r magnitude value."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef jhk_to_sdssi(jmag,hmag,kmag):\n    '''Converts given J, H, Ks mags to an SDSS i magnitude value.\n\n    Parameters\n    ----------\n\n    jmag,hmag,kmag : float\n        2MASS J, H, Ks mags of the object.\n\n    Returns\n    -------\n\n    float\n        The converted SDSS i band magnitude.\n\n    '''\n\n    return convert_constants(jmag,hmag,kmag,\n                             SDSSI_JHK,\n                             SDSSI_JH, SDSSI_JK, SDSSI_HK,\n                             SDSSI_J, SDSSI_H, SDSSI_K)", "response": "Converts given J H Ks mags to SDSS i band magnitude value."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nconverting given J H Ks mags to SDSS z magnitude value.", "response": "def jhk_to_sdssz(jmag,hmag,kmag):\n    '''Converts given J, H, Ks mags to an SDSS z magnitude value.\n\n    Parameters\n    ----------\n\n    jmag,hmag,kmag : float\n        2MASS J, H, Ks mags of the object.\n\n    Returns\n    -------\n\n    float\n        The converted SDSS z band magnitude.\n\n    '''\n\n    return convert_constants(jmag,hmag,kmag,\n                             SDSSZ_JHK,\n                             SDSSZ_JH, SDSSZ_JK, SDSSZ_HK,\n                             SDSSZ_J, SDSSZ_H, SDSSZ_K)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ncalculate the absolute magnitude for a GAIA object.", "response": "def absolute_gaia_magnitude(gaia_mag,\n                            gaia_parallax_mas,\n                            gaia_mag_err=None,\n                            gaia_parallax_err_mas=None):\n    '''Calculates the GAIA absolute magnitude for object (or array of objects).\n\n    Given a G mag and the parallax measured by GAIA, gets the absolute mag using\n    the usual equation::\n\n        G - M_G = 5 x log10(d_pc) - 5\n        M_G = 5 - 5log10(d_pc) + G\n\n    Parameters\n    ----------\n\n    gaia_mag : float or array-like\n        The measured GAIA G magnitude.\n\n    gaia_parallax_max : float or array-like\n        The measured parallax of the object in mas.\n\n    gaia_mag_err : float or array-like or None\n        The measurement error in GAIA G magnitude.\n\n    gaia_parallax_err_mas : float or array-like or None\n        The measurement error in GAIA parallax in mas.\n\n    Returns\n    -------\n\n    float or array-like\n        The absolute magnitude M_G of the object(s).\n\n    If both `_err` input kwargs are provided, will return a tuple of the form::\n\n        (M_G float or array-like, M_G_err float or array-like)\n\n    '''\n\n    # get the distance\n    # we're using the naive calculation of d. this is inaccurate as stated in\n    # Bailer-Jones 2015 (http://arxiv.org/abs/1507.02105) if the error in\n    # parallax is a significant fraction of parallax\n    d_pc = np.abs(1000.0/gaia_parallax_mas)\n\n    # get the distance error\n    if gaia_parallax_err_mas is not None:\n        d_pc_err = (\n            (1000.0/(gaia_parallax_mas*gaia_parallax_mas)) *\n            gaia_parallax_err_mas\n        )\n    else:\n        d_pc_err = None\n\n    # calculate the absolute mag from the relation\n    # FIXME: this is NOT corrected for extinction in G mag. see Jordi+ 2010\n    # (http://adsabs.harvard.edu/abs/2010A%26A...523A..48J) to figure out\n    # A_G/A_V as a function of (V-I)_0, then apply it here\n    M_G = 5 - 5.0*np.log10(d_pc) + gaia_mag\n\n    # calculate the err in M_G\n    if d_pc_err is not None and gaia_mag_err is not None:\n\n        M_G_err = np.sqrt(\n            ((5.0/(d_pc * np.log(10.0)))**2 * (d_pc_err)**2) +\n            gaia_mag_err*gaia_mag_err\n        )\n    else:\n        M_G_err = None\n\n\n    if M_G_err is not None:\n        return M_G, M_G_err\n    else:\n        return M_G"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ncalculating the Schwarzenberg - Czerny AoV statistic at a test frequency.", "response": "def aov_theta(times, mags, errs, frequency,\n              binsize=0.05, minbin=9):\n    '''Calculates the Schwarzenberg-Czerny AoV statistic at a test frequency.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The input time-series and associated errors.\n\n    frequency : float\n        The test frequency to calculate the theta statistic at.\n\n    binsize : float\n        The phase bin size to use.\n\n    minbin : int\n        The minimum number of items in a phase bin to consider in the\n        calculation of the statistic.\n\n    Returns\n    -------\n\n    theta_aov : float\n        The value of the AoV statistic at the specified `frequency`.\n\n    '''\n\n    period = 1.0/frequency\n    fold_time = times[0]\n\n    phased = phase_magseries(times,\n                             mags,\n                             period,\n                             fold_time,\n                             wrap=False,\n                             sort=True)\n\n    phases = phased['phase']\n    pmags = phased['mags']\n    bins = nparange(0.0, 1.0, binsize)\n    ndets = phases.size\n\n    binnedphaseinds = npdigitize(phases, bins)\n\n    bin_s1_tops = []\n    bin_s2_tops = []\n    binndets = []\n    goodbins = 0\n\n    all_xbar = npmedian(pmags)\n\n    for x in npunique(binnedphaseinds):\n\n        thisbin_inds = binnedphaseinds == x\n        thisbin_mags = pmags[thisbin_inds]\n\n        if thisbin_mags.size > minbin:\n\n            thisbin_ndet = thisbin_mags.size\n            thisbin_xbar = npmedian(thisbin_mags)\n\n            # get s1\n            thisbin_s1_top = (\n                thisbin_ndet *\n                (thisbin_xbar - all_xbar) *\n                (thisbin_xbar - all_xbar)\n            )\n\n            # get s2\n            thisbin_s2_top = npsum((thisbin_mags - all_xbar) *\n                                   (thisbin_mags - all_xbar))\n\n            bin_s1_tops.append(thisbin_s1_top)\n            bin_s2_tops.append(thisbin_s2_top)\n            binndets.append(thisbin_ndet)\n            goodbins = goodbins + 1\n\n\n    # turn the quantities into arrays\n    bin_s1_tops = nparray(bin_s1_tops)\n    bin_s2_tops = nparray(bin_s2_tops)\n    binndets = nparray(binndets)\n\n    # calculate s1 first\n    s1 = npsum(bin_s1_tops)/(goodbins - 1.0)\n\n    # then calculate s2\n    s2 = npsum(bin_s2_tops)/(ndets - goodbins)\n\n    theta_aov = s1/s2\n\n    return theta_aov"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _aov_worker(task):\n    '''This is a parallel worker for the function below.\n\n    Parameters\n    ----------\n\n    task : tuple\n        This is of the form below::\n\n            task[0] = times\n            task[1] = mags\n            task[2] = errs\n            task[3] = frequency\n            task[4] = binsize\n            task[5] = minbin\n\n    Returns\n    -------\n\n    theta_aov : float\n        The theta value at the specified frequency. nan if the calculation\n        fails.\n\n    '''\n\n    times, mags, errs, frequency, binsize, minbin = task\n\n    try:\n\n        theta = aov_theta(times, mags, errs, frequency,\n                          binsize=binsize, minbin=minbin)\n\n        return theta\n\n    except Exception as e:\n\n        return npnan", "response": "This is a parallel worker for the function below."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef get_frequency_grid(times,\n                       samplesperpeak=5,\n                       nyquistfactor=5,\n                       minfreq=None,\n                       maxfreq=None,\n                       returnf0dfnf=False):\n    '''This calculates a frequency grid for the period finding functions in this\n    module.\n\n    Based on the autofrequency function in astropy.stats.lombscargle.\n\n    http://docs.astropy.org/en/stable/_modules/astropy/stats/lombscargle/core.html#LombScargle.autofrequency\n\n    Parameters\n    ----------\n\n    times : np.array\n        The times to use to generate the frequency grid over.\n\n    samplesperpeak : int\n        The minimum sample coverage each frequency point in the grid will get.\n\n    nyquistfactor : int\n        The multiplier over the Nyquist rate to use.\n\n    minfreq,maxfreq : float or None\n        If not None, these will be the limits of the frequency grid generated.\n\n    returnf0dfnf : bool\n        If this is True, will return the values of `f0`, `df`, and `Nf`\n        generated for this grid.\n\n    Returns\n    -------\n\n    np.array\n        A grid of frequencies.\n\n    '''\n\n    baseline = times.max() - times.min()\n    nsamples = times.size\n\n    df = 1. / baseline / samplesperpeak\n\n    if minfreq is not None:\n        f0 = minfreq\n    else:\n        f0 = 0.5 * df\n\n    if maxfreq is not None:\n        Nf = int(np.ceil((maxfreq - f0) / df))\n    else:\n        Nf = int(0.5 * samplesperpeak * nyquistfactor * nsamples)\n\n\n    if returnf0dfnf:\n        return f0, df, Nf, f0 + df * np.arange(Nf)\n    else:\n        return f0 + df * np.arange(Nf)", "response": "This function calculates a frequency grid for the period finding functions in this module."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef bootstrap_falsealarmprob(lspinfo,\n                             times,\n                             mags,\n                             errs,\n                             nbootstrap=250,\n                             magsarefluxes=False,\n                             sigclip=10.0,\n                             npeaks=None):\n    '''Calculates the false alarm probabilities of periodogram peaks using\n    bootstrap resampling of the magnitude time series.\n\n    The false alarm probability here is defined as::\n\n        (1.0 + sum(trialbestpeaks[i] > peak[j]))/(ntrialbestpeaks + 1)\n\n    for each best periodogram peak j. The index i is for each bootstrap\n    trial. This effectively gives us a significance for the peak. Smaller FAP\n    means a better chance that the peak is real.\n\n    The basic idea is to get the number of trial best peaks that are larger than\n    the current best peak and divide this by the total number of trials. The\n    distribution of these trial best peaks is obtained after scrambling the mag\n    values and rerunning the specified periodogram method for a bunch of trials.\n\n    `lspinfo` is the output dict from a periodbase periodogram function and MUST\n    contain a 'method' key that corresponds to one of the keys in the LSPMETHODS\n    dict above. This will let this function know which periodogram function to\n    run to generate the bootstrap samples. The lspinfo SHOULD also have a\n    'kwargs' key that corresponds to the input keyword arguments for the\n    periodogram function as it was run originally, to keep everything the same\n    during the bootstrap runs. If this is missing, default values will be used.\n\n    FIXME: this may not be strictly correct; must look more into bootstrap\n    significance testing. Also look into if we're doing resampling correctly for\n    time series because the samples are not iid. Look into moving block\n    bootstrap.\n\n    Parameters\n    ----------\n\n    lspinfo : dict\n        A dict of period-finder results from one of the period-finders in\n        periodbase, or your own functions, provided it's of the form and\n        contains at least the keys listed below::\n\n            {'periods': np.array of all periods searched by the period-finder,\n             'lspvals': np.array of periodogram power value for each period,\n             'bestperiod': a float value that is the period with the highest\n                           peak in the periodogram, i.e. the most-likely actual\n                           period,\n             'method': a three-letter code naming the period-finder used; must\n                       be one of the keys in the\n                       `astrobase.periodbase.METHODLABELS` dict,\n             'nbestperiods': a list of the periods corresponding to periodogram\n                             peaks (`nbestlspvals` below) to annotate on the\n                             periodogram plot so they can be called out\n                             visually,\n             'nbestlspvals': a list of the power values associated with\n                             periodogram peaks to annotate on the periodogram\n                             plot so they can be called out visually; should be\n                             the same length as `nbestperiods` above,\n             'kwargs': dict of kwargs passed to your own period-finder function}\n\n        If you provide your own function's period-finder results, you should add\n        a corresponding key for it to the LSPMETHODS dict above so the bootstrap\n        function can use it correctly. Your period-finder function should take\n        `times`, `mags`, errs and any extra parameters as kwargs and return a\n        dict of the form described above. A small worked example::\n\n            from your_module import your_periodfinder_func\n            from astrobase import periodbase\n\n            periodbase.LSPMETHODS['your-finder'] = your_periodfinder_func\n\n            # run a period-finder session\n            your_pfresults = your_periodfinder_func(times, mags, errs,\n                                                    **extra_kwargs)\n\n            # run bootstrap to find FAP\n            falsealarm_info = periodbase.bootstrap_falsealarmprob(\n                your_pfresults,\n                times, mags, errs,\n                nbootstrap=250,\n                magsarefluxes=False,\n            )\n\n    times,mags,errs : np.arrays\n        The magnitude/flux time-series to process along with their associated\n        measurement errors.\n\n    nbootstrap : int\n        The total number of bootstrap trials to run. This is set to 250 by\n        default, but should probably be around 1000 for realistic results.\n\n    magsarefluxes : bool\n        If True, indicates the input time-series is fluxes and not mags.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    npeaks : int or None\n        The number of peaks from the list of 'nbestlspvals' in the period-finder\n        result dict to run the bootstrap for. If None, all of the peaks in this\n        list will have their FAP calculated.\n\n    Returns\n    -------\n\n    dict\n        Returns a dict of the form::\n\n            {'peaks':allpeaks,\n             'periods':allperiods,\n             'probabilities':allfaps,\n             'alltrialbestpeaks':alltrialbestpeaks}\n\n    '''\n\n    # figure out how many periods to work on\n    if (npeaks and (0 < npeaks < len(lspinfo['nbestperiods']))):\n        nperiods = npeaks\n    else:\n        LOGWARNING('npeaks not specified or invalid, '\n                   'getting FAP for all %s periodogram peaks' %\n                   len(lspinfo['nbestperiods']))\n        nperiods = len(lspinfo['nbestperiods'])\n\n    nbestperiods = lspinfo['nbestperiods'][:nperiods]\n    nbestpeaks = lspinfo['nbestlspvals'][:nperiods]\n\n    # get rid of nans first and sigclip\n    stimes, smags, serrs = sigclip_magseries(times,\n                                             mags,\n                                             errs,\n                                             magsarefluxes=magsarefluxes,\n                                             sigclip=sigclip)\n\n    allpeaks = []\n    allperiods = []\n    allfaps = []\n    alltrialbestpeaks = []\n\n    # make sure there are enough points to calculate a spectrum\n    if len(stimes) > 9 and len(smags) > 9 and len(serrs) > 9:\n\n        for ind, period, peak in zip(range(len(nbestperiods)),\n                                     nbestperiods,\n                                     nbestpeaks):\n\n            LOGINFO('peak %s: running %s trials...' % (ind+1, nbootstrap))\n\n            trialbestpeaks = []\n\n            for _trial in range(nbootstrap):\n\n                # get a scrambled index\n                tindex = np.random.randint(0,\n                                           high=mags.size,\n                                           size=mags.size)\n\n\n                # get the kwargs dict out of the lspinfo\n                if 'kwargs' in lspinfo:\n\n                    kwargs = lspinfo['kwargs']\n\n                    # update the kwargs with some local stuff\n                    kwargs.update({'magsarefluxes':magsarefluxes,\n                                   'sigclip':sigclip,\n                                   'verbose':False})\n                else:\n                    kwargs = {'magsarefluxes':magsarefluxes,\n                              'sigclip':sigclip,\n                              'verbose':False}\n\n\n                # run the periodogram with scrambled mags and errs\n                # and the appropriate keyword arguments\n                lspres = LSPMETHODS[lspinfo['method']](\n                    times, mags[tindex], errs[tindex],\n                    **kwargs\n                )\n                trialbestpeaks.append(lspres['bestlspval'])\n\n            trialbestpeaks = np.array(trialbestpeaks)\n            alltrialbestpeaks.append(trialbestpeaks)\n\n            # calculate the FAP for a trial peak j = FAP[j] =\n            # (1.0 + sum(trialbestpeaks[i] > peak[j]))/(ntrialbestpeaks + 1)\n            if lspinfo['method'] != 'pdm':\n                falsealarmprob = (\n                    (1.0 + trialbestpeaks[trialbestpeaks > peak].size) /\n                    (trialbestpeaks.size + 1.0)\n                )\n            # for PDM, we're looking for a peak smaller than the best peak\n            # because values closer to 0.0 are more significant\n            else:\n                falsealarmprob = (\n                    (1.0 + trialbestpeaks[trialbestpeaks < peak].size) /\n                    (trialbestpeaks.size + 1.0)\n                )\n\n            LOGINFO('FAP for peak %s, period: %.6f = %.3g' % (ind+1,\n                                                              period,\n                                                              falsealarmprob))\n\n            allpeaks.append(peak)\n            allperiods.append(period)\n            allfaps.append(falsealarmprob)\n\n        return {'peaks':allpeaks,\n                'periods':allperiods,\n                'probabilities':allfaps,\n                'alltrialbestpeaks':alltrialbestpeaks}\n\n    else:\n        LOGERROR('not enough mag series points to calculate periodogram')\n        return None", "response": "Calculates the false alarm probabilities of periodogram peaks using the bootstrap resampling of the magnitude time series."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef read_original_textlc(lcpath):\n    '''\n    Read .epdlc, and .tfalc light curves and return a corresponding labelled\n    dict (if LC from <2012) or astropy table (if >=2012). Each has different\n    keys that can be accessed via .keys()\n\n    Input:\n    lcpath: path (string) to light curve data, which is a textfile with HAT\n    LC data.\n\n    Example:\n    dat = read_original_textlc('HAT-115-0003266.epdlc')\n    '''\n\n    LOGINFO('reading original HAT text LC: {:s}'.format(lcpath))\n\n    N_lines_to_parse_comments = 50\n    with open(lcpath, 'rb') as file:\n        head = [next(file) for ind in range(N_lines_to_parse_comments)]\n\n    N_comment_lines = len([l for l in head if l.decode('UTF-8')[0] == '#'])\n\n    # if there are too many comment lines, fail out\n    if N_comment_lines < N_lines_to_parse_comments:\n        LOGERROR(\n            'LC file {fpath} has too many comment lines'.format(fpath=lcpath)\n        )\n        return None\n\n    first_data_line = list(\n        filter(None, head[N_comment_lines].decode('UTF-8').split())\n    )\n    N_cols = len(first_data_line)\n\n    # There are different column formats depending on when HAT pipeline was run\n    # also different formats for different types of LCs:\n    # pre-2012: .epdlc -> 17 columns\n    # pre-2012: .tfalc -> 20 columns\n    # post-2012: .epdlc or .tfalc -> 32 columns\n\n    if N_cols == 17:\n        colformat = 'pre2012-epdlc'\n    elif N_cols == 20:\n        colformat = 'pre2012-tfalc'\n    elif N_cols == 32:\n        colformat = 'post2012-hatlc'\n    else:\n        LOGERROR(\"can't handle this column format yet, \"\n                 \"file: {fpath}, ncols: {ncols}\".format(fpath=lcpath,\n                                                        ncols=N_cols))\n        return None\n\n\n    # deal with pre-2012 column format\n    if colformat == 'pre2012-epdlc':\n\n        col_names = ['framekey','rjd',\n                     'aim_000','aie_000','aiq_000',\n                     'aim_001','aie_001','aiq_001',\n                     'aim_002','aie_002','aiq_002',\n                     'arm_000','arm_001','arm_002',\n                     'aep_000','aep_001','aep_002']\n        col_dtypes = ['U8',float,\n                      float,float,'U1',\n                      float,float,'U1',\n                      float,float,'U1',\n                      float,float,float,\n                      float,float,float]\n        dtype_pairs = [el for el in zip(col_names, col_dtypes)]\n        data = np.genfromtxt(lcpath, names=col_names, dtype=col_dtypes,\n                             skip_header=N_comment_lines, delimiter=None)\n        out = {}\n        for ix in range(len(data.dtype.names)):\n            out[data.dtype.names[ix]] = data[data.dtype.names[ix]]\n\n    elif colformat == 'pre2012-tfalc':\n\n        col_names = ['framekey','rjd',\n                     'aim_000','aie_000','aiq_000',\n                     'aim_001','aie_001','aiq_001',\n                     'aim_002','aie_002','aiq_002',\n                     'arm_000','arm_001','arm_002',\n                     'aep_000','aep_001','aep_002',\n                     'atf_000','atf_001','atf_002']\n        col_dtypes = ['U8',float,\n                      float,float,'U1',\n                      float,float,'U1',\n                      float,float,'U1',\n                      float,float,float,\n                      float,float,float,\n                      float,float,float]\n        dtype_pairs = [el for el in zip(col_names, col_dtypes)]\n        data = np.genfromtxt(lcpath, names=col_names, dtype=col_dtypes,\n                             skip_header=N_comment_lines, delimiter=None)\n        out = {}\n        for ix in range(len(data.dtype.names)):\n            out[data.dtype.names[ix]] = data[data.dtype.names[ix]]\n\n    elif colformat == 'post2012-hatlc':\n\n        col_names = ['hatid', 'framekey', 'fld', 'bjd',\n                     'aim_000', 'aie_000', 'aiq_000',\n                     'aim_001', 'aie_001', 'aiq_001',\n                     'aim_002', 'aie_002', 'aiq_002',\n                     'arm_000', 'arm_001', 'arm_002',\n                     'aep_000', 'aep_001', 'aep_002',\n                     'atf_000', 'atf_001', 'atf_002',\n                     'xcc', 'ycc', 'bgv', 'bge',\n                     'fsv', 'fdv', 'fkv',\n                     'iha', 'izd', 'rjd']\n\n        out = astascii.read(lcpath, names=col_names, comment='#')\n\n    return out", "response": "Read original text file with HAT - 111 - 0003266. epdlc and. tfalc light curves and return a corresponding labelled\n    dict."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef traptransit_fit_magseries(times, mags, errs,\n                              transitparams,\n                              sigclip=10.0,\n                              plotfit=False,\n                              magsarefluxes=False,\n                              verbose=True):\n    '''This fits a trapezoid transit model to a magnitude time series.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The input mag/flux time-series to fit a trapezoid planet-transit model\n        to.\n\n    period : float\n        The period to use for the model fit.\n\n    transitparams : list of floats\n        These are initial parameters for the transit model fit. A list of the\n        following form is required::\n\n            transitparams = [transitperiod (time),\n                             transitepoch (time),\n                             transitdepth (flux or mags),\n                             transitduration (phase),\n                             ingressduration (phase)]\n\n        - for magnitudes -> `transitdepth` should be < 0\n        - for fluxes     -> `transitdepth` should be > 0\n\n        If `transitepoch` is None, this function will do an initial spline fit\n        to find an approximate minimum of the phased light curve using the given\n        period.\n\n        The `transitdepth` provided is checked against the value of\n        `magsarefluxes`. if `magsarefluxes = True`, the `transitdepth` is forced\n        to be > 0; if `magsarefluxes` = False, the `transitdepth` is forced to\n        be < 0.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    magsarefluxes : bool\n        If True, will treat the input values of `mags` as fluxes for purposes of\n        plotting the fit and sig-clipping.\n\n    plotfit : str or False\n        If this is a string, this function will make a plot for the fit to the\n        mag/flux time-series and writes the plot to the path specified here.\n\n    ignoreinitfail : bool\n        If this is True, ignores the initial failure to find a set of optimized\n        Fourier parameters using the global optimization function and proceeds\n        to do a least-squares fit anyway.\n\n    verbose : bool\n        If True, will indicate progress and warn of any problems.\n\n    Returns\n    -------\n\n    dict\n        This function returns a dict containing the model fit parameters, the\n        minimized chi-sq value and the reduced chi-sq value. The form of this\n        dict is mostly standardized across all functions in this module::\n\n            {\n                'fittype':'traptransit',\n                'fitinfo':{\n                    'initialparams':the initial transit params provided,\n                    'finalparams':the final model fit transit params ,\n                    'finalparamerrs':formal errors in the params,\n                    'leastsqfit':the full tuple returned by scipy.leastsq,\n                    'fitmags': the model fit mags,\n                    'fitepoch': the epoch of minimum light for the fit,\n                    'ntransitpoints': the number of LC points in transit phase\n                },\n                'fitchisq': the minimized value of the fit's chi-sq,\n                'fitredchisq':the reduced chi-sq value,\n                'fitplotfile': the output fit plot if fitplot is not None,\n                'magseries':{\n                    'times':input times in phase order of the model,\n                    'phase':the phases of the model mags,\n                    'mags':input mags/fluxes in the phase order of the model,\n                    'errs':errs in the phase order of the model,\n                    'magsarefluxes':input value of magsarefluxes kwarg\n                }\n            }\n\n    '''\n\n    stimes, smags, serrs = sigclip_magseries(times, mags, errs,\n                                             sigclip=sigclip,\n                                             magsarefluxes=magsarefluxes)\n\n    # get rid of zero errs\n    nzind = np.nonzero(serrs)\n    stimes, smags, serrs = stimes[nzind], smags[nzind], serrs[nzind]\n\n    # check the transitparams\n    transitperiod, transitepoch, transitdepth = transitparams[0:3]\n\n    # check if we have a transitepoch to use\n    if transitepoch is None:\n\n        if verbose:\n            LOGWARNING('no transitepoch given in transitparams, '\n                       'trying to figure it out automatically...')\n        # do a spline fit to figure out the approximate min of the LC\n        try:\n            spfit = spline_fit_magseries(times, mags, errs, transitperiod,\n                                         sigclip=sigclip,\n                                         magsarefluxes=magsarefluxes,\n                                         verbose=verbose)\n            transitepoch = spfit['fitinfo']['fitepoch']\n\n        # if the spline-fit fails, try a savgol fit instead\n        except Exception as e:\n            sgfit = savgol_fit_magseries(times, mags, errs, transitperiod,\n                                         sigclip=sigclip,\n                                         magsarefluxes=magsarefluxes,\n                                         verbose=verbose)\n            transitepoch = sgfit['fitinfo']['fitepoch']\n\n        # if everything failed, then bail out and ask for the transitepoch\n        finally:\n\n            if transitepoch is None:\n                LOGERROR(\"couldn't automatically figure out the transit epoch, \"\n                         \"can't continue. please provide it in transitparams.\")\n\n                # assemble the returndict\n                returndict = {\n                    'fittype':'traptransit',\n                    'fitinfo':{\n                        'initialparams':transitparams,\n                        'finalparams':None,\n                        'leastsqfit':None,\n                        'fitmags':None,\n                        'fitepoch':None,\n                    },\n                    'fitchisq':np.nan,\n                    'fitredchisq':np.nan,\n                    'fitplotfile':None,\n                    'magseries':{\n                        'phase':None,\n                        'times':None,\n                        'mags':None,\n                        'errs':None,\n                        'magsarefluxes':magsarefluxes,\n                    },\n                }\n\n                return returndict\n\n            else:\n\n                # check the case when there are more than one transitepochs\n                # returned\n                if transitepoch.size > 1:\n                    if verbose:\n                        LOGWARNING(\n                            \"could not auto-find a single minimum in LC for \"\n                            \"transitepoch, using the first one returned\"\n                        )\n                    transitparams[1] = transitepoch[0]\n\n                else:\n\n                    if verbose:\n                        LOGWARNING(\n                            'using automatically determined transitepoch = %.5f'\n                            % transitepoch\n                        )\n                    transitparams[1] = transitepoch.item()\n\n    # next, check the transitdepth and fix it to the form required\n    if magsarefluxes:\n        if transitdepth < 0.0:\n            transitparams[2] = -transitdepth\n\n    else:\n        if transitdepth > 0.0:\n            transitparams[2] = -transitdepth\n\n    # finally, do the fit\n    try:\n        leastsqfit = spleastsq(transits.trapezoid_transit_residual,\n                               transitparams,\n                               args=(stimes, smags, serrs),\n                               full_output=True)\n    except Exception as e:\n        leastsqfit = None\n\n    # if the fit succeeded, then we can return the final parameters\n    if leastsqfit and leastsqfit[-1] in (1,2,3,4):\n\n        finalparams = leastsqfit[0]\n        covxmatrix = leastsqfit[1]\n\n        # calculate the chisq and reduced chisq\n        fitmags, phase, ptimes, pmags, perrs, n_transitpoints = (\n            transits.trapezoid_transit_func(\n                finalparams,\n                stimes, smags, serrs,\n                get_ntransitpoints=True\n            )\n        )\n        fitchisq = np.sum(\n            ((fitmags - pmags)*(fitmags - pmags)) / (perrs*perrs)\n        )\n        fitredchisq = fitchisq/(len(pmags) - len(finalparams) - 1)\n\n        # get the residual variance and calculate the formal 1-sigma errs on the\n        # final parameters\n        residuals = leastsqfit[2]['fvec']\n        residualvariance = (\n            np.sum(residuals*residuals)/(pmags.size - finalparams.size)\n        )\n        if covxmatrix is not None:\n            covmatrix = residualvariance*covxmatrix\n            stderrs = np.sqrt(np.diag(covmatrix))\n        else:\n            LOGERROR('covxmatrix not available, fit probably failed!')\n            stderrs = None\n\n        if verbose:\n            LOGINFO(\n                'final fit done. chisq = %.5f, reduced chisq = %.5f' %\n                (fitchisq, fitredchisq)\n            )\n\n        # get the fit epoch\n        fperiod, fepoch = finalparams[:2]\n\n        # assemble the returndict\n        returndict = {\n            'fittype':'traptransit',\n            'fitinfo':{\n                'initialparams':transitparams,\n                'finalparams':finalparams,\n                'finalparamerrs':stderrs,\n                'leastsqfit':leastsqfit,\n                'fitmags':fitmags,\n                'fitepoch':fepoch,\n                'ntransitpoints':n_transitpoints\n            },\n            'fitchisq':fitchisq,\n            'fitredchisq':fitredchisq,\n            'fitplotfile':None,\n            'magseries':{\n                'phase':phase,\n                'times':ptimes,\n                'mags':pmags,\n                'errs':perrs,\n                'magsarefluxes':magsarefluxes,\n            },\n        }\n\n        # make the fit plot if required\n        if plotfit and isinstance(plotfit, str):\n\n            make_fit_plot(phase, pmags, perrs, fitmags,\n                          fperiod, ptimes.min(), fepoch,\n                          plotfit,\n                          magsarefluxes=magsarefluxes)\n\n            returndict['fitplotfile'] = plotfit\n\n        return returndict\n\n    # if the leastsq fit failed, return nothing\n    else:\n\n        LOGERROR('trapezoid-fit: least-squared fit to the light curve failed!')\n\n        # assemble the returndict\n        returndict = {\n            'fittype':'traptransit',\n            'fitinfo':{\n                'initialparams':transitparams,\n                'finalparams':None,\n                'finalparamerrs':None,\n                'leastsqfit':leastsqfit,\n                'fitmags':None,\n                'fitepoch':None,\n                'ntransitpoints':0\n            },\n            'fitchisq':np.nan,\n            'fitredchisq':np.nan,\n            'fitplotfile':None,\n            'magseries':{\n                'phase':None,\n                'times':None,\n                'mags':None,\n                'errs':None,\n                'magsarefluxes':magsarefluxes,\n            },\n        }\n\n        return returndict", "response": "This function fits a trapezoid - transit model to a magnitude time series."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _get_value(quantitystr, fitparams, fixedparams):\n\n    # for Mandel-Agol fitting, sometimes we want to fix some parameters,\n    # and fit others. this function allows that flexibility.\n    fitparamskeys, fixedparamskeys = fitparams.keys(), fixedparams.keys()\n    if quantitystr in fitparamskeys:\n        quantity = fitparams[quantitystr]\n    elif quantitystr in fixedparamskeys:\n        quantity = fixedparams[quantitystr]\n    return quantity", "response": "This function returns the value of a parameter in a node."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _transit_model(times, t0, per, rp, a, inc, ecc, w, u, limb_dark,\n                   exp_time_minutes=2, supersample_factor=7):\n    '''This returns a BATMAN planetary transit model.\n\n    Parameters\n    ----------\n\n    times : np.array\n        The times at which the model will be evaluated.\n\n    t0 : float\n        The time of periastron for the transit.\n\n    per : float\n        The orbital period of the planet.\n\n    rp : float\n        The stellar radius of the planet's star (in Rsun).\n\n    a : float\n        The semi-major axis of the planet's orbit (in Rsun).\n\n    inc : float\n        The orbital inclination (in degrees).\n\n    ecc : float\n        The eccentricity of the orbit.\n\n    w : float\n        The longitude of periastron (in degrees).\n\n    u : list of floats\n        The limb darkening coefficients specific to the limb darkening model\n        used.\n\n    limb_dark : {\"uniform\", \"linear\", \"quadratic\", \"square-root\", \"logarithmic\", \"exponential\", \"power2\", \"custom\"}\n        The type of limb darkening model to use. See the full list here:\n\n        https://www.cfa.harvard.edu/~lkreidberg/batman/tutorial.html#limb-darkening-options\n\n    exp_time_minutes : float\n        The amount of time to 'smear' the transit LC points over to simulate a\n        long exposure time.\n\n    supersample_factor: int\n        The number of supersampled time data points to average the lightcurve\n        model over.\n\n    Returns\n    -------\n\n    (params, batman_model) : tuple\n        The returned tuple contains the params list and the generated\n        `batman.TransitModel` object.\n\n    '''\n    params = batman.TransitParams()  # object to store transit parameters\n    params.t0 = t0                   # time of periastron\n    params.per = per                 # orbital period\n    params.rp = rp                   # planet radius (in stellar radii)\n    params.a = a                     # semi-major axis (in stellar radii)\n    params.inc = inc                 # orbital inclination (in degrees)\n    params.ecc = ecc                 # the eccentricity of the orbit\n    params.w = w                     # longitude of periastron (in degrees)\n    params.u = u                     # limb darkening coefficient list\n    params.limb_dark = limb_dark     # limb darkening model to use\n\n    t = times\n    m = batman.TransitModel(params, t, exp_time=exp_time_minutes/60./24.,\n                            supersample_factor=supersample_factor)\n\n    return params, m", "response": "This function returns a BATMAN planetary transit model."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _log_prior_transit(theta, priorbounds):\n    '''\n    Assume priors on all parameters have uniform probability.\n    '''\n    # priorbounds contains the input priors, and because of how we previously\n    # sorted theta, its sorted keys tell us which parts of theta correspond to\n    # which physical quantities.\n\n    allowed = True\n    for ix, key in enumerate(np.sort(list(priorbounds.keys()))):\n        if priorbounds[key][0] < theta[ix] < priorbounds[key][1]:\n            allowed = True and allowed\n        else:\n            allowed = False\n\n    if allowed:\n        return 0.\n\n    return -np.inf", "response": "Return the log prior transit of theta."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ncompute the log likelihood of a single object in a single object.", "response": "def _log_likelihood_transit(theta, params, model, t, flux, err_flux,\n                            priorbounds):\n    '''\n    Given a batman TransitModel and its proposed parameters (theta), update the\n    batman params object with the proposed parameters and evaluate the gaussian\n    likelihood.\n\n    Note: the priorbounds are only needed to parse theta.\n    '''\n\n    u = []\n\n    for ix, key in enumerate(sorted(priorbounds.keys())):\n\n        if key == 'rp':\n            params.rp = theta[ix]\n        elif key == 't0':\n            params.t0 = theta[ix]\n        elif key == 'sma':\n            params.a = theta[ix]\n        elif key == 'incl':\n            params.inc = theta[ix]\n        elif key == 'period':\n            params.per = theta[ix]\n        elif key == 'ecc':\n            params.per = theta[ix]\n        elif key == 'omega':\n            params.w = theta[ix]\n        elif key == 'u_linear':\n            u.append(theta[ix])\n        elif key == 'u_quadratic':\n            u.append(theta[ix])\n            params.u = u\n\n    lc = model.light_curve(params)\n    residuals = flux - lc\n    log_likelihood = -0.5*(\n        np.sum((residuals/err_flux)**2 + np.log(2*np.pi*(err_flux)**2))\n    )\n\n    return log_likelihood"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ncomputing the log likelihood of a transit model plus a line.", "response": "def _log_likelihood_transit_plus_line(theta, params, model, t, data_flux,\n                                      err_flux, priorbounds):\n    '''\n    Given a batman TransitModel and its proposed parameters (theta), update the\n    batman params object with the proposed parameters and evaluate the gaussian\n    likelihood.\n\n    Note: the priorbounds are only needed to parse theta.\n    '''\n\n    u = []\n    for ix, key in enumerate(sorted(priorbounds.keys())):\n\n        if key == 'rp':\n            params.rp = theta[ix]\n        elif key == 't0':\n            params.t0 = theta[ix]\n        elif key == 'sma':\n            params.a = theta[ix]\n        elif key == 'incl':\n            params.inc = theta[ix]\n        elif key == 'period':\n            params.per = theta[ix]\n        elif key == 'ecc':\n            params.per = theta[ix]\n        elif key == 'omega':\n            params.w = theta[ix]\n        elif key == 'u_linear':\n            u.append(theta[ix])\n        elif key == 'u_quadratic':\n            u.append(theta[ix])\n            params.u = u\n        elif key == 'poly_order0':\n            poly_order0 = theta[ix]\n        elif key == 'poly_order1':\n            poly_order1 = theta[ix]\n\n    try:\n        poly_order0\n    except Exception as e:\n        poly_order0 = 0\n    else:\n        pass\n\n    transit = model.light_curve(params)\n    line = poly_order0 + t*poly_order1\n    model = transit + line\n\n    residuals = data_flux - model\n\n    log_likelihood = -0.5*(\n        np.sum((residuals/err_flux)**2 + np.log(2*np.pi*(err_flux)**2))\n    )\n\n    return log_likelihood"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef log_posterior_transit(theta, params, model, t, flux, err_flux, priorbounds):\n    '''\n    Evaluate posterior probability given proposed model parameters and\n    the observed flux timeseries.\n    '''\n    lp = _log_prior_transit(theta, priorbounds)\n    if not np.isfinite(lp):\n        return -np.inf\n    else:\n        return lp + _log_likelihood_transit(theta, params, model, t, flux,\n                                            err_flux, priorbounds)", "response": "Evaluate posterior probability given proposed model parameters and the observed flux timeseries."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nevaluate posterior probability given proposed model parameters and the observed flux timeseries.", "response": "def log_posterior_transit_plus_line(theta, params, model, t, flux, err_flux,\n                                    priorbounds):\n    '''\n    Evaluate posterior probability given proposed model parameters and\n    the observed flux timeseries.\n    '''\n    lp = _log_prior_transit_plus_line(theta, priorbounds)\n    if not np.isfinite(lp):\n        return -np.inf\n    else:\n        return (\n            lp + _log_likelihood_transit_plus_line(\n                theta, params, model, t, flux, err_flux, priorbounds)\n        )"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef mandelagol_fit_magseries(\n        times, mags, errs,\n        fitparams,\n        priorbounds,\n        fixedparams,\n        trueparams=None,\n        burninpercent=0.3,\n        plotcorner=False,\n        samplesavpath=False,\n        n_walkers=50,\n        n_mcmc_steps=400,\n        eps=1e-4,\n        skipsampling=False,\n        overwriteexistingsamples=False,\n        mcmcprogressbar=False,\n        plotfit=False,\n        magsarefluxes=False,\n        sigclip=10.0,\n        verbose=True,\n        nworkers=4\n):\n    '''This fits a Mandel & Agol (2002) planetary transit model to a flux time\n    series. You can fit and fix whatever parameters you want.\n\n    It relies on Kreidberg (2015)'s BATMAN implementation for the transit model,\n    emcee to sample the posterior (Foreman-Mackey et al 2013), `corner` to plot\n    it, and `h5py` to save the samples. See e.g., Claret's work for good guesses\n    of star-appropriate limb-darkening parameters.\n\n    NOTE: this only works for flux time-series at the moment.\n\n    NOTE: Between the `fitparams`, `priorbounds`, and `fixedparams` dicts, you\n    must specify all of the planetary transit parameters required by BATMAN:\n    `['t0', 'rp', 'sma', 'incl', 'u', 'rp', 'ecc', 'omega', 'period']`, or the\n    BATMAN model will fail to initialize.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The input flux time-series to fit a Fourier cosine series to.\n\n    fitparams : dict\n        This is the initial parameter guesses for MCMC, found e.g., by\n        BLS. The key string format must not be changed, but any parameter can be\n        either \"fit\" or \"fixed\". If it is \"fit\", it must have a corresponding\n        prior. For example::\n\n            fitparams = {'t0':1325.9, 'rp':np.sqrt(fitd['transitdepth']),\n                         'sma':6.17, 'incl':85, 'u':[0.3, 0.2]}\n\n        where 'u' is a list of the limb darkening parameters, Linear first, then\n        quadratic. Quadratic limb darkening is the only form implemented.\n\n    priorbounds : dict\n        This sets the lower & upper bounds on uniform prior, e.g.::\n\n            priorbounds = {'rp':(0.135, 0.145), 'u_linear':(0.3-1, 0.3+1),\n                           'u_quad':(0.2-1, 0.2+1), 't0':(np.min(time),\n                           np.max(time)), 'sma':(6,6.4), 'incl':(80,90)}\n\n    fixedparams : dict\n        This sets which parameters are fixed, and their values. For example::\n\n            fixedparams = {'ecc':0.,\n                           'omega':90.,\n                           'limb_dark':'quadratic',\n                           'period':fitd['period'] }\n\n        `limb_dark` must be \"quadratic\".  It's \"fixed\", because once you\n        choose your limb-darkening model, it's fixed.\n\n    trueparams : list of floats\n        The true parameter values you're fitting for, if they're known (e.g., a\n        known planet, or fake data). Only for plotting purposes.\n\n    burninpercent : float\n        The percent of MCMC samples to discard as burn-in.\n\n    plotcorner : str or False\n        If this is a str, points to the path of output corner plot that will be\n        generated for this MCMC run.\n\n    samplesavpath : str\n        This must be provided so `emcee` can save its MCMC samples to disk as\n        HDF5 files. This will set the path of the output HDF5file written.\n\n    n_walkers : int\n        The number of MCMC walkers to use.\n\n    n_mcmc_steps : int\n        The number of MCMC steps to take.\n\n    eps : float\n        The radius of the `n_walkers-dimensional` Gaussian ball used to\n        initialize the MCMC.\n\n    skipsampling : bool\n        If you've already collected MCMC samples, and you do not want any more\n        sampling (e.g., just make the plots), set this to be True.\n\n    overwriteexistingsamples : bool\n        If you've collected samples, but you want to overwrite them, set this to\n        True. Usually, it should be False, which appends samples to\n        `samplesavpath` HDF5 file.\n\n    mcmcprogressbar : bool\n        If True, will show a progress bar for the MCMC process.\n\n    plotfit: str or bool\n        If a str, indicates the path of the output fit plot file. If False, no\n        fit plot will be made.\n\n    magsarefluxes : bool\n        This indicates if the input measurements in `mags` are actually fluxes.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    verbose : bool\n        If True, will indicate MCMC progress.\n\n    nworkers : int\n        The number of parallel workers to launch for MCMC.\n\n    Returns\n    -------\n\n    dict\n        This function returns a dict containing the model fit parameters and\n        other fit information. The form of this dict is mostly standardized\n        across all functions in this module::\n\n            {\n                'fittype':'mandelagol',\n                'fitinfo':{\n                    'initialparams':the initial transit params provided,\n                    'fixedparams':the fixed transit params provided,\n                    'finalparams':the final model fit transit params,\n                    'finalparamerrs':formal errors in the params,\n                    'fitmags': the model fit mags,\n                    'fitepoch': the epoch of minimum light for the fit,\n                },\n                'fitplotfile': the output fit plot if fitplot is not None,\n                'magseries':{\n                    'times':input times in phase order of the model,\n                    'phase':the phases of the model mags,\n                    'mags':input mags/fluxes in the phase order of the model,\n                    'errs':errs in the phase order of the model,\n                    'magsarefluxes':input value of magsarefluxes kwarg\n                }\n            }\n\n    '''\n\n    from multiprocessing import Pool\n    fittype = 'mandelagol'\n\n    if not magsarefluxes:\n        raise NotImplementedError('magsarefluxes is not implemented yet.')\n    if not samplesavpath:\n        raise ValueError(\n            'This function requires that you save the samples somewhere'\n        )\n    if not mandel_agol_dependencies:\n        raise ImportError(\n            'This function depends on BATMAN, emcee>3.0, corner, and h5py.'\n        )\n\n    # sigma clip and get rid of zero errs\n    stimes, smags, serrs = sigclip_magseries(times, mags, errs,\n                                             sigclip=sigclip,\n                                             magsarefluxes=magsarefluxes)\n    nzind = np.nonzero(serrs)\n    stimes, smags, serrs = stimes[nzind], smags[nzind], serrs[nzind]\n\n    init_period = _get_value('period', fitparams, fixedparams)\n    init_epoch = _get_value('t0', fitparams, fixedparams)\n    init_rp = _get_value('rp', fitparams, fixedparams)\n    init_sma = _get_value('sma', fitparams, fixedparams)\n    init_incl = _get_value('incl', fitparams, fixedparams)\n    init_ecc = _get_value('ecc', fitparams, fixedparams)\n    init_omega = _get_value('omega', fitparams, fixedparams)\n    limb_dark = _get_value('limb_dark', fitparams, fixedparams)\n    init_u = _get_value('u', fitparams, fixedparams)\n\n    if not limb_dark == 'quadratic':\n        raise ValueError(\n            'only quadratic limb-darkening is supported at the moment'\n        )\n\n    # initialize the model and calculate the initial model light-curve\n    init_params, init_m = _transit_model(stimes, init_epoch, init_period,\n                                         init_rp, init_sma, init_incl, init_ecc,\n                                         init_omega, init_u, limb_dark)\n    init_flux = init_m.light_curve(init_params)\n\n    # guessed initial params. give nice guesses, or else emcee struggles.\n    theta, fitparamnames = [], []\n    for k in np.sort(list(fitparams.keys())):\n        if isinstance(fitparams[k], float) or isinstance(fitparams[k], int):\n            theta.append(fitparams[k])\n            fitparamnames.append(fitparams[k])\n        elif isinstance(fitparams[k], list):\n            if not len(fitparams[k]) == 2:\n                raise ValueError('should only be quadratic LD coeffs')\n            theta.append(fitparams[k][0])\n            theta.append(fitparams[k][1])\n            fitparamnames.append(fitparams[k][0])\n            fitparamnames.append(fitparams[k][1])\n\n    # initialize sampler\n    n_dim = len(theta)\n    initial_position_vec = [theta + eps*np.random.randn(n_dim)\n                            for i in range(n_walkers)]\n\n    # run the MCMC, unless you just want to load the available samples\n    if not skipsampling:\n\n        backend = emcee.backends.HDFBackend(samplesavpath)\n        if overwriteexistingsamples:\n            LOGWARNING(\n                'erased samples previously at {:s}'.format(samplesavpath)\n            )\n            backend.reset(n_walkers, n_dim)\n\n        # if this is the first run, then start from a gaussian ball.\n        # otherwise, resume from the previous samples.\n        starting_positions = initial_position_vec\n        isfirstrun = True\n        if os.path.exists(backend.filename):\n            if backend.iteration > 1:\n                starting_positions = None\n                isfirstrun = False\n\n        if verbose and isfirstrun:\n            LOGINFO(\n                'start {:s} MCMC with {:d} dims, {:d} steps, {:d} walkers,'.\n                format(fittype, n_dim, n_mcmc_steps, n_walkers) +\n                ' {:d} threads'.format(nworkers)\n            )\n        elif verbose and not isfirstrun:\n            LOGINFO(\n                'continue {:s} with {:d} dims, {:d} steps, {:d} walkers, '.\n                format(fittype, n_dim, n_mcmc_steps, n_walkers) +\n                '{:d} threads'.format(nworkers)\n            )\n\n        import sys\n\n        if sys.version_info >= (3, 3):\n            with Pool(nworkers) as pool:\n                sampler = emcee.EnsembleSampler(\n                    n_walkers, n_dim, log_posterior_transit,\n                    args=(init_params, init_m, stimes,\n                          smags, serrs, priorbounds),\n                    pool=pool,\n                    backend=backend\n                )\n                sampler.run_mcmc(starting_positions, n_mcmc_steps,\n                                 progress=mcmcprogressbar)\n\n        elif sys.version_info < (3, 3):\n\n            sampler = emcee.EnsembleSampler(\n                n_walkers, n_dim, log_posterior_transit,\n                args=(init_params, init_m, stimes, smags, serrs, priorbounds),\n                threads=nworkers,\n                backend=backend\n            )\n            sampler.run_mcmc(starting_positions, n_mcmc_steps,\n                             progress=mcmcprogressbar)\n\n        if verbose:\n            LOGINFO(\n                'ended {:s} MCMC run with {:d} steps, {:d} walkers, '.format(\n                    fittype, n_mcmc_steps, n_walkers\n                ) + '{:d} threads'.format(nworkers)\n            )\n\n    reader = emcee.backends.HDFBackend(samplesavpath)\n\n    n_to_discard = int(burninpercent*n_mcmc_steps)\n\n    samples = reader.get_chain(discard=n_to_discard, flat=True)\n    log_prob_samples = reader.get_log_prob(discard=n_to_discard, flat=True)\n    log_prior_samples = reader.get_blobs(discard=n_to_discard, flat=True)\n\n    # Get best-fit parameters and their 1-sigma error bars\n    fit_statistics = list(\n        map(lambda v: (v[1], v[2]-v[1], v[1]-v[0]),\n            list(zip( *np.percentile(samples, [15.85, 50, 84.15], axis=0))))\n    )\n\n    medianparams, std_perrs, std_merrs = {}, {}, {}\n    for ix, k in enumerate(np.sort(list(priorbounds.keys()))):\n        medianparams[k] = fit_statistics[ix][0]\n        std_perrs[k] = fit_statistics[ix][1]\n        std_merrs[k] = fit_statistics[ix][2]\n\n    stderrs = {'std_perrs':std_perrs, 'std_merrs':std_merrs}\n\n    per = _get_value('period', medianparams, fixedparams)\n    t0 = _get_value('t0', medianparams, fixedparams)\n    rp = _get_value('rp', medianparams, fixedparams)\n    sma = _get_value('sma', medianparams, fixedparams)\n    incl = _get_value('incl', medianparams, fixedparams)\n    ecc = _get_value('ecc', medianparams, fixedparams)\n    omega = _get_value('omega', medianparams, fixedparams)\n    limb_dark = _get_value('limb_dark', medianparams, fixedparams)\n    try:\n        u = fixedparams['u']\n    except Exception as e:\n        u = [medianparams['u_linear'], medianparams['u_quad']]\n\n    fit_params, fit_m = _transit_model(stimes, t0, per, rp, sma, incl, ecc,\n                                       omega, u, limb_dark)\n    fitmags = fit_m.light_curve(fit_params)\n    fepoch = t0\n\n    # assemble the return dictionary\n    returndict = {\n        'fittype':fittype,\n        'fitinfo':{\n            'initialparams':fitparams,\n            'initialmags':init_flux,\n            'fixedparams':fixedparams,\n            'finalparams':medianparams,\n            'finalparamerrs':stderrs,\n            'fitmags':fitmags,\n            'fitepoch':fepoch,\n        },\n        'fitplotfile':None,\n        'magseries':{\n            'times':stimes,\n            'mags':smags,\n            'errs':serrs,\n            'magsarefluxes':magsarefluxes,\n        },\n    }\n\n    # make the output corner plot, and lightcurve plot if desired\n    if plotcorner:\n        if isinstance(trueparams,dict):\n            trueparamkeys = np.sort(list(trueparams.keys()))\n            truelist = [trueparams[k] for k in trueparamkeys]\n            fig = corner.corner(\n                samples,\n                labels=trueparamkeys,\n                truths=truelist,\n                quantiles=[0.1585, 0.5, .8415], show_titles=True\n            )\n        else:\n            fig = corner.corner(samples,\n                                labels=fitparamnames,\n                                quantiles=[0.1585, 0.5, .8415],\n                                show_titles=True)\n\n        plt.savefig(plotcorner, dpi=300)\n        if verbose:\n            LOGINFO('saved {:s}'.format(plotcorner))\n\n    if plotfit and isinstance(plotfit, str):\n\n        f, ax = plt.subplots(figsize=(8,6))\n        ax.scatter(stimes, smags, c='k', alpha=0.5, label='observed',\n                   zorder=1, s=1.5, rasterized=True, linewidths=0)\n        ax.scatter(stimes, init_flux, c='r', alpha=1,\n                   s=3.5, zorder=2, rasterized=True, linewidths=0,\n                   label='initial guess')\n        ax.scatter(\n            stimes, fitmags, c='b', alpha=1,\n            s=1.5, zorder=3, rasterized=True, linewidths=0,\n            label='fit {:d} dims'.format(\n                len(fitparamnames))\n        )\n        ax.legend(loc='best')\n        ax.set(xlabel='time [days]', ylabel='relative flux')\n        f.savefig(plotfit, dpi=300, bbox_inches='tight')\n        if verbose:\n            LOGINFO('saved {:s}'.format(plotfit))\n\n        returndict['fitplotfile'] = plotfit\n\n    return returndict", "response": "Fit a Mandelagol model to a flux time - series."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef mandelagol_and_line_fit_magseries(\n        times, mags, errs,\n        fitparams,\n        priorbounds,\n        fixedparams,\n        trueparams=None,\n        burninpercent=0.3,\n        plotcorner=False,\n        timeoffset=0,\n        samplesavpath=False,\n        n_walkers=50,\n        n_mcmc_steps=400,\n        eps=1e-4,\n        skipsampling=False,\n        overwriteexistingsamples=False,\n        mcmcprogressbar=False,\n        plotfit=False,\n        scatterxdata=None,\n        scatteryaxes=None,\n        magsarefluxes=True,\n        sigclip=10.0,\n        verbose=True,\n        nworkers=4\n):\n    '''The model fit by this function is: a Mandel & Agol (2002) transit, PLUS a\n    line. You can fit and fix whatever parameters you want.\n\n    A typical use case: you want to measure transit times of individual SNR >~\n    50 transits. You fix all the transit parameters except for the mid-time,\n    and also fit for a line locally.\n\n    NOTE: this only works for flux time-series at the moment.\n\n    NOTE: Between the `fitparams`, `priorbounds`, and `fixedparams` dicts, you\n    must specify all of the planetary transit parameters required by BATMAN and\n    the parameters for the line fit: `['t0', 'rp', 'sma', 'incl', 'u', 'rp',\n    'ecc', 'omega', 'period', 'poly_order0', poly_order1']`, or the BATMAN model\n    will fail to initialize.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The input flux time-series to fit a Fourier cosine series to.\n\n    fitparams : dict\n        This is the initial parameter guesses for MCMC, found e.g., by\n        BLS. The key string format must not be changed, but any parameter can be\n        either \"fit\" or \"fixed\". If it is \"fit\", it must have a corresponding\n        prior. For example::\n\n            fitparams = {'t0':1325.9,\n                         'poly_order0':1,\n                         'poly_order1':0.}\n\n        where `t0` is the time of transit-center for a reference transit.\n        `poly_order0` corresponds to the intercept of the line, `poly_order1` is\n        the slope.\n\n    priorbounds : dict\n        This sets the lower & upper bounds on uniform prior, e.g.::\n\n            priorbounds = {'t0':(np.min(time), np.max(time)),\n                            'poly_order0':(0.5,1.5),\n                            'poly_order1':(-0.5,0.5) }\n\n    fixedparams : dict\n        This sets which parameters are fixed, and their values. For example::\n\n            fixedparams = {'ecc':0.,\n                           'omega':90.,\n                           'limb_dark':'quadratic',\n                           'period':fitd['period'],\n                           'rp':np.sqrt(fitd['transitdepth']),\n                           'sma':6.17, 'incl':85, 'u':[0.3, 0.2]}\n\n        `limb_dark` must be \"quadratic\".  It's \"fixed\", because once you\n        choose your limb-darkening model, it's fixed.\n\n    trueparams : list of floats\n        The true parameter values you're fitting for, if they're known (e.g., a\n        known planet, or fake data). Only for plotting purposes.\n\n    burninpercent : float\n        The percent of MCMC samples to discard as burn-in.\n\n    plotcorner : str or False\n        If this is a str, points to the path of output corner plot that will be\n        generated for this MCMC run.\n\n    timeoffset : float\n        If input times are offset by some constant, and you want saved pickles\n        to fix that.\n\n    samplesavpath : str\n        This must be provided so `emcee` can save its MCMC samples to disk as\n        HDF5 files. This will set the path of the output HDF5file written.\n\n    n_walkers : int\n        The number of MCMC walkers to use.\n\n    n_mcmc_steps : int\n        The number of MCMC steps to take.\n\n    eps : float\n        The radius of the `n_walkers-dimensional` Gaussian ball used to\n        initialize the MCMC.\n\n    skipsampling : bool\n        If you've already collected MCMC samples, and you do not want any more\n        sampling (e.g., just make the plots), set this to be True.\n\n    overwriteexistingsamples : bool\n        If you've collected samples, but you want to overwrite them, set this to\n        True. Usually, it should be False, which appends samples to\n        `samplesavpath` HDF5 file.\n\n    mcmcprogressbar : bool\n        If True, will show a progress bar for the MCMC process.\n\n    plotfit: str or bool\n        If a str, indicates the path of the output fit plot file. If False, no\n        fit plot will be made.\n\n    scatterxdata : np.array or None\n        Use this to overplot x,y scatter points on the output model/data\n        lightcurve (e.g., to highlight bad data, or to indicate an ephemeris),\n        this can take a `np.ndarray` with the same units as `times`.\n\n    scatteryaxes : np.array or None\n        Use this to provide the y-values for scatterxdata, in units of fraction\n        of an axis.\n\n    magsarefluxes : bool\n        This indicates if the input measurements in `mags` are actually fluxes.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    verbose : bool\n        If True, will indicate MCMC progress.\n\n    nworkers : int\n        The number of parallel workers to launch for MCMC.\n\n    Returns\n    -------\n\n    dict\n        This function returns a dict containing the model fit parameters and\n        other fit information. The form of this dict is mostly standardized\n        across all functions in this module::\n\n            {\n                'fittype':'mandelagol_and_line',\n                'fitinfo':{\n                    'initialparams':the initial transit params provided,\n                    'fixedparams':the fixed transit params provided,\n                    'finalparams':the final model fit transit params,\n                    'finalparamerrs':formal errors in the params,\n                    'fitmags': the model fit mags,\n                    'fitepoch': the epoch of minimum light for the fit,\n                },\n                'fitplotfile': the output fit plot if fitplot is not None,\n                'magseries':{\n                    'times':input times in phase order of the model,\n                    'phase':the phases of the model mags,\n                    'mags':input mags/fluxes in the phase order of the model,\n                    'errs':errs in the phase order of the model,\n                    'magsarefluxes':input value of magsarefluxes kwarg\n                }\n            }\n\n    '''\n\n    from multiprocessing import Pool\n    fittype = 'mandelagol_and_line'\n\n    if not magsarefluxes:\n        raise NotImplementedError('magsarefluxes is not implemented yet.')\n    if not samplesavpath:\n        raise ValueError(\n            'This function requires that you save the samples somewhere'\n        )\n    if not mandel_agol_dependencies:\n        raise ImportError(\n            'This function depends on BATMAN, emcee>3.0, corner, and h5py.'\n        )\n\n    # sigma clip and get rid of zero errs\n    stimes, smags, serrs = sigclip_magseries(times, mags, errs,\n                                             sigclip=sigclip,\n                                             magsarefluxes=magsarefluxes)\n    nzind = np.nonzero(serrs)\n    stimes, smags, serrs = stimes[nzind], smags[nzind], serrs[nzind]\n\n    init_period = _get_value('period', fitparams, fixedparams)\n    init_epoch = _get_value('t0', fitparams, fixedparams)\n    init_rp = _get_value('rp', fitparams, fixedparams)\n    init_sma = _get_value('sma', fitparams, fixedparams)\n    init_incl = _get_value('incl', fitparams, fixedparams)\n    init_ecc = _get_value('ecc', fitparams, fixedparams)\n    init_omega = _get_value('omega', fitparams, fixedparams)\n    limb_dark = _get_value('limb_dark', fitparams, fixedparams)\n    init_u = _get_value('u', fitparams, fixedparams)\n\n    init_poly_order0 = _get_value('poly_order0', fitparams, fixedparams)\n    init_poly_order1 = _get_value('poly_order1', fitparams, fixedparams)\n\n    if not limb_dark == 'quadratic':\n        raise ValueError(\n            'only quadratic limb-darkening is supported at the moment'\n        )\n\n    # initialize the model and calculate the initial model light-curve\n    init_params, init_m = _transit_model(\n        stimes, init_epoch, init_period, init_rp, init_sma, init_incl,\n        init_ecc, init_omega, init_u, limb_dark)\n\n    init_flux = (\n        init_m.light_curve(init_params) +\n        init_poly_order0 + init_poly_order1*stimes\n    )\n\n    # guessed initial params. give nice guesses, or else emcee struggles.\n    theta, fitparamnames = [], []\n    for k in np.sort(list(fitparams.keys())):\n        if isinstance(fitparams[k], float) or isinstance(fitparams[k], int):\n            theta.append(fitparams[k])\n            fitparamnames.append(fitparams[k])\n        elif isinstance(fitparams[k], list):\n            if not len(fitparams[k]) == 2:\n                raise ValueError('should only be quadratic LD coeffs')\n            theta.append(fitparams[k][0])\n            theta.append(fitparams[k][1])\n            fitparamnames.append(fitparams[k][0])\n            fitparamnames.append(fitparams[k][1])\n\n    # initialize sampler\n    n_dim = len(theta)\n\n    # run the MCMC, unless you just want to load the available samples\n    if not skipsampling:\n\n        backend = emcee.backends.HDFBackend(samplesavpath)\n        if overwriteexistingsamples:\n            LOGWARNING(\n                'erased samples previously at {:s}'.format(samplesavpath)\n            )\n            backend.reset(n_walkers, n_dim)\n\n        # if this is the first run, then start from a gaussian ball, centered\n        # on the maximum likelihood solution.  otherwise, resume from the\n        # previous samples.\n        def nll(*args):\n            return -_log_likelihood_transit_plus_line(*args)\n\n        soln = spminimize(\n            nll, theta, method='BFGS',\n            args=(init_params, init_m, stimes, smags, serrs, priorbounds)\n        )\n        theta_ml = soln.x\n        ml_poly_order0 = theta_ml[0]\n        ml_poly_order1 = theta_ml[1]\n        ml_rp = theta_ml[2]\n        ml_t0 = theta_ml[3]\n\n        ml_params, ml_m = _transit_model(stimes, ml_t0, init_period,\n                                         ml_rp, init_sma, init_incl,\n                                         init_ecc, init_omega, init_u,\n                                         limb_dark)\n        ml_mags = (\n            ml_m.light_curve(ml_params) +\n            ml_poly_order0 + ml_poly_order1*stimes\n        )\n\n        initial_position_vec = [theta_ml + eps*np.random.randn(n_dim)\n                                for i in range(n_walkers)]\n        starting_positions = initial_position_vec\n        isfirstrun = True\n        if os.path.exists(backend.filename):\n            if backend.iteration > 1:\n                starting_positions = None\n                isfirstrun = False\n\n        if verbose and isfirstrun:\n            LOGINFO(\n                'start {:s} MCMC with {:d} dims, {:d} steps, {:d} walkers,'.\n                format(fittype, n_dim, n_mcmc_steps, n_walkers) +\n                ' {:d} threads'.format(nworkers)\n            )\n        elif verbose and not isfirstrun:\n            LOGINFO(\n                'continue {:s} with {:d} dims, {:d} steps, {:d} walkers, '.\n                format(fittype, n_dim, n_mcmc_steps, n_walkers) +\n                '{:d} threads'.format(nworkers)\n            )\n\n        with Pool(nworkers) as pool:\n            sampler = emcee.EnsembleSampler(\n                n_walkers, n_dim, log_posterior_transit_plus_line,\n                args=(init_params, init_m, stimes, smags, serrs, priorbounds),\n                pool=pool,\n                backend=backend\n            )\n            sampler.run_mcmc(starting_positions, n_mcmc_steps,\n                             progress=mcmcprogressbar)\n\n        if verbose:\n            LOGINFO(\n                'ended {:s} MCMC run with {:d} steps, {:d} walkers, '.format(\n                    fittype, n_mcmc_steps, n_walkers\n                ) + '{:d} threads'.format(nworkers)\n            )\n\n    reader = emcee.backends.HDFBackend(samplesavpath)\n\n    n_to_discard = int(burninpercent*n_mcmc_steps)\n\n    samples = reader.get_chain(discard=n_to_discard, flat=True)\n    log_prob_samples = reader.get_log_prob(discard=n_to_discard, flat=True)\n    log_prior_samples = reader.get_blobs(discard=n_to_discard, flat=True)\n\n    # Get best-fit parameters and their 1-sigma error bars\n    fit_statistics = list(\n        map(lambda v: (v[1], v[2]-v[1], v[1]-v[0]),\n            list(zip( *np.percentile(samples, [15.85, 50, 84.15], axis=0))))\n    )\n\n    medianparams, std_perrs, std_merrs = {}, {}, {}\n    for ix, k in enumerate(np.sort(list(priorbounds.keys()))):\n        medianparams[k] = fit_statistics[ix][0]\n        std_perrs[k] = fit_statistics[ix][1]\n        std_merrs[k] = fit_statistics[ix][2]\n\n    stderrs = {'std_perrs':std_perrs, 'std_merrs':std_merrs}\n\n    per = _get_value('period', medianparams, fixedparams)\n    t0 = _get_value('t0', medianparams, fixedparams)\n    rp = _get_value('rp', medianparams, fixedparams)\n    sma = _get_value('sma', medianparams, fixedparams)\n    incl = _get_value('incl', medianparams, fixedparams)\n    ecc = _get_value('ecc', medianparams, fixedparams)\n    omega = _get_value('omega', medianparams, fixedparams)\n    limb_dark = _get_value('limb_dark', medianparams, fixedparams)\n    try:\n        u = fixedparams['u']\n    except Exception as e:\n        u = [medianparams['u_linear'], medianparams['u_quad']]\n\n    poly_order0 = _get_value('poly_order0', medianparams, fixedparams)\n    poly_order1 = _get_value('poly_order1', medianparams, fixedparams)\n\n    # initialize the model and calculate the initial model light-curve\n    fit_params, fit_m = _transit_model(stimes, t0, per, rp, sma, incl, ecc,\n                                       omega, u, limb_dark)\n    fitmags = (\n        fit_m.light_curve(fit_params) +\n        poly_order0 + poly_order1*stimes\n    )\n    fepoch = t0\n\n    # assemble the return dictionary\n    medianparams['t0'] += timeoffset\n    returndict = {\n        'fittype':fittype,\n        'fitinfo':{\n            'initialparams':fitparams,\n            'initialmags':init_flux,\n            'fixedparams':fixedparams,\n            'finalparams':medianparams,\n            'finalparamerrs':stderrs,\n            'fitmags':fitmags,\n            'fitepoch':fepoch+timeoffset,\n        },\n        'fitplotfile':None,\n        'magseries':{\n            'times':stimes+timeoffset,\n            'mags':smags,\n            'errs':serrs,\n            'magsarefluxes':magsarefluxes,\n        },\n    }\n\n    # make the output corner plot, and lightcurve plot if desired\n    if plotcorner:\n        fig = corner.corner(\n            samples,\n            labels=['line intercept-1', 'line slope',\n                    'rp','t0-{:.4f}'.format(timeoffset)],\n            truths=[ml_poly_order0, ml_poly_order1, ml_rp, ml_t0],\n            quantiles=[0.1585, 0.5, .8415], show_titles=True\n        )\n        plt.savefig(plotcorner, dpi=300)\n        if verbose:\n            LOGINFO('saved {:s}'.format(plotcorner))\n\n    if plotfit and isinstance(plotfit, str):\n\n        plt.close('all')\n        f, (a0, a1) = plt.subplots(nrows=2, ncols=1, sharex=True,\n                                   figsize=(8,5),\n                                   gridspec_kw={'height_ratios':[3, 1]})\n\n        a0.scatter(stimes, smags, c='k', alpha=0.9, label='data', zorder=1,\n                   s=10, rasterized=True, linewidths=0)\n\n        DEBUGGING = False\n        if DEBUGGING:\n            a0.scatter(stimes, init_flux, c='r', alpha=1, s=3.5, zorder=2,\n                       rasterized=True, linewidths=0,\n                       label='initial guess for ml')\n            a0.scatter(stimes, ml_mags, c='g', alpha=1, s=3.5, zorder=2,\n                       rasterized=True, linewidths=0, label='max likelihood')\n\n        a0.plot(\n            stimes, fitmags, c='b',\n            zorder=0, rasterized=True, lw=2, alpha=0.4,\n            label='{:s} fit, {:d} dims'.format(fittype, len(fitparamnames))\n        )\n\n        a1.scatter(\n            stimes, smags-fitmags, c='k', alpha=0.9,\n            rasterized=True, s=10, linewidths=0\n        )\n\n        if scatterxdata and scatteryaxes:\n            import matplotlib.transforms as transforms\n            for a in [a0, a1]:\n                transform = transforms.blended_transform_factory(\n                    a.transData, a.transAxes\n                )\n                a.scatter(scatterxdata, scatteryaxes, c='r', alpha=0.9,\n                          zorder=2, s=10, rasterized=True, linewidths=0,\n                          marker=\"^\", transform=transform)\n\n\n        a1.set_xlabel('time-t0 [days]')\n        a0.set_ylabel('relative flux')\n        a1.set_ylabel('residual')\n        a0.legend(loc='best', fontsize='x-small')\n        for a in [a0, a1]:\n            a.get_yaxis().set_tick_params(which='both', direction='in')\n            a.get_xaxis().set_tick_params(which='both', direction='in')\n\n        f.tight_layout(h_pad=0, w_pad=0)\n        f.savefig(plotfit, dpi=300, bbox_inches='tight')\n        if verbose:\n            LOGINFO('saved {:s}'.format(plotfit))\n\n        returndict['fitplotfile'] = plotfit\n\n    return returndict", "response": "This function is used to fit a Mandelagol time - series to a line."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef list_trilegal_filtersystems():\n    '''\n    This just lists all the filter systems available for TRILEGAL.\n\n    '''\n\n    print('%-40s %s' % ('FILTER SYSTEM NAME','DESCRIPTION'))\n    print('%-40s %s' % ('------------------','-----------'))\n    for key in sorted(TRILEGAL_FILTER_SYSTEMS.keys()):\n        print('%-40s %s' % (key, TRILEGAL_FILTER_SYSTEMS[key]['desc']))", "response": "This function lists all the filter systems available for TRILEGAL."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef read_model_table(modelfile):\n    '''\n    This reads a downloaded TRILEGAL model file.\n\n    Parameters\n    ----------\n\n    modelfile : str\n        Path to the downloaded model file to read.\n\n    Returns\n    -------\n\n    np.recarray\n        Returns the model table as a Numpy record array.\n\n    '''\n\n    infd = gzip.open(modelfile)\n    model = np.genfromtxt(infd,names=True)\n    infd.close()\n\n    return model", "response": "Reads a downloaded TRILEGAL model file and returns a Numpy record array."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _time_independent_equals(a, b):\n    '''\n    This compares two values in constant time.\n\n    Taken from tornado:\n\n    https://github.com/tornadoweb/tornado/blob/\n    d4eb8eb4eb5cc9a6677e9116ef84ded8efba8859/tornado/web.py#L3060\n\n    '''\n    if len(a) != len(b):\n        return False\n    result = 0\n    if isinstance(a[0], int):  # python3 byte strings\n        for x, y in zip(a, b):\n            result |= x ^ y\n    else:  # python2\n        for x, y in zip(a, b):\n            result |= ord(x) ^ ord(y)\n    return result == 0", "response": "Compare two values in constant time."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef default(self, obj):\n        '''Overrides the default serializer for `JSONEncoder`.\n\n        This can serialize the following objects in addition to what\n        `JSONEncoder` can already do.\n\n        - `np.array`\n        - `bytes`\n        - `complex`\n        - `np.float64` and other `np.dtype` objects\n\n        Parameters\n        ----------\n\n        obj : object\n            A Python object to serialize to JSON.\n\n        Returns\n        -------\n\n        str\n            A JSON encoded representation of the input object.\n\n        '''\n\n        if isinstance(obj, np.ndarray):\n            return obj.tolist()\n        elif isinstance(obj, bytes):\n            return obj.decode()\n        elif isinstance(obj, complex):\n            return (obj.real, obj.imag)\n        elif (isinstance(obj, (float, np.float64, np.float_)) and\n              not np.isfinite(obj)):\n            return None\n        elif isinstance(obj, (np.int8, np.int16, np.int32, np.int64)):\n            return int(obj)\n        else:\n            return json.JSONEncoder.default(self, obj)", "response": "Overrides the default serializer for JSONEncoder."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef initialize(self, currentdir, assetpath, cplist,\n                   cplistfile, executor, readonly, baseurl):\n        '''\n        handles initial setup.\n\n        '''\n\n        self.currentdir = currentdir\n        self.assetpath = assetpath\n        self.currentproject = cplist\n        self.cplistfile = cplistfile\n        self.executor = executor\n        self.readonly = readonly\n        self.baseurl = baseurl", "response": "handles initial setup. cplist is a list of dicts containing the key - value pairs for the current directory assetpath cplistfile executor readonly baseurl is a string indicating whether the request should be permanently readonly or not."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef get(self):\n        '''\n        This handles GET requests for the current checkplot-list.json file.\n\n        Used with AJAX from frontend.\n\n        '''\n\n        # add the reviewed key to the current dict if it doesn't exist\n        # this will hold all the reviewed objects for the frontend\n        if 'reviewed' not in self.currentproject:\n            self.currentproject['reviewed'] = {}\n\n        # just returns the current project as JSON\n        self.write(self.currentproject)", "response": "This handles GET requests for the current checkplot - list. json file. It will return the current project as JSON and the reviewed key will be set to reviewed."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef post(self):\n        '''This handles POST requests.\n\n        Saves the changes made by the user on the frontend back to the current\n        checkplot-list.json file.\n\n        '''\n\n        # if self.readonly is set, then don't accept any changes\n        # return immediately with a 400\n        if self.readonly:\n\n            msg = \"checkplotserver is in readonly mode. no updates allowed.\"\n            resultdict = {'status':'error',\n                          'message':msg,\n                          'readonly':self.readonly,\n                          'result':None}\n\n            self.write(resultdict)\n            raise tornado.web.Finish()\n\n\n        objectid = self.get_argument('objectid', None)\n        changes = self.get_argument('changes',None)\n\n        # if either of the above is invalid, return nothing\n        if not objectid or not changes:\n\n            msg = (\"could not parse changes to the checkplot filelist \"\n                   \"from the frontend\")\n            LOGGER.error(msg)\n            resultdict = {'status':'error',\n                          'message':msg,\n                          'readonly':self.readonly,\n                          'result':None}\n\n            self.write(resultdict)\n            raise tornado.web.Finish()\n\n\n        # otherwise, update the checkplot list JSON\n        objectid = xhtml_escape(objectid)\n        changes = json.loads(changes)\n\n        # update the dictionary\n        if 'reviewed' not in self.currentproject:\n            self.currentproject['reviewed'] = {}\n\n        self.currentproject['reviewed'][objectid] = changes\n\n        # update the JSON file\n        with open(self.cplistfile,'w') as outfd:\n            json.dump(self.currentproject, outfd)\n\n        # return status\n        msg = (\"wrote all changes to the checkplot filelist \"\n               \"from the frontend for object: %s\" % objectid)\n        LOGGER.info(msg)\n        resultdict = {'status':'success',\n                      'message':msg,\n                      'readonly':self.readonly,\n                      'result':{'objectid':objectid,\n                                'changes':changes}}\n\n        self.write(resultdict)\n        self.finish()", "response": "This handles the POST requests."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef get(self, cpfile):\n        '''This handles a GET request to run a specified LC tool.\n\n        Parameters\n        ----------\n\n        cpfile : str\n            This is the checkplot file to run the tool on.\n\n        Returns\n        -------\n\n        str\n            Returns a JSON response.\n\n        Notes\n        -----\n\n        The URI structure is::\n\n            /tools/<cpfile>?[args]\n\n        where args are::\n\n            ?lctool=<lctool>&argkey1=argval1&argkey2=argval2&...\n\n            &forcereload=true <- if this is present, then reload values from\n            original checkplot.\n\n            &objectid=<objectid>\n\n        `lctool` is one of the strings below\n\n        Period search functions::\n\n            psearch-gls: run Lomb-Scargle with given params\n            psearch-bls: run BLS with given params\n            psearch-pdm: run phase dispersion minimization with given params\n            psearch-aov: run analysis-of-variance with given params\n            psearch-mav: run analysis-of-variance (multi-harm) with given params\n            psearch-acf: run ACF period search with given params\n            psearch-win: run spectral window function search with given params\n\n        Arguments recognized by all period-search functions are::\n\n            startp=XX\n            endp=XX\n            magsarefluxes=True|False\n            autofreq=True|False\n            stepsize=XX\n\n        Variability characterization functions::\n\n            var-varfeatures: gets the variability features from the checkplot or\n                             recalculates if they're not present\n\n            var-prewhiten: pre-whitens the light curve with a sinusoidal signal\n\n            var-masksig: masks a given phase location with given width from the\n                         light curve\n\n        Light curve manipulation functions ::\n\n            phasedlc-newplot: make phased LC with new provided period/epoch\n            lcfit-fourier: fit a Fourier function to the phased LC\n            lcfit-spline: fit a spline function to the phased LC\n            lcfit-legendre: fit a Legendre polynomial to the phased LC\n            lcfit-savgol: fit a Savitsky-Golay polynomial to the phased LC\n\n        FIXME: figure out how to cache the results of these functions\n        temporarily and save them back to the checkplot after we click on save\n        in the frontend.\n\n        TODO: look for a checkplot-blah-blah.pkl-cps-processing file in the same\n        place as the usual pickle file. if this exists and is newer than the pkl\n        file, load it instead. Or have a checkplotdict['cpservertemp'] item.\n\n        '''\n\n        if cpfile:\n\n            self.cpfile = (\n                xhtml_escape(base64.b64decode(url_unescape(cpfile)))\n            )\n\n            # see if this plot is in the current project\n            if self.cpfile in self.currentproject['checkplots']:\n\n                # make sure this file exists\n                cpfpath = os.path.join(\n                    os.path.abspath(os.path.dirname(self.cplistfile)),\n                    self.cpfile\n                )\n\n                # if we can't find the pickle, quit immediately\n                if not os.path.exists(cpfpath):\n\n                    msg = \"couldn't find checkplot %s\" % cpfpath\n                    LOGGER.error(msg)\n                    resultdict = {'status':'error',\n                                  'message':msg,\n                                  'readonly':self.readonly,\n                                  'result':None}\n\n                    self.write(resultdict)\n                    raise tornado.web.Finish()\n\n                ###########################\n                # now parse the arguments #\n                ###########################\n\n                # check if we have to force-reload\n                forcereload = self.get_argument('forcereload',False)\n                if forcereload and xhtml_escape(forcereload):\n                    forcereload = True if forcereload == 'true' else False\n\n                # get the objectid\n                cpobjectid = self.get_argument('objectid',None)\n\n                # get the light curve tool to use\n                lctool = self.get_argument('lctool', None)\n\n                # preemptive dict to fill out\n                resultdict = {'status':None,\n                              'message':None,\n                              'readonly':self.readonly,\n                              'result':None}\n\n                # check if the lctool arg is provided\n                if lctool:\n\n                    lctool = xhtml_escape(lctool)\n                    lctoolargs = []\n                    lctoolkwargs = {}\n\n                    # check if this lctool is OK and has all the required args\n                    if lctool in CPTOOLMAP:\n\n                        try:\n\n                            # all args should have been parsed\n                            # successfully. parse the kwargs now\n                            for xkwarg, xkwargtype, xkwargdef in zip(\n                                    CPTOOLMAP[lctool]['kwargs'],\n                                    CPTOOLMAP[lctool]['kwargtypes'],\n                                    CPTOOLMAP[lctool]['kwargdefs']\n                            ):\n\n                                # get the kwarg\n                                if xkwargtype is list:\n                                    wbkwarg = self.get_arguments(xkwarg)\n                                    if len(wbkwarg) > 0:\n                                        wbkwarg = [url_unescape(xhtml_escape(x))\n                                                   for x in wbkwarg]\n                                    else:\n                                        wbkwarg = None\n\n                                else:\n                                    wbkwarg = self.get_argument(xkwarg, None)\n                                    if wbkwarg is not None:\n                                        wbkwarg = url_unescape(\n                                            xhtml_escape(wbkwarg)\n                                        )\n\n                                LOGGER.info('xkwarg = %s, wbkwarg = %s' %\n                                            (xkwarg, repr(wbkwarg)))\n\n                                # if it's None, sub with the default\n                                if wbkwarg is None:\n\n                                    wbkwarg = xkwargdef\n\n                                # otherwise, cast it to the required type\n                                else:\n\n                                    # special handling for lists of floats\n                                    if xkwargtype is list:\n                                        wbkwarg = [float(x) for x in wbkwarg]\n\n                                    # special handling for booleans\n                                    elif xkwargtype is bool:\n\n                                        if wbkwarg == 'false':\n                                            wbkwarg = False\n                                        elif wbkwarg == 'true':\n                                            wbkwarg = True\n                                        else:\n                                            wbkwarg = xkwargdef\n\n                                    # usual casting for other types\n                                    else:\n\n                                        wbkwarg = xkwargtype(wbkwarg)\n\n                                # update the lctools kwarg dict\n\n                                # make sure to remove any [] from the kwargs\n                                # this was needed to parse the input query\n                                # string correctly\n                                if xkwarg.endswith('[]'):\n                                    xkwarg = xkwarg.rstrip('[]')\n\n                                lctoolkwargs.update({xkwarg:wbkwarg})\n\n                        except Exception as e:\n\n                            LOGGER.exception('lctool %s, kwarg %s '\n                                             'will not work' %\n                                             (lctool, xkwarg))\n                            resultdict['status'] = 'error'\n                            resultdict['message'] = (\n                                'lctool %s, kwarg %s '\n                                'will not work' %\n                                (lctool, xkwarg)\n                            )\n                            resultdict['result'] = {'objectid':cpobjectid}\n\n                            self.write(resultdict)\n                            raise tornado.web.Finish()\n\n                    # if the tool is not in the CPTOOLSMAP\n                    else:\n                        LOGGER.error('lctool %s, does not exist' % lctool)\n                        resultdict['status'] = 'error'\n                        resultdict['message'] = (\n                            'lctool %s does not exist' % lctool\n                        )\n                        resultdict['result'] = {'objectid':cpobjectid}\n\n                        self.write(resultdict)\n                        raise tornado.web.Finish()\n\n                # if no lctool arg is provided\n                else:\n\n                    LOGGER.error('lctool argument not provided')\n                    resultdict['status'] = 'error'\n                    resultdict['message'] = (\n                        'lctool argument not provided'\n                    )\n                    resultdict['result'] = {'objectid':cpobjectid}\n\n                    self.write(resultdict)\n                    raise tornado.web.Finish()\n\n\n                ##############################################\n                ## NOW WE'RE READY TO ACTUALLY DO SOMETHING ##\n                ##############################################\n\n                LOGGER.info('loading %s...' % cpfpath)\n\n                # this loads the actual checkplot pickle\n                cpdict = yield self.executor.submit(\n                    _read_checkplot_picklefile, cpfpath\n                )\n\n                # we check for the existence of a cpfpath + '-cpserver-temp'\n                # file first. this is where we store stuff before we write it\n                # back to the actual checkplot.\n                tempfpath = cpfpath + '-cpserver-temp'\n\n                # load the temp checkplot if it exists\n                if os.path.exists(tempfpath):\n\n                    tempcpdict = yield self.executor.submit(\n                        _read_checkplot_picklefile, tempfpath\n                    )\n\n                # if it doesn't exist, read the times, mags, errs from the\n                # actual checkplot in prep for working on it\n                else:\n\n                    tempcpdict = {\n                        'objectid':cpdict['objectid'],\n                        'magseries':{\n                            'times':cpdict['magseries']['times'],\n                            'mags':cpdict['magseries']['mags'],\n                            'errs':cpdict['magseries']['errs'],\n                        }\n                    }\n\n\n                # if we're not forcing a rerun from the original checkplot dict\n                if not forcereload:\n\n                    cptimes, cpmags, cperrs = (\n                        tempcpdict['magseries']['times'],\n                        tempcpdict['magseries']['mags'],\n                        tempcpdict['magseries']['errs'],\n                    )\n                    LOGGER.info('forcereload = False')\n\n                # otherwise, reload the original times, mags, errs\n                else:\n\n                    cptimes, cpmags, cperrs = (cpdict['magseries']['times'],\n                                               cpdict['magseries']['mags'],\n                                               cpdict['magseries']['errs'])\n                    LOGGER.info('forcereload = True')\n\n\n                # collect the args\n                for xarg, xargtype in zip(CPTOOLMAP[lctool]['args'],\n                                          CPTOOLMAP[lctool]['argtypes']):\n\n                    # handle special args\n                    if xarg is None:\n                        lctoolargs.append(None)\n                    elif xarg == 'times':\n                        lctoolargs.append(cptimes)\n                    elif xarg == 'mags':\n                        lctoolargs.append(cpmags)\n                    elif xarg == 'errs':\n                        lctoolargs.append(cperrs)\n\n                    # handle other args\n                    else:\n\n                        try:\n\n                            if xargtype is list:\n\n                                wbarg = self.get_arguments(xarg)\n\n                            else:\n\n                                wbarg = url_unescape(\n                                    xhtml_escape(\n                                        self.get_argument(xarg, None)\n                                    )\n                                )\n\n                            # cast the arg to the required type\n\n                            # special handling for lists\n                            if xargtype is list:\n                                wbarg = [float(x) for x in wbarg]\n                            # special handling for epochs that can be optional\n                            elif xargtype is float and xarg == 'varepoch':\n                                try:\n                                    wbarg = xargtype(wbarg)\n                                except Exception as e:\n                                    wbarg = None\n                            # usual casting for other types\n                            else:\n                                wbarg = xargtype(wbarg)\n\n                            lctoolargs.append(wbarg)\n\n                        except Exception as e:\n\n                            LOGGER.exception('lctool %s, arg %s '\n                                             'will not work' %\n                                             (lctool, xarg))\n                            resultdict['status'] = 'error'\n                            resultdict['message'] = (\n                                'lctool %s, arg %s '\n                                'will not work' %\n                                (lctool, xarg)\n                            )\n                            resultdict['result'] = {'objectid':cpobjectid}\n\n                            self.write(resultdict)\n                            raise tornado.web.Finish()\n\n\n                LOGGER.info(lctool)\n                LOGGER.info(lctoolargs)\n                LOGGER.info(lctoolkwargs)\n\n                ############################\n                ## handle the lctools now ##\n                ############################\n\n                # make sure the results aren't there already.\n                # if they are and force-reload is not True,\n                # just return them instead.\n                resloc = CPTOOLMAP[lctool]['resloc']\n\n                # TODO: figure out a way to make the dispatched tasks\n                # cancellable. This can probably be done by having a global\n                # TOOLQUEUE object that gets imported on initialize(). In this\n                # object, we could put in key:vals like so:\n                #\n                # TOOLQUEUE['lctool-<toolname>-cpfpath'] = (\n                #    yield self.executor.submit(blah, *blah_args, **blah_kwargs)\n                # )\n                #\n                # then we probably need some sort of frontend AJAX call that\n                # enqueues things and can then cancel stuff from the queue. see\n                # stuff we need to figure out:\n                # - if the above scheme actually yields so we remain async\n                # - if the Future object supports cancellation\n                # - if the Future object that isn't resolved actually works\n\n\n                # get the objectid. we'll send this along with every\n                # result. this should handle the case of the current objectid\n                # not being the same as the objectid being looked at by the\n                # user. in effect, this will allow the user to launch a\n                # long-running process and come back to it later since the\n                # frontend will load the older results when they are complete.\n                objectid = cpdict['objectid']\n\n                # if lctool is a periodogram method\n                if lctool in ('psearch-gls',\n                              'psearch-bls',\n                              'psearch-pdm',\n                              'psearch-aov',\n                              'psearch-mav',\n                              'psearch-acf',\n                              'psearch-win'):\n\n                    lspmethod = resloc[0]\n\n                    # if we can return the results from a previous run\n                    if (lspmethod in tempcpdict and\n                        isinstance(tempcpdict[lspmethod], dict) and\n                        (not forcereload)):\n\n                        # for a periodogram method, we need the\n                        # following items\n                        bestperiod = (\n                            tempcpdict[lspmethod]['bestperiod']\n                        )\n                        nbestperiods = (\n                            tempcpdict[lspmethod]['nbestperiods']\n                        )\n                        nbestlspvals = (\n                            tempcpdict[lspmethod]['nbestlspvals']\n                        )\n                        periodogram = (\n                            tempcpdict[lspmethod]['periodogram']\n                        )\n\n                        # get the first phased LC plot and its period\n                        # and epoch\n                        phasedlc0plot = (\n                            tempcpdict[lspmethod][0]['plot']\n                        )\n                        phasedlc0period = float(\n                            tempcpdict[lspmethod][0]['period']\n                        )\n                        phasedlc0epoch = float(\n                            tempcpdict[lspmethod][0]['epoch']\n                        )\n\n                        LOGGER.warning(\n                            'returning previously unsaved '\n                            'results for lctool %s from %s' %\n                            (lctool, tempfpath)\n                        )\n\n                        #\n                        # assemble the returndict\n                        #\n\n                        resultdict['status'] = 'warning'\n                        resultdict['message'] = (\n                            'previous '\n                            'unsaved results from %s' %\n                            lctool\n                        )\n                        resultdict['result'] = {\n                            'objectid':objectid,\n                            lspmethod:{\n                                'nbestperiods':nbestperiods,\n                                'periodogram':periodogram,\n                                'bestperiod':bestperiod,\n                                'nbestpeaks':nbestlspvals,\n                                'phasedlc0':{\n                                    'plot':phasedlc0plot,\n                                    'period':phasedlc0period,\n                                    'epoch':phasedlc0epoch,\n                                }\n                            }\n                        }\n\n                        self.write(resultdict)\n                        self.finish()\n\n                    # otherwise, we have to rerun the periodogram method\n                    else:\n\n                        # see if sigclip is set. if so, then do the sigclip on\n                        # the times, mags, errs\n                        if lctoolkwargs['sigclip'] is not None:\n\n                            wtimes, wmags, werrs = lcmath.sigclip_magseries(\n                                lctoolargs[0],\n                                lctoolargs[1],\n                                lctoolargs[2],\n                                sigclip=lctoolkwargs['sigclip'],\n                                magsarefluxes=lctoolkwargs['magsarefluxes']\n                            )\n\n                            lctoolargs[0] = wtimes\n                            lctoolargs[1] = wmags\n                            lctoolargs[2] = werrs\n\n                        #\n                        # process the LC filters now\n                        #\n\n                        # see if the lctimefilters are set\n                        if lctoolkwargs['lctimefilters']:\n\n                            wtimes, wmags, werrs = (lctoolargs[0],\n                                                    lctoolargs[1],\n                                                    lctoolargs[2])\n                            filtermasks = [\n                                np.full_like(wtimes, False, dtype=np.bool_)\n                            ]\n\n                            # parse the time filter strings\n                            filterstr = lctoolkwargs['lctimefilters']\n\n                            filters = filterstr.split(',')\n                            filters = [\n                                x.strip().lstrip('(').rstrip(')').strip()\n                                for x in filters\n                            ]\n\n                            for filt in filters:\n\n                                try:\n\n                                    thisfilt = filt.split(':')\n                                    if len(thisfilt) == 2:\n\n                                        filt_lo = float(thisfilt[0])\n                                        filt_hi = float(thisfilt[1])\n\n                                        filtermasks.append(\n                                            ((wtimes -\n                                              cptimes.min()) < filt_hi) &\n                                            ((wtimes -\n                                              cptimes.min()) > filt_lo)\n                                        )\n\n                                    elif (len(thisfilt) == 3 and\n                                          thisfilt[0].strip() == 'not'):\n\n\n                                        filt_lo = float(thisfilt[1])\n                                        filt_hi = float(thisfilt[2])\n\n                                        filtermasks.append(np.logical_not(\n                                            (((wtimes -\n                                               cptimes.min()) < filt_hi) &\n                                             ((wtimes -\n                                               cptimes.min()) > filt_lo))\n                                        ))\n\n                                    else:\n                                        continue\n\n                                except Exception as e:\n                                    continue\n\n                            # finally, apply the filters if applicable\n                            if len(filtermasks) > 0:\n\n                                # apply the filters using an OR\n                                filterind = np.column_stack(filtermasks)\n                                filterind = np.any(filterind, axis=1)\n\n                                lctoolargs[0] = wtimes[filterind]\n                                lctoolargs[1] = wmags[filterind]\n                                lctoolargs[2] = werrs[filterind]\n\n\n                        # see if the lcmagfilters are set\n                        if lctoolkwargs['lcmagfilters']:\n\n                            wtimes, wmags, werrs = (lctoolargs[0],\n                                                    lctoolargs[1],\n                                                    lctoolargs[2])\n                            filtermasks = [\n                                np.full_like(wtimes, False, dtype=np.bool_)\n                            ]\n\n                            # parse the time filter strings\n                            filterstr = lctoolkwargs['lcmagfilters']\n\n                            filters = filterstr.split(',')\n                            filters = [\n                                x.strip().strip()\n                                for x in filters\n                            ]\n\n                            for filt in filters:\n\n                                try:\n\n                                    thisfilt = filt.split(':')\n                                    if len(thisfilt) == 2:\n\n                                        filt_lo = float(thisfilt[0])\n                                        filt_hi = float(thisfilt[1])\n\n                                        filtermasks.append(\n                                            (wmags < filt_hi) &\n                                            (wmags > filt_lo)\n                                        )\n\n                                    elif (len(thisfilt) == 3 and\n                                          thisfilt[0].strip() == 'not'):\n\n\n                                        filt_lo = float(thisfilt[1])\n                                        filt_hi = float(thisfilt[2])\n\n                                        filtermasks.append(np.logical_not(\n                                            ((wmags < filt_hi) &\n                                             (wmags > filt_lo))\n                                        ))\n\n                                    else:\n                                        continue\n\n                                except Exception as e:\n                                    continue\n\n                            # finally, apply the filters if applicable\n                            if len(filtermasks) > 0:\n\n                                # apply the filters using an OR\n                                filterind = np.column_stack(filtermasks)\n                                filterind = np.any(filterind, axis=1)\n\n                                lctoolargs[0] = wtimes[filterind]\n                                lctoolargs[1] = wmags[filterind]\n                                lctoolargs[2] = werrs[filterind]\n\n                        # at the end of processing, remove from lctookwargs\n                        # since the pfmethod doesn't know about this\n                        del lctoolkwargs['lctimefilters']\n                        del lctoolkwargs['lcmagfilters']\n\n                        #\n                        # now run the period finder and get results\n                        #\n\n                        lctoolfunction = CPTOOLMAP[lctool]['func']\n\n                        # run the period finder\n                        funcresults = yield self.executor.submit(\n                            lctoolfunction,\n                            *lctoolargs,\n                            **lctoolkwargs\n                        )\n\n                        # get what we need out of funcresults when it\n                        # returns.\n                        nbestperiods = funcresults['nbestperiods']\n                        nbestlspvals = funcresults['nbestlspvals']\n                        bestperiod = funcresults['bestperiod']\n\n                        # generate the periodogram png\n                        pgramres = yield self.executor.submit(\n                            _pkl_periodogram,\n                            funcresults,\n                        )\n\n                        # generate the phased LCs. we show these in the frontend\n                        # along with the periodogram.\n\n\n                        phasedlcargs0 = (None,\n                                         lspmethod,\n                                         -1,\n                                         lctoolargs[0],\n                                         lctoolargs[1],\n                                         lctoolargs[2],\n                                         nbestperiods[0],\n                                         'min')\n\n                        if len(nbestperiods) > 1:\n                            phasedlcargs1 = (None,\n                                             lspmethod,\n                                             -1,\n                                             lctoolargs[0],\n                                             lctoolargs[1],\n                                             lctoolargs[2],\n                                             nbestperiods[1],\n                                             'min')\n                        else:\n                            phasedlcargs1 = None\n\n\n                        if len(nbestperiods) > 2:\n                            phasedlcargs2 = (None,\n                                             lspmethod,\n                                             -1,\n                                             lctoolargs[0],\n                                             lctoolargs[1],\n                                             lctoolargs[2],\n                                             nbestperiods[2],\n                                             'min')\n                        else:\n                            phasedlcargs2 = None\n\n                        # here, we set a bestperiodhighlight to distinguish this\n                        # plot from the ones existing in the checkplot already\n                        phasedlckwargs = {\n                            'xliminsetmode':False,\n                            'magsarefluxes':lctoolkwargs['magsarefluxes'],\n                            'bestperiodhighlight':'#defa75',\n                        }\n\n                        # dispatch the plot functions\n                        phasedlc0 = yield self.executor.submit(\n                            _pkl_phased_magseries_plot,\n                            *phasedlcargs0,\n                            **phasedlckwargs\n                        )\n\n                        if phasedlcargs1 is not None:\n                            phasedlc1 = yield self.executor.submit(\n                                _pkl_phased_magseries_plot,\n                                *phasedlcargs1,\n                                **phasedlckwargs\n                            )\n                        else:\n                            phasedlc1 = None\n\n                        if phasedlcargs2 is not None:\n                            phasedlc2 = yield self.executor.submit(\n                                _pkl_phased_magseries_plot,\n                                *phasedlcargs2,\n                                **phasedlckwargs\n                            )\n                        else:\n                            phasedlc2 = None\n\n\n                        # save these to the tempcpdict\n                        # save the pickle only if readonly is not true\n                        if not self.readonly:\n\n                            tempcpdict[lspmethod] = {\n                                'periods':funcresults['periods'],\n                                'lspvals':funcresults['lspvals'],\n                                'bestperiod':funcresults['bestperiod'],\n                                'nbestperiods':funcresults['nbestperiods'],\n                                'nbestlspvals':funcresults['nbestlspvals'],\n                                'periodogram':(\n                                    pgramres[lspmethod]['periodogram']\n                                ),\n                                0:phasedlc0,\n                            }\n\n                            if phasedlc1 is not None:\n                                tempcpdict[lspmethod][1] = phasedlc1\n\n                            if phasedlc2 is not None:\n                                tempcpdict[lspmethod][2] = phasedlc2\n\n\n                            savekwargs = {\n                                'outfile':tempfpath,\n                                'protocol':pickle.HIGHEST_PROTOCOL\n                            }\n                            savedcpf = yield self.executor.submit(\n                                _write_checkplot_picklefile,\n                                tempcpdict,\n                                **savekwargs\n                            )\n\n                            LOGGER.info(\n                                'saved temp results from '\n                                '%s to checkplot: %s' %\n                                (lctool, savedcpf)\n                            )\n\n                        else:\n\n                            LOGGER.warning(\n                                'not saving temp results to checkplot '\n                                ' because readonly = True'\n                            )\n\n                        #\n                        # assemble the return dict\n                        #\n\n                        # the periodogram\n                        periodogram = pgramres[lspmethod]['periodogram']\n\n                        # phasedlc plot, period, and epoch for best 3 peaks\n                        phasedlc0plot = phasedlc0['plot']\n                        phasedlc0period = float(phasedlc0['period'])\n                        phasedlc0epoch = float(phasedlc0['epoch'])\n\n                        if phasedlc1 is not None:\n\n                            phasedlc1plot = phasedlc1['plot']\n                            phasedlc1period = float(phasedlc1['period'])\n                            phasedlc1epoch = float(phasedlc1['epoch'])\n\n                        if phasedlc2 is not None:\n\n                            phasedlc2plot = phasedlc2['plot']\n                            phasedlc2period = float(phasedlc2['period'])\n                            phasedlc2epoch = float(phasedlc2['epoch'])\n\n                        resultdict['status'] = 'success'\n                        resultdict['message'] = (\n                            'new results for %s' %\n                            lctool\n                        )\n                        resultdict['result'] = {\n                            'objectid':objectid,\n                            lspmethod:{\n                                'nbestperiods':nbestperiods,\n                                'nbestpeaks':nbestlspvals,\n                                'periodogram':periodogram,\n                                'bestperiod':bestperiod,\n                                'phasedlc0':{\n                                    'plot':phasedlc0plot,\n                                    'period':phasedlc0period,\n                                    'epoch':phasedlc0epoch,\n                                },\n                            }\n                        }\n\n                        if phasedlc1 is not None:\n                            resultdict['result'][lspmethod]['phasedlc1'] = {\n                                'plot':phasedlc1plot,\n                                'period':phasedlc1period,\n                                'epoch':phasedlc1epoch,\n                            }\n\n                        if phasedlc2 is not None:\n                            resultdict['result'][lspmethod]['phasedlc2'] = {\n                                'plot':phasedlc2plot,\n                                'period':phasedlc2period,\n                                'epoch':phasedlc2epoch,\n                            }\n\n\n                        # return to frontend\n                        self.write(resultdict)\n                        self.finish()\n\n\n                # if the lctool is a call to the phased LC plot itself\n                # this requires lots of parameters\n                # these should all be present in the frontend\n                elif lctool == 'phasedlc-newplot':\n\n                    lspmethod = lctoolargs[1]\n                    periodind = lctoolargs[2]\n\n                    # if we can return the results from a previous run\n                    if (not forcereload and lspmethod in tempcpdict and\n                        isinstance(tempcpdict[lspmethod], dict) and\n                        periodind in tempcpdict[lspmethod] and\n                        isinstance(tempcpdict[lspmethod][periodind], dict)):\n\n                        # we get phased LC at periodind from a previous run\n                        phasedlc = tempcpdict[lspmethod][periodind]\n\n                        LOGGER.warning(\n                            'returning previously unsaved '\n                            'results for lctool %s from %s' %\n                            (lctool, tempfpath)\n                        )\n\n                        #\n                        # assemble the returndict\n                        #\n\n                        resultdict['status'] = 'warning'\n                        resultdict['message'] = (\n                            'previous '\n                            'unsaved results from %s' %\n                            lctool\n                        )\n                        retkey = 'phasedlc%s' % periodind\n                        resultdict['result'] = {\n                            'objectid':objectid,\n                            lspmethod:{\n                                retkey:phasedlc\n                            }\n                        }\n\n                        self.write(resultdict)\n                        self.finish()\n\n                    # otherwise, we need to dispatch the function\n                    else:\n\n                        # add the highlight to distinguish this plot from usual\n                        # checkplot plots\n                        # full disclosure: http://c0ffee.surge.sh/\n                        lctoolkwargs['bestperiodhighlight'] = '#defa75'\n\n                        # set the input periodind to -1 to make sure we still\n                        # have the highlight on the plot. we use the correct\n                        # periodind when returning\n                        lctoolargs[2] = -1\n\n                        # see if sigclip is set. if so, then do the sigclip on\n                        # the times, mags, errs\n                        if lctoolkwargs['sigclip'] is not None:\n                            stimes, smags, serrs = lcmath.sigclip_magseries(\n                                lctoolargs[3],\n                                lctoolargs[4],\n                                lctoolargs[5],\n                                sigclip=lctoolkwargs['sigclip'],\n                                magsarefluxes=lctoolkwargs['magsarefluxes']\n                            )\n                        else:\n                            stimes, smags, serrs = (lctoolargs[3],\n                                                    lctoolargs[4],\n                                                    lctoolargs[5])\n\n\n                        #\n                        # process the LC filters now\n                        #\n\n                        # see if the lctimefilters are set\n                        if lctoolkwargs['lctimefilters']:\n\n                            wtimes, wmags, werrs = stimes, smags, serrs\n\n                            filtermasks = [\n                                np.full_like(wtimes, False, dtype=np.bool_)\n                            ]\n\n                            # parse the time filter strings\n                            filterstr = lctoolkwargs['lctimefilters']\n\n                            filters = filterstr.split(',')\n                            filters = [\n                                x.strip().lstrip('(').rstrip(')').strip()\n                                for x in filters\n                            ]\n\n                            for filt in filters:\n\n                                try:\n\n                                    thisfilt = filt.split(':')\n                                    if len(thisfilt) == 2:\n\n                                        filt_lo = float(thisfilt[0])\n                                        filt_hi = float(thisfilt[1])\n\n                                        filtermasks.append(\n                                            ((wtimes -\n                                              cptimes.min()) < filt_hi) &\n                                            ((wtimes -\n                                              cptimes.min()) > filt_lo)\n                                        )\n\n                                    elif (len(thisfilt) == 3 and\n                                          thisfilt[0].strip() == 'not'):\n\n\n                                        filt_lo = float(thisfilt[1])\n                                        filt_hi = float(thisfilt[2])\n\n                                        filtermasks.append(np.logical_not(\n                                            (((wtimes -\n                                               cptimes.min()) < filt_hi) &\n                                             ((wtimes -\n                                               cptimes.min()) > filt_lo))\n                                        ))\n\n                                    else:\n                                        continue\n\n                                except Exception as e:\n                                    continue\n\n                            # finally, apply the filters if applicable\n                            if len(filtermasks) > 0:\n\n                                # apply the filters using an OR\n                                filterind = np.column_stack(filtermasks)\n                                filterind = np.any(filterind, axis=1)\n\n                                stimes = wtimes[filterind]\n                                smags = wmags[filterind]\n                                serrs = werrs[filterind]\n\n\n                        # see if the lcmagfilters are set\n                        if lctoolkwargs['lcmagfilters']:\n\n                            wtimes, wmags, werrs = stimes, smags, serrs\n                            filtermasks = [\n                                np.full_like(wtimes, False, dtype=np.bool_)\n                            ]\n\n                            # parse the time filter strings\n                            filterstr = lctoolkwargs['lcmagfilters']\n\n                            filters = filterstr.split(',')\n                            filters = [\n                                x.strip().strip()\n                                for x in filters\n                            ]\n\n                            for filt in filters:\n\n                                try:\n\n                                    thisfilt = filt.split(':')\n                                    if len(thisfilt) == 2:\n\n                                        filt_lo = float(thisfilt[0])\n                                        filt_hi = float(thisfilt[1])\n\n                                        filtermasks.append(\n                                            (wmags < filt_hi) &\n                                            (wmags > filt_lo)\n                                        )\n\n                                    elif (len(thisfilt) == 3 and\n                                          thisfilt[0].strip() == 'not'):\n\n\n                                        filt_lo = float(thisfilt[1])\n                                        filt_hi = float(thisfilt[2])\n\n                                        filtermasks.append(np.logical_not(\n                                            ((wmags < filt_hi) &\n                                             (wmags > filt_lo))\n                                        ))\n\n                                    else:\n                                        continue\n\n                                except Exception as e:\n                                    continue\n\n                            # finally, apply the filters if applicable\n                            if len(filtermasks) > 0:\n\n                                # apply the filters using an OR\n                                filterind = np.column_stack(filtermasks)\n                                filterind = np.any(filterind, axis=1)\n\n                                stimes = wtimes[filterind]\n                                smags = wmags[filterind]\n                                serrs = werrs[filterind]\n\n                        # at the end of processing, remove from lctookwargs\n                        # since the pfmethod doesn't know about this\n                        del lctoolkwargs['lctimefilters']\n                        del lctoolkwargs['lcmagfilters']\n\n\n                        # if the varepoch is set to None, try to get the\n                        # minimum-light epoch using a spline fit\n                        if lctoolargs[-1] is None:\n                            LOGGER.warning(\n                                'automatically getting min epoch '\n                                'for phased LC plot'\n                            )\n                            try:\n                                spfit = lcfit.spline_fit_magseries(\n                                    stimes,         # times\n                                    smags,          # mags\n                                    serrs,          # errs\n                                    lctoolargs[6],  # period\n                                    magsarefluxes=lctoolkwargs['magsarefluxes'],\n                                    sigclip=None,\n                                    verbose=True\n                                )\n\n                                # set the epoch correctly now for the plot\n                                lctoolargs[-1] = spfit['fitinfo']['fitepoch']\n\n                                if len(spfit['fitinfo']['fitepoch']) != 1:\n                                    lctoolargs[-1] = (\n                                        spfit['fitinfo']['fitepoch'][0]\n                                    )\n\n                        # if the spline fit fails, use the minimum of times as\n                        # epoch as usual\n                            except Exception as e:\n\n                                LOGGER.exception(\n                                    'spline fit failed, '\n                                    'using min(times) as epoch'\n                                )\n\n                                lctoolargs[-1] = np.min(stimes)\n\n                        # now run the phased LC function with provided args,\n                        # kwargs\n\n                        # final times, mags, errs\n                        lctoolargs[3] = stimes\n                        lctoolargs[4] = smags\n                        lctoolargs[5] = serrs\n\n                        # the sigclip kwarg isn't used here since we did this\n                        # already earlier\n                        del lctoolkwargs['sigclip']\n\n                        lctoolfunction = CPTOOLMAP[lctool]['func']\n\n                        funcresults = yield self.executor.submit(\n                            lctoolfunction,\n                            *lctoolargs,\n                            **lctoolkwargs\n                        )\n\n                        # save these to the tempcpdict\n                        # save the pickle only if readonly is not true\n                        if not self.readonly:\n\n                            if (lspmethod in tempcpdict and\n                                isinstance(tempcpdict[lspmethod], dict)):\n\n                                if periodind in tempcpdict[lspmethod]:\n\n                                    tempcpdict[lspmethod][periodind] = (\n                                        funcresults\n                                    )\n\n                                else:\n\n                                    tempcpdict[lspmethod].update(\n                                        {periodind: funcresults}\n                                    )\n\n                            else:\n\n                                tempcpdict[lspmethod] = {periodind: funcresults}\n\n\n                            savekwargs = {\n                                'outfile':tempfpath,\n                                'protocol':pickle.HIGHEST_PROTOCOL\n                            }\n                            savedcpf = yield self.executor.submit(\n                                _write_checkplot_picklefile,\n                                tempcpdict,\n                                **savekwargs\n                            )\n\n                            LOGGER.info(\n                                'saved temp results from '\n                                '%s to checkplot: %s' %\n                                (lctool, savedcpf)\n                            )\n\n                        else:\n\n                            LOGGER.warning(\n                                'not saving temp results to checkplot '\n                                ' because readonly = True'\n                            )\n\n                        #\n                        # assemble the return dict\n                        #\n                        resultdict['status'] = 'success'\n                        resultdict['message'] = (\n                            'new results for %s' %\n                            lctool\n                        )\n                        retkey = 'phasedlc%s' % periodind\n                        resultdict['result'] = {\n                            'objectid':objectid,\n                            lspmethod:{\n                                retkey:funcresults\n                            }\n                        }\n\n                        self.write(resultdict)\n                        self.finish()\n\n\n                # if the lctool is var-varfeatures\n                elif lctool == 'var-varfeatures':\n\n                    # see if we can return results from a previous iteration of\n                    # this tool\n                    if (not forcereload and\n                        'varinfo' in tempcpdict and\n                        isinstance(tempcpdict['varinfo'], dict) and\n                        'varfeatures' in tempcpdict['varinfo'] and\n                        isinstance(tempcpdict['varinfo']['varfeatures'], dict)):\n\n                        LOGGER.warning(\n                            'returning previously unsaved '\n                            'results for lctool %s from %s' %\n                            (lctool, tempfpath)\n                        )\n\n                        #\n                        # assemble the returndict\n                        #\n\n                        resultdict['status'] = 'warning'\n                        resultdict['message'] = (\n                            'previous '\n                            'unsaved results from %s' %\n                            lctool\n                        )\n                        resultdict['result'] = {\n                            'objectid':objectid,\n                            'varinfo': {\n                                'varfeatures': (\n                                    tempcpdict['varinfo']['varfeatures']\n                                )\n                            }\n                        }\n\n                        self.write(resultdict)\n                        self.finish()\n\n                    # otherwise, we need to dispatch the function\n                    else:\n\n                        lctoolfunction = CPTOOLMAP[lctool]['func']\n                        funcresults = yield self.executor.submit(\n                            lctoolfunction,\n                            *lctoolargs,\n                            **lctoolkwargs\n                        )\n\n                        # save these to the tempcpdict\n                        # save the pickle only if readonly is not true\n                        if not self.readonly:\n\n                            if ('varinfo' in tempcpdict and\n                                isinstance(tempcpdict['varinfo'], dict)):\n\n                                if 'varfeatures' in tempcpdict['varinfo']:\n\n                                    tempcpdict['varinfo']['varfeatures'] = (\n                                        funcresults\n                                    )\n\n                                else:\n\n                                    tempcpdict['varinfo'].update(\n                                        {'varfeatures': funcresults}\n                                    )\n\n                            else:\n\n                                tempcpdict['varinfo'] = {'varfeatures':\n                                                         funcresults}\n\n\n                            savekwargs = {\n                                'outfile':tempfpath,\n                                'protocol':pickle.HIGHEST_PROTOCOL\n                            }\n                            savedcpf = yield self.executor.submit(\n                                _write_checkplot_picklefile,\n                                tempcpdict,\n                                **savekwargs\n                            )\n\n                            LOGGER.info(\n                                'saved temp results from '\n                                '%s to checkplot: %s' %\n                                (lctool, savedcpf)\n                            )\n\n                        else:\n\n                            LOGGER.warning(\n                                'not saving temp results to checkplot '\n                                ' because readonly = True'\n                            )\n\n                        #\n                        # assemble the return dict\n                        #\n                        resultdict['status'] = 'success'\n                        resultdict['message'] = (\n                            'new results for %s' %\n                            lctool\n                        )\n                        resultdict['result'] = {\n                            'objectid':objectid,\n                            'varinfo':{\n                                'varfeatures':funcresults\n                            }\n                        }\n\n                        self.write(resultdict)\n                        self.finish()\n\n\n                # if the lctool is var-prewhiten or var-masksig\n                elif lctool in ('var-prewhiten','var-masksig'):\n\n                    key1, key2 = resloc\n\n                    # see if we can return results from a previous iteration of\n                    # this tool\n                    if (not forcereload and\n                        key1 in tempcpdict and\n                        isinstance(tempcpdict[key1], dict) and\n                        key2 in tempcpdict[key1] and\n                        isinstance(tempcpdict[key1][key2], dict)):\n\n                        LOGGER.warning(\n                            'returning previously unsaved '\n                            'results for lctool %s from %s' %\n                            (lctool, tempfpath)\n                        )\n\n                        #\n                        # assemble the returndict\n                        #\n                        resultdict['status'] = 'warning'\n                        resultdict['message'] = (\n                            'previous '\n                            'unsaved results from %s' %\n                            lctool\n                        )\n                        resultdict['result'] = {\n                            'objectid':objectid,\n                            key1: {\n                                key2: (\n                                    tempcpdict[key1][key2]\n                                )\n                            }\n                        }\n\n                        self.write(resultdict)\n                        self.finish()\n\n                    # otherwise, we need to dispatch the function\n                    else:\n\n                        lctoolfunction = CPTOOLMAP[lctool]['func']\n\n                        # send in a stringio object for the fitplot kwarg\n                        lctoolkwargs['plotfit'] = StrIO()\n\n                        funcresults = yield self.executor.submit(\n                            lctoolfunction,\n                            *lctoolargs,\n                            **lctoolkwargs\n                        )\n\n                        # we turn the returned fitplotfile fd into a base64\n                        # encoded string after reading it\n                        fitfd = funcresults['fitplotfile']\n                        fitfd.seek(0)\n                        fitbin = fitfd.read()\n                        fitb64 = base64.b64encode(fitbin)\n                        fitfd.close()\n                        funcresults['fitplotfile'] = fitb64\n\n                        # save these to the tempcpdict\n                        # save the pickle only if readonly is not true\n                        if not self.readonly:\n\n                            if (key1 in tempcpdict and\n                                isinstance(tempcpdict[key1], dict)):\n\n                                if key2 in tempcpdict[key1]:\n\n                                    tempcpdict[key1][key2] = (\n                                        funcresults\n                                    )\n\n                                else:\n\n                                    tempcpdict[key1].update(\n                                        {key2: funcresults}\n                                    )\n\n                            else:\n\n                                tempcpdict[key1] = {key2: funcresults}\n\n\n                            savekwargs = {\n                                'outfile':tempfpath,\n                                'protocol':pickle.HIGHEST_PROTOCOL\n                            }\n                            savedcpf = yield self.executor.submit(\n                                _write_checkplot_picklefile,\n                                tempcpdict,\n                                **savekwargs\n                            )\n\n                            LOGGER.info(\n                                'saved temp results from '\n                                '%s to checkplot: %s' %\n                                (lctool, savedcpf)\n                            )\n\n                        else:\n\n                            LOGGER.warning(\n                                'not saving temp results to checkplot '\n                                ' because readonly = True'\n                            )\n\n                        #\n                        # assemble the return dict\n                        #\n                        # for this operation, we'll return:\n                        # - fitplotfile\n                        fitreturndict = {'fitplotfile':fitb64}\n\n                        resultdict['status'] = 'success'\n                        resultdict['message'] = (\n                            'new results for %s' %\n                            lctool\n                        )\n                        resultdict['result'] = {\n                            'objectid':objectid,\n                            key1:{\n                                key2:fitreturndict\n                            }\n                        }\n\n                        self.write(resultdict)\n                        self.finish()\n\n\n                # if the lctool is a lcfit method\n                elif lctool in ('lcfit-fourier',\n                                'lcfit-spline',\n                                'lcfit-legendre',\n                                'lcfit-savgol'):\n\n                    key1, key2 = resloc\n\n                    # see if we can return results from a previous iteration of\n                    # this tool\n                    if (not forcereload and\n                        key1 in tempcpdict and\n                        isinstance(tempcpdict[key1], dict) and\n                        key2 in tempcpdict[key1] and\n                        isinstance(tempcpdict[key1][key2], dict)):\n\n                        LOGGER.warning(\n                            'returning previously unsaved '\n                            'results for lctool %s from %s' %\n                            (lctool, tempfpath)\n                        )\n\n                        #\n                        # assemble the returndict\n                        #\n\n                        resultdict['status'] = 'warning'\n                        resultdict['message'] = (\n                            'previous '\n                            'unsaved results from %s' %\n                            lctool\n                        )\n\n                        # these are the full results\n                        phasedfitlc = tempcpdict[key1][key2]\n\n                        # we only want a few things from them\n                        fitresults = {\n                            'method':phasedfitlc['lcfit']['fittype'],\n                            'chisq':phasedfitlc['lcfit']['fitchisq'],\n                            'redchisq':phasedfitlc['lcfit']['fitredchisq'],\n                            'period':phasedfitlc['period'],\n                            'epoch':phasedfitlc['epoch'],\n                            'plot':phasedfitlc['plot'],\n                        }\n\n                        # add fitparams if there are any\n                        if ('finalparams' in phasedfitlc['lcfit']['fitinfo'] and\n                            phasedfitlc['lcfit']['fitinfo']['finalparams']\n                            is not None):\n                            fitresults['fitparams'] = (\n                                phasedfitlc['lcfit']['fitinfo']['finalparams']\n                            )\n\n                        # this is the final result object\n                        resultdict['result'] = {\n                            'objectid':objectid,\n                            key1: {\n                                key2: (\n                                    fitresults\n                                )\n                            }\n                        }\n\n                        self.write(resultdict)\n                        self.finish()\n\n                    # otherwise, we need to dispatch the function\n                    else:\n\n                        lctoolfunction = CPTOOLMAP[lctool]['func']\n\n                        funcresults = yield self.executor.submit(\n                            lctoolfunction,\n                            *lctoolargs,\n                            **lctoolkwargs\n                        )\n\n                        # now that we have the fit results, generate a fitplot.\n                        # these args are for the special fitplot mode of\n                        # _pkl_phased_magseries_plot\n                        phasedlcargs = (None,\n                                        'lcfit',\n                                        -1,\n                                        cptimes,\n                                        cpmags,\n                                        cperrs,\n                                        lctoolargs[3],  # this is the fit period\n                                        'min')\n\n                        # here, we set a bestperiodhighlight to distinguish this\n                        # plot from the ones existing in the checkplot already\n                        # also add the overplotfit information\n                        phasedlckwargs = {\n                            'xliminsetmode':False,\n                            'magsarefluxes':lctoolkwargs['magsarefluxes'],\n                            'bestperiodhighlight':'#defa75',\n                            'overplotfit':funcresults\n                        }\n\n                        # dispatch the plot function\n                        phasedlc = yield self.executor.submit(\n                            _pkl_phased_magseries_plot,\n                            *phasedlcargs,\n                            **phasedlckwargs\n                        )\n\n                        # save these to the tempcpdict\n                        # save the pickle only if readonly is not true\n                        if not self.readonly:\n\n                            if (key1 in tempcpdict and\n                                isinstance(tempcpdict[key1], dict)):\n\n                                if key2 in tempcpdict[key1]:\n\n                                    tempcpdict[key1][key2] = (\n                                        phasedlc\n                                    )\n\n                                else:\n\n                                    tempcpdict[key1].update(\n                                        {key2: phasedlc}\n                                    )\n\n                            else:\n\n                                tempcpdict[key1] = {key2: phasedlc}\n\n\n                            savekwargs = {\n                                'outfile':tempfpath,\n                                'protocol':pickle.HIGHEST_PROTOCOL\n                            }\n                            savedcpf = yield self.executor.submit(\n                                _write_checkplot_picklefile,\n                                tempcpdict,\n                                **savekwargs\n                            )\n\n                            LOGGER.info(\n                                'saved temp results from '\n                                '%s to checkplot: %s' %\n                                (lctool, savedcpf)\n                            )\n\n                        else:\n\n                            LOGGER.warning(\n                                'not saving temp results to checkplot '\n                                ' because readonly = True'\n                            )\n\n                        #\n                        # assemble the return dict\n                        #\n                        fitresults = {\n                            'method':phasedlc['lcfit']['fittype'],\n                            'chisq':phasedlc['lcfit']['fitchisq'],\n                            'redchisq':phasedlc['lcfit']['fitredchisq'],\n                            'period':phasedlc['period'],\n                            'epoch':phasedlc['epoch'],\n                            'plot':phasedlc['plot'],\n                        }\n\n                        # add fitparams if there are any\n                        if ('finalparams' in funcresults['fitinfo'] and\n                            funcresults['fitinfo']['finalparams'] is not None):\n                            fitresults['fitparams'] = (\n                                funcresults['fitinfo']['finalparams']\n                            )\n\n\n                        resultdict['status'] = 'success'\n                        resultdict['message'] = (\n                            'new results for %s' %\n                            lctool\n                        )\n                        resultdict['result'] = {\n                            'objectid':objectid,\n                            key1:{\n                                key2:fitresults\n                            }\n                        }\n\n                        self.write(resultdict)\n                        self.finish()\n\n\n                # if this is the special lcfit subtract tool\n                elif lctool == 'lcfit-subtract':\n\n                    fitmethod, periodind = lctoolargs\n\n                    # find the fit requested\n\n                    # subtract it from the cptimes, cpmags, cperrs\n\n                    # if not readonly, write back to cptimes, cpmags, cperrs\n\n                    # make a new phasedlc plot for the current periodind using\n                    # these new cptimes, cpmags, cperrs\n\n                    # return this plot\n\n\n                # if this is the special full reset tool\n                elif lctool == 'lctool-reset':\n\n                    if os.path.exists(tempfpath):\n                        os.remove(tempfpath)\n                        LOGGER.warning('reset all LC tool results '\n                                       'for %s by removing %s' %\n                                       (tempfpath, cpfpath))\n                        resultdict['status'] = 'success'\n                    else:\n                        resultdict['status'] = 'error'\n                        LOGGER.warning('tried to reset LC tool results for %s, '\n                                       'but temp checkplot result pickle %s '\n                                       'does not exist' %\n                                       (tempfpath, cpfpath))\n\n                    resultdict['message'] = (\n                        'all unsynced results for this object have been purged'\n                    )\n                    resultdict['result'] = {'objectid':cpobjectid}\n\n                    self.write(resultdict)\n                    self.finish()\n\n\n                # if this is the special load results tool\n                elif lctool == 'lctool-results':\n\n                    target = self.get_argument('resultsfor',None)\n\n                    if target is not None:\n\n                        target = xhtml_escape(target)\n\n                        # get rid of invalid targets\n                        if (target not in CPTOOLMAP or\n                            target == 'lctool-reset' or\n                            target == 'lctool-results' or\n                            target == 'phasedlc-newplot' or\n                            target == 'lcfit-subtract'):\n\n                            LOGGER.error(\"can't get results for %s\" % target)\n                            resultdict['status'] = 'error'\n                            resultdict['message'] = (\n                                \"can't get results for %s\" % target\n                            )\n                            resultdict['result'] = {'objectid':cpobjectid}\n\n                            self.write(resultdict)\n                            raise tornado.web.Finish()\n\n                        # if we're good to go, get the target location\n                        targetloc = CPTOOLMAP[target]['resloc']\n\n                        # first, search the cptempdict for this target\n                        # if found, return it\n\n                        # second, search the actual cpdict for this target\n                        # if found, return it\n\n                    # otherwise, we're being asked for everything\n                    # return the whole\n                    else:\n\n                        pass\n\n\n                # otherwise, this is an unrecognized lctool\n                else:\n\n                    LOGGER.error('lctool %s, does not exist' % lctool)\n                    resultdict['status'] = 'error'\n                    resultdict['message'] = (\n                        'lctool %s does not exist' % lctool\n                    )\n                    resultdict['result'] = {'objectid':cpobjectid}\n\n                    self.write(resultdict)\n                    raise tornado.web.Finish()\n\n            # if the cpfile doesn't exist\n            else:\n\n                LOGGER.error('could not find %s' % self.cpfile)\n\n                resultdict = {'status':'error',\n                              'message':\"This checkplot doesn't exist.\",\n                              'readonly':self.readonly,\n                              'result':None}\n\n                self.write(resultdict)\n                raise tornado.web.Finish()\n\n        # if no checkplot was provided to load\n        else:\n\n            resultdict = {'status':'error',\n                          'message':'No checkplot provided to load.',\n                          'readonly':self.readonly,\n                          'result':None}\n\n            self.write(resultdict)\n            raise tornado.web.Finish()", "response": "This is the method that handles a GET request to run a specific LC tool on a specific checkplot file."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _old_epd_diffmags(coeff, fsv, fdv, fkv, xcc, ycc, bgv, bge, mag):\n    '''\n    This calculates the difference in mags after EPD coefficients are\n    calculated.\n\n    final EPD mags = median(magseries) + epd_diffmags()\n\n    '''\n\n    return -(coeff[0]*fsv**2. +\n             coeff[1]*fsv +\n             coeff[2]*fdv**2. +\n             coeff[3]*fdv +\n             coeff[4]*fkv**2. +\n             coeff[5]*fkv +\n             coeff[6] +\n             coeff[7]*fsv*fdv +\n             coeff[8]*fsv*fkv +\n             coeff[9]*fdv*fkv +\n             coeff[10]*np.sin(2*np.pi*xcc) +\n             coeff[11]*np.cos(2*np.pi*xcc) +\n             coeff[12]*np.sin(2*np.pi*ycc) +\n             coeff[13]*np.cos(2*np.pi*ycc) +\n             coeff[14]*np.sin(4*np.pi*xcc) +\n             coeff[15]*np.cos(4*np.pi*xcc) +\n             coeff[16]*np.sin(4*np.pi*ycc) +\n             coeff[17]*np.cos(4*np.pi*ycc) +\n             coeff[18]*bgv +\n             coeff[19]*bge -\n             mag)", "response": "Calculates the difference in mags after EPD coefficients are calculated."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _old_epd_magseries(times, mags, errs,\n                       fsv, fdv, fkv, xcc, ycc, bgv, bge,\n                       epdsmooth_windowsize=21,\n                       epdsmooth_sigclip=3.0,\n                       epdsmooth_func=smooth_magseries_signal_medfilt,\n                       epdsmooth_extraparams=None):\n    '''\n    Detrends a magnitude series given in mag using accompanying values of S in\n    fsv, D in fdv, K in fkv, x coords in xcc, y coords in ycc, background in\n    bgv, and background error in bge. smooth is used to set a smoothing\n    parameter for the fit function. Does EPD voodoo.\n\n    '''\n\n    # find all the finite values of the magsnitude\n    finiteind = np.isfinite(mags)\n\n    # calculate median and stdev\n    mags_median = np.median(mags[finiteind])\n    mags_stdev = np.nanstd(mags)\n\n    # if we're supposed to sigma clip, do so\n    if epdsmooth_sigclip:\n        excludeind = abs(mags - mags_median) < epdsmooth_sigclip*mags_stdev\n        finalind = finiteind & excludeind\n    else:\n        finalind = finiteind\n\n    final_mags = mags[finalind]\n    final_len = len(final_mags)\n\n    # smooth the signal\n    if isinstance(epdsmooth_extraparams, dict):\n        smoothedmags = epdsmooth_func(final_mags,\n                                      epdsmooth_windowsize,\n                                      **epdsmooth_extraparams)\n    else:\n        smoothedmags = epdsmooth_func(final_mags, epdsmooth_windowsize)\n\n    # make the linear equation matrix\n    epdmatrix = np.c_[fsv[finalind]**2.0,\n                      fsv[finalind],\n                      fdv[finalind]**2.0,\n                      fdv[finalind],\n                      fkv[finalind]**2.0,\n                      fkv[finalind],\n                      np.ones(final_len),\n                      fsv[finalind]*fdv[finalind],\n                      fsv[finalind]*fkv[finalind],\n                      fdv[finalind]*fkv[finalind],\n                      np.sin(2*np.pi*xcc[finalind]),\n                      np.cos(2*np.pi*xcc[finalind]),\n                      np.sin(2*np.pi*ycc[finalind]),\n                      np.cos(2*np.pi*ycc[finalind]),\n                      np.sin(4*np.pi*xcc[finalind]),\n                      np.cos(4*np.pi*xcc[finalind]),\n                      np.sin(4*np.pi*ycc[finalind]),\n                      np.cos(4*np.pi*ycc[finalind]),\n                      bgv[finalind],\n                      bge[finalind]]\n\n    # solve the matrix equation [epdmatrix] . [x] = [smoothedmags]\n    # return the EPD differential magss if the solution succeeds\n    try:\n\n        coeffs, residuals, rank, singulars = lstsq(epdmatrix, smoothedmags,\n                                                   rcond=None)\n\n        if DEBUG:\n            print('coeffs = %s, residuals = %s' % (coeffs, residuals))\n\n\n        retdict = {'times':times,\n                   'mags':(mags_median +\n                           _old_epd_diffmags(coeffs, fsv, fdv,\n                                             fkv, xcc, ycc, bgv, bge, mags)),\n                   'errs':errs,\n                   'fitcoeffs':coeffs,\n                   'residuals':residuals}\n\n        return retdict\n\n    # if the solution fails, return nothing\n    except Exception as e:\n\n        LOGEXCEPTION('EPD solution did not converge')\n\n        retdict = {'times':times,\n                   'mags':np.full_like(mags, np.nan),\n                   'errs':errs,\n                   'fitcoeffs':coeffs,\n                   'residuals':residuals}\n\n        return retdict", "response": "This function is used to calculate the EPD for a magnitude series given in mag."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _epd_function(coeffs, fsv, fdv, fkv, xcc, ycc, bgv, bge, iha, izd):\n    '''\n    This is the EPD function to fit using a smoothed mag-series.\n\n    '''\n\n    return (coeffs[0]*fsv*fsv +\n            coeffs[1]*fsv +\n            coeffs[2]*fdv*fdv +\n            coeffs[3]*fdv +\n            coeffs[4]*fkv*fkv +\n            coeffs[5]*fkv +\n            coeffs[6] +\n            coeffs[7]*fsv*fdv +\n            coeffs[8]*fsv*fkv +\n            coeffs[9]*fdv*fkv +\n            coeffs[10]*np.sin(2*pi_value*xcc) +\n            coeffs[11]*np.cos(2*pi_value*xcc) +\n            coeffs[12]*np.sin(2*pi_value*ycc) +\n            coeffs[13]*np.cos(2*pi_value*ycc) +\n            coeffs[14]*np.sin(4*pi_value*xcc) +\n            coeffs[15]*np.cos(4*pi_value*xcc) +\n            coeffs[16]*np.sin(4*pi_value*ycc) +\n            coeffs[17]*np.cos(4*pi_value*ycc) +\n            coeffs[18]*bgv +\n            coeffs[19]*bge +\n            coeffs[20]*iha +\n            coeffs[21]*izd)", "response": "This function calculates the EPD function to fit using a smoothed mag - series."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _epd_residual(coeffs, mags, fsv, fdv, fkv, xcc, ycc, bgv, bge, iha, izd):\n    '''\n    This is the residual function to minimize using scipy.optimize.leastsq.\n\n    '''\n\n    f = _epd_function(coeffs, fsv, fdv, fkv, xcc, ycc, bgv, bge, iha, izd)\n    residual = mags - f\n    return residual", "response": "This function computes the residual function for the EPD decomposition."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef rfepd_magseries(times, mags, errs,\n                    externalparam_arrs,\n                    magsarefluxes=False,\n                    epdsmooth=True,\n                    epdsmooth_sigclip=3.0,\n                    epdsmooth_windowsize=21,\n                    epdsmooth_func=smooth_magseries_savgol,\n                    epdsmooth_extraparams=None,\n                    rf_subsample=1.0,\n                    rf_ntrees=300,\n                    rf_extraparams={'criterion':'mse',\n                                    'oob_score':False,\n                                    'n_jobs':-1}):\n    '''This uses a `RandomForestRegressor` to de-correlate the given magseries.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The input mag/flux time-series to run EPD on.\n\n    externalparam_arrs : list of np.arrays\n        This is a list of ndarrays of external parameters to decorrelate\n        against. These should all be the same size as `times`, `mags`, `errs`.\n\n    epdsmooth : bool\n        If True, sets the training LC for the RandomForestRegress to be a\n        smoothed version of the sigma-clipped light curve provided in `times`,\n        `mags`, `errs`.\n\n    epdsmooth_sigclip : float or int or sequence of two floats/ints or None\n        This specifies how to sigma-clip the input LC before smoothing it and\n        fitting the EPD function to it. The actual LC will not be sigma-clipped.\n\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    epdsmooth_windowsize : int\n        This is the number of LC points to smooth over to generate a smoothed\n        light curve that will be used to fit the EPD function.\n\n    epdsmooth_func : Python function\n        This sets the smoothing filter function to use. A Savitsky-Golay filter\n        is used to smooth the light curve by default. The functions that can be\n        used with this kwarg are listed in `varbase.trends`. If you want to use\n        your own function, it MUST have the following signature::\n\n                def smoothfunc(mags_array, window_size, **extraparams)\n\n        and return a numpy array of the same size as `mags_array` with the\n        smoothed time-series. Any extra params can be provided using the\n        `extraparams` dict.\n\n    epdsmooth_extraparams : dict\n        This is a dict of any extra filter params to supply to the smoothing\n        function.\n\n    rf_subsample : float\n        Defines the fraction of the size of the `mags` array to use for\n        training the random forest regressor.\n\n    rf_ntrees : int\n        This is the number of trees to use for the `RandomForestRegressor`.\n\n    rf_extraprams : dict\n        This is a dict of any extra kwargs to provide to the\n        `RandomForestRegressor` instance used.\n\n    Returns\n    -------\n\n    dict\n        Returns a dict with decorrelated mags and the usual info from the\n        `RandomForestRegressor`: variable importances, etc.\n\n    '''\n\n    # get finite times, mags, errs\n    finind = np.isfinite(times) & np.isfinite(mags) & np.isfinite(errs)\n    ftimes, fmags, ferrs = times[::][finind], mags[::][finind], errs[::][finind]\n    finalparam_arrs = []\n    for ep in externalparam_arrs:\n        finalparam_arrs.append(ep[::][finind])\n\n    stimes, smags, serrs, eparams = sigclip_magseries_with_extparams(\n        times, mags, errs,\n        externalparam_arrs,\n        sigclip=epdsmooth_sigclip,\n        magsarefluxes=magsarefluxes\n    )\n\n    # smoothing is optional for RFR because we train on a fraction of the mag\n    # series and so should not require a smoothed input to fit a function to\n    if epdsmooth:\n\n        # smooth the signal\n        if isinstance(epdsmooth_extraparams, dict):\n            smoothedmags = epdsmooth_func(smags,\n                                          epdsmooth_windowsize,\n                                          **epdsmooth_extraparams)\n        else:\n            smoothedmags = epdsmooth_func(smags,\n                                          epdsmooth_windowsize)\n\n    else:\n\n        smoothedmags = smags\n\n\n    # set up the regressor\n    if isinstance(rf_extraparams, dict):\n        RFR = RandomForestRegressor(n_estimators=rf_ntrees,\n                                    **rf_extraparams)\n    else:\n        RFR = RandomForestRegressor(n_estimators=rf_ntrees)\n\n    # collect the features\n    features = np.column_stack(eparams)\n\n    # fit, then generate the predicted values, then get corrected values\n\n    # we fit on a randomly selected subsample of all the mags\n    if rf_subsample < 1.0:\n        featureindices = np.arange(smoothedmags.size)\n\n        # these are sorted because time-order should be important\n        training_indices = np.sort(\n            npr.choice(featureindices,\n                       size=int(rf_subsample*smoothedmags.size),\n                       replace=False)\n        )\n    else:\n        training_indices = np.arange(smoothedmags.size)\n\n    RFR.fit(features[training_indices,:], smoothedmags[training_indices])\n\n    # predict on the full feature set\n    flux_corrections = RFR.predict(np.column_stack(finalparam_arrs))\n    corrected_fmags = npmedian(fmags) + fmags - flux_corrections\n\n    retdict = {'times':ftimes,\n               'mags':corrected_fmags,\n               'errs':ferrs,\n               'feature_importances':RFR.feature_importances_,\n               'regressor':RFR,\n               'mags_median':npmedian(corrected_fmags),\n               'mags_mad':npmedian(npabs(corrected_fmags -\n                                         npmedian(corrected_fmags)))}\n\n    return retdict", "response": "This function uses a RandomForestRegressor to de - correlate a set of mag - time - series times and mags."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef lcfit_features(times, mags, errs, period,\n                   fourierorder=5,\n                   # these are depth, duration, ingress duration\n                   transitparams=(-0.01,0.1,0.1),\n                   # these are depth, duration, depth ratio, secphase\n                   ebparams=(-0.2,0.3,0.7,0.5),\n                   sigclip=10.0,\n                   magsarefluxes=False,\n                   fitfailure_means_featurenan=False,\n                   verbose=True):\n    '''This calculates various features related to fitting models to light\n    curves.\n\n    This function:\n\n    - calculates `R_ij` and `phi_ij` ratios for Fourier fit amplitudes and\n      phases.\n\n    - calculates the reduced chi-sq for fourier, EB, and planet transit fits.\n\n    - calculates the reduced chi-sq for fourier, EB, planet transit fits w/2 x\n      period.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The input mag/flux time-series to calculate periodic features for.\n\n    period : float\n        The period of variabiity to use to phase the light curve.\n\n    fourierorder : int\n        The Fourier order to use to generate sinusoidal function and fit that to\n        the phased light curve.\n\n    transitparams : list of floats\n        The transit depth, duration, and ingress duration to use to generate a\n        trapezoid planet transit model fit to the phased light curve. The period\n        used is the one provided in `period`, while the epoch is automatically\n        obtained from a spline fit to the phased light curve.\n\n    ebparams : list of floats\n        The primary eclipse depth, eclipse duration, the primary-secondary depth\n        ratio, and the phase of the secondary eclipse to use to generate an\n        eclipsing binary model fit to the phased light curve. The period used is\n        the one provided in `period`, while the epoch is automatically obtained\n        from a spline fit to the phased light curve.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    magsarefluxes : bool\n        Set this to True if the input measurements in `mags` are actually\n        fluxes.\n\n    fitfailure_means_featurenan : bool\n        If the planet, EB and EBx2 fits don't return standard errors because the\n        covariance matrix could not be generated, then the fit is suspicious and\n        the features calculated can't be trusted. If\n        `fitfailure_means_featurenan` is True, then the output features for\n        these fits will be set to nan.\n\n    verbose : bool\n        If True, will indicate progress while working.\n\n    Returns\n    -------\n\n    dict\n        A dict of all the features calculated is returned.\n\n    '''\n\n    # get the finite values\n    finind = np.isfinite(times) & np.isfinite(mags) & np.isfinite(errs)\n    ftimes, fmags, ferrs = times[finind], mags[finind], errs[finind]\n\n    # get nonzero errors\n    nzind = np.nonzero(ferrs)\n    ftimes, fmags, ferrs = ftimes[nzind], fmags[nzind], ferrs[nzind]\n\n    # get the MAD of the unphased light curve\n    lightcurve_median = np.median(fmags)\n    lightcurve_mad = np.median(np.abs(fmags - lightcurve_median))\n\n    #\n    # fourier fit\n    #\n\n    # we fit a Fourier series to the light curve using the best period and\n    # extract the amplitudes and phases up to the 8th order to fit the LC. the\n    # various ratios of the amplitudes A_ij and the differences in the phases\n    # phi_ij are also used as periodic variability features\n\n    # do the fit\n    ffit = lcfit.fourier_fit_magseries(ftimes, fmags, ferrs, period,\n                                       fourierorder=fourierorder,\n                                       sigclip=sigclip,\n                                       magsarefluxes=magsarefluxes,\n                                       verbose=verbose)\n\n    # get the coeffs and redchisq\n    fourier_fitcoeffs = ffit['fitinfo']['finalparams']\n    fourier_chisq = ffit['fitchisq']\n    fourier_redchisq = ffit['fitredchisq']\n\n    if fourier_fitcoeffs is not None:\n\n        fourier_modelmags, _, _, fpmags, _ = sinusoidal.fourier_sinusoidal_func(\n            [period,\n             ffit['fitinfo']['fitepoch'],\n             ffit['fitinfo']['finalparams'][:fourierorder],\n             ffit['fitinfo']['finalparams'][fourierorder:]],\n            ftimes,\n            fmags,\n            ferrs\n        )\n\n        fourier_residuals = fourier_modelmags - fpmags\n        fourier_residual_median = np.median(fourier_residuals)\n        fourier_residual_mad = np.median(np.abs(fourier_residuals -\n                                                fourier_residual_median))\n\n\n        # break them out into amps and phases\n        famplitudes = fourier_fitcoeffs[:fourierorder]\n        fphases = fourier_fitcoeffs[fourierorder:]\n\n        famp_combos = combinations(famplitudes,2)\n        famp_cinds = combinations(range(len(famplitudes)),2)\n\n        fpha_combos = combinations(fphases,2)\n        fpha_cinds = combinations(range(len(fphases)),2)\n\n    else:\n\n        LOGERROR('LC fit to sinusoidal series model failed, '\n                 'using initial params')\n\n        initfourieramps = [0.6] + [0.2]*(fourierorder - 1)\n        initfourierphas = [0.1] + [0.1]*(fourierorder - 1)\n\n        fourier_modelmags, _, _, fpmags, _ = sinusoidal.fourier_sinusoidal_func(\n            [period,\n             ffit['fitinfo']['fitepoch'],\n             initfourieramps,\n             initfourierphas],\n            ftimes,\n            fmags,\n            ferrs\n        )\n\n        fourier_residuals = fourier_modelmags - fpmags\n        fourier_residual_median = np.median(fourier_residuals)\n        fourier_residual_mad = np.median(np.abs(fourier_residuals -\n                                                fourier_residual_median))\n\n        # break them out into amps and phases\n        famplitudes = initfourieramps\n        fphases = initfourierphas\n\n        famp_combos = combinations(famplitudes,2)\n        famp_cinds = combinations(range(len(famplitudes)),2)\n\n        fpha_combos = combinations(fphases,2)\n        fpha_cinds = combinations(range(len(fphases)),2)\n\n\n    fampratios = {}\n    fphadiffs = {}\n\n    # get the ratios for all fourier coeff combinations\n    for ampi, ampc, phai, phac in zip(famp_cinds,\n                                      famp_combos,\n                                      fpha_cinds,\n                                      fpha_combos):\n\n        ampratind = 'R_%s%s' % (ampi[1]+1, ampi[0]+1)\n        # this is R_ij\n        amprat = ampc[1]/ampc[0]\n        phadiffind = 'phi_%s%s' % (phai[1]+1, phai[0]+1)\n        # this is phi_ij\n        phadiff = phac[1] - phai[0]*phac[0]\n\n        fampratios[ampratind] = amprat\n        fphadiffs[phadiffind] = phadiff\n\n    # update the outdict for the Fourier fit results\n    outdict = {\n        'fourier_ampratios':fampratios,\n        'fourier_phadiffs':fphadiffs,\n        'fourier_fitparams':fourier_fitcoeffs,\n        'fourier_redchisq':fourier_redchisq,\n        'fourier_chisq':fourier_chisq,\n        'fourier_residual_median':fourier_residual_median,\n        'fourier_residual_mad':fourier_residual_mad,\n        'fourier_residual_mad_over_lcmad':fourier_residual_mad/lightcurve_mad\n    }\n\n    # EB and planet fits will find the epoch automatically\n    planetfitparams = [period,\n                       None,\n                       transitparams[0],\n                       transitparams[1],\n                       transitparams[2]]\n\n    ebfitparams = [period,\n                   None,\n                   ebparams[0],\n                   ebparams[1],\n                   ebparams[2],\n                   ebparams[3]]\n\n    # do the planet and EB fit with this period\n    planet_fit = lcfit.traptransit_fit_magseries(ftimes, fmags, ferrs,\n                                                 planetfitparams,\n                                                 sigclip=sigclip,\n                                                 magsarefluxes=magsarefluxes,\n                                                 verbose=verbose)\n\n    planetfit_finalparams = planet_fit['fitinfo']['finalparams']\n\n    planetfit_finalparamerrs = planet_fit['fitinfo']['finalparamerrs']\n\n    if planetfit_finalparamerrs is None and fitfailure_means_featurenan:\n\n        LOGWARNING('planet fit: no standard errors available '\n                   'for fit parameters, fit is bad, '\n                   'setting fit chisq and red-chisq to np.nan')\n        planetfit_chisq = np.nan\n        planetfit_redchisq = np.nan\n        planet_residual_median = np.nan\n        planet_residual_mad = np.nan\n        planet_residual_mad_over_lcmad = np.nan\n\n    else:\n\n        planetfit_chisq = planet_fit['fitchisq']\n        planetfit_redchisq = planet_fit['fitredchisq']\n\n        if planetfit_finalparams is not None:\n\n            planet_modelmags, _, _, ppmags, _ = transits.trapezoid_transit_func(\n                planetfit_finalparams,\n                ftimes,\n                fmags,\n                ferrs\n            )\n\n        else:\n\n            LOGERROR('LC fit to transit planet model '\n                     'failed, using initial params')\n            planet_modelmags, _, _, ppmags, _ = transits.trapezoid_transit_func(\n                planetfitparams,\n                ftimes,\n                fmags,\n                ferrs\n            )\n\n        planet_residuals = planet_modelmags - ppmags\n        planet_residual_median = np.median(planet_residuals)\n        planet_residual_mad = np.median(np.abs(planet_residuals -\n                                               planet_residual_median))\n        planet_residual_mad_over_lcmad = planet_residual_mad/lightcurve_mad\n\n\n    eb_fit = lcfit.gaussianeb_fit_magseries(ftimes, fmags, ferrs,\n                                            ebfitparams,\n                                            sigclip=sigclip,\n                                            magsarefluxes=magsarefluxes,\n                                            verbose=verbose)\n\n    ebfit_finalparams = eb_fit['fitinfo']['finalparams']\n    ebfit_finalparamerrs = eb_fit['fitinfo']['finalparamerrs']\n\n    if ebfit_finalparamerrs is None and fitfailure_means_featurenan:\n\n        LOGWARNING('EB fit: no standard errors available '\n                   'for fit parameters, fit is bad, '\n                   'setting fit chisq and red-chisq to np.nan')\n        ebfit_chisq = np.nan\n        ebfit_redchisq = np.nan\n        eb_residual_median = np.nan\n        eb_residual_mad = np.nan\n        eb_residual_mad_over_lcmad = np.nan\n\n    else:\n\n        ebfit_chisq = eb_fit['fitchisq']\n        ebfit_redchisq = eb_fit['fitredchisq']\n\n        if ebfit_finalparams is not None:\n\n            eb_modelmags, _, _, ebpmags, _ = eclipses.invgauss_eclipses_func(\n                ebfit_finalparams,\n                ftimes,\n                fmags,\n                ferrs\n            )\n\n        else:\n\n            LOGERROR('LC fit to EB model failed, using initial params')\n\n            eb_modelmags, _, _, ebpmags, _ = eclipses.invgauss_eclipses_func(\n                ebfitparams,\n                ftimes,\n                fmags,\n                ferrs\n            )\n\n        eb_residuals = eb_modelmags - ebpmags\n        eb_residual_median = np.median(eb_residuals)\n        eb_residual_mad = np.median(np.abs(eb_residuals - eb_residual_median))\n        eb_residual_mad_over_lcmad = eb_residual_mad/lightcurve_mad\n\n\n    # do the EB fit with 2 x period\n    ebfitparams[0] = ebfitparams[0]*2.0\n    eb_fitx2 = lcfit.gaussianeb_fit_magseries(ftimes, fmags, ferrs,\n                                              ebfitparams,\n                                              sigclip=sigclip,\n                                              magsarefluxes=magsarefluxes,\n                                              verbose=verbose)\n\n    ebfitx2_finalparams = eb_fitx2['fitinfo']['finalparams']\n    ebfitx2_finalparamerrs = eb_fitx2['fitinfo']['finalparamerrs']\n\n    if ebfitx2_finalparamerrs is None and fitfailure_means_featurenan:\n\n        LOGWARNING('EB x2 period fit: no standard errors available '\n                   'for fit parameters, fit is bad, '\n                   'setting fit chisq and red-chisq to np.nan')\n        ebfitx2_chisq = np.nan\n        ebfitx2_redchisq = np.nan\n        ebx2_residual_median = np.nan\n        ebx2_residual_mad = np.nan\n        ebx2_residual_mad_over_lcmad = np.nan\n\n    else:\n\n        ebfitx2_chisq = eb_fitx2['fitchisq']\n        ebfitx2_redchisq = eb_fitx2['fitredchisq']\n\n        if ebfitx2_finalparams is not None:\n\n            ebx2_modelmags, _, _, ebx2pmags, _ = (\n                eclipses.invgauss_eclipses_func(\n                    ebfitx2_finalparams,\n                    ftimes,\n                    fmags,\n                    ferrs\n                )\n            )\n\n        else:\n\n            LOGERROR('LC fit to EB model with 2xP failed, using initial params')\n\n            ebx2_modelmags, _, _, ebx2pmags, _ = (\n                eclipses.invgauss_eclipses_func(\n                    ebfitparams,\n                    ftimes,\n                    fmags,\n                    ferrs\n                )\n            )\n\n        ebx2_residuals = ebx2_modelmags - ebx2pmags\n        ebx2_residual_median = np.median(ebx2_residuals)\n        ebx2_residual_mad = np.median(np.abs(ebx2_residuals -\n                                             ebx2_residual_median))\n\n        ebx2_residual_mad_over_lcmad = ebx2_residual_mad/lightcurve_mad\n\n\n    # update the outdict\n    outdict.update({\n        'planet_fitparams':planetfit_finalparams,\n        'planet_chisq':planetfit_chisq,\n        'planet_redchisq':planetfit_redchisq,\n        'planet_residual_median':planet_residual_median,\n        'planet_residual_mad':planet_residual_mad,\n        'planet_residual_mad_over_lcmad':(\n            planet_residual_mad_over_lcmad,\n        ),\n        'eb_fitparams':ebfit_finalparams,\n        'eb_chisq':ebfit_chisq,\n        'eb_redchisq':ebfit_redchisq,\n        'eb_residual_median':eb_residual_median,\n        'eb_residual_mad':eb_residual_mad,\n        'eb_residual_mad_over_lcmad':(\n            eb_residual_mad_over_lcmad,\n        ),\n        'ebx2_fitparams':ebfitx2_finalparams,\n        'ebx2_chisq':ebfitx2_chisq,\n        'ebx2_redchisq':ebfitx2_redchisq,\n        'ebx2_residual_median':ebx2_residual_median,\n        'ebx2_residual_mad':ebx2_residual_mad,\n        'ebx2_residual_mad_over_lcmad':(\n            ebx2_residual_mad_over_lcmad,\n        ),\n    })\n\n    return outdict", "response": "This function calculates the set of features related to fitting models to light curve curves."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef phasedlc_features(times,\n                      mags,\n                      errs,\n                      period,\n                      nbrtimes=None,\n                      nbrmags=None,\n                      nbrerrs=None):\n    '''This calculates various phased LC features for the object.\n\n    Some of the features calculated here come from:\n\n    Kim, D.-W., Protopapas, P., Bailer-Jones, C. A. L., et al. 2014, Astronomy\n    and Astrophysics, 566, A43, and references therein (especially Richards, et\n    al. 2011).\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The input mag/flux time-series to calculate the phased LC features for.\n\n    period : float\n        The period used to phase the input mag/flux time-series.\n\n    nbrtimes,nbrmags,nbrerrs : np.array or None\n        If `nbrtimes`, `nbrmags`, and `nbrerrs` are all provided, they should be\n        ndarrays with `times`, `mags`, `errs` of this object's closest neighbor\n        (close within some small number x FWHM of telescope to check for\n        blending). This function will then also calculate extra features based\n        on the neighbor's phased LC using the `period` provided for the target\n        object.\n\n    Returns\n    -------\n\n    dict\n        Returns a dict with phased LC features.\n\n    '''\n    # get the finite values\n    finind = np.isfinite(times) & np.isfinite(mags) & np.isfinite(errs)\n    ftimes, fmags, ferrs = times[finind], mags[finind], errs[finind]\n\n    # get nonzero errors\n    nzind = np.nonzero(ferrs)\n    ftimes, fmags, ferrs = ftimes[nzind], fmags[nzind], ferrs[nzind]\n\n    # only operate on LC if enough points\n    if ftimes.size > 49:\n\n        # get the MAD of the unphased light curve\n        lightcurve_median = np.median(fmags)\n        lightcurve_mad = np.median(np.abs(fmags - lightcurve_median))\n\n        # get p2p for raw lightcurve\n        p2p_unphasedlc = lightcurve_ptp_measures(ftimes, fmags, ferrs)\n        inveta_unphasedlc = 1.0/p2p_unphasedlc['eta_normal']\n\n        # phase the light curve with the given period, assume epoch is\n        # times.min()\n        phasedlc = lcmath.phase_magseries_with_errs(ftimes, fmags, ferrs,\n                                                    period, ftimes.min(),\n                                                    wrap=False)\n\n        phase = phasedlc['phase']\n        pmags = phasedlc['mags']\n        perrs = phasedlc['errs']\n\n        # get ptp measures for best period\n        ptp_bestperiod = lightcurve_ptp_measures(phase,pmags,perrs)\n\n        # phase the light curve with the given periodx2, assume epoch is\n        # times.min()\n        phasedlc = lcmath.phase_magseries_with_errs(ftimes, fmags, ferrs,\n                                                    period*2.0, ftimes.min(),\n                                                    wrap=False)\n\n        phasex2 = phasedlc['phase']\n        pmagsx2 = phasedlc['mags']\n        perrsx2 = phasedlc['errs']\n\n\n        # get ptp measures for best periodx2\n        ptp_bestperiodx2 = lightcurve_ptp_measures(phasex2,pmagsx2,perrsx2)\n\n        # eta_phasedlc_bestperiod - calculate eta for the phased LC with best\n        # period\n        inveta_bestperiod = 1.0/ptp_bestperiod['eta_normal']\n\n        # eta_phasedlc_bestperiodx2 - calculate eta for the phased LC with best\n        #                             period x 2\n        inveta_bestperiodx2 = 1.0/ptp_bestperiodx2['eta_normal']\n\n\n        # eta_phased_ratio_eta_raw - eta for best period phased LC / eta for raw\n        # LC\n        inveta_ratio_phased_unphased = inveta_bestperiod/inveta_unphasedlc\n\n        # eta_phasedx2_ratio_eta_raw - eta for best periodx2 phased LC/eta for\n        # raw LC\n        inveta_ratio_phasedx2_unphased = inveta_bestperiodx2/inveta_unphasedlc\n\n\n        # freq_model_max_delta_mags - absval of magdiff btw model phased LC\n        #                             maxima using period x 2. look at points\n        #                             more than 10 points away for maxima\n        phasedx2_maxval_ind = argrelmax(pmagsx2, order=10)\n        if phasedx2_maxval_ind[0].size > 1:\n            phasedx2_magdiff_maxval = (\n                np.max(np.abs(np.diff(pmagsx2[phasedx2_maxval_ind[0]])))\n            )\n        else:\n            phasedx2_magdiff_maxval = np.nan\n\n        # freq_model_min_delta_mags - absval of magdiff btw model phased LC\n        #                             minima using period x 2. look at points\n        #                             more than 10 points away for minima\n        phasedx2_minval_ind = argrelmin(pmagsx2, order=10)\n        if phasedx2_minval_ind[0].size > 1:\n            phasedx2_magdiff_minval = (\n                np.max(np.abs(np.diff(pmagsx2[phasedx2_minval_ind[0]])))\n            )\n        else:\n            phasedx2_magdiff_minval = np.nan\n\n        # p2p_scatter_pfold_over_mad - MAD of successive absolute mag diffs of\n        #                              the phased LC using best period divided\n        #                              by the MAD of the unphased LC\n        phased_magdiff = np.diff(pmags)\n        phased_magdiff_median = np.median(phased_magdiff)\n        phased_magdiff_mad = np.median(np.abs(phased_magdiff -\n                                              phased_magdiff_median))\n\n        phasedx2_magdiff = np.diff(pmagsx2)\n        phasedx2_magdiff_median = np.median(phasedx2_magdiff)\n        phasedx2_magdiff_mad = np.median(np.abs(phasedx2_magdiff -\n                                                phasedx2_magdiff_median))\n\n        phased_magdiffmad_unphased_mad_ratio = phased_magdiff_mad/lightcurve_mad\n        phasedx2_magdiffmad_unphased_mad_ratio = (\n            phasedx2_magdiff_mad/lightcurve_mad\n        )\n\n        # get the percentiles of the slopes of the adjacent mags for phasedx2\n        phasedx2_slopes = np.diff(pmagsx2)/np.diff(phasex2)\n        phasedx2_slope_percentiles = np.ravel(np.nanpercentile(phasedx2_slopes,\n                                                               [10.0,90.0]))\n        phasedx2_slope_10percentile = phasedx2_slope_percentiles[0]\n        phasedx2_slope_90percentile = phasedx2_slope_percentiles[1]\n\n        # check if nbrtimes, _mags, _errs are available\n        if ((nbrtimes is not None) and\n            (nbrmags is not None) and\n            (nbrerrs is not None)):\n\n            # get the finite values\n            nfinind = (np.isfinite(nbrtimes) &\n                       np.isfinite(nbrmags) &\n                       np.isfinite(nbrerrs))\n            nftimes, nfmags, nferrs = (nbrtimes[nfinind],\n                                       nbrmags[nfinind],\n                                       nbrerrs[nfinind])\n\n            # get nonzero errors\n            nnzind = np.nonzero(nferrs)\n            nftimes, nfmags, nferrs = (nftimes[nnzind],\n                                       nfmags[nnzind],\n                                       nferrs[nnzind])\n\n            # only operate on LC if enough points\n            if nftimes.size > 49:\n\n                # get the phased light curve using the same period and epoch as\n                # the actual object\n                nphasedlc = lcmath.phase_magseries_with_errs(\n                    nftimes, nfmags, nferrs,\n                    period, ftimes.min(),\n                    wrap=False\n                )\n\n                # normalize the object and neighbor phased mags\n                norm_pmags = pmags - np.median(pmags)\n                norm_npmags = nphasedlc['mags'] - np.median(nphasedlc['mags'])\n\n                # phase bin them both so we can compare LCs easily\n                phabinned_objectlc = lcmath.phase_bin_magseries(phase,\n                                                                norm_pmags,\n                                                                minbinelems=1)\n                phabinned_nbrlc = lcmath.phase_bin_magseries(nphasedlc['phase'],\n                                                             norm_npmags,\n                                                             minbinelems=1)\n\n                absdiffs = []\n\n                for pha, phamag in zip(phabinned_objectlc['binnedphases'],\n                                       phabinned_objectlc['binnedmags']):\n\n                    try:\n\n                        # get the matching phase from the neighbor phased LC\n                        phadiffs = np.abs(pha - phabinned_nbrlc['binnedphases'])\n                        minphadiffind = np.where(\n                            (phadiffs < 1.0e-4) &\n                            (phadiffs == np.min(phadiffs))\n                        )\n                        absmagdiff = np.abs(\n                            phamag - phabinned_nbrlc['binnedmags'][\n                                minphadiffind\n                            ]\n                        )\n                        if absmagdiff.size > 0:\n                            absdiffs.append(absmagdiff.min())\n\n                    except Exception as e:\n                        continue\n\n                # sum of absdiff between the normalized to 0.0 phased LC of this\n                # object and that of the closest neighbor phased with the same\n                # period and epoch\n                if len(absdiffs) > 0:\n                    sum_nbr_phasedlc_magdiff = sum(absdiffs)\n                else:\n                    sum_nbr_phasedlc_magdiff = np.nan\n\n            else:\n\n                sum_nbr_phasedlc_magdiff = np.nan\n\n        else:\n            sum_nbr_phasedlc_magdiff = np.nan\n\n        return {\n            'inveta_unphasedlc':inveta_unphasedlc,\n            'inveta_bestperiod':inveta_bestperiod,\n            'inveta_bestperiodx2':inveta_bestperiodx2,\n            'inveta_ratio_phased_unphased':inveta_ratio_phased_unphased,\n            'inveta_ratio_phasedx2_unphased':inveta_ratio_phasedx2_unphased,\n            'phasedx2_magdiff_maxima':phasedx2_magdiff_maxval,\n            'phasedx2_magdiff_minina':phasedx2_magdiff_minval,\n            'phased_unphased_magdiff_mad_ratio':(\n                phased_magdiffmad_unphased_mad_ratio\n            ),\n            'phasedx2_unphased_magdiff_mad_ratio':(\n                phasedx2_magdiffmad_unphased_mad_ratio\n            ),\n            'phasedx2_slope_10percentile':phasedx2_slope_10percentile,\n            'phasedx2_slope_90percentile':phasedx2_slope_90percentile,\n            'sum_nbr_phasedlc_magdiff':sum_nbr_phasedlc_magdiff,\n        }\n\n    else:\n\n        return {\n            'inveta_unphasedlc':np.nan,\n            'inveta_bestperiod':np.nan,\n            'inveta_bestperiodx2':np.nan,\n            'inveta_ratio_phased_unphased':np.nan,\n            'inveta_ratio_phasedx2_unphased':np.nan,\n            'phasedx2_magdiff_maxima':np.nan,\n            'phasedx2_magdiff_minina':np.nan,\n            'phased_unphased_magdiff_mad_ratio':np.nan,\n            'phasedx2_unphased_magdiff_mad_ratio':np.nan,\n            'phasedx2_slope_10percentile':np.nan,\n            'phasedx2_slope_90percentile':np.nan,\n            'sum_nbr_phasedlc_magdiff':np.nan,\n        }", "response": "This function calculates the phased LC features for a given object."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ncalculate the Stellingwerf PDM theta value at a test frequency.", "response": "def stellingwerf_pdm_theta(times, mags, errs, frequency,\n                           binsize=0.05, minbin=9):\n    '''\n    This calculates the Stellingwerf PDM theta value at a test frequency.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The input time-series and associated errors.\n\n    frequency : float\n        The test frequency to calculate the theta statistic at.\n\n    binsize : float\n        The phase bin size to use.\n\n    minbin : int\n        The minimum number of items in a phase bin to consider in the\n        calculation of the statistic.\n\n    Returns\n    -------\n\n    theta_pdm : float\n        The value of the theta statistic at the specified `frequency`.\n\n\n    '''\n\n    period = 1.0/frequency\n    fold_time = times[0]\n\n    phased = phase_magseries(times,\n                             mags,\n                             period,\n                             fold_time,\n                             wrap=False,\n                             sort=True)\n\n    phases = phased['phase']\n    pmags = phased['mags']\n    bins = nparange(0.0, 1.0, binsize)\n\n    binnedphaseinds = npdigitize(phases, bins)\n\n    binvariances = []\n    binndets = []\n    goodbins = 0\n\n    for x in npunique(binnedphaseinds):\n\n        thisbin_inds = binnedphaseinds == x\n        thisbin_mags = pmags[thisbin_inds]\n\n        if thisbin_mags.size > minbin:\n            thisbin_variance = npvar(thisbin_mags,ddof=1)\n            binvariances.append(thisbin_variance)\n            binndets.append(thisbin_mags.size)\n            goodbins = goodbins + 1\n\n    # now calculate theta\n    binvariances = nparray(binvariances)\n    binndets = nparray(binndets)\n\n    theta_top = npsum(binvariances*(binndets - 1)) / (npsum(binndets) -\n                                                      goodbins)\n    theta_bot = npvar(pmags,ddof=1)\n    theta = theta_top/theta_bot\n\n    return theta"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef analytic_false_alarm_probability(lspinfo,\n                                     times,\n                                     conservative_nfreq_eff=True,\n                                     peakvals=None,\n                                     inplace=True):\n    '''This returns the analytic false alarm probabilities for periodogram\n    peak values.\n\n    FIXME: this doesn't actually work. Fix later.\n\n    The calculation follows that on page 3 of Zechmeister & Kurster (2009)::\n\n        FAP = 1 \u2212 [1 \u2212 Prob(z > z0)]**M\n\n    where::\n\n        M is the number of independent frequencies\n        Prob(z > z0) is the probability of peak with value > z0\n        z0 is the peak value we're evaluating\n\n    For PDM, the Prob(z > z0) is described by the beta distribution, according\n    to:\n\n    - Schwarzenberg-Czerny (1997;\n      https://ui.adsabs.harvard.edu/#abs/1997ApJ...489..941S)\n\n    - Zalian, Chadid, and Stellingwerf (2013;\n      http://adsabs.harvard.edu/abs/2014MNRAS.440...68Z)\n\n    This is given by::\n\n        beta( (N-B)/2, (B-1)/2; ((N-B)/(B-1))*theta_pdm )\n\n    Where::\n\n        N = number of observations\n        B = number of phase bins\n\n    This translates to a scipy.stats call to the beta distribution CDF::\n\n        x = ((N-B)/(B-1))*theta_pdm_best\n        prob_exceeds_val = scipy.stats.beta.cdf(x, (N-B)/2.0, (B-1.0)/2.0)\n\n    Which we can then plug into the false alarm prob eqn above with the\n    calculation of M.\n\n    Parameters\n    ----------\n\n    lspinfo : dict\n        The dict returned by the\n        :py:func:`~astrobase.periodbase.spdm.stellingwerf_pdm` function.\n\n    times : np.array\n        The times for which the periodogram result in ``lspinfo`` was\n        calculated.\n\n    conservative_nfreq_eff : bool\n        If True, will follow the prescription given in Schwarzenberg-Czerny\n        (2003):\n\n        http://adsabs.harvard.edu/abs/2003ASPC..292..383S\n\n        and estimate the effective number of independent frequences M_eff as::\n\n            min(N_obs, N_freq, DELTA_f/delta_f)\n\n    peakvals : sequence or None\n        The peak values for which to evaluate the false-alarm probability. If\n        None, will calculate this for each of the peak values in the\n        ``nbestpeaks`` key of the ``lspinfo`` dict.\n\n    inplace : bool\n        If True, puts the results of the FAP calculation into the ``lspinfo``\n        dict as a list available as ``lspinfo['falsealarmprob']``.\n\n    Returns\n    -------\n\n    list\n        The calculated false alarm probabilities for each of the peak values in\n        ``peakvals``.\n\n    '''\n\n    from scipy.stats import beta\n\n    frequencies = 1.0/lspinfo['periods']\n\n    M = independent_freq_count(frequencies,\n                               times,\n                               conservative=conservative_nfreq_eff)\n\n    if peakvals is None:\n        peakvals = lspinfo['nbestlspvals']\n\n    nphasebins = nparange(0.0, 1.0, lspinfo['kwargs']['phasebinsize']).size\n    ndet = times.size\n\n    false_alarm_probs = []\n\n    for peakval in peakvals:\n\n        prob_xval = ((ndet-nphasebins)/(nphasebins-1.0))*peakval\n        prob_exceeds_val = beta.cdf(prob_xval,\n                                    (ndet-nphasebins)/2.0,\n                                    (nphasebins-1.0)/2.0)\n        import ipdb; ipdb.set_trace()\n\n        false_alarm_probs.append(1.0 - (1.0 - prob_exceeds_val)**M)\n\n    if inplace:\n        lspinfo['falsealarmprob'] = false_alarm_probs\n\n    return false_alarm_probs", "response": "This function returns the analytic false alarm probability for the given set of time intervals and peak values."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef keplermag_to_sdssr(keplermag, kic_sdssg, kic_sdssr):\n    '''Converts magnitude measurements in Kepler band to SDSS r band.\n\n    Parameters\n    ----------\n\n    keplermag : float or array-like\n        The Kepler magnitude value(s) to convert to fluxes.\n\n    kic_sdssg,kic_sdssr : float or array-like\n        The SDSS g and r magnitudes of the object(s) from the Kepler Input\n        Catalog. The .llc.fits MAST light curve file for a Kepler object\n        contains these values in the FITS extension 0 header.\n\n    Returns\n    -------\n\n    float or array-like\n        SDSS r band magnitude(s) converted from the Kepler band magnitude.\n\n    '''\n    kic_sdssgr = kic_sdssg - kic_sdssr\n\n    if kic_sdssgr < 0.8:\n        kepsdssr = (keplermag - 0.2*kic_sdssg)/0.8\n    else:\n        kepsdssr = (keplermag - 0.1*kic_sdssg)/0.9\n    return kepsdssr", "response": "Converts magnitude measurements in Kepler band to SDSS r band."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef consolidate_kepler_fitslc(keplerid,\n                              lcfitsdir,\n                              normalize=True,\n                              headerkeys=LCHEADERKEYS,\n                              datakeys=LCDATAKEYS,\n                              sapkeys=LCSAPKEYS,\n                              pdckeys=LCPDCKEYS,\n                              topkeys=LCTOPKEYS,\n                              apkeys=LCAPERTUREKEYS):\n    '''This gets all Kepler/K2 light curves for the given `keplerid`\n    in `lcfitsdir`.\n\n    Searches recursively in `lcfitsdir` for all of the files belonging to the\n    specified `keplerid`. Sorts the light curves by time. Returns an\n    `lcdict`. This is meant to be used to consolidate light curves for a single\n    object across Kepler quarters.\n\n    NOTE: `keplerid` is an integer (without the leading zeros). This is usually\n    the KIC ID.\n\n    NOTE: if light curve time arrays contain `nans`, these and their associated\n    measurements will be sorted to the end of the final combined arrays.\n\n    Parameters\n    ----------\n\n    keplerid : int\n        The Kepler ID of the object to consolidate LCs for, as an integer\n        without any leading zeros. This is usually the KIC or EPIC ID.\n\n    lcfitsdir : str\n        The directory to look in for LCs of the specified object.\n\n    normalize : bool\n        If True, then each component light curve's SAP_FLUX and PDCSAP_FLUX\n        measurements will be normalized to 1.0 by dividing out the median flux\n        for the component light curve.\n\n    headerkeys : list\n        A list of FITS header keys that will be extracted from the FITS light\n        curve file. These describe the observations. The default value for this\n        is given in `LCHEADERKEYS` above.\n\n    datakeys : list\n        A list of FITS column names that correspond to the auxiliary\n        measurements in the light curve. The default is `LCDATAKEYS` above.\n\n    sapkeys : list\n        A list of FITS column names that correspond to the SAP flux\n        measurements in the light curve. The default is `LCSAPKEYS` above.\n\n    pdckeys : list\n        A list of FITS column names that correspond to the PDC flux\n        measurements in the light curve. The default is `LCPDCKEYS` above.\n\n    topkeys : list\n        A list of FITS header keys that describe the object in the light\n        curve. The default is `LCTOPKEYS` above.\n\n    apkeys : list\n        A list of FITS header keys that describe the flux measurement apertures\n        used by the Kepler/K2 pipeline. The default is `LCAPERTUREKEYS` above.\n\n    Returns\n    -------\n\n    lcdict\n        Returns an `lcdict` (this is useable by most astrobase functions for LC\n        processing).\n\n    '''\n\n    LOGINFO('looking for Kepler light curve FITS in %s for %s...' % (lcfitsdir,\n                                                                     keplerid))\n\n    # for Python 3.5 and up, use recursive glob, it appears to be absurdly\n    # faster than os.walk\n    if sys.version_info[:2] > (3,4):\n\n        matching = glob.glob(os.path.join(lcfitsdir,\n                                          '**',\n                                          'kplr%09i-*_llc.fits' % keplerid),\n                             recursive=True)\n        LOGINFO('found %s files: %s' % (len(matching), repr(matching)))\n\n    # for Python < 3.5, use os.walk and glob\n    else:\n\n        # use the os.walk function to start looking for files in lcfitsdir\n        walker = os.walk(lcfitsdir)\n        matching = []\n        for root, dirs, _files in walker:\n            for sdir in dirs:\n                searchpath = os.path.join(root,\n                                          sdir,\n                                          'kplr%09i-*_llc.fits' % keplerid)\n                foundfiles = glob.glob(searchpath)\n\n                if foundfiles:\n                    matching.extend(foundfiles)\n                    LOGINFO('found %s in dir: %s' % (repr(foundfiles),\n                                                     os.path.join(root,sdir)))\n\n    # now that we've found everything, read them all in\n    if len(matching) > 0:\n\n        LOGINFO('consolidating...')\n\n        # the first file\n        consolidated = read_kepler_fitslc(matching[0],\n                                          headerkeys=headerkeys,\n                                          datakeys=datakeys,\n                                          sapkeys=sapkeys,\n                                          pdckeys=pdckeys,\n                                          topkeys=topkeys,\n                                          apkeys=apkeys,\n                                          normalize=normalize)\n\n\n        # get the rest of the files\n        for lcf in matching:\n            consolidated = read_kepler_fitslc(lcf,\n                                              appendto=consolidated,\n                                              headerkeys=headerkeys,\n                                              datakeys=datakeys,\n                                              sapkeys=sapkeys,\n                                              pdckeys=pdckeys,\n                                              topkeys=topkeys,\n                                              apkeys=apkeys,\n                                              normalize=normalize)\n\n        # get the sort indices\n        # we use time for the columns and quarters for the headers\n        LOGINFO('sorting by time...')\n\n        # NOTE: nans in time will be sorted to the end of the array\n        finiteind = npisfinite(consolidated['time'])\n        if npsum(finiteind) < consolidated['time'].size:\n            LOGWARNING('some time values are nan! '\n                       'measurements at these times will be '\n                       'sorted to the end of the column arrays.')\n\n        # get the sort index\n        column_sort_ind = npargsort(consolidated['time'])\n\n        # sort the columns by time\n        for col in consolidated['columns']:\n            if '.' in col:\n                key, subkey = col.split('.')\n                consolidated[key][subkey] = (\n                    consolidated[key][subkey][column_sort_ind]\n                )\n            else:\n                consolidated[col] = consolidated[col][column_sort_ind]\n\n        # now sort the headers by quarters\n        header_sort_ind = npargsort(consolidated['quarter']).tolist()\n\n        # this is a bit convoluted, but whatever: list -> array -> list\n\n        for key in ('quarter', 'season', 'datarelease', 'obsmode'):\n            consolidated[key] = (\n                nparray(consolidated[key])[header_sort_ind].tolist()\n            )\n\n        for key in ('timesys','bjdoffset','exptime','lcaperture',\n                    'aperpixused','aperpixunused','pixarcsec',\n                    'channel','skygroup','module','output','ndet'):\n            consolidated['lcinfo'][key] = (\n                nparray(consolidated['lcinfo'][key])[header_sort_ind].tolist()\n            )\n\n        for key in ('cdpp3_0','cdpp6_0','cdpp12_0','pdcvar','pdcmethod',\n                    'aper_target_total_ratio','aper_target_frac'):\n            consolidated['varinfo'][key] = (\n                nparray(consolidated['varinfo'][key])[header_sort_ind].tolist()\n            )\n\n        # finally, return the consolidated lcdict\n        return consolidated\n\n    # if we didn't find anything, complain\n    else:\n\n        LOGERROR('could not find any light curves '\n                 'for %s in %s or its subdirectories' % (keplerid,\n                                                         lcfitsdir))\n        return None", "response": "This function returns a dict that associates all of the light curves for a given Kepler object with the given Kepler object."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _epd_residual(coeffs, fluxes, xcc, ycc, bgv, bge):\n    '''This is the residual function to minimize using scipy.optimize.leastsq.\n\n    Parameters\n    ----------\n\n    coeffs : array-like of floats\n        Contains the EPD coefficients that will be used to generate the EPD fit\n        function.\n\n    fluxes : array-like\n        The flux measurement array being used.\n\n    xcc,ycc : array-like\n        Arrays of the x and y coordinates associated with each measurement in\n        `fluxes`.\n\n    bgv,bge : array-like\n        Arrays of the flux background value and the flux background error\n        associated with each measurement in `fluxes`.\n\n    Returns\n    -------\n\n    np.array\n        Contains the fit function residual evaluated at each flux measurement\n        value.\n\n    '''\n\n    f = _epd_function(coeffs, fluxes, xcc, ycc, bgv, bge)\n    residual = fluxes - f\n    return residual", "response": "This function computes the residual function to minimize using scipy. optimize. leastsq."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef rfepd_kepler_lightcurve(\n        lcdict,\n        xccol='mom_centr1',\n        yccol='mom_centr2',\n        timestoignore=None,\n        filterflags=True,\n        writetodict=True,\n        epdsmooth=23,\n        decorr='xcc,ycc',\n        nrftrees=200\n):\n    '''This uses a `RandomForestRegressor` to fit and decorrelate Kepler light\n    curves.\n\n    Fits the X and Y positions, the background, and background error.\n\n    By default, this function removes points in the Kepler LC that have ANY\n    quality flags set.\n\n    Parameters\n    ----------\n\n    lcdict : lcdict\n        An `lcdict` produced by `consolidate_kepler_fitslc` or\n        `read_kepler_fitslc`.\n\n    xcol,ycol : str\n        Indicates the x and y coordinate column names to use from the Kepler LC\n        in the EPD fit.\n\n    timestoignore : list of tuples\n        This is of the form::\n\n            [(time1_start, time1_end), (time2_start, time2_end), ...]\n\n        and indicates the start and end times to mask out of the final\n        lcdict. Use this to remove anything that wasn't caught by the quality\n        flags.\n\n    filterflags : bool\n        If True, will remove any measurements that have non-zero quality flags\n        present. This usually indicates an issue with the instrument or\n        spacecraft.\n\n    writetodict : bool\n        If writetodict is True, adds the following columns to the lcdict::\n\n            rfepd_time = time array\n            rfepd_sapflux = uncorrected flux before EPD\n            rfepd_epdsapflux = corrected flux after EPD\n            rfepd_epdsapcorr = EPD flux corrections\n            rfepd_bkg = background array\n            rfepd_bkg_err = background errors array\n            rfepd_xcc = xcoord array\n            rfepd_ycc = ycoord array\n            rfepd_quality = quality flag array\n\n        and updates the 'columns' list in the lcdict as well.\n\n    epdsmooth : int\n        Sets the number of light curve points to smooth over when generating the\n        EPD fit function.\n\n    decorr : {'xcc,ycc','bgv,bge','xcc,ycc,bgv,bge'}\n        Indicates whether to use the x,y coords alone; background value and\n        error alone; or x,y coords and background value, error in combination as\n        the features to training the `RandomForestRegressor` on and perform the\n        fit.\n\n    nrftrees : int\n        The number of trees to use in the `RandomForestRegressor`.\n\n    Returns\n    -------\n\n    tuple\n        Returns a tuple of the form: (times, corrected_fluxes, flux_corrections)\n\n    '''\n\n    times, fluxes, background, background_err = (\n        lcdict['time'],\n        lcdict['sap']['sap_flux'],\n        lcdict['sap']['sap_bkg'],\n        lcdict['sap']['sap_bkg_err']\n    )\n    xcc = lcdict[xccol]\n    ycc = lcdict[yccol]\n    flags = lcdict['sap_quality']\n\n    # filter all bad LC points as noted by quality flags\n    if filterflags:\n\n        nbefore = times.size\n\n        filterind = flags == 0\n\n        times = times[filterind]\n        fluxes = fluxes[filterind]\n        background = background[filterind]\n        background_err = background_err[filterind]\n        xcc = xcc[filterind]\n        ycc = ycc[filterind]\n        flags = flags[filterind]\n\n        nafter = times.size\n        LOGINFO('applied quality flag filter, ndet before = %s, '\n                'ndet after = %s'\n                % (nbefore, nafter))\n\n\n    # remove nans\n    find = (npisfinite(xcc) & npisfinite(ycc) &\n            npisfinite(times) & npisfinite(fluxes) &\n            npisfinite(background) & npisfinite(background_err))\n\n    nbefore = times.size\n\n    times = times[find]\n    fluxes = fluxes[find]\n    background = background[find]\n    background_err = background_err[find]\n    xcc = xcc[find]\n    ycc = ycc[find]\n    flags = flags[find]\n\n    nafter = times.size\n    LOGINFO('removed nans, ndet before = %s, ndet after = %s'\n            % (nbefore, nafter))\n\n\n    # exclude all times in timestoignore\n    if (timestoignore and\n        isinstance(timestoignore, list) and\n        len(timestoignore) > 0):\n\n        exclind = npfull_like(times,True)\n\n        nbefore = times.size\n\n        # apply all the masks\n        for ignoretime in timestoignore:\n            time0, time1 = ignoretime[0], ignoretime[1]\n            thismask = (times > time0) & (times < time1)\n            exclind = exclind & thismask\n\n        # quantities after masks have been applied\n        times = times[exclind]\n        fluxes = fluxes[exclind]\n        background = background[exclind]\n        background_err = background_err[exclind]\n        xcc = xcc[exclind]\n        ycc = ycc[exclind]\n        flags = flags[exclind]\n\n        nafter = times.size\n        LOGINFO('removed timestoignore, ndet before = %s, ndet after = %s'\n                % (nbefore, nafter))\n\n\n    # now that we're all done, we can do EPD\n\n    # set up the regressor\n    RFR = RandomForestRegressor(n_estimators=nrftrees)\n\n    if decorr == 'xcc,ycc,bgv,bge':\n        # collect the features and target variable\n        features = npcolumn_stack((xcc,ycc,background,background_err))\n    elif decorr == 'xcc,ycc':\n        # collect the features and target variable\n        features = npcolumn_stack((xcc,ycc))\n    elif decorr == 'bgv,bge':\n        # collect the features and target variable\n        features = npcolumn_stack((background,background_err))\n    else:\n        LOGERROR(\"couldn't understand decorr, not decorrelating...\")\n        return None\n\n    # smooth the light curve\n    if epdsmooth:\n        smoothedfluxes = median_filter(fluxes, size=epdsmooth)\n    else:\n        smoothedfluxes = fluxes\n\n    # fit, then generate the predicted values, then get corrected values\n    RFR.fit(features, smoothedfluxes)\n    flux_corrections = RFR.predict(features)\n    corrected_fluxes = npmedian(fluxes) + fluxes - flux_corrections\n\n    # remove the random forest to save RAM\n    del RFR\n\n    # write these to the dictionary if requested\n    if writetodict:\n\n        lcdict['rfepd'] = {}\n        lcdict['rfepd']['time'] = times\n        lcdict['rfepd']['sapflux'] = fluxes\n        lcdict['rfepd']['epdsapflux'] = corrected_fluxes\n        lcdict['rfepd']['epdsapcorr'] = flux_corrections\n        lcdict['rfepd']['bkg'] = background\n        lcdict['rfepd']['bkg_err'] = background_err\n        lcdict['rfepd']['xcc'] = xcc\n        lcdict['rfepd']['ycc'] = ycc\n        lcdict['rfepd']['quality'] = flags\n\n        for newcol in ['rfepd.time','rfepd.sapflux',\n                       'rfepd.epdsapflux','rfepd.epdsapcorr',\n                       'rfepd.bkg','rfepd.bkg.err',\n                       'rfepd.xcc','rfepd.ycc',\n                       'rfepd.quality']:\n\n            if newcol not in lcdict['columns']:\n                lcdict['columns'].append(newcol)\n\n    return times, corrected_fluxes, flux_corrections", "response": "This function uses a RandomForestRegressor to fit and decorrelate Kepler lightcurve curves."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _get_legendre_deg_ctd(npts):\n    '''This is a helper function for centroid detrending.\n\n    '''\n\n    from scipy.interpolate import interp1d\n\n    degs = nparray([4,5,6,10,15])\n    pts = nparray([1e2,3e2,5e2,1e3,3e3])\n    fn = interp1d(pts, degs, kind='linear',\n                  bounds_error=False,\n                  fill_value=(min(degs), max(degs)))\n    legendredeg = int(npfloor(fn(npts)))\n\n    return legendredeg", "response": "This is a helper function for centroid detrending.\n   "}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef parallel_timebin(lclist,\n                     binsizesec,\n                     maxobjects=None,\n                     outdir=None,\n                     lcformat='hat-sql',\n                     lcformatdir=None,\n                     timecols=None,\n                     magcols=None,\n                     errcols=None,\n                     minbinelems=7,\n                     nworkers=NCPUS,\n                     maxworkertasks=1000):\n    '''This time-bins all the LCs in the list using the specified bin size.\n\n    Parameters\n    ----------\n\n    lclist : list of str\n        The input LCs to process.\n\n    binsizesec : float\n        The time bin size to use in seconds.\n\n    maxobjects : int or None\n        If provided, LC processing will stop at `lclist[maxobjects]`.\n\n    outdir : str or None\n        The directory where output LCs will be written. If None, will write to\n        the same directory as the input LCs.\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curve file.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    timecols,magcols,errcols : lists of str\n        The keys in the lcdict produced by your light curve reader function that\n        correspond to the times, mags/fluxes, and associated measurement errors\n        that will be used as inputs to the binning process. If these are None,\n        the default values for `timecols`, `magcols`, and `errcols` for your\n        light curve format will be used here.\n\n    minbinelems : int\n        The minimum number of time-bin elements required to accept a time-bin as\n        valid for the output binned light curve.\n\n    nworkers : int\n        Number of parallel workers to launch.\n\n    maxworkertasks : int\n        The maximum number of tasks a parallel worker will complete before being\n        replaced to guard against memory leaks.\n\n    Returns\n    -------\n\n    dict\n        The returned dict contains keys = input LCs, vals = output LCs.\n\n    '''\n\n    if outdir and not os.path.exists(outdir):\n        os.mkdir(outdir)\n\n    if maxobjects is not None:\n        lclist = lclist[:maxobjects]\n\n    tasks = [(x, binsizesec, {'outdir':outdir,\n                              'lcformat':lcformat,\n                              'lcformatdir':lcformatdir,\n                              'timecols':timecols,\n                              'magcols':magcols,\n                              'errcols':errcols,\n                              'minbinelems':minbinelems}) for x in lclist]\n\n    pool = mp.Pool(nworkers, maxtasksperchild=maxworkertasks)\n    results = pool.map(timebinlc_worker, tasks)\n    pool.close()\n    pool.join()\n\n    resdict = {os.path.basename(x):y for (x,y) in zip(lclist, results)}\n\n    return resdict", "response": "This function processes the given list of LCs and returns a list of LCs in the output LCs in parallel with the specified time - bin size."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef parallel_timebin_lcdir(lcdir,\n                           binsizesec,\n                           maxobjects=None,\n                           outdir=None,\n                           lcformat='hat-sql',\n                           lcformatdir=None,\n                           timecols=None,\n                           magcols=None,\n                           errcols=None,\n                           minbinelems=7,\n                           nworkers=NCPUS,\n                           maxworkertasks=1000):\n    '''\n    This time bins all the light curves in the specified directory.\n\n    Parameters\n    ----------\n\n    lcdir : list of str\n        Directory containing the input LCs to process.\n\n    binsizesec : float\n        The time bin size to use in seconds.\n\n    maxobjects : int or None\n        If provided, LC processing will stop at `lclist[maxobjects]`.\n\n    outdir : str or None\n        The directory where output LCs will be written. If None, will write to\n        the same directory as the input LCs.\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curve file.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    timecols,magcols,errcols : lists of str\n        The keys in the lcdict produced by your light curve reader function that\n        correspond to the times, mags/fluxes, and associated measurement errors\n        that will be used as inputs to the binning process. If these are None,\n        the default values for `timecols`, `magcols`, and `errcols` for your\n        light curve format will be used here.\n\n    minbinelems : int\n        The minimum number of time-bin elements required to accept a time-bin as\n        valid for the output binned light curve.\n\n    nworkers : int\n        Number of parallel workers to launch.\n\n    maxworkertasks : int\n        The maximum number of tasks a parallel worker will complete before being\n        replaced to guard against memory leaks.\n\n    Returns\n    -------\n\n    dict\n        The returned dict contains keys = input LCs, vals = output LCs.\n\n    '''\n    try:\n        formatinfo = get_lcformat(lcformat,\n                                  use_lcformat_dir=lcformatdir)\n        if formatinfo:\n            (fileglob, readerfunc,\n             dtimecols, dmagcols, derrcols,\n             magsarefluxes, normfunc) = formatinfo\n        else:\n            LOGERROR(\"can't figure out the light curve format\")\n            return None\n    except Exception as e:\n        LOGEXCEPTION(\"can't figure out the light curve format\")\n        return None\n\n    lclist = sorted(glob.glob(os.path.join(lcdir, fileglob)))\n\n    return parallel_timebin(lclist,\n                            binsizesec,\n                            maxobjects=maxobjects,\n                            outdir=outdir,\n                            lcformat=lcformat,\n                            timecols=timecols,\n                            magcols=magcols,\n                            errcols=errcols,\n                            minbinelems=minbinelems,\n                            nworkers=nworkers,\n                            maxworkertasks=maxworkertasks)", "response": "This function processes the given directory of time - bins and returns a list of str objects that are available in the output directory."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _varfeatures_worker(task):\n    '''\n    This wraps varfeatures.\n\n    '''\n\n    try:\n        (lcfile, outdir, timecols, magcols, errcols,\n         mindet, lcformat, lcformatdir) = task\n        return get_varfeatures(lcfile, outdir,\n                               timecols=timecols,\n                               magcols=magcols,\n                               errcols=errcols,\n                               mindet=mindet,\n                               lcformat=lcformat,\n                               lcformatdir=lcformatdir)\n\n    except Exception as e:\n        return None", "response": "This wraps varfeatures.\n    This wraps varfeatures.\n   "}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef serial_varfeatures(lclist,\n                       outdir,\n                       maxobjects=None,\n                       timecols=None,\n                       magcols=None,\n                       errcols=None,\n                       mindet=1000,\n                       lcformat='hat-sql',\n                       lcformatdir=None):\n    '''This runs variability feature extraction for a list of LCs.\n\n    Parameters\n    ----------\n\n    lclist : list of str\n        The list of light curve file names to process.\n\n    outdir : str\n        The directory where the output varfeatures pickle files will be written.\n\n    maxobjects : int\n        The number of LCs to process from `lclist`.\n\n    timecols : list of str or None\n        The timecol keys to use from the lcdict in calculating the features.\n\n    magcols : list of str or None\n        The magcol keys to use from the lcdict in calculating the features.\n\n    errcols : list of str or None\n        The errcol keys to use from the lcdict in calculating the features.\n\n    mindet : int\n        The minimum number of LC points required to generate variability\n        features.\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curves specified in `basedir` or `use_list_of_filenames`.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    Returns\n    -------\n\n    list of str\n        List of the generated variability features pickles for the input LCs,\n        with results for each magcol in the input `magcol` or light curve\n        format's default `magcol` list.\n\n    '''\n\n    if maxobjects:\n        lclist = lclist[:maxobjects]\n\n    tasks = [(x, outdir, timecols, magcols, errcols,\n              mindet, lcformat, lcformatdir)\n             for x in lclist]\n\n    for task in tqdm(tasks):\n        result = _varfeatures_worker(task)\n\n    return result", "response": "This function runs the variability feature extraction for a list of LCs and outputs a list of varfeatures pickles for each magcol in the input LCs."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef parallel_varfeatures(lclist,\n                         outdir,\n                         maxobjects=None,\n                         timecols=None,\n                         magcols=None,\n                         errcols=None,\n                         mindet=1000,\n                         lcformat='hat-sql',\n                         lcformatdir=None,\n                         nworkers=NCPUS):\n    '''This runs variable feature extraction in parallel for all LCs in `lclist`.\n\n    Parameters\n    ----------\n\n    lclist : list of str\n        The list of light curve file names to process.\n\n    outdir : str\n        The directory where the output varfeatures pickle files will be written.\n\n    maxobjects : int\n        The number of LCs to process from `lclist`.\n\n    timecols : list of str or None\n        The timecol keys to use from the lcdict in calculating the features.\n\n    magcols : list of str or None\n        The magcol keys to use from the lcdict in calculating the features.\n\n    errcols : list of str or None\n        The errcol keys to use from the lcdict in calculating the features.\n\n    mindet : int\n        The minimum number of LC points required to generate variability\n        features.\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curves specified in `basedir` or `use_list_of_filenames`.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    nworkers : int\n        The number of parallel workers to launch.\n\n    Returns\n    -------\n\n    dict\n        A dict with key:val pairs of input LC file name : the generated\n        variability features pickles for each of the input LCs, with results for\n        each magcol in the input `magcol` or light curve format's default\n        `magcol` list.\n\n    '''\n    # make sure to make the output directory if it doesn't exist\n    if not os.path.exists(outdir):\n        os.makedirs(outdir)\n\n    if maxobjects:\n        lclist = lclist[:maxobjects]\n\n    tasks = [(x, outdir, timecols, magcols, errcols, mindet,\n              lcformat, lcformatdir) for x in lclist]\n\n    with ProcessPoolExecutor(max_workers=nworkers) as executor:\n        resultfutures = executor.map(varfeatures_worker, tasks)\n\n    results = [x for x in resultfutures]\n    resdict = {os.path.basename(x):y for (x,y) in zip(lclist, results)}\n\n    return resdict", "response": "This function runs varfeatures extraction in parallel for all LCs in lclist."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef parallel_varfeatures_lcdir(lcdir,\n                               outdir,\n                               fileglob=None,\n                               maxobjects=None,\n                               timecols=None,\n                               magcols=None,\n                               errcols=None,\n                               recursive=True,\n                               mindet=1000,\n                               lcformat='hat-sql',\n                               lcformatdir=None,\n                               nworkers=NCPUS):\n    '''This runs parallel variable feature extraction for a directory of LCs.\n\n    Parameters\n    ----------\n\n    lcdir : str\n        The directory of light curve files to process.\n\n    outdir : str\n        The directory where the output varfeatures pickle files will be written.\n\n    fileglob : str or None\n        The file glob to use when looking for light curve files in `lcdir`. If\n        None, the default file glob associated for this LC format will be used.\n\n    maxobjects : int\n        The number of LCs to process from `lclist`.\n\n    timecols : list of str or None\n        The timecol keys to use from the lcdict in calculating the features.\n\n    magcols : list of str or None\n        The magcol keys to use from the lcdict in calculating the features.\n\n    errcols : list of str or None\n        The errcol keys to use from the lcdict in calculating the features.\n\n    mindet : int\n        The minimum number of LC points required to generate variability\n        features.\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curves specified in `basedir` or `use_list_of_filenames`.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    nworkers : int\n        The number of parallel workers to launch.\n\n    Returns\n    -------\n\n    dict\n        A dict with key:val pairs of input LC file name : the generated\n        variability features pickles for each of the input LCs, with results for\n        each magcol in the input `magcol` or light curve format's default\n        `magcol` list.\n\n    '''\n\n    try:\n        formatinfo = get_lcformat(lcformat,\n                                  use_lcformat_dir=lcformatdir)\n        if formatinfo:\n            (dfileglob, readerfunc,\n             dtimecols, dmagcols, derrcols,\n             magsarefluxes, normfunc) = formatinfo\n        else:\n            LOGERROR(\"can't figure out the light curve format\")\n            return None\n    except Exception as e:\n        LOGEXCEPTION(\"can't figure out the light curve format\")\n        return None\n\n    if not fileglob:\n        fileglob = dfileglob\n\n    # now find the files\n    LOGINFO('searching for %s light curves in %s ...' % (lcformat, lcdir))\n\n    if recursive is False:\n        matching = glob.glob(os.path.join(lcdir, fileglob))\n\n    else:\n        # use recursive glob for Python 3.5+\n        if sys.version_info[:2] > (3,4):\n\n            matching = glob.glob(os.path.join(lcdir,\n                                              '**',\n                                              fileglob),\n                                 recursive=True)\n\n        # otherwise, use os.walk and glob\n        else:\n\n            # use os.walk to go through the directories\n            walker = os.walk(lcdir)\n            matching = []\n\n            for root, dirs, _files in walker:\n                for sdir in dirs:\n                    searchpath = os.path.join(root,\n                                              sdir,\n                                              fileglob)\n                    foundfiles = glob.glob(searchpath)\n\n                    if foundfiles:\n                        matching.extend(foundfiles)\n\n\n    # now that we have all the files, process them\n    if matching and len(matching) > 0:\n\n        LOGINFO('found %s light curves, getting varfeatures...' %\n                len(matching))\n\n        return parallel_varfeatures(matching,\n                                    outdir,\n                                    maxobjects=maxobjects,\n                                    timecols=timecols,\n                                    magcols=magcols,\n                                    errcols=errcols,\n                                    mindet=mindet,\n                                    lcformat=lcformat,\n                                    lcformatdir=lcformatdir,\n                                    nworkers=nworkers)\n\n    else:\n\n        LOGERROR('no light curve files in %s format found in %s' % (lcformat,\n                                                                    lcdir))\n        return None", "response": "This function runs the parallel varfeatures extraction for a directory of LCs."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef checkplot_pickle_to_png(\n        checkplotin,\n        outfile,\n        extrarows=None\n):\n    '''This reads the checkplot pickle or dict provided, and writes out a PNG.\n\n    The output PNG contains most of the information in the input checkplot\n    pickle/dict, and can be used to quickly glance through the highlights\n    instead of having to review the checkplot with the `checkplotserver`\n    webapp. This is useful for exporting read-only views of finalized checkplots\n    from the `checkplotserver` as well, to share them with other people.\n\n    The PNG has 4 x N tiles::\n\n        [    finder    ] [  objectinfo  ] [ varinfo/comments ] [ unphased LC  ]\n        [ periodogram1 ] [ phased LC P1 ] [   phased LC P2   ] [ phased LC P3 ]\n        [ periodogram2 ] [ phased LC P1 ] [   phased LC P2   ] [ phased LC P3 ]\n                                         .\n                                         .\n        [ periodogramN ] [ phased LC P1 ] [ phased LC P2 ] [ phased LC P3 ]\n\n    for N independent period-finding methods producing:\n\n    - periodogram1,2,3...N: the periodograms from each method\n\n    - phased LC P1,P2,P3: the phased lightcurves using the best 3 peaks in each\n      periodogram\n\n    Parameters\n    ----------\n\n    checkplotin : dict or str\n        This is either a checkplotdict produced by\n        :py:func:`astrobase.checkplot.pkl.checkplot_dict` or a checkplot pickle\n        file produced by :py:func:`astrobase.checkplot.pkl.checkplot_pickle`.\n\n    outfile : str\n        The filename of the output PNG file to create.\n\n    extrarows : list of tuples\n        This is a list of 4-element tuples containing paths to PNG files that\n        will be added to the end of the rows generated from the checkplotin\n        pickle/dict. Each tuple represents a row in the final output PNG\n        file. If there are less than 4 elements per tuple, the missing elements\n        will be filled in with white-space. If there are more than 4 elements\n        per tuple, only the first four will be used.\n\n        The purpose of this kwarg is to incorporate periodograms and phased LC\n        plots (in the form of PNGs) generated from an external period-finding\n        function or program (like VARTOOLS) to allow for comparison with\n        astrobase results.\n\n        NOTE: the PNG files specified in `extrarows` here will be added to those\n        already present in the input checkplotdict['externalplots'] if that is\n        None because you passed in a similar list of external plots to the\n        :py:func:`astrobase.checkplot.pkl.checkplot_pickle` function earlier. In\n        this case, `extrarows` can be used to add even more external plots if\n        desired.\n\n        Each external plot PNG will be resized to 750 x 480 pixels to fit into\n        an output image cell.\n\n        By convention, each 4-element tuple should contain:\n\n        - a periodiogram PNG\n        - phased LC PNG with 1st best peak period from periodogram\n        - phased LC PNG with 2nd best peak period from periodogram\n        - phased LC PNG with 3rd best peak period from periodogram\n\n        Example of extrarows::\n\n            [('/path/to/external/bls-periodogram.png',\n              '/path/to/external/bls-phasedlc-plot-bestpeak.png',\n              '/path/to/external/bls-phasedlc-plot-peak2.png',\n              '/path/to/external/bls-phasedlc-plot-peak3.png'),\n             ('/path/to/external/pdm-periodogram.png',\n              '/path/to/external/pdm-phasedlc-plot-bestpeak.png',\n              '/path/to/external/pdm-phasedlc-plot-peak2.png',\n              '/path/to/external/pdm-phasedlc-plot-peak3.png'),\n            ...]\n\n    Returns\n    -------\n\n    str\n        The absolute path to the generated checkplot PNG.\n\n    '''\n\n    # figure out if the checkplotpickle is a filename\n    # python 3\n    if sys.version_info[:2] > (3,2):\n\n        if (isinstance(checkplotin, str) and os.path.exists(checkplotin)):\n            cpd = _read_checkplot_picklefile(checkplotin)\n        elif isinstance(checkplotin, dict):\n            cpd = checkplotin\n        else:\n            LOGERROR('checkplotin: %s of type %s is not a '\n                     'valid checkplot filename (or does not exist), or a dict' %\n                     (os.path.abspath(checkplotin), type(checkplotin)))\n            return None\n\n    # check for unicode in python 2.7\n    else:\n\n        # get the current checkplotdict\n        if ((isinstance(checkplotin, str) or\n             isinstance(checkplotin, unicode)) and\n            os.path.exists(checkplotin)):\n            cpd = _read_checkplot_picklefile(checkplotin)\n        elif isinstance(checkplotin,dict):\n            cpd = checkplotin\n        else:\n            LOGERROR('checkplotin: %s of type %s is not a '\n                     'valid checkplot filename (or does not exist), or a dict' %\n                     (os.path.abspath(checkplotin), type(checkplotin)))\n            return None\n\n    # figure out the dimensions of the output png\n    # each cell is 750 x 480 pixels\n    # a row is made of four cells\n    # - the first row is for object info\n    # - the rest are for periodograms and phased LCs, one row per method\n    # if there are more than three phased LC plots per method, we'll only plot 3\n    if 'pfmethods' in cpd:\n        cplspmethods = cpd['pfmethods']\n    else:\n        cplspmethods = []\n        for pfm in METHODSHORTLABELS:\n            if pfm in cpd:\n                cplspmethods.append(pfm)\n\n\n    cprows = len(cplspmethods)\n\n    # add in any extra rows from neighbors\n    if 'neighbors' in cpd and cpd['neighbors'] and len(cpd['neighbors']) > 0:\n        nbrrows = len(cpd['neighbors'])\n    else:\n        nbrrows = 0\n\n    # add in any extra rows from keyword arguments\n    if extrarows and len(extrarows) > 0:\n        erows = len(extrarows)\n    else:\n        erows = 0\n\n    # add in any extra rows from the checkplot dict\n    if ('externalplots' in cpd and\n        cpd['externalplots'] and\n        len(cpd['externalplots']) > 0):\n        cpderows = len(cpd['externalplots'])\n    else:\n        cpderows = 0\n\n    totalwidth = 3000\n    totalheight = 480 + (cprows + erows + nbrrows + cpderows)*480\n\n    # this is the output PNG\n    outimg = Image.new('RGBA',(totalwidth, totalheight),(255,255,255,255))\n\n    # now fill in the rows of the output png. we'll use Pillow to build up the\n    # output image from the already stored plots and stuff in the checkplot\n    # dict.\n\n    ###############################\n    # row 1, cell 1: finder chart #\n    ###############################\n\n    if cpd['finderchart']:\n        finder = Image.open(\n            _base64_to_file(cpd['finderchart'], None, writetostrio=True)\n        )\n        bigfinder = finder.resize((450,450), Image.ANTIALIAS)\n        outimg.paste(bigfinder,(150,20))\n\n    #####################################\n    # row 1, cell 2: object information #\n    #####################################\n\n    # find the font we need from the package data\n    fontpath = os.path.abspath(\n        os.path.join(os.path.dirname(__file__),\n                     '..',\n                     'cpserver',\n                     'cps-assets',\n                     'DejaVuSans.ttf')\n    )\n    # load the font\n    if os.path.exists(fontpath):\n        cpfontnormal = ImageFont.truetype(fontpath, 20)\n        cpfontlarge = ImageFont.truetype(fontpath, 28)\n    else:\n        LOGWARNING('could not find bundled '\n                   'DejaVu Sans font in the astrobase package '\n                   'data, using ugly defaults...')\n        cpfontnormal = ImageFont.load_default()\n        cpfontlarge = ImageFont.load_default()\n\n    # the image draw object\n    objinfodraw = ImageDraw.Draw(outimg)\n\n    # write out the object information\n\n    # objectid\n    objinfodraw.text(\n        (625, 25),\n        cpd['objectid'] if cpd['objectid'] else 'no objectid',\n        font=cpfontlarge,\n        fill=(0,0,255,255)\n    )\n    # twomass id\n    if 'twomassid' in cpd['objectinfo']:\n        objinfodraw.text(\n            (625, 60),\n            ('2MASS J%s' % cpd['objectinfo']['twomassid']\n             if cpd['objectinfo']['twomassid']\n             else ''),\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n    # ndet\n    if 'ndet' in cpd['objectinfo']:\n        objinfodraw.text(\n            (625, 85),\n            ('LC points: %s' % cpd['objectinfo']['ndet']\n             if cpd['objectinfo']['ndet'] is not None\n             else ''),\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n    else:\n        objinfodraw.text(\n            (625, 85),\n            ('LC points: %s' % cpd['magseries']['times'].size),\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n    # coords and PM\n    objinfodraw.text(\n        (625, 125),\n        ('Coords and PM'),\n        font=cpfontnormal,\n        fill=(0,0,0,255)\n    )\n    if 'ra' in cpd['objectinfo'] and 'decl' in cpd['objectinfo']:\n        objinfodraw.text(\n            (900, 125),\n            (('RA, Dec: %.3f, %.3f' %\n              (cpd['objectinfo']['ra'], cpd['objectinfo']['decl']))\n             if (cpd['objectinfo']['ra'] is not None and\n                 cpd['objectinfo']['decl'] is not None)\n             else ''),\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n    else:\n        objinfodraw.text(\n            (900, 125),\n            'RA, Dec: nan, nan',\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n\n    if 'propermotion' in cpd['objectinfo']:\n        objinfodraw.text(\n            (900, 150),\n            (('Total PM: %.5f mas/yr' % cpd['objectinfo']['propermotion'])\n             if (cpd['objectinfo']['propermotion'] is not None)\n             else ''),\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n    else:\n        objinfodraw.text(\n            (900, 150),\n            'Total PM: nan',\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n\n    if 'rpmj' in cpd['objectinfo']:\n        objinfodraw.text(\n            (900, 175),\n            (('Reduced PM [Jmag]: %.3f' % cpd['objectinfo']['rpmj'])\n             if (cpd['objectinfo']['rpmj'] is not None)\n             else ''),\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n    else:\n        objinfodraw.text(\n            (900, 175),\n            'Reduced PM [Jmag]: nan',\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n\n    # here, we have to deal with two generations of objectinfo dicts\n\n    # first, deal with the new generation of objectinfo dicts\n    if 'available_dereddened_bands' in cpd['objectinfo']:\n\n        #\n        # first, we deal with the bands and mags\n        #\n        # magnitudes\n        objinfodraw.text(\n            (625, 200),\n            'Magnitudes',\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n\n        # process the various bands\n        # if dereddened mags aren't available, use the observed mags\n        if len(cpd['objectinfo']['available_bands']) > 0:\n\n            # we'll get all the available mags\n            for bandind, band, label in zip(\n                    range(len(cpd['objectinfo']['available_bands'])),\n                    cpd['objectinfo']['available_bands'],\n                    cpd['objectinfo']['available_band_labels']\n            ):\n\n                thisbandmag = cpd['objectinfo'][band]\n\n                # we'll draw stuff in three rows depending on the number of\n                # bands we have to use\n                if bandind in (0,1,2,3,4):\n\n                    thispos = (900+125*bandind, 200)\n\n                    objinfodraw.text(\n                        thispos,\n                        '%s: %.3f' % (label, thisbandmag),\n                        font=cpfontnormal,\n                        fill=(0,0,0,255)\n                    )\n\n                elif bandind in (5,6,7,8,9):\n\n                    rowbandind = bandind - 5\n\n                    thispos = (900+125*rowbandind, 225)\n\n                    objinfodraw.text(\n                        thispos,\n                        '%s: %.3f' % (label, thisbandmag),\n                        font=cpfontnormal,\n                        fill=(0,0,0,255)\n                    )\n\n                else:\n\n                    rowbandind = bandind - 10\n\n                    thispos = (900+125*rowbandind, 250)\n\n                    objinfodraw.text(\n                        thispos,\n                        '%s: %.3f' % (label, thisbandmag),\n                        font=cpfontnormal,\n                        fill=(0,0,0,255)\n                    )\n\n\n        #\n        # next, deal with the colors\n        #\n        # colors\n        if ('dereddened' in cpd['objectinfo'] and\n            cpd['objectinfo']['dereddened'] is True):\n            deredlabel = \"(dereddened)\"\n        else:\n            deredlabel = \"\"\n\n        objinfodraw.text(\n            (625, 275),\n            'Colors %s' % deredlabel,\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n\n        if len(cpd['objectinfo']['available_colors']) > 0:\n\n            # we'll get all the available mags (dereddened versions preferred)\n            for colorind, color, colorlabel in zip(\n                    range(len(cpd['objectinfo']['available_colors'])),\n                    cpd['objectinfo']['available_colors'],\n                    cpd['objectinfo']['available_color_labels']\n            ):\n\n                thiscolor = cpd['objectinfo'][color]\n\n                # we'll draw stuff in three rows depending on the number of\n                # bands we have to use\n                if colorind in (0,1,2,3,4):\n\n                    thispos = (900+150*colorind, 275)\n\n                    objinfodraw.text(\n                        thispos,\n                        '%s: %.3f' % (colorlabel, thiscolor),\n                        font=cpfontnormal,\n                        fill=(0,0,0,255)\n                    )\n\n                elif colorind in (5,6,7,8,9):\n\n                    thisrowind = colorind - 5\n                    thispos = (900+150*thisrowind, 300)\n\n                    objinfodraw.text(\n                        thispos,\n                        '%s: %.3f' % (colorlabel, thiscolor),\n                        font=cpfontnormal,\n                        fill=(0,0,0,255)\n                    )\n\n                elif colorind in (10,11,12,13,14):\n\n                    thisrowind = colorind - 10\n                    thispos = (900+150*thisrowind, 325)\n\n                    objinfodraw.text(\n                        thispos,\n                        '%s: %.3f' % (colorlabel, thiscolor),\n                        font=cpfontnormal,\n                        fill=(0,0,0,255)\n                    )\n\n                else:\n\n                    thisrowind = colorind - 15\n                    thispos = (900+150*thisrowind, 350)\n\n                    objinfodraw.text(\n                        thispos,\n                        '%s: %.3f' % (colorlabel, thiscolor),\n                        font=cpfontnormal,\n                        fill=(0,0,0,255)\n                    )\n\n    # otherwise, deal with older generation of checkplots\n    else:\n\n        objinfodraw.text(\n            (625, 200),\n            ('Magnitudes'),\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n\n        objinfodraw.text(\n            (900, 200),\n            ('gri: %.3f, %.3f, %.3f' %\n             ((cpd['objectinfo']['sdssg'] if\n               ('sdssg' in cpd['objectinfo'] and\n                cpd['objectinfo']['sdssg'] is not None)\n               else npnan),\n              (cpd['objectinfo']['sdssr'] if\n               ('sdssr' in cpd['objectinfo'] and\n                cpd['objectinfo']['sdssr'] is not None)\n               else npnan),\n              (cpd['objectinfo']['sdssi'] if\n               ('sdssi' in cpd['objectinfo'] and\n                cpd['objectinfo']['sdssi'] is not None)\n               else npnan))),\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n        objinfodraw.text(\n            (900, 225),\n            ('JHK: %.3f, %.3f, %.3f' %\n             ((cpd['objectinfo']['jmag'] if\n               ('jmag' in cpd['objectinfo'] and\n                cpd['objectinfo']['jmag'] is not None)\n               else npnan),\n              (cpd['objectinfo']['hmag'] if\n               ('hmag' in cpd['objectinfo'] and\n                cpd['objectinfo']['hmag'] is not None)\n               else npnan),\n              (cpd['objectinfo']['kmag'] if\n               ('kmag' in cpd['objectinfo'] and\n                cpd['objectinfo']['kmag'] is not None)\n               else npnan))),\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n        objinfodraw.text(\n            (900, 250),\n            ('BV: %.3f, %.3f' %\n             ((cpd['objectinfo']['bmag'] if\n               ('bmag' in cpd['objectinfo'] and\n                cpd['objectinfo']['bmag'] is not None)\n               else npnan),\n              (cpd['objectinfo']['vmag'] if\n               ('vmag' in cpd['objectinfo'] and\n                cpd['objectinfo']['vmag'] is not None)\n               else npnan))),\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n\n        # colors\n        if ('dereddened' in cpd['objectinfo'] and\n            cpd['objectinfo']['dereddened'] is True):\n            deredlabel = \"(dereddened)\"\n        else:\n            deredlabel = \"\"\n\n        objinfodraw.text(\n            (625, 275),\n            'Colors %s' % deredlabel,\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n\n        objinfodraw.text(\n            (900, 275),\n            ('B - V: %.3f, V - K: %.3f' %\n             ( (cpd['objectinfo']['bvcolor'] if\n                ('bvcolor' in cpd['objectinfo'] and\n                 cpd['objectinfo']['bvcolor'] is not None)\n                else npnan),\n               (cpd['objectinfo']['vkcolor'] if\n                ('vkcolor' in cpd['objectinfo'] and\n                 cpd['objectinfo']['vkcolor'] is not None)\n                else npnan) )),\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n        objinfodraw.text(\n            (900, 300),\n            ('i - J: %.3f, g - K: %.3f' %\n             ( (cpd['objectinfo']['ijcolor'] if\n                ('ijcolor' in cpd['objectinfo'] and\n                 cpd['objectinfo']['ijcolor'] is not None)\n                else npnan),\n               (cpd['objectinfo']['gkcolor'] if\n                ('gkcolor' in cpd['objectinfo'] and\n                 cpd['objectinfo']['gkcolor'] is not None)\n                else npnan) )),\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n        objinfodraw.text(\n            (900, 325),\n            ('J - K: %.3f' %\n             ( (cpd['objectinfo']['jkcolor'] if\n                ('jkcolor' in cpd['objectinfo'] and\n                 cpd['objectinfo']['jkcolor'] is not None)\n                else npnan),) ),\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n\n    #\n    # rest of the object information\n    #\n\n    # color classification\n    if ('color_classes' in cpd['objectinfo'] and\n        cpd['objectinfo']['color_classes']):\n\n        objinfodraw.text(\n            (625, 375),\n            ('star classification by color: %s' %\n             (', '.join(cpd['objectinfo']['color_classes']))),\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n\n    # GAIA neighbors\n    if ( ('gaia_neighbors' in cpd['objectinfo']) and\n         (cpd['objectinfo']['gaia_neighbors'] is not None) and\n         (np.isfinite(cpd['objectinfo']['gaia_neighbors'])) and\n         ('searchradarcsec' in cpd['objectinfo']) and\n         (cpd['objectinfo']['searchradarcsec']) ):\n\n        objinfodraw.text(\n            (625, 400),\n            ('%s GAIA close neighbors within %.1f arcsec' %\n             (cpd['objectinfo']['gaia_neighbors'],\n              cpd['objectinfo']['searchradarcsec'])),\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n\n    # closest GAIA neighbor\n    if ( ('gaia_closest_distarcsec' in cpd['objectinfo']) and\n         (cpd['objectinfo']['gaia_closest_distarcsec'] is not None) and\n         (np.isfinite(cpd['objectinfo']['gaia_closest_distarcsec'])) and\n         ('gaia_closest_gmagdiff' in cpd['objectinfo']) and\n         (cpd['objectinfo']['gaia_closest_gmagdiff'] is not None) and\n         (np.isfinite(cpd['objectinfo']['gaia_closest_gmagdiff'])) ):\n\n        objinfodraw.text(\n            (625, 425),\n            ('closest GAIA neighbor is %.1f arcsec away, '\n             'GAIA mag (obj-nbr): %.3f' %\n             (cpd['objectinfo']['gaia_closest_distarcsec'],\n              cpd['objectinfo']['gaia_closest_gmagdiff'])),\n            font=cpfontnormal,\n            fill=(0,0,0,255)\n        )\n\n    # object tags\n    if 'objecttags' in cpd['objectinfo'] and cpd['objectinfo']['objecttags']:\n\n        objtagsplit = cpd['objectinfo']['objecttags'].split(',')\n\n        # write three tags per line\n        nobjtaglines = int(np.ceil(len(objtagsplit)/3.0))\n\n        for objtagline in range(nobjtaglines):\n            objtagslice = ','.join(objtagsplit[objtagline*3:objtagline*3+3])\n            objinfodraw.text(\n                (625, 450+objtagline*25),\n                objtagslice,\n                font=cpfontnormal,\n                fill=(135, 54, 0, 255)\n            )\n\n\n\n    ################################################\n    # row 1, cell 3: variability info and comments #\n    ################################################\n\n    # objectisvar\n    objisvar = cpd['varinfo']['objectisvar']\n\n    if objisvar == '0':\n        objvarflag = 'Variable star flag not set'\n    elif objisvar == '1':\n        objvarflag = 'Object is probably a variable star'\n    elif objisvar == '2':\n        objvarflag = 'Object is probably not a variable star'\n    elif objisvar == '3':\n        objvarflag = 'Not sure if this object is a variable star'\n    elif objisvar is None:\n        objvarflag = 'Variable star flag not set'\n    elif objisvar is True:\n        objvarflag = 'Object is probably a variable star'\n    elif objisvar is False:\n        objvarflag = 'Object is probably not a variable star'\n    else:\n        objvarflag = 'Variable star flag: %s' % objisvar\n\n    objinfodraw.text(\n        (1650, 125),\n        objvarflag,\n        font=cpfontnormal,\n        fill=(0,0,0,255)\n    )\n\n    # period\n    objinfodraw.text(\n        (1650, 150),\n        ('Period [days]: %.6f' %\n         (cpd['varinfo']['varperiod']\n          if cpd['varinfo']['varperiod'] is not None\n          else np.nan)),\n        font=cpfontnormal,\n        fill=(0,0,0,255)\n    )\n\n    # epoch\n    objinfodraw.text(\n        (1650, 175),\n        ('Epoch [JD]: %.6f' %\n         (cpd['varinfo']['varepoch']\n          if cpd['varinfo']['varepoch'] is not None\n          else np.nan)),\n        font=cpfontnormal,\n        fill=(0,0,0,255)\n    )\n\n    # variability tags\n    if cpd['varinfo']['vartags']:\n\n        vartagsplit = cpd['varinfo']['vartags'].split(',')\n\n        # write three tags per line\n        nvartaglines = int(np.ceil(len(vartagsplit)/3.0))\n\n        for vartagline in range(nvartaglines):\n            vartagslice = ','.join(vartagsplit[vartagline*3:vartagline*3+3])\n            objinfodraw.text(\n                (1650, 225+vartagline*25),\n                vartagslice,\n                font=cpfontnormal,\n                fill=(135, 54, 0, 255)\n            )\n\n    # object comments\n    if 'comments' in cpd and cpd['comments']:\n\n        commentsplit = cpd['comments'].split(' ')\n\n        # write 10 words per line\n        ncommentlines = int(np.ceil(len(commentsplit)/10.0))\n\n        for commentline in range(ncommentlines):\n            commentslice = ' '.join(\n                commentsplit[commentline*10:commentline*10+10]\n            )\n            objinfodraw.text(\n                (1650, 325+commentline*25),\n                commentslice,\n                font=cpfontnormal,\n                fill=(0,0,0,255)\n            )\n\n    # this handles JSON-ified checkplots returned by LCC server\n    elif 'objectcomments' in cpd and cpd['objectcomments']:\n\n        commentsplit = cpd['objectcomments'].split(' ')\n\n        # write 10 words per line\n        ncommentlines = int(np.ceil(len(commentsplit)/10.0))\n\n        for commentline in range(ncommentlines):\n            commentslice = ' '.join(\n                commentsplit[commentline*10:commentline*10+10]\n            )\n            objinfodraw.text(\n                (1650, 325+commentline*25),\n                commentslice,\n                font=cpfontnormal,\n                fill=(0,0,0,255)\n            )\n\n    #######################################\n    # row 1, cell 4: unphased light curve #\n    #######################################\n\n    if (cpd['magseries'] and\n        'plot' in cpd['magseries'] and\n        cpd['magseries']['plot']):\n        magseries = Image.open(\n            _base64_to_file(cpd['magseries']['plot'], None, writetostrio=True)\n        )\n        outimg.paste(magseries,(750*3,0))\n\n    # this handles JSON-ified checkplots from LCC server\n    elif ('magseries' in cpd and isinstance(cpd['magseries'],str)):\n\n        magseries = Image.open(\n            _base64_to_file(cpd['magseries'], None, writetostrio=True)\n        )\n        outimg.paste(magseries,(750*3,0))\n\n\n    ###############################\n    # the rest of the rows in cpd #\n    ###############################\n    for lspmethodind, lspmethod in enumerate(cplspmethods):\n\n        ###############################\n        # the periodogram comes first #\n        ###############################\n\n        if (cpd[lspmethod] and cpd[lspmethod]['periodogram']):\n\n            pgram = Image.open(\n                _base64_to_file(cpd[lspmethod]['periodogram'], None,\n                                writetostrio=True)\n            )\n            outimg.paste(pgram,(0,480 + 480*lspmethodind))\n\n        #############################\n        # best phased LC comes next #\n        #############################\n\n        if (cpd[lspmethod] and 0 in cpd[lspmethod] and cpd[lspmethod][0]):\n\n            plc1 = Image.open(\n                _base64_to_file(cpd[lspmethod][0]['plot'], None,\n                                writetostrio=True)\n            )\n            outimg.paste(plc1,(750,480 + 480*lspmethodind))\n\n        # this handles JSON-ified checkplots from LCC server\n        elif (cpd[lspmethod] and 'phasedlc0' in cpd[lspmethod] and\n              isinstance(cpd[lspmethod]['phasedlc0']['plot'], str)):\n\n            plc1 = Image.open(\n                _base64_to_file(cpd[lspmethod]['phasedlc0']['plot'], None,\n                                writetostrio=True)\n            )\n            outimg.paste(plc1,(750,480 + 480*lspmethodind))\n\n        #################################\n        # 2nd best phased LC comes next #\n        #################################\n\n        if (cpd[lspmethod] and 1 in cpd[lspmethod] and cpd[lspmethod][1]):\n\n            plc2 = Image.open(\n                _base64_to_file(cpd[lspmethod][1]['plot'], None,\n                                writetostrio=True)\n            )\n            outimg.paste(plc2,(750*2,480 + 480*lspmethodind))\n\n        # this handles JSON-ified checkplots from LCC server\n        elif (cpd[lspmethod] and 'phasedlc1' in cpd[lspmethod] and\n              isinstance(cpd[lspmethod]['phasedlc1']['plot'], str)):\n\n            plc2 = Image.open(\n                _base64_to_file(cpd[lspmethod]['phasedlc1']['plot'], None,\n                                writetostrio=True)\n            )\n            outimg.paste(plc2,(750*2,480 + 480*lspmethodind))\n\n        #################################\n        # 3rd best phased LC comes next #\n        #################################\n\n        if (cpd[lspmethod] and 2 in cpd[lspmethod] and cpd[lspmethod][2]):\n\n            plc3 = Image.open(\n                _base64_to_file(cpd[lspmethod][2]['plot'], None,\n                                writetostrio=True)\n            )\n            outimg.paste(plc3,(750*3,480 + 480*lspmethodind))\n\n        # this handles JSON-ified checkplots from LCC server\n        elif (cpd[lspmethod] and 'phasedlc2' in cpd[lspmethod] and\n              isinstance(cpd[lspmethod]['phasedlc2']['plot'], str)):\n\n            plc3 = Image.open(\n                _base64_to_file(cpd[lspmethod]['phasedlc2']['plot'], None,\n                                writetostrio=True)\n            )\n            outimg.paste(plc3,(750*3,480 + 480*lspmethodind))\n\n\n    ################################\n    ## ALL DONE WITH BUILDING PNG ##\n    ################################\n\n    #########################\n    # add in any extra rows #\n    #########################\n\n    # from the keyword arguments\n    if erows > 0:\n\n        for erowind, erow in enumerate(extrarows):\n\n            # make sure we never go above 4 plots in a row\n            for ecolind, ecol in enumerate(erow[:4]):\n\n                eplot = Image.open(ecol)\n                eplotresized = eplot.resize((750,480), Image.ANTIALIAS)\n                outimg.paste(eplotresized,\n                             (750*ecolind,\n                              (cprows+1)*480 + 480*erowind))\n\n    # from the checkplotdict\n    if cpderows > 0:\n\n        for cpderowind, cpderow in enumerate(cpd['externalplots']):\n\n            # make sure we never go above 4 plots in a row\n            for cpdecolind, cpdecol in enumerate(cpderow[:4]):\n\n                cpdeplot = Image.open(cpdecol)\n                cpdeplotresized = cpdeplot.resize((750,480), Image.ANTIALIAS)\n                outimg.paste(cpdeplotresized,\n                             (750*cpdecolind,\n                              (cprows+1)*480 + (erows*480) + 480*cpderowind))\n\n\n    # from neighbors:\n    if nbrrows > 0:\n\n        # we have four tiles\n        # tile 1: neighbor objectid, ra, decl, distance, unphased LC\n        # tile 2: phased LC for gls\n        # tile 3: phased LC for pdm\n        # tile 4: phased LC for any other period finding method\n        #         the priority is like so: ['bls','mav','aov','win']\n\n        for nbrind, nbr in enumerate(cpd['neighbors']):\n\n            # figure out which period finding methods are available for this\n            # neighbor. make sure to match the ones from the actual object in\n            # order of priority: 'gls','pdm','bls','aov','mav','acf','win'\n            nbrlspmethods = []\n\n            for lspmethod in cpd['pfmethods']:\n                if lspmethod in nbr:\n                    nbrlspmethods.append(lspmethod)\n\n            # restrict to top three in priority\n            nbrlspmethods = nbrlspmethods[:3]\n\n            try:\n\n                # first panel: neighbor objectid, ra, decl, distance, unphased\n                # LC\n                nbrlc = Image.open(\n                    _base64_to_file(\n                        nbr['magseries']['plot'], None, writetostrio=True\n                    )\n                )\n                outimg.paste(nbrlc,\n                             (750*0,\n                              (cprows+1)*480 + (erows*480) + (cpderows*480) +\n                              480*nbrind))\n\n                # overlay the objectinfo\n                objinfodraw.text(\n                    (98,\n                     (cprows+1)*480 + (erows*480) + (cpderows*480) +\n                     480*nbrind + 15),\n                    ('N%s: %s' % (nbrind + 1, nbr['objectid'])),\n                    font=cpfontlarge,\n                    fill=(0,0,255,255)\n                )\n                # overlay the objectinfo\n                objinfodraw.text(\n                    (98,\n                     (cprows+1)*480 + (erows*480) + (cpderows*480) +\n                     480*nbrind + 50),\n                    ('(RA, DEC) = (%.3f, %.3f), distance: %.1f arcsec' %\n                     (nbr['ra'], nbr['decl'], nbr['dist'])),\n                    font=cpfontnormal,\n                    fill=(0,0,255,255)\n                )\n\n                # second panel: phased LC for gls\n                lsp1lc = Image.open(\n                    _base64_to_file(\n                        nbr[nbrlspmethods[0]][0]['plot'], None,\n                        writetostrio=True\n                    )\n                )\n                outimg.paste(lsp1lc,\n                             (750*1,\n                              (cprows+1)*480 + (erows*480) + (cpderows*480) +\n                              480*nbrind))\n\n                # second panel: phased LC for gls\n                lsp2lc = Image.open(\n                    _base64_to_file(\n                        nbr[nbrlspmethods[1]][0]['plot'], None,\n                        writetostrio=True\n                    )\n                )\n                outimg.paste(lsp2lc,\n                             (750*2,\n                              (cprows+1)*480 + (erows*480) + (cpderows*480) +\n                              480*nbrind))\n\n                # second panel: phased LC for gls\n                lsp3lc = Image.open(\n                    _base64_to_file(\n                        nbr[nbrlspmethods[2]][0]['plot'], None,\n                        writetostrio=True\n                    )\n                )\n                outimg.paste(lsp3lc,\n                             (750*3,\n                              (cprows+1)*480 + (erows*480) + (cpderows*480) +\n                              480*nbrind))\n\n            except Exception as e:\n\n                LOGERROR('neighbor %s does not have a magseries plot, '\n                         'measurements are probably all nan' % nbr['objectid'])\n\n                # overlay the objectinfo\n                objinfodraw.text(\n                    (98,\n                     (cprows+1)*480 + (erows*480) + (cpderows*480) +\n                     480*nbrind + 15),\n                    ('N%s: %s' %\n                     (nbrind + 1, nbr['objectid'])),\n                    font=cpfontlarge,\n                    fill=(0,0,255,255)\n                )\n\n                if 'ra' in nbr and 'decl' in nbr and 'dist' in nbr:\n\n                    # overlay the objectinfo\n                    objinfodraw.text(\n                        (98,\n                         (cprows+1)*480 + (erows*480) + (cpderows*480) +\n                         480*nbrind + 50),\n                        ('(RA, DEC) = (%.3f, %.3f), distance: %.1f arcsec' %\n                         (nbr['ra'], nbr['decl'], nbr['dist'])),\n                        font=cpfontnormal,\n                        fill=(0,0,255,255)\n                    )\n\n                elif 'objectinfo' in nbr:\n\n                    # overlay the objectinfo\n                    objinfodraw.text(\n                        (98,\n                         (cprows+1)*480 + (erows*480) + (cpderows*480) +\n                         480*nbrind + 50),\n                        ('(RA, DEC) = (%.3f, %.3f), distance: %.1f arcsec' %\n                         (nbr['objectinfo']['ra'],\n                          nbr['objectinfo']['decl'],\n                          nbr['objectinfo']['distarcsec'])),\n                        font=cpfontnormal,\n                        fill=(0,0,255,255)\n                    )\n\n\n    #####################\n    ## WRITE FINAL PNG ##\n    #####################\n\n    # check if the output filename is actually an instance of StringIO\n    if sys.version_info[:2] < (3,0):\n\n        is_strio = isinstance(outfile, cStringIO.InputType)\n\n    else:\n\n        is_strio = isinstance(outfile, StrIO)\n\n\n    if not is_strio:\n\n        # check if we've stupidly copied over the same filename as the input\n        # pickle to expected output file\n        if outfile.endswith('pkl'):\n            LOGWARNING('expected output PNG filename ends with .pkl, '\n                       'changed to .png')\n            outfile = outfile.replace('.pkl','.png')\n\n    outimg.save(outfile, format='PNG', optimize=True)\n\n\n    if not is_strio:\n        if os.path.exists(outfile):\n            LOGINFO('checkplot pickle -> checkplot PNG: %s OK' % outfile)\n            return outfile\n        else:\n            LOGERROR('failed to write checkplot PNG')\n            return None\n\n    else:\n        LOGINFO('checkplot pickle -> StringIO instance OK')\n        return outfile", "response": "This function reads the checkplot pickle or dict provided and writes out a PNG file."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef flare_model(flareparams, times, mags, errs):\n    '''This is a flare model function, similar to Kowalski+ 2011.\n\n    From the paper by Pitkin+ 2014:\n    http://adsabs.harvard.edu/abs/2014MNRAS.445.2268P\n\n    Parameters\n    ----------\n\n    flareparams : list of float\n        This defines the flare model::\n\n            [amplitude,\n             flare_peak_time,\n             rise_gaussian_stdev,\n             decay_time_constant]\n\n        where:\n\n        `amplitude`: the maximum flare amplitude in mags or flux. If flux, then\n        amplitude should be positive. If mags, amplitude should be negative.\n\n        `flare_peak_time`: time at which the flare maximum happens.\n\n        `rise_gaussian_stdev`: the stdev of the gaussian describing the rise of\n        the flare.\n\n        `decay_time_constant`: the time constant of the exponential fall of the\n        flare.\n\n    times,mags,errs : np.array\n        The input time-series of measurements and associated errors for which\n        the model will be generated. The times will be used to generate\n        model mags.\n\n    Returns\n    -------\n\n    (modelmags, times, mags, errs) : tuple\n        Returns the model mags evaluated at the input time values. Also returns\n        the input `times`, `mags`, and `errs`.\n\n    '''\n\n    (amplitude, flare_peak_time,\n     rise_gaussian_stdev, decay_time_constant) = flareparams\n\n    zerolevel = np.median(mags)\n    modelmags = np.full_like(times, zerolevel)\n\n    # before peak gaussian rise...\n    modelmags[times < flare_peak_time] = (\n        mags[times < flare_peak_time] +\n        amplitude * np.exp(\n            -((times[times < flare_peak_time] -\n               flare_peak_time) *\n              (times[times < flare_peak_time] -\n               flare_peak_time)) /\n            (2.0*rise_gaussian_stdev*rise_gaussian_stdev)\n        )\n    )\n\n    # after peak exponential decay...\n    modelmags[times > flare_peak_time] = (\n        mags[times > flare_peak_time] +\n        amplitude * np.exp(\n            -((times[times > flare_peak_time] -\n               flare_peak_time)) /\n            (decay_time_constant)\n        )\n    )\n\n    return modelmags, times, mags, errs", "response": "This function is a flare model function that is similar to Kowalski + 2011. This function is a flare model function similar to Kowalski + 2011."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef flare_model_residual(flareparams, times, mags, errs):\n    '''\n    This returns the residual between model mags and the actual mags.\n\n    Parameters\n    ----------\n\n    flareparams : list of float\n        This defines the flare model::\n\n            [amplitude,\n             flare_peak_time,\n             rise_gaussian_stdev,\n             decay_time_constant]\n\n        where:\n\n        `amplitude`: the maximum flare amplitude in mags or flux. If flux, then\n        amplitude should be positive. If mags, amplitude should be negative.\n\n        `flare_peak_time`: time at which the flare maximum happens.\n\n        `rise_gaussian_stdev`: the stdev of the gaussian describing the rise of\n        the flare.\n\n        `decay_time_constant`: the time constant of the exponential fall of the\n        flare.\n\n    times,mags,errs : np.array\n        The input time-series of measurements and associated errors for which\n        the model will be generated. The times will be used to generate\n        model mags.\n\n    Returns\n    -------\n\n    np.array\n        The residuals between the input `mags` and generated `modelmags`,\n        weighted by the measurement errors in `errs`.\n\n    '''\n\n    modelmags, _, _, _ = flare_model(flareparams, times, mags, errs)\n\n    return (mags - modelmags)/errs", "response": "This function returns the residual between the model mags and the actual mags."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef shutdown_check_handler():\n\n    url = 'http://169.254.169.254/latest/meta-data/spot/instance-action'\n\n    try:\n        resp = requests.get(url, timeout=1.0)\n        resp.raise_for_status()\n\n        stopinfo = resp.json()\n        if 'action' in stopinfo and stopinfo['action'] in ('stop',\n                                                           'terminate',\n                                                           'hibernate'):\n            stoptime = stopinfo['time']\n            LOGWARNING('instance is going to %s at %s' % (stopinfo['action'],\n                                                          stoptime))\n\n            resp.close()\n            return True\n        else:\n            resp.close()\n            return False\n\n    except HTTPError as e:\n        resp.close()\n        return False\n\n    except Exception as e:\n        resp.close()\n        return False", "response": "This function checks the AWS instance data URL to see if there s a pending shutdown for the instance. If it is we ll check the instance data URL to see if the instance is still safe."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef runcp_producer_loop(\n        lightcurve_list,\n        input_queue,\n        input_bucket,\n        result_queue,\n        result_bucket,\n        pfresult_list=None,\n        runcp_kwargs=None,\n        process_list_slice=None,\n        purge_queues_when_done=False,\n        delete_queues_when_done=False,\n        download_when_done=True,\n        save_state_when_done=True,\n        s3_client=None,\n        sqs_client=None\n):\n    \"\"\"This sends checkplot making tasks to the input queue and monitors the\n    result queue for task completion.\n\n    Parameters\n    ----------\n\n    lightcurve_list : str or list of str\n        This is either a string pointing to a file containing a list of light\n        curves filenames to process or the list itself. The names must\n        correspond to the full filenames of files stored on S3, including all\n        prefixes, but not include the 's3://<bucket name>/' bit (these will be\n        added automatically).\n\n    input_queue : str\n        This is the name of the SQS queue which will receive processing tasks\n        generated by this function. The queue URL will automatically be obtained\n        from AWS.\n\n    input_bucket : str\n        The name of the S3 bucket containing the light curve files to process.\n\n    result_queue : str\n        This is the name of the SQS queue that this function will listen to for\n        messages from the workers as they complete processing on their input\n        elements. This function will attempt to match input sent to the\n        `input_queue` with results coming into the `result_queue` so it knows\n        how many objects have been successfully processed. If this function\n        receives task results that aren't in its own input queue, it will\n        acknowledge them so they complete successfully, but not download them\n        automatically. This handles leftover tasks completing from a previous\n        run of this function.\n\n    result_bucket : str\n        The name of the S3 bucket which will receive the results from the\n        workers.\n\n    pfresult_list : list of str or None\n        This is a list of periodfinder result pickle S3 URLs associated with\n        each light curve. If provided, this will be used to add in phased light\n        curve plots to each checkplot pickle. If this is None, the worker loop\n        will produce checkplot pickles that only contain object information,\n        neighbor information, and unphased light curves.\n\n    runcp_kwargs : dict\n        This is a dict used to pass any extra keyword arguments to the\n        `lcproc.checkplotgen.runcp` function that will be run by the worker\n        loop.\n\n    process_list_slice : list\n        This is used to index into the input light curve list so a subset of the\n        full list can be processed in this specific run of this function.\n\n        Use None for a slice index elem to emulate single slice spec behavior:\n\n        process_list_slice = [10, None]  -> lightcurve_list[10:]\n        process_list_slice = [None, 500] -> lightcurve_list[:500]\n\n    purge_queues_when_done : bool\n        If this is True, and this function exits (either when all done, or when\n        it is interrupted with a Ctrl+C), all outstanding elements in the\n        input/output queues that have not yet been acknowledged by workers or by\n        this function will be purged. This effectively cancels all outstanding\n        work.\n\n    delete_queues_when_done : bool\n        If this is True, and this function exits (either when all done, or when\n        it is interrupted with a Ctrl+C'), all outstanding work items will be\n        purged from the input/queues and the queues themselves will be deleted.\n\n    download_when_done : bool\n        If this is True, the generated checkplot pickle for each input work item\n        will be downloaded immediately to the current working directory when the\n        worker functions report they're done with it.\n\n    save_state_when_done : bool\n        If this is True, will save the current state of the work item queue and\n        the work items acknowledged as completed to a pickle in the current\n        working directory. Call the `runcp_producer_loop_savedstate` function\n        below to resume processing from this saved state later.\n\n    s3_client : boto3.Client or None\n        If None, this function will instantiate a new `boto3.Client` object to\n        use in its S3 download operations. Alternatively, pass in an existing\n        `boto3.Client` instance to re-use it here.\n\n    sqs_client : boto3.Client or None\n        If None, this function will instantiate a new `boto3.Client` object to\n        use in its SQS operations. Alternatively, pass in an existing\n        `boto3.Client` instance to re-use it here.\n\n    Returns\n    -------\n\n    dict or str\n        Returns the current work state as a dict or str path to the generated\n        work state pickle depending on if `save_state_when_done` is True.\n\n    \"\"\"\n\n    if not sqs_client:\n        sqs_client = boto3.client('sqs')\n    if not s3_client:\n        s3_client = boto3.client('s3')\n\n    if isinstance(lightcurve_list, str) and os.path.exists(lightcurve_list):\n\n        # get the LC list\n        with open(lightcurve_list, 'r') as infd:\n            lclist = infd.readlines()\n\n        lclist = [x.replace('\\n','') for x in lclist if len(x) > 0]\n\n        if process_list_slice is not None:\n            lclist = lclist[process_list_slice[0]:process_list_slice[1]]\n\n        lclist = [x[1:] for x in lclist if x.startswith('/')]\n        lclist = ['s3://%s/%s' % (input_bucket, x) for x in lclist]\n\n    # this handles direct invocation using lists of s3:// urls of light curves\n    elif isinstance(lightcurve_list, list):\n        lclist = lightcurve_list\n\n    # set up the input and output queues\n\n    # check if the queues by the input and output names given exist already\n    # if they do, go ahead and use them\n    # if they don't, make new ones.\n    try:\n        inq = sqs_client.get_queue_url(QueueName=input_queue)\n        inq_url = inq['QueueUrl']\n        LOGINFO('input queue already exists, skipping creation...')\n    except ClientError as e:\n        inq = awsutils.sqs_create_queue(input_queue, client=sqs_client)\n        inq_url = inq['url']\n\n    try:\n        outq = sqs_client.get_queue_url(QueueName=result_queue)\n        outq_url = outq['QueueUrl']\n        LOGINFO('result queue already exists, skipping creation...')\n    except ClientError as e:\n        outq = awsutils.sqs_create_queue(result_queue, client=sqs_client)\n        outq_url = outq['url']\n\n    LOGINFO('input queue: %s' % inq_url)\n    LOGINFO('output queue: %s' % outq_url)\n\n    # wait until queues are up\n    LOGINFO('waiting for queues to become ready...')\n    time.sleep(10.0)\n\n    # for each item in the lightcurve_list, send it to the input queue and wait\n    # until it's done to send another one\n\n    if pfresult_list is None:\n        pfresult_list = [None for x in lclist]\n\n    for lc, pf in zip(lclist, pfresult_list):\n\n        this_item = {\n            'target': lc,\n            'action': 'runcp',\n            'args': (pf,),\n            'kwargs':runcp_kwargs if runcp_kwargs is not None else {},\n            'outbucket': result_bucket,\n            'outqueue': outq_url\n        }\n\n        resp = awsutils.sqs_put_item(inq_url, this_item, client=sqs_client)\n        if resp:\n            LOGINFO('sent %s to queue: %s' % (lc,inq_url))\n\n    # now block until all objects are done\n    done_objects = {}\n\n    LOGINFO('all items queued, waiting for results...')\n\n    # listen to the kill and term signals and raise KeyboardInterrupt when\n    # called\n    signal.signal(signal.SIGINT, kill_handler)\n    signal.signal(signal.SIGTERM, kill_handler)\n\n    while len(list(done_objects.keys())) < len(lclist):\n\n        try:\n\n            result = awsutils.sqs_get_item(outq_url, client=sqs_client)\n\n            if result is not None and len(result) > 0:\n\n                recv = result[0]\n                try:\n                    processed_object = recv['item']['target']\n                except KeyError:\n                    LOGWARNING('unknown target in received item: %s' % recv)\n                    processed_object = 'unknown-lc'\n\n                cpf = recv['item']['cpf']\n                receipt = recv['receipt_handle']\n\n                if processed_object in lclist:\n\n                    if processed_object not in done_objects:\n                        done_objects[processed_object] = [cpf]\n                    else:\n                        done_objects[processed_object].append(cpf)\n\n                    LOGINFO('done with %s -> %s' % (processed_object, cpf))\n\n                    if download_when_done:\n\n                        getobj = awsutils.awsutils.s3_get_url(\n                            cpf,\n                            client=s3_client\n                        )\n                        LOGINFO('downloaded %s -> %s' % (cpf, getobj))\n\n                else:\n                    LOGWARNING('processed object returned is not in '\n                               'queued target list, probably from an '\n                               'earlier run. accepting but not downloading.')\n\n                awsutils.sqs_delete_item(outq_url, receipt)\n\n        except KeyboardInterrupt as e:\n\n            LOGWARNING('breaking out of producer wait-loop')\n            break\n\n\n    # delete the input and output queues when we're done\n    LOGINFO('done with processing.')\n    time.sleep(1.0)\n\n    if purge_queues_when_done:\n        LOGWARNING('purging queues at exit, please wait 10 seconds...')\n        sqs_client.purge_queue(QueueUrl=inq_url)\n        sqs_client.purge_queue(QueueUrl=outq_url)\n        time.sleep(10.0)\n\n    if delete_queues_when_done:\n        LOGWARNING('deleting queues at exit')\n        awsutils.sqs_delete_queue(inq_url)\n        awsutils.sqs_delete_queue(outq_url)\n\n    work_state = {\n        'done': done_objects,\n        'in_progress': list(set(lclist) - set(done_objects.keys())),\n        'args':((os.path.abspath(lightcurve_list) if\n                 isinstance(lightcurve_list, str) else lightcurve_list),\n                input_queue,\n                input_bucket,\n                result_queue,\n                result_bucket),\n        'kwargs':{'pfresult_list':pfresult_list,\n                  'runcp_kwargs':runcp_kwargs,\n                  'process_list_slice':process_list_slice,\n                  'download_when_done':download_when_done,\n                  'purge_queues_when_done':purge_queues_when_done,\n                  'save_state_when_done':save_state_when_done,\n                  'delete_queues_when_done':delete_queues_when_done}\n    }\n\n    if save_state_when_done:\n        with open('runcp-queue-producer-loop-state.pkl','wb') as outfd:\n            pickle.dump(work_state, outfd, pickle.HIGHEST_PROTOCOL)\n\n    # at the end, return the done_objects dict\n    # also return the list of unprocessed items if any\n    return work_state", "response": "This function sends checkplot making tasks to the input queue and monitors the result queue for task completion."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef runcp_producer_loop_savedstate(\n        use_saved_state=None,\n        lightcurve_list=None,\n        input_queue=None,\n        input_bucket=None,\n        result_queue=None,\n        result_bucket=None,\n        pfresult_list=None,\n        runcp_kwargs=None,\n        process_list_slice=None,\n        download_when_done=True,\n        purge_queues_when_done=True,\n        save_state_when_done=True,\n        delete_queues_when_done=False,\n        s3_client=None,\n        sqs_client=None\n):\n    \"\"\"This wraps the function above to allow for loading previous state from a\n    file.\n\n    Parameters\n    ----------\n\n    use_saved_state : str or None\n        This is the path to the saved state pickle file produced by a previous\n        run of `runcp_producer_loop`. Will get all of the arguments to run\n        another instance of the loop from that pickle file. If this is None, you\n        MUST provide all of the appropriate arguments to that function.\n\n    lightcurve_list : str or list of str or None\n        This is either a string pointing to a file containing a list of light\n        curves filenames to process or the list itself. The names must\n        correspond to the full filenames of files stored on S3, including all\n        prefixes, but not include the 's3://<bucket name>/' bit (these will be\n        added automatically).\n\n    input_queue : str or None\n        This is the name of the SQS queue which will receive processing tasks\n        generated by this function. The queue URL will automatically be obtained\n        from AWS.\n\n    input_bucket : str or None\n        The name of the S3 bucket containing the light curve files to process.\n\n    result_queue : str or None\n        This is the name of the SQS queue that this function will listen to for\n        messages from the workers as they complete processing on their input\n        elements. This function will attempt to match input sent to the\n        `input_queue` with results coming into the `result_queue` so it knows\n        how many objects have been successfully processed. If this function\n        receives task results that aren't in its own input queue, it will\n        acknowledge them so they complete successfully, but not download them\n        automatically. This handles leftover tasks completing from a previous\n        run of this function.\n\n    result_bucket : str or None\n        The name of the S3 bucket which will receive the results from the\n        workers.\n\n    pfresult_list : list of str or None\n        This is a list of periodfinder result pickle S3 URLs associated with\n        each light curve. If provided, this will be used to add in phased light\n        curve plots to each checkplot pickle. If this is None, the worker loop\n        will produce checkplot pickles that only contain object information,\n        neighbor information, and unphased light curves.\n\n    runcp_kwargs : dict or None\n        This is a dict used to pass any extra keyword arguments to the\n        `lcproc.checkplotgen.runcp` function that will be run by the worker\n        loop.\n\n    process_list_slice : list or None\n        This is used to index into the input light curve list so a subset of the\n        full list can be processed in this specific run of this function.\n\n        Use None for a slice index elem to emulate single slice spec behavior:\n\n        process_list_slice = [10, None]  -> lightcurve_list[10:]\n        process_list_slice = [None, 500] -> lightcurve_list[:500]\n\n    purge_queues_when_done : bool or None\n        If this is True, and this function exits (either when all done, or when\n        it is interrupted with a Ctrl+C), all outstanding elements in the\n        input/output queues that have not yet been acknowledged by workers or by\n        this function will be purged. This effectively cancels all outstanding\n        work.\n\n    delete_queues_when_done : bool or None\n        If this is True, and this function exits (either when all done, or when\n        it is interrupted with a Ctrl+C'), all outstanding work items will be\n        purged from the input/queues and the queues themselves will be deleted.\n\n    download_when_done : bool or None\n        If this is True, the generated checkplot pickle for each input work item\n        will be downloaded immediately to the current working directory when the\n        worker functions report they're done with it.\n\n    save_state_when_done : bool or None\n        If this is True, will save the current state of the work item queue and\n        the work items acknowledged as completed to a pickle in the current\n        working directory. Call the `runcp_producer_loop_savedstate` function\n        below to resume processing from this saved state later.\n\n    s3_client : boto3.Client or None\n        If None, this function will instantiate a new `boto3.Client` object to\n        use in its S3 download operations. Alternatively, pass in an existing\n        `boto3.Client` instance to re-use it here.\n\n    sqs_client : boto3.Client or None\n        If None, this function will instantiate a new `boto3.Client` object to\n        use in its SQS operations. Alternatively, pass in an existing\n        `boto3.Client` instance to re-use it here.\n\n    Returns\n    -------\n\n    dict or str\n        Returns the current work state as a dict or str path to the generated\n        work state pickle depending on if `save_state_when_done` is True.\n\n    \"\"\"\n\n    if use_saved_state is not None and os.path.exists(use_saved_state):\n\n        with open(use_saved_state,'rb') as infd:\n            saved_state = pickle.load(infd)\n\n        # run the producer loop using the saved state's todo list\n        return runcp_producer_loop(\n            saved_state['in_progress'],\n            saved_state['args'][1],\n            saved_state['args'][2],\n            saved_state['args'][3],\n            saved_state['args'][4],\n            **saved_state['kwargs']\n        )\n\n    else:\n\n        return runcp_producer_loop(\n            lightcurve_list,\n            input_queue,\n            input_bucket,\n            result_queue,\n            result_bucket,\n            pfresult_list=pfresult_list,\n            runcp_kwargs=runcp_kwargs,\n            process_list_slice=process_list_slice,\n            download_when_done=download_when_done,\n            purge_queues_when_done=purge_queues_when_done,\n            save_state_when_done=save_state_when_done,\n            delete_queues_when_done=delete_queues_when_done,\n            s3_client=s3_client,\n            sqs_client=sqs_client\n        )", "response": "This function is used to load the saved state pickle file produced by a previous run of runcp_producer_loop. It will load the saved state pickle file and process the saved state and return a new copy of the saved state."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef runpf_consumer_loop(\n        in_queue_url,\n        workdir,\n        lc_altexts=('',),\n        wait_time_seconds=5,\n        shutdown_check_timer_seconds=60.0,\n        sqs_client=None,\n        s3_client=None\n):\n    \"\"\"This runs period-finding in a loop until interrupted.\n\n    Consumes work task items from an input queue set up by `runpf_producer_loop`\n    above.\n\n    Parameters\n    ----------\n\n    in_queue_url : str\n        The SQS URL of the input queue to listen to for work assignment\n        messages. The task orders will include the input and output S3 bucket\n        names, as well as the URL of the output queue to where this function\n        will report its work-complete or work-failed status.\n\n    workdir : str\n        The directory on the local machine where this worker loop will download\n        the input light curves, process them, and produce its output\n        periodfinding result pickles. These will then be uploaded to the\n        specified S3 output bucket, and then deleted from the local disk.\n\n    lc_altexts : sequence of str\n        If not None, this is a sequence of alternate extensions to try for the\n        input light curve file other than the one provided in the input task\n        order. For example, to get anything that's an .sqlite where .sqlite.gz\n        is expected, use altexts=[''] to strip the .gz.\n\n    wait_time_seconds : int\n        The amount of time to wait in the input SQS queue for an input task\n        order. If this timeout expires and no task has been received, this\n        function goes back to the top of the work loop.\n\n    shutdown_check_timer_seconds : float\n        The amount of time to wait before checking for a pending EC2 shutdown\n        message for the instance this worker loop is operating on. If a shutdown\n        is noticed, the worker loop is cancelled in preparation for instance\n        shutdown.\n\n    sqs_client : boto3.Client or None\n        If None, this function will instantiate a new `boto3.Client` object to\n        use in its SQS operations. Alternatively, pass in an existing\n        `boto3.Client` instance to re-use it here.\n\n    s3_client : boto3.Client or None\n        If None, this function will instantiate a new `boto3.Client` object to\n        use in its S3 operations. Alternatively, pass in an existing\n        `boto3.Client` instance to re-use it here.\n\n    Returns\n    -------\n\n    Nothing.\n\n    \"\"\"\n\n    if not sqs_client:\n        sqs_client = boto3.client('sqs')\n    if not s3_client:\n        s3_client = boto3.client('s3')\n\n\n    # listen to the kill and term signals and raise KeyboardInterrupt when\n    # called\n    signal.signal(signal.SIGINT, kill_handler)\n    signal.signal(signal.SIGTERM, kill_handler)\n\n    shutdown_last_time = time.monotonic()\n\n    while True:\n\n        curr_time = time.monotonic()\n\n        if (curr_time - shutdown_last_time) > shutdown_check_timer_seconds:\n            shutdown_check = shutdown_check_handler()\n            if shutdown_check:\n                LOGWARNING('instance will die soon, breaking loop')\n                break\n            shutdown_last_time = time.monotonic()\n\n        try:\n\n            # receive a single message from the inqueue\n            work = awsutils.sqs_get_item(in_queue_url,\n                                         client=sqs_client,\n                                         raiseonfail=True)\n\n            # JSON deserialize the work item\n            if work is not None and len(work) > 0:\n\n                recv = work[0]\n\n                # skip any messages that don't tell us to runpf\n                action = recv['item']['action']\n                if action != 'runpf':\n                    continue\n\n                target = recv['item']['target']\n                args = recv['item']['args']\n                kwargs = recv['item']['kwargs']\n                outbucket = recv['item']['outbucket']\n\n                if 'outqueue' in recv['item']:\n                    out_queue_url = recv['item']['outqueue']\n                else:\n                    out_queue_url = None\n\n                receipt = recv['receipt_handle']\n\n                # download the target from S3 to a file in the work directory\n                try:\n\n                    lc_filename = awsutils.s3_get_url(\n                        target,\n                        altexts=lc_altexts,\n                        client=s3_client\n                    )\n\n                    runpf_args = (lc_filename, args[0])\n\n                    # now runpf\n                    pfresult = runpf(\n                        *runpf_args,\n                        **kwargs\n                    )\n\n                    if pfresult and os.path.exists(pfresult):\n\n                        LOGINFO('runpf OK for LC: %s -> %s' %\n                                (lc_filename, pfresult))\n\n                        # check if the file exists already because it's been\n                        # processed somewhere else\n                        resp = s3_client.list_objects_v2(\n                            Bucket=outbucket,\n                            MaxKeys=1,\n                            Prefix=pfresult\n                        )\n                        outbucket_list = resp.get('Contents',[])\n\n                        if outbucket_list and len(outbucket_list) > 0:\n\n                            LOGWARNING(\n                                'not uploading pfresult for %s because '\n                                'it exists in the output bucket already'\n                                % target\n                            )\n                            awsutils.sqs_delete_item(in_queue_url, receipt)\n                            continue\n\n                        put_url = awsutils.s3_put_file(pfresult,\n                                                       outbucket,\n                                                       client=s3_client)\n\n                        if put_url is not None:\n\n                            LOGINFO('result uploaded to %s' % put_url)\n\n                            # put the S3 URL of the output into the output\n                            # queue if requested\n                            if out_queue_url is not None:\n\n                                awsutils.sqs_put_item(\n                                    out_queue_url,\n                                    {'pfresult':put_url,\n                                     'target': target,\n                                     'lc_filename':lc_filename,\n                                     'kwargs':kwargs},\n                                    raiseonfail=True\n                                )\n\n                            # delete the result from the local directory\n                            os.remove(pfresult)\n\n                        # if the upload fails, don't acknowledge the\n                        # message. might be a temporary S3 failure, so\n                        # another worker might succeed later.\n                        # FIXME: add SNS bits to warn us of failures\n                        else:\n                            LOGERROR('failed to upload %s to S3' % pfresult)\n                            os.remove(pfresult)\n\n                        # delete the input item from the input queue to\n                        # acknowledge its receipt and indicate that\n                        # processing is done and successful\n                        awsutils.sqs_delete_item(in_queue_url, receipt)\n\n                        # delete the light curve file when we're done with it\n                        if ( (lc_filename is not None) and\n                             (os.path.exists(lc_filename)) ):\n                            os.remove(lc_filename)\n\n                    # if runcp failed outright, don't requeue. instead, write a\n                    # ('failed-checkplot-%s.pkl' % lc_filename) file to the\n                    # output S3 bucket.\n                    else:\n\n                        LOGWARNING('runpf failed for LC: %s' %\n                                   (lc_filename,))\n\n                        with open('failed-periodfinding-%s.pkl' %\n                                  lc_filename, 'wb') as outfd:\n                            pickle.dump(\n                                {'in_queue_url':in_queue_url,\n                                 'target':target,\n                                 'lc_filename':lc_filename,\n                                 'kwargs':kwargs,\n                                 'outbucket':outbucket,\n                                 'out_queue_url':out_queue_url},\n                                outfd, pickle.HIGHEST_PROTOCOL\n                            )\n\n                        put_url = awsutils.s3_put_file(\n                            'failed-periodfinding-%s.pkl' % lc_filename,\n                            outbucket,\n                            client=s3_client\n                        )\n\n                        # put the S3 URL of the output into the output\n                        # queue if requested\n                        if out_queue_url is not None:\n\n                            awsutils.sqs_put_item(\n                                out_queue_url,\n                                {'pfresult':put_url,\n                                 'lc_filename':lc_filename,\n                                 'kwargs':kwargs},\n                                raiseonfail=True\n                            )\n\n                        # delete the input item from the input queue to\n                        # acknowledge its receipt and indicate that\n                        # processing is done\n                        awsutils.sqs_delete_item(in_queue_url,\n                                                 receipt,\n                                                 raiseonfail=True)\n\n                        # delete the light curve file when we're done with it\n                        if ( (lc_filename is not None) and\n                             (os.path.exists(lc_filename)) ):\n                            os.remove(lc_filename)\n\n\n                except ClientError as e:\n\n                    LOGWARNING('queues have disappeared. stopping worker loop')\n                    break\n\n\n                # if there's any other exception, put a failed response into the\n                # output bucket and queue\n                except Exception as e:\n\n                    LOGEXCEPTION('could not process input from queue')\n\n                    if 'lc_filename' in locals():\n\n                        with open('failed-periodfinding-%s.pkl' %\n                                  lc_filename,'wb') as outfd:\n                            pickle.dump(\n                                {'in_queue_url':in_queue_url,\n                                 'target':target,\n                                 'lc_filename':lc_filename,\n                                 'kwargs':kwargs,\n                                 'outbucket':outbucket,\n                                 'out_queue_url':out_queue_url},\n                                outfd, pickle.HIGHEST_PROTOCOL\n                            )\n\n                        put_url = awsutils.s3_put_file(\n                            'failed-periodfinding-%s.pkl' % lc_filename,\n                            outbucket,\n                            client=s3_client\n                        )\n\n\n                        # put the S3 URL of the output into the output\n                        # queue if requested\n                        if out_queue_url is not None:\n\n                            awsutils.sqs_put_item(\n                                out_queue_url,\n                                {'pfresult':put_url,\n                                 'lc_filename':lc_filename,\n                                 'kwargs':kwargs},\n                                raiseonfail=True\n                            )\n\n                        # delete the light curve file when we're done with it\n                        if ( (lc_filename is not None) and\n                             (os.path.exists(lc_filename)) ):\n                            os.remove(lc_filename)\n\n                    # delete the input item from the input queue to\n                    # acknowledge its receipt and indicate that\n                    # processing is done\n                    awsutils.sqs_delete_item(in_queue_url,\n                                             receipt,\n                                             raiseonfail=True)\n\n\n        # a keyboard interrupt kills the loop\n        except KeyboardInterrupt:\n\n            LOGWARNING('breaking out of the processing loop.')\n            break\n\n\n        # if the queues disappear, then the producer loop is done and we should\n        # exit\n        except ClientError as e:\n\n            LOGWARNING('queues have disappeared. stopping worker loop')\n            break\n\n\n        # any other exception continues the loop we'll write the output file to\n        # the output S3 bucket (and any optional output queue), but add a\n        # failed-* prefix to it to indicate that processing failed. FIXME: could\n        # use a dead-letter queue for this instead\n        except Exception as e:\n\n            LOGEXCEPTION('could not process input from queue')\n\n            if 'lc_filename' in locals():\n\n                with open('failed-periodfinding-%s.pkl' %\n                          lc_filename,'wb') as outfd:\n                    pickle.dump(\n                        {'in_queue_url':in_queue_url,\n                         'target':target,\n                         'kwargs':kwargs,\n                         'outbucket':outbucket,\n                         'out_queue_url':out_queue_url},\n                        outfd, pickle.HIGHEST_PROTOCOL\n                    )\n\n                put_url = awsutils.s3_put_file(\n                    'failed-periodfinding-%s.pkl' % lc_filename,\n                    outbucket,\n                    client=s3_client\n                )\n\n                # put the S3 URL of the output into the output\n                # queue if requested\n                if out_queue_url is not None:\n\n                    awsutils.sqs_put_item(\n                        out_queue_url,\n                        {'cpf':put_url,\n                         'kwargs':kwargs},\n                        raiseonfail=True\n                    )\n                if ( (lc_filename is not None) and\n                     (os.path.exists(lc_filename)) ):\n                    os.remove(lc_filename)\n\n            # delete the input item from the input queue to\n            # acknowledge its receipt and indicate that\n            # processing is done\n            awsutils.sqs_delete_item(in_queue_url, receipt, raiseonfail=True)", "response": "This function runs the main worker loop for the work assignment process."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef gaussianeb_fit_magseries(times, mags, errs,\n                             ebparams,\n                             sigclip=10.0,\n                             plotfit=False,\n                             magsarefluxes=False,\n                             verbose=True):\n    '''This fits a double inverted gaussian EB model to a magnitude time series.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The input mag/flux time-series to fit the EB model to.\n\n    period : float\n        The period to use for EB fit.\n\n    ebparams : list of float\n        This is a list containing the eclipsing binary parameters::\n\n            ebparams = [period (time),\n                        epoch (time),\n                        pdepth (mags),\n                        pduration (phase),\n                        psdepthratio,\n                        secondaryphase]\n\n        `period` is the period in days.\n\n        `epoch` is the time of primary minimum in JD.\n\n        `pdepth` is the depth of the primary eclipse:\n\n        - for magnitudes -> `pdepth` should be < 0\n        - for fluxes     -> `pdepth` should be > 0\n\n        `pduration` is the length of the primary eclipse in phase.\n\n        `psdepthratio` is the ratio of the secondary eclipse depth to that of\n        the primary eclipse.\n\n        `secondaryphase` is the phase at which the minimum of the secondary\n        eclipse is located. This effectively parameterizes eccentricity.\n\n        If `epoch` is None, this function will do an initial spline fit to find\n        an approximate minimum of the phased light curve using the given period.\n\n        The `pdepth` provided is checked against the value of\n        `magsarefluxes`. if `magsarefluxes = True`, the `ebdepth` is forced to\n        be > 0; if `magsarefluxes = False`, the `ebdepth` is forced to be < 0.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    magsarefluxes : bool\n        If True, will treat the input values of `mags` as fluxes for purposes of\n        plotting the fit and sig-clipping.\n\n    plotfit : str or False\n        If this is a string, this function will make a plot for the fit to the\n        mag/flux time-series and writes the plot to the path specified here.\n\n    ignoreinitfail : bool\n        If this is True, ignores the initial failure to find a set of optimized\n        Fourier parameters using the global optimization function and proceeds\n        to do a least-squares fit anyway.\n\n    verbose : bool\n        If True, will indicate progress and warn of any problems.\n\n    Returns\n    -------\n\n    dict\n        This function returns a dict containing the model fit parameters, the\n        minimized chi-sq value and the reduced chi-sq value. The form of this\n        dict is mostly standardized across all functions in this module::\n\n            {\n                'fittype':'gaussianeb',\n                'fitinfo':{\n                    'initialparams':the initial EB params provided,\n                    'finalparams':the final model fit EB params,\n                    'finalparamerrs':formal errors in the params,\n                    'leastsqfit':the full tuple returned by scipy.leastsq,\n                    'fitmags': the model fit mags,\n                    'fitepoch': the epoch of minimum light for the fit,\n                },\n                'fitchisq': the minimized value of the fit's chi-sq,\n                'fitredchisq':the reduced chi-sq value,\n                'fitplotfile': the output fit plot if fitplot is not None,\n                'magseries':{\n                    'times':input times in phase order of the model,\n                    'phase':the phases of the model mags,\n                    'mags':input mags/fluxes in the phase order of the model,\n                    'errs':errs in the phase order of the model,\n                    'magsarefluxes':input value of magsarefluxes kwarg\n                }\n            }\n\n\n    '''\n\n    stimes, smags, serrs = sigclip_magseries(times, mags, errs,\n                                             sigclip=sigclip,\n                                             magsarefluxes=magsarefluxes)\n\n    # get rid of zero errs\n    nzind = npnonzero(serrs)\n    stimes, smags, serrs = stimes[nzind], smags[nzind], serrs[nzind]\n\n\n    # check the ebparams\n    ebperiod, ebepoch, ebdepth = ebparams[0:3]\n\n    # check if we have a ebepoch to use\n    if ebepoch is None:\n\n        if verbose:\n            LOGWARNING('no ebepoch given in ebparams, '\n                       'trying to figure it out automatically...')\n        # do a spline fit to figure out the approximate min of the LC\n        try:\n            spfit = spline_fit_magseries(times, mags, errs, ebperiod,\n                                         sigclip=sigclip,\n                                         magsarefluxes=magsarefluxes,\n                                         verbose=verbose)\n            ebepoch = spfit['fitinfo']['fitepoch']\n\n        # if the spline-fit fails, try a savgol fit instead\n        except Exception as e:\n            sgfit = savgol_fit_magseries(times, mags, errs, ebperiod,\n                                         sigclip=sigclip,\n                                         magsarefluxes=magsarefluxes,\n                                         verbose=verbose)\n            ebepoch = sgfit['fitinfo']['fitepoch']\n\n        # if everything failed, then bail out and ask for the ebepoch\n        finally:\n\n            if ebepoch is None:\n                LOGERROR(\"couldn't automatically figure out the eb epoch, \"\n                         \"can't continue. please provide it in ebparams.\")\n\n                # assemble the returndict\n                returndict = {\n                    'fittype':'gaussianeb',\n                    'fitinfo':{\n                        'initialparams':ebparams,\n                        'finalparams':None,\n                        'leastsqfit':None,\n                        'fitmags':None,\n                        'fitepoch':None,\n                    },\n                    'fitchisq':npnan,\n                    'fitredchisq':npnan,\n                    'fitplotfile':None,\n                    'magseries':{\n                        'phase':None,\n                        'times':None,\n                        'mags':None,\n                        'errs':None,\n                        'magsarefluxes':magsarefluxes,\n                    },\n                }\n\n                return returndict\n\n            else:\n\n                if ebepoch.size > 1:\n                    if verbose:\n                        LOGWARNING('could not auto-find a single minimum '\n                                   'for ebepoch, using the first one returned')\n                    ebparams[1] = ebepoch[0]\n\n                else:\n\n                    if verbose:\n                        LOGWARNING(\n                            'using automatically determined ebepoch = %.5f'\n                            % ebepoch\n                        )\n                    ebparams[1] = ebepoch.item()\n\n    # next, check the ebdepth and fix it to the form required\n    if magsarefluxes:\n        if ebdepth < 0.0:\n            ebparams[2] = -ebdepth[2]\n\n    else:\n        if ebdepth > 0.0:\n            ebparams[2] = -ebdepth[2]\n\n    # finally, do the fit\n    try:\n        leastsqfit = spleastsq(eclipses.invgauss_eclipses_residual,\n                               ebparams,\n                               args=(stimes, smags, serrs),\n                               full_output=True)\n    except Exception as e:\n        leastsqfit = None\n\n    # if the fit succeeded, then we can return the final parameters\n    if leastsqfit and leastsqfit[-1] in (1,2,3,4):\n\n        finalparams = leastsqfit[0]\n        covxmatrix = leastsqfit[1]\n\n        # calculate the chisq and reduced chisq\n        fitmags, phase, ptimes, pmags, perrs = eclipses.invgauss_eclipses_func(\n            finalparams,\n            stimes, smags, serrs\n        )\n        fitchisq = npsum(\n            ((fitmags - pmags)*(fitmags - pmags)) / (perrs*perrs)\n        )\n        fitredchisq = fitchisq/(len(pmags) - len(finalparams) - 1)\n\n        # get the residual variance and calculate the formal 1-sigma errs on the\n        # final parameters\n        residuals = leastsqfit[2]['fvec']\n        residualvariance = (\n            npsum(residuals*residuals)/(pmags.size - finalparams.size)\n        )\n        if covxmatrix is not None:\n            covmatrix = residualvariance*covxmatrix\n            stderrs = npsqrt(npdiag(covmatrix))\n        else:\n            LOGERROR('covxmatrix not available, fit probably failed!')\n            stderrs = None\n\n        if verbose:\n            LOGINFO(\n                'final fit done. chisq = %.5f, reduced chisq = %.5f' %\n                (fitchisq, fitredchisq)\n            )\n\n        # get the fit epoch\n        fperiod, fepoch = finalparams[:2]\n\n        # assemble the returndict\n        returndict = {\n            'fittype':'gaussianeb',\n            'fitinfo':{\n                'initialparams':ebparams,\n                'finalparams':finalparams,\n                'finalparamerrs':stderrs,\n                'leastsqfit':leastsqfit,\n                'fitmags':fitmags,\n                'fitepoch':fepoch,\n            },\n            'fitchisq':fitchisq,\n            'fitredchisq':fitredchisq,\n            'fitplotfile':None,\n            'magseries':{\n                'phase':phase,\n                'times':ptimes,\n                'mags':pmags,\n                'errs':perrs,\n                'magsarefluxes':magsarefluxes,\n            },\n        }\n\n        # make the fit plot if required\n        if plotfit and isinstance(plotfit, str):\n\n            make_fit_plot(phase, pmags, perrs, fitmags,\n                          fperiod, ptimes.min(), fepoch,\n                          plotfit,\n                          magsarefluxes=magsarefluxes)\n\n            returndict['fitplotfile'] = plotfit\n\n        return returndict\n\n    # if the leastsq fit failed, return nothing\n    else:\n\n        LOGERROR('eb-fit: least-squared fit to the light curve failed!')\n\n        # assemble the returndict\n        returndict = {\n            'fittype':'gaussianeb',\n            'fitinfo':{\n                'initialparams':ebparams,\n                'finalparams':None,\n                'finalparamerrs':None,\n                'leastsqfit':leastsqfit,\n                'fitmags':None,\n                'fitepoch':None,\n            },\n            'fitchisq':npnan,\n            'fitredchisq':npnan,\n            'fitplotfile':None,\n            'magseries':{\n                'phase':None,\n                'times':None,\n                'mags':None,\n                'errs':None,\n                'magsarefluxes':magsarefluxes,\n            },\n        }\n\n        return returndict", "response": "This function fits a double inverted gaussian EB model to a magnitude time series."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef spline_fit_magseries(times, mags, errs, period,\n                         knotfraction=0.01,\n                         maxknots=30,\n                         sigclip=30.0,\n                         plotfit=False,\n                         ignoreinitfail=False,\n                         magsarefluxes=False,\n                         verbose=True):\n\n    '''This fits a univariate cubic spline to the phased light curve.\n\n    This fit may be better than the Fourier fit for sharply variable objects,\n    like EBs, so can be used to distinguish them from other types of variables.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The input mag/flux time-series to fit a spline to.\n\n    period : float\n        The period to use for the spline fit.\n\n    knotfraction : float\n        The knot fraction is the number of internal knots to use for the\n        spline. A value of 0.01 (or 1%) of the total number of non-nan\n        observations appears to work quite well, without over-fitting. maxknots\n        controls the maximum number of knots that will be allowed.\n\n    maxknots : int\n        The maximum number of knots that will be used even if `knotfraction`\n        gives a value to use larger than `maxknots`. This helps dealing with\n        over-fitting to short time-scale variations.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    magsarefluxes : bool\n        If True, will treat the input values of `mags` as fluxes for purposes of\n        plotting the fit and sig-clipping.\n\n    plotfit : str or False\n        If this is a string, this function will make a plot for the fit to the\n        mag/flux time-series and writes the plot to the path specified here.\n\n    ignoreinitfail : bool\n        If this is True, ignores the initial failure to find a set of optimized\n        Fourier parameters using the global optimization function and proceeds\n        to do a least-squares fit anyway.\n\n    verbose : bool\n        If True, will indicate progress and warn of any problems.\n\n    Returns\n    -------\n\n    dict\n        This function returns a dict containing the model fit parameters, the\n        minimized chi-sq value and the reduced chi-sq value. The form of this\n        dict is mostly standardized across all functions in this module::\n\n            {\n                'fittype':'spline',\n                'fitinfo':{\n                    'nknots': the number of knots used for the fit\n                    'fitmags': the model fit mags,\n                    'fitepoch': the epoch of minimum light for the fit,\n                },\n                'fitchisq': the minimized value of the fit's chi-sq,\n                'fitredchisq':the reduced chi-sq value,\n                'fitplotfile': the output fit plot if fitplot is not None,\n                'magseries':{\n                    'times':input times in phase order of the model,\n                    'phase':the phases of the model mags,\n                    'mags':input mags/fluxes in the phase order of the model,\n                    'errs':errs in the phase order of the model,\n                    'magsarefluxes':input value of magsarefluxes kwarg\n                }\n            }\n\n    '''\n\n    # this is required to fit the spline correctly\n    if errs is None:\n        errs = npfull_like(mags, 0.005)\n\n    # sigclip the magnitude time series\n    stimes, smags, serrs = sigclip_magseries(times, mags, errs,\n                                             sigclip=sigclip,\n                                             magsarefluxes=magsarefluxes)\n    # get rid of zero errs\n    nzind = npnonzero(serrs)\n    stimes, smags, serrs = stimes[nzind], smags[nzind], serrs[nzind]\n\n    # phase the mag series\n    phase, pmags, perrs, ptimes, mintime = (\n        get_phased_quantities(stimes, smags, serrs, period)\n    )\n\n    # now figure out the number of knots up to max knots (=100)\n    nobs = len(phase)\n    nknots = int(npfloor(knotfraction*nobs))\n    nknots = maxknots if nknots > maxknots else nknots\n    splineknots = nplinspace(phase[0] + 0.01,\n                             phase[-1] - 0.01,\n                             num=nknots)\n\n    # NOTE: newer scipy needs x to be strictly increasing. this means we should\n    # filter out anything that doesn't have np.diff(phase) > 0.0\n    # FIXME: this needs to be tested\n    phase_diffs_ind = npdiff(phase) > 0.0\n    incphase_ind = npconcatenate((nparray([True]), phase_diffs_ind))\n    phase, pmags, perrs = (phase[incphase_ind],\n                           pmags[incphase_ind],\n                           perrs[incphase_ind])\n\n    # generate and fit the spline\n    spl = LSQUnivariateSpline(phase, pmags, t=splineknots, w=1.0/perrs)\n\n    # calculate the spline fit to the actual phases, the chisq and red-chisq\n    fitmags = spl(phase)\n\n    fitchisq = npsum(\n        ((fitmags - pmags)*(fitmags - pmags)) / (perrs*perrs)\n    )\n\n    fitredchisq = fitchisq/(len(pmags) - nknots - 1)\n\n    if verbose:\n        LOGINFO(\n            'spline fit done. nknots = %s,  '\n            'chisq = %.5f, reduced chisq = %.5f' %\n            (nknots, fitchisq, fitredchisq)\n        )\n\n    # figure out the time of light curve minimum (i.e. the fit epoch)\n    # this is when the fit mag is maximum (i.e. the faintest)\n    # or if magsarefluxes = True, then this is when fit flux is minimum\n    if not magsarefluxes:\n        fitmagminind = npwhere(fitmags == npmax(fitmags))\n    else:\n        fitmagminind = npwhere(fitmags == npmin(fitmags))\n    if len(fitmagminind[0]) > 1:\n        fitmagminind = (fitmagminind[0][0],)\n    magseriesepoch = ptimes[fitmagminind]\n\n    # assemble the returndict\n    returndict = {\n        'fittype':'spline',\n        'fitinfo':{\n            'nknots':nknots,\n            'fitmags':fitmags,\n            'fitepoch':magseriesepoch\n        },\n        'fitchisq':fitchisq,\n        'fitredchisq':fitredchisq,\n        'fitplotfile':None,\n        'magseries':{\n            'times':ptimes,\n            'phase':phase,\n            'mags':pmags,\n            'errs':perrs,\n            'magsarefluxes':magsarefluxes\n        },\n    }\n\n    # make the fit plot if required\n    if plotfit and isinstance(plotfit, str):\n\n        make_fit_plot(phase, pmags, perrs, fitmags,\n                      period, mintime, magseriesepoch,\n                      plotfit,\n                      magsarefluxes=magsarefluxes)\n\n        returndict['fitplotfile'] = plotfit\n\n    return returndict", "response": "Fits a spline to a set of time - series times mags and errors."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nfits a Savitzky - Golay filter to the magnitude time series.", "response": "def savgol_fit_magseries(times, mags, errs, period,\n                         windowlength=None,\n                         polydeg=2,\n                         sigclip=30.0,\n                         plotfit=False,\n                         magsarefluxes=False,\n                         verbose=True):\n\n    '''Fit a Savitzky-Golay filter to the magnitude/flux time series.\n\n    SG fits successive sub-sets (windows) of adjacent data points with a\n    low-order polynomial via least squares. At each point (magnitude), it\n    returns the value of the polynomial at that magnitude's time.  This is made\n    significantly cheaper than *actually* performing least squares for each\n    window through linear algebra tricks that are possible when specifying the\n    window size and polynomial order beforehand.  Numerical Recipes Ch 14.8\n    gives an overview, Eq. 14.8.6 is what Scipy has implemented.\n\n    The idea behind Savitzky-Golay is to preserve higher moments (>=2) of the\n    input data series than would be done by a simple moving window average.\n\n    Note that the filter assumes evenly spaced data, which magnitude time series\n    are not. By *pretending* the data points are evenly spaced, we introduce an\n    additional noise source in the function values. This is a relatively small\n    noise source provided that the changes in the magnitude values across the\n    full width of the N=windowlength point window is < sqrt(N/2) times the\n    measurement noise on a single point.\n\n    TODO:\n    - Find correct dof for reduced chi squared in savgol_fit_magseries\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The input mag/flux time-series to fit the Savitsky-Golay model to.\n\n    period : float\n        The period to use for the model fit.\n\n    windowlength : None or int\n        The length of the filter window (the number of coefficients). Must be\n        either positive and odd, or None. (The window is the number of points to\n        the left, and to the right, of whatever point is having a polynomial fit\n        to it locally). Bigger windows at fixed polynomial order risk lowering\n        the amplitude of sharp features. If None, this routine (arbitrarily)\n        sets the `windowlength` for phased LCs to be either the number of finite\n        data points divided by 300, or polydeg+3, whichever is bigger.\n\n    polydeg : int\n        This is the order of the polynomial used to fit the samples.  Must be\n        less than `windowlength`. \"Higher-order filters do better at preserving\n        feature heights and widths, but do less smoothing on broader features.\"\n        (Numerical Recipes).\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    magsarefluxes : bool\n        If True, will treat the input values of `mags` as fluxes for purposes of\n        plotting the fit and sig-clipping.\n\n    plotfit : str or False\n        If this is a string, this function will make a plot for the fit to the\n        mag/flux time-series and writes the plot to the path specified here.\n\n    ignoreinitfail : bool\n        If this is True, ignores the initial failure to find a set of optimized\n        Fourier parameters using the global optimization function and proceeds\n        to do a least-squares fit anyway.\n\n    verbose : bool\n        If True, will indicate progress and warn of any problems.\n\n    Returns\n    -------\n\n    dict\n        This function returns a dict containing the model fit parameters, the\n        minimized chi-sq value and the reduced chi-sq value. The form of this\n        dict is mostly standardized across all functions in this module::\n\n            {\n                'fittype':'savgol',\n                'fitinfo':{\n                    'windowlength': the window length used for the fit,\n                    'polydeg':the polynomial degree used for the fit,\n                    'fitmags': the model fit mags,\n                    'fitepoch': the epoch of minimum light for the fit,\n                },\n                'fitchisq': the minimized value of the fit's chi-sq,\n                'fitredchisq':the reduced chi-sq value,\n                'fitplotfile': the output fit plot if fitplot is not None,\n                'magseries':{\n                    'times':input times in phase order of the model,\n                    'phase':the phases of the model mags,\n                    'mags':input mags/fluxes in the phase order of the model,\n                    'errs':errs in the phase order of the model,\n                    'magsarefluxes':input value of magsarefluxes kwarg\n                }\n            }\n\n    '''\n    stimes, smags, serrs = sigclip_magseries(times, mags, errs,\n                                             sigclip=sigclip,\n                                             magsarefluxes=magsarefluxes)\n\n    # get rid of zero errs\n    nzind = npnonzero(serrs)\n    stimes, smags, serrs = stimes[nzind], smags[nzind], serrs[nzind]\n\n    phase, pmags, perrs, ptimes, mintime = (\n        get_phased_quantities(stimes, smags, serrs, period)\n    )\n\n    if not isinstance(windowlength, int):\n        windowlength = max(\n            polydeg + 3,\n            int(len(phase)/300)\n        )\n        if windowlength % 2 == 0:\n            windowlength += 1\n\n    if verbose:\n        LOGINFO('applying Savitzky-Golay filter with '\n                'window length %s and polynomial degree %s to '\n                'mag series with %s observations, '\n                'using period %.6f, folded at %.6f' % (windowlength,\n                                                       polydeg,\n                                                       len(pmags),\n                                                       period,\n                                                       mintime))\n\n    # generate the function values obtained by applying the SG filter. The\n    # \"wrap\" option is best for phase-folded LCs.\n    sgf = savgol_filter(pmags, windowlength, polydeg, mode='wrap')\n\n    # here the \"fit\" to the phases is the function produced by the\n    # Savitzky-Golay filter. then compute the chisq and red-chisq.\n    fitmags = sgf\n\n    fitchisq = npsum(\n        ((fitmags - pmags)*(fitmags - pmags)) / (perrs*perrs)\n    )\n\n    # TODO: quantify dof for SG filter.\n    nparams = int(len(pmags)/windowlength) * polydeg\n    fitredchisq = fitchisq/(len(pmags) - nparams - 1)\n    fitredchisq = -99.\n\n    if verbose:\n        LOGINFO(\n            'SG filter applied. chisq = %.5f, reduced chisq = %.5f' %\n            (fitchisq, fitredchisq)\n        )\n\n    # figure out the time of light curve minimum (i.e. the fit epoch)\n    # this is when the fit mag is maximum (i.e. the faintest)\n    # or if magsarefluxes = True, then this is when fit flux is minimum\n    if not magsarefluxes:\n        fitmagminind = npwhere(fitmags == npmax(fitmags))\n    else:\n        fitmagminind = npwhere(fitmags == npmin(fitmags))\n    if len(fitmagminind[0]) > 1:\n        fitmagminind = (fitmagminind[0][0],)\n    magseriesepoch = ptimes[fitmagminind]\n\n    # assemble the returndict\n    returndict = {\n        'fittype':'savgol',\n        'fitinfo':{\n            'windowlength':windowlength,\n            'polydeg':polydeg,\n            'fitmags':fitmags,\n            'fitepoch':magseriesepoch\n        },\n        'fitchisq':fitchisq,\n        'fitredchisq':fitredchisq,\n        'fitplotfile':None,\n        'magseries':{\n            'times':ptimes,\n            'phase':phase,\n            'mags':pmags,\n            'errs':perrs,\n            'magsarefluxes':magsarefluxes\n        }\n    }\n\n    # make the fit plot if required\n    if plotfit and isinstance(plotfit, str):\n\n        make_fit_plot(phase, pmags, perrs, fitmags,\n                      period, mintime, magseriesepoch,\n                      plotfit,\n                      magsarefluxes=magsarefluxes)\n\n        returndict['fitplotfile'] = plotfit\n\n    return returndict"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nfits an arbitrary - order Legendre polynomial to the time - series times mags errs period", "response": "def legendre_fit_magseries(times, mags, errs, period,\n                           legendredeg=10,\n                           sigclip=30.0,\n                           plotfit=False,\n                           magsarefluxes=False,\n                           verbose=True):\n\n    '''Fit an arbitrary-order Legendre series, via least squares, to the\n    magnitude/flux time series.\n\n    This is a series of the form::\n\n        p(x) = c_0*L_0(x) + c_1*L_1(x) + c_2*L_2(x) + ... + c_n*L_n(x)\n\n    where L_i's are Legendre polynomials (also called \"Legendre functions of the\n    first kind\") and c_i's are the coefficients being fit.\n\n    This function is mainly just a wrapper to\n    `numpy.polynomial.legendre.Legendre.fit`.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The input mag/flux time-series to fit a Legendre series polynomial to.\n\n    period : float\n        The period to use for the Legendre fit.\n\n    legendredeg : int\n        This is `n` in the equation above, e.g. if you give `n=5`, you will\n        get 6 coefficients. This number should be much less than the number of\n        data points you are fitting.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    magsarefluxes : bool\n        If True, will treat the input values of `mags` as fluxes for purposes of\n        plotting the fit and sig-clipping.\n\n    plotfit : str or False\n        If this is a string, this function will make a plot for the fit to the\n        mag/flux time-series and writes the plot to the path specified here.\n\n    ignoreinitfail : bool\n        If this is True, ignores the initial failure to find a set of optimized\n        Fourier parameters using the global optimization function and proceeds\n        to do a least-squares fit anyway.\n\n    verbose : bool\n        If True, will indicate progress and warn of any problems.\n\n    Returns\n    -------\n\n    dict\n        This function returns a dict containing the model fit parameters, the\n        minimized chi-sq value and the reduced chi-sq value. The form of this\n        dict is mostly standardized across all functions in this module::\n\n            {\n                'fittype':'legendre',\n                'fitinfo':{\n                    'legendredeg': the Legendre polynomial degree used,\n                    'fitmags': the model fit mags,\n                    'fitepoch': the epoch of minimum light for the fit,\n                },\n                'fitchisq': the minimized value of the fit's chi-sq,\n                'fitredchisq':the reduced chi-sq value,\n                'fitplotfile': the output fit plot if fitplot is not None,\n                'magseries':{\n                    'times':input times in phase order of the model,\n                    'phase':the phases of the model mags,\n                    'mags':input mags/fluxes in the phase order of the model,\n                    'errs':errs in the phase order of the model,\n                    'magsarefluxes':input value of magsarefluxes kwarg\n                }\n            }\n\n\n    '''\n    stimes, smags, serrs = sigclip_magseries(times, mags, errs,\n                                             sigclip=sigclip,\n                                             magsarefluxes=magsarefluxes)\n\n    # get rid of zero errs\n    nzind = npnonzero(serrs)\n    stimes, smags, serrs = stimes[nzind], smags[nzind], serrs[nzind]\n\n\n    phase, pmags, perrs, ptimes, mintime = (\n        get_phased_quantities(stimes, smags, serrs, period)\n    )\n\n    if verbose:\n        LOGINFO('fitting Legendre series with '\n                'maximum Legendre polynomial order %s to '\n                'mag series with %s observations, '\n                'using period %.6f, folded at %.6f' % (legendredeg,\n                                                       len(pmags),\n                                                       period,\n                                                       mintime))\n\n    # Least squares fit of Legendre polynomial series to the data. The window\n    # and domain (see \"Using the Convenience Classes\" in the numpy\n    # documentation) are handled automatically, scaling the times to a minimal\n    # domain in [-1,1], in which Legendre polynomials are a complete basis.\n\n    p = Legendre.fit(phase, pmags, legendredeg)\n    coeffs = p.coef\n    fitmags = p(phase)\n\n    # Now compute the chisq and red-chisq.\n\n    fitchisq = npsum(\n        ((fitmags - pmags)*(fitmags - pmags)) / (perrs*perrs)\n    )\n\n    nparams = legendredeg + 1\n    fitredchisq = fitchisq/(len(pmags) - nparams - 1)\n\n    if verbose:\n        LOGINFO(\n            'Legendre fit done. chisq = %.5f, reduced chisq = %.5f' %\n            (fitchisq, fitredchisq)\n        )\n\n    # figure out the time of light curve minimum (i.e. the fit epoch)\n    # this is when the fit mag is maximum (i.e. the faintest)\n    # or if magsarefluxes = True, then this is when fit flux is minimum\n    if not magsarefluxes:\n        fitmagminind = npwhere(fitmags == npmax(fitmags))\n    else:\n        fitmagminind = npwhere(fitmags == npmin(fitmags))\n    if len(fitmagminind[0]) > 1:\n        fitmagminind = (fitmagminind[0][0],)\n    magseriesepoch = ptimes[fitmagminind]\n\n    # assemble the returndict\n    returndict = {\n        'fittype':'legendre',\n        'fitinfo':{\n            'legendredeg':legendredeg,\n            'fitmags':fitmags,\n            'fitepoch':magseriesepoch,\n            'finalparams':coeffs,\n        },\n        'fitchisq':fitchisq,\n        'fitredchisq':fitredchisq,\n        'fitplotfile':None,\n        'magseries':{\n            'times':ptimes,\n            'phase':phase,\n            'mags':pmags,\n            'errs':perrs,\n            'magsarefluxes':magsarefluxes\n        }\n    }\n\n    # make the fit plot if required\n    if plotfit and isinstance(plotfit, str):\n\n        make_fit_plot(phase, pmags, perrs, fitmags,\n                      period, mintime, magseriesepoch,\n                      plotfit,\n                      magsarefluxes=magsarefluxes)\n\n        returndict['fitplotfile'] = plotfit\n\n    return returndict"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef update_checkplotdict_nbrlcs(\n        checkplotdict,\n        timecol, magcol, errcol,\n        lcformat='hat-sql',\n        lcformatdir=None,\n        verbose=True,\n):\n\n    '''For all neighbors in a checkplotdict, make LCs and phased LCs.\n\n    Parameters\n    ----------\n\n    checkplotdict : dict\n        This is the checkplot to process. The light curves for the neighbors to\n        the object here will be extracted from the stored file paths, and this\n        function will make plots of these time-series. If the object has 'best'\n        periods and epochs generated by period-finder functions in this\n        checkplotdict, phased light curve plots of each neighbor will be made\n        using these to check the effects of blending.\n\n    timecol,magcol,errcol : str\n        The timecol, magcol, and errcol keys used to generate this object's\n        checkplot. This is used to extract the correct times-series from the\n        neighbors' light curves.\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curves specified in `basedir` or `use_list_of_filenames`.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    Returns\n    -------\n\n    dict\n        The input checkplotdict is returned with the neighor light curve plots\n        added in.\n\n    '''\n\n    try:\n        formatinfo = get_lcformat(lcformat,\n                                  use_lcformat_dir=lcformatdir)\n        if formatinfo:\n            (dfileglob, readerfunc,\n             dtimecols, dmagcols, derrcols,\n             magsarefluxes, normfunc) = formatinfo\n        else:\n            LOGERROR(\"can't figure out the light curve format\")\n            return checkplotdict\n    except Exception as e:\n        LOGEXCEPTION(\"can't figure out the light curve format\")\n        return checkplotdict\n\n    if not ('neighbors' in checkplotdict and\n            checkplotdict['neighbors'] and\n            len(checkplotdict['neighbors']) > 0):\n\n        LOGERROR('no neighbors for %s, not updating...' %\n                 (checkplotdict['objectid']))\n        return checkplotdict\n\n    # get our object's magkeys to compare to the neighbor\n    objmagkeys = {}\n\n    # handle diff generations of checkplots\n    if 'available_bands' in checkplotdict['objectinfo']:\n        mclist = checkplotdict['objectinfo']['available_bands']\n    else:\n        mclist = ('bmag','vmag','rmag','imag','jmag','hmag','kmag',\n                  'sdssu','sdssg','sdssr','sdssi','sdssz')\n\n    for mc in mclist:\n        if (mc in checkplotdict['objectinfo'] and\n            checkplotdict['objectinfo'][mc] is not None and\n            np.isfinite(checkplotdict['objectinfo'][mc])):\n\n            objmagkeys[mc] = checkplotdict['objectinfo'][mc]\n\n\n    # if there are actually neighbors, go through them in order\n    for nbr in checkplotdict['neighbors']:\n\n        objectid, lcfpath = (nbr['objectid'],\n                             nbr['lcfpath'])\n\n        # get the light curve\n        if not os.path.exists(lcfpath):\n            LOGERROR('objectid: %s, neighbor: %s, '\n                     'lightcurve: %s not found, skipping...' %\n                     (checkplotdict['objectid'], objectid, lcfpath))\n            continue\n\n        lcdict = readerfunc(lcfpath)\n\n        # this should handle lists/tuples being returned by readerfunc\n        # we assume that the first element is the actual lcdict\n        # FIXME: figure out how to not need this assumption\n        if ( (isinstance(lcdict, (list, tuple))) and\n             (isinstance(lcdict[0], dict)) ):\n            lcdict = lcdict[0]\n\n\n        # 0. get this neighbor's magcols and get the magdiff and colordiff\n        # between it and the object\n\n        nbrmagkeys = {}\n\n        for mc in objmagkeys:\n\n            if (('objectinfo' in lcdict) and\n                (isinstance(lcdict['objectinfo'], dict)) and\n                (mc in lcdict['objectinfo']) and\n                (lcdict['objectinfo'][mc] is not None) and\n                (np.isfinite(lcdict['objectinfo'][mc]))):\n\n                nbrmagkeys[mc] = lcdict['objectinfo'][mc]\n\n        # now calculate the magdiffs\n        magdiffs = {}\n        for omc in objmagkeys:\n            if omc in nbrmagkeys:\n                magdiffs[omc] = objmagkeys[omc] - nbrmagkeys[omc]\n\n        # calculate colors and colordiffs\n        colordiffs = {}\n\n        # generate the list of colors to get\n        # NOTE: here, we don't really bother with new/old gen checkplots\n        # maybe change this later to handle arbitrary colors\n\n        for ctrio in (['bmag','vmag','bvcolor'],\n                      ['vmag','kmag','vkcolor'],\n                      ['jmag','kmag','jkcolor'],\n                      ['sdssi','jmag','ijcolor'],\n                      ['sdssg','kmag','gkcolor'],\n                      ['sdssg','sdssr','grcolor']):\n            m1, m2, color = ctrio\n\n            if (m1 in objmagkeys and\n                m2 in objmagkeys and\n                m1 in nbrmagkeys and\n                m2 in nbrmagkeys):\n\n                objcolor = objmagkeys[m1] - objmagkeys[m2]\n                nbrcolor = nbrmagkeys[m1] - nbrmagkeys[m2]\n                colordiffs[color] = objcolor - nbrcolor\n\n        # finally, add all the color and magdiff info to the nbr dict\n        nbr.update({'magdiffs':magdiffs,\n                    'colordiffs':colordiffs})\n\n        #\n        # process magcols\n        #\n\n        # normalize using the special function if specified\n        if normfunc is not None:\n            lcdict = normfunc(lcdict)\n\n        try:\n\n            # get the times, mags, and errs\n            # dereference the columns and get them from the lcdict\n            if '.' in timecol:\n                timecolget = timecol.split('.')\n            else:\n                timecolget = [timecol]\n            times = _dict_get(lcdict, timecolget)\n\n            if '.' in magcol:\n                magcolget = magcol.split('.')\n            else:\n                magcolget = [magcol]\n            mags = _dict_get(lcdict, magcolget)\n\n            if '.' in errcol:\n                errcolget = errcol.split('.')\n            else:\n                errcolget = [errcol]\n            errs = _dict_get(lcdict, errcolget)\n\n        except KeyError:\n\n            LOGERROR('LC for neighbor: %s (target object: %s) does not '\n                     'have one or more of the required columns: %s, '\n                     'skipping...' %\n                     (objectid, checkplotdict['objectid'],\n                      ', '.join([timecol, magcol, errcol])))\n            continue\n\n        # filter the input times, mags, errs; do sigclipping and normalization\n        stimes, smags, serrs = sigclip_magseries(times,\n                                                 mags,\n                                                 errs,\n                                                 magsarefluxes=magsarefluxes,\n                                                 sigclip=4.0)\n\n        # normalize here if not using special normalization\n        if normfunc is None:\n            ntimes, nmags = normalize_magseries(\n                stimes, smags,\n                magsarefluxes=magsarefluxes\n            )\n            xtimes, xmags, xerrs = ntimes, nmags, serrs\n        else:\n            xtimes, xmags, xerrs = stimes, smags, serrs\n\n\n        # check if this neighbor has enough finite points in its LC\n        # fail early if not enough light curve points\n        if ((xtimes is None) or (xmags is None) or (xerrs is None) or\n            (xtimes.size < 49) or (xmags.size < 49) or (xerrs.size < 49)):\n\n            LOGERROR(\"one or more of times, mags, errs appear to be None \"\n                     \"after sig-clipping. are the measurements all nan? \"\n                     \"can't make neighbor light curve plots \"\n                     \"for target: %s, neighbor: %s, neighbor LC: %s\" %\n                     (checkplotdict['objectid'],\n                      nbr['objectid'],\n                      nbr['lcfpath']))\n            continue\n\n        #\n        # now we can start doing stuff if everything checks out\n        #\n\n        # make an unphased mag-series plot\n        nbrdict = _pkl_magseries_plot(xtimes,\n                                      xmags,\n                                      xerrs,\n                                      magsarefluxes=magsarefluxes)\n        # update the nbr\n        nbr.update(nbrdict)\n\n        # for each lspmethod in the checkplot, make a corresponding plot for\n        # this neighbor\n\n        # figure out the period finder methods present\n        if 'pfmethods' in checkplotdict:\n            pfmethods = checkplotdict['pfmethods']\n        else:\n            pfmethods = []\n            for cpkey in checkplotdict:\n                for pfkey in PFMETHODS:\n                    if pfkey in cpkey:\n                        pfmethods.append(pfkey)\n\n        for lspt in pfmethods:\n\n            # initialize this lspmethod entry\n            nbr[lspt] = {}\n\n            # we only care about the best period and its options\n            operiod, oepoch = (checkplotdict[lspt][0]['period'],\n                               checkplotdict[lspt][0]['epoch'])\n            (ophasewrap, ophasesort, ophasebin,\n             ominbinelems, oplotxlim) = (\n                 checkplotdict[lspt][0]['phasewrap'],\n                 checkplotdict[lspt][0]['phasesort'],\n                 checkplotdict[lspt][0]['phasebin'],\n                 checkplotdict[lspt][0]['minbinelems'],\n                 checkplotdict[lspt][0]['plotxlim'],\n            )\n\n            # make the phasedlc plot for this period\n            nbr = _pkl_phased_magseries_plot(\n                nbr,\n                lspt.split('-')[1],  # this splits '<pfindex>-<pfmethod>'\n                0,\n                xtimes, xmags, xerrs,\n                operiod, oepoch,\n                phasewrap=ophasewrap,\n                phasesort=ophasesort,\n                phasebin=ophasebin,\n                minbinelems=ominbinelems,\n                plotxlim=oplotxlim,\n                magsarefluxes=magsarefluxes,\n                verbose=verbose,\n                override_pfmethod=lspt\n            )\n\n    # at this point, this neighbor's dict should be up to date with all\n    # info, magseries plot, and all phased LC plots\n    # return the updated checkplotdict\n    return checkplotdict", "response": "This function updates the checkplotdict with the light curves for the object with the given timecol magcol and errcol."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef runcp_worker(task):\n    '''\n    This is the worker for running checkplots.\n\n    Parameters\n    ----------\n\n    task : tuple\n        This is of the form: (pfpickle, outdir, lcbasedir, kwargs).\n\n    Returns\n    -------\n\n    list of str\n        The list of checkplot pickles returned by the `runcp` function.\n\n    '''\n\n    pfpickle, outdir, lcbasedir, kwargs = task\n\n    try:\n\n        return runcp(pfpickle, outdir, lcbasedir, **kwargs)\n\n    except Exception as e:\n\n        LOGEXCEPTION(' could not make checkplots for %s: %s' % (pfpickle, e))\n        return None", "response": "This is the worker for running checkplots."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef parallel_cp(\n        pfpicklelist,\n        outdir,\n        lcbasedir,\n        fast_mode=False,\n        lcfnamelist=None,\n        cprenorm=False,\n        lclistpkl=None,\n        gaia_max_timeout=60.0,\n        gaia_mirror=None,\n        nbrradiusarcsec=60.0,\n        maxnumneighbors=5,\n        makeneighborlcs=True,\n        xmatchinfo=None,\n        xmatchradiusarcsec=3.0,\n        sigclip=10.0,\n        minobservations=99,\n        lcformat='hat-sql',\n        lcformatdir=None,\n        timecols=None,\n        magcols=None,\n        errcols=None,\n        skipdone=False,\n        done_callback=None,\n        done_callback_args=None,\n        done_callback_kwargs=None,\n        liststartindex=None,\n        maxobjects=None,\n        nworkers=NCPUS,\n):\n    '''This drives the parallel execution of `runcp` for a list of periodfinding\n    result pickles.\n\n    Parameters\n    ----------\n\n    pfpicklelist : list of str or list of Nones\n        This is the list of the filenames of the period-finding result pickles\n        to process. To make checkplots using the light curves directly, set this\n        to a list of Nones with the same length as the list of light curve files\n        that you provide in `lcfnamelist`.\n\n    outdir : str\n        The directory the checkplot pickles will be written to.\n\n    lcbasedir : str\n        The base directory that this function will look in to find the light\n        curves pointed to by the period-finding result files. If you're using\n        `lcfnamelist` to provide a list of light curve filenames directly, this\n        arg is ignored.\n\n    lcfnamelist : list of str or None\n        If this is provided, it must be a list of the input light curve\n        filenames to process. These can either be associated with each input\n        period-finder result pickle, or can be provided standalone to make\n        checkplots without phased LC plots in them. In the second case, you must\n        set `pfpicklelist` to a list of Nones that matches the length of\n        `lcfnamelist`.\n\n    cprenorm : bool\n        Set this to True if the light curves should be renormalized by\n        `checkplot.checkplot_pickle`. This is set to False by default because we\n        do our own normalization in this function using the light curve's\n        registered normalization function and pass the normalized times, mags,\n        errs to the `checkplot.checkplot_pickle` function.\n\n    lclistpkl : str or dict\n        This is either the filename of a pickle or the actual dict produced by\n        lcproc.make_lclist. This is used to gather neighbor information.\n\n    nbrradiusarcsec : float\n        The radius in arcseconds to use for a search conducted around the\n        coordinates of this object to look for any potential confusion and\n        blending of variability amplitude caused by their proximity.\n\n    maxnumneighbors : int\n        The maximum number of neighbors that will have their light curves and\n        magnitudes noted in this checkplot as potential blends with the target\n        object.\n\n    makeneighborlcs : bool\n        If True, will make light curve and phased light curve plots for all\n        neighbors found in the object collection for each input object.\n\n    fast_mode : bool or float\n        This runs the external catalog operations in a \"fast\" mode, with short\n        timeouts and not trying to hit external catalogs that take a long time\n        to respond.\n\n        If this is set to True, the default settings for the external requests\n        will then become::\n\n                skyview_lookup = False\n                skyview_timeout = 10.0\n                skyview_retry_failed = False\n                dust_timeout = 10.0\n                gaia_submit_timeout = 7.0\n                gaia_max_timeout = 10.0\n                gaia_submit_tries = 2\n                complete_query_later = False\n                search_simbad = False\n\n        If this is a float, will run in \"fast\" mode with the provided timeout\n        value in seconds and the following settings::\n\n                skyview_lookup = True\n                skyview_timeout = fast_mode\n                skyview_retry_failed = False\n                dust_timeout = fast_mode\n                gaia_submit_timeout = 0.66*fast_mode\n                gaia_max_timeout = fast_mode\n                gaia_submit_tries = 2\n                complete_query_later = False\n                search_simbad = False\n\n    gaia_max_timeout : float\n        Sets the timeout in seconds to use when waiting for the GAIA service to\n        respond to our request for the object's information. Note that if\n        `fast_mode` is set, this is ignored.\n\n    gaia_mirror : str or None\n        This sets the GAIA mirror to use. This is a key in the\n        `services.gaia.GAIA_URLS` dict which defines the URLs to hit for each\n        mirror.\n\n    xmatchinfo : str or dict\n        This is either the xmatch dict produced by the function\n        `load_xmatch_external_catalogs` above, or the path to the xmatch info\n        pickle file produced by that function.\n\n    xmatchradiusarcsec : float\n        This is the cross-matching radius to use in arcseconds.\n\n    minobservations : int\n        The minimum of observations the input object's mag/flux time-series must\n        have for this function to plot its light curve and phased light\n        curve. If the object has less than this number, no light curves will be\n        plotted, but the checkplotdict will still contain all of the other\n        information.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curves specified in `basedir` or `use_list_of_filenames`.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    timecols : list of str or None\n        The timecol keys to use from the lcdict in generating this checkplot.\n\n    magcols : list of str or None\n        The magcol keys to use from the lcdict in generating this checkplot.\n\n    errcols : list of str or None\n        The errcol keys to use from the lcdict in generating this checkplot.\n\n    skipdone : bool\n        This indicates if this function will skip creating checkplots that\n        already exist corresponding to the current `objectid` and `magcol`. If\n        `skipdone` is set to True, this will be done.\n\n    done_callback : Python function or None\n        This is used to provide a function to execute after the checkplot\n        pickles are generated. This is useful if you want to stream the results\n        of checkplot making to some other process, e.g. directly running an\n        ingestion into an LCC-Server collection. The function will always get\n        the list of the generated checkplot pickles as its first arg, and all of\n        the kwargs for runcp in the kwargs dict. Additional args and kwargs can\n        be provided by giving a list in the `done_callbacks_args` kwarg and a\n        dict in the `done_callbacks_kwargs` kwarg.\n\n        NOTE: the function you pass in here should be pickleable by normal\n        Python if you want to use it with the parallel_cp and parallel_cp_lcdir\n        functions below.\n\n    done_callback_args : tuple or None\n        If not None, contains any args to pass into the `done_callback`\n        function.\n\n    done_callback_kwargs : dict or None\n        If not None, contains any kwargs to pass into the `done_callback`\n        function.\n\n    liststartindex : int\n        The index of the `pfpicklelist` (and `lcfnamelist` if provided) to start\n        working at.\n\n    maxobjects : int\n        The maximum number of objects to process in this run. Use this with\n        `liststartindex` to effectively distribute working on a large list of\n        input period-finding result pickles (and light curves if `lcfnamelist`\n        is also provided) over several sessions or machines.\n\n    nworkers : int\n        The number of parallel workers that will work on the checkplot\n        generation process.\n\n    Returns\n    -------\n\n    dict\n        This returns a dict with keys = input period-finding pickles and vals =\n        list of the corresponding checkplot pickles produced.\n\n    '''\n\n    # work around the Darwin segfault after fork if no network activity in\n    # main thread bug: https://bugs.python.org/issue30385#msg293958\n    if sys.platform == 'darwin':\n        import requests\n        requests.get('http://captive.apple.com/hotspot-detect.html')\n\n    if not os.path.exists(outdir):\n        os.mkdir(outdir)\n\n    # handle the start and end indices\n    if (liststartindex is not None) and (maxobjects is None):\n        pfpicklelist = pfpicklelist[liststartindex:]\n        if lcfnamelist is not None:\n            lcfnamelist = lcfnamelist[liststartindex:]\n\n    elif (liststartindex is None) and (maxobjects is not None):\n        pfpicklelist = pfpicklelist[:maxobjects]\n        if lcfnamelist is not None:\n            lcfnamelist = lcfnamelist[:maxobjects]\n\n    elif (liststartindex is not None) and (maxobjects is not None):\n        pfpicklelist = (\n            pfpicklelist[liststartindex:liststartindex+maxobjects]\n        )\n        if lcfnamelist is not None:\n            lcfnamelist = lcfnamelist[liststartindex:liststartindex+maxobjects]\n\n    # if the lcfnamelist is not provided, create a dummy\n    if lcfnamelist is None:\n        lcfnamelist = [None]*len(pfpicklelist)\n\n    tasklist = [(x, outdir, lcbasedir,\n                 {'lcformat':lcformat,\n                  'lcformatdir':lcformatdir,\n                  'lcfname':y,\n                  'timecols':timecols,\n                  'magcols':magcols,\n                  'errcols':errcols,\n                  'lclistpkl':lclistpkl,\n                  'gaia_max_timeout':gaia_max_timeout,\n                  'gaia_mirror':gaia_mirror,\n                  'nbrradiusarcsec':nbrradiusarcsec,\n                  'maxnumneighbors':maxnumneighbors,\n                  'makeneighborlcs':makeneighborlcs,\n                  'xmatchinfo':xmatchinfo,\n                  'xmatchradiusarcsec':xmatchradiusarcsec,\n                  'sigclip':sigclip,\n                  'minobservations':minobservations,\n                  'skipdone':skipdone,\n                  'cprenorm':cprenorm,\n                  'fast_mode':fast_mode,\n                  'done_callback':done_callback,\n                  'done_callback_args':done_callback_args,\n                  'done_callback_kwargs':done_callback_kwargs}) for\n                x,y in zip(pfpicklelist, lcfnamelist)]\n\n    resultfutures = []\n    results = []\n\n    with ProcessPoolExecutor(max_workers=nworkers) as executor:\n        resultfutures = executor.map(runcp_worker, tasklist)\n\n    results = [x for x in resultfutures]\n\n    executor.shutdown()\n    return results", "response": "This function drives the parallel execution of runcp for a list of period - finding result pickles."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef parallel_cp_pfdir(pfpickledir,\n                      outdir,\n                      lcbasedir,\n                      pfpickleglob='periodfinding-*.pkl*',\n                      lclistpkl=None,\n                      cprenorm=False,\n                      nbrradiusarcsec=60.0,\n                      maxnumneighbors=5,\n                      makeneighborlcs=True,\n                      fast_mode=False,\n                      gaia_max_timeout=60.0,\n                      gaia_mirror=None,\n                      xmatchinfo=None,\n                      xmatchradiusarcsec=3.0,\n                      minobservations=99,\n                      sigclip=10.0,\n                      lcformat='hat-sql',\n                      lcformatdir=None,\n                      timecols=None,\n                      magcols=None,\n                      errcols=None,\n                      skipdone=False,\n                      done_callback=None,\n                      done_callback_args=None,\n                      done_callback_kwargs=None,\n                      maxobjects=None,\n                      nworkers=32):\n\n    '''This drives the parallel execution of `runcp` for a directory of\n    periodfinding pickles.\n\n    Parameters\n    ----------\n\n    pfpickledir : str\n        This is the directory containing all of the period-finding pickles to\n        process.\n\n    outdir : str\n        The directory the checkplot pickles will be written to.\n\n    lcbasedir : str\n        The base directory that this function will look in to find the light\n        curves pointed to by the period-finding result files. If you're using\n        `lcfnamelist` to provide a list of light curve filenames directly, this\n        arg is ignored.\n\n    pkpickleglob : str\n        This is a UNIX file glob to select period-finding result pickles in the\n        specified `pfpickledir`.\n\n    lclistpkl : str or dict\n        This is either the filename of a pickle or the actual dict produced by\n        lcproc.make_lclist. This is used to gather neighbor information.\n\n    cprenorm : bool\n        Set this to True if the light curves should be renormalized by\n        `checkplot.checkplot_pickle`. This is set to False by default because we\n        do our own normalization in this function using the light curve's\n        registered normalization function and pass the normalized times, mags,\n        errs to the `checkplot.checkplot_pickle` function.\n\n    nbrradiusarcsec : float\n        The radius in arcseconds to use for a search conducted around the\n        coordinates of this object to look for any potential confusion and\n        blending of variability amplitude caused by their proximity.\n\n    maxnumneighbors : int\n        The maximum number of neighbors that will have their light curves and\n        magnitudes noted in this checkplot as potential blends with the target\n        object.\n\n    makeneighborlcs : bool\n        If True, will make light curve and phased light curve plots for all\n        neighbors found in the object collection for each input object.\n\n    fast_mode : bool or float\n        This runs the external catalog operations in a \"fast\" mode, with short\n        timeouts and not trying to hit external catalogs that take a long time\n        to respond.\n\n        If this is set to True, the default settings for the external requests\n        will then become::\n\n                skyview_lookup = False\n                skyview_timeout = 10.0\n                skyview_retry_failed = False\n                dust_timeout = 10.0\n                gaia_submit_timeout = 7.0\n                gaia_max_timeout = 10.0\n                gaia_submit_tries = 2\n                complete_query_later = False\n                search_simbad = False\n\n        If this is a float, will run in \"fast\" mode with the provided timeout\n        value in seconds and the following settings::\n\n                skyview_lookup = True\n                skyview_timeout = fast_mode\n                skyview_retry_failed = False\n                dust_timeout = fast_mode\n                gaia_submit_timeout = 0.66*fast_mode\n                gaia_max_timeout = fast_mode\n                gaia_submit_tries = 2\n                complete_query_later = False\n                search_simbad = False\n\n    gaia_max_timeout : float\n        Sets the timeout in seconds to use when waiting for the GAIA service to\n        respond to our request for the object's information. Note that if\n        `fast_mode` is set, this is ignored.\n\n    gaia_mirror : str or None\n        This sets the GAIA mirror to use. This is a key in the\n        `services.gaia.GAIA_URLS` dict which defines the URLs to hit for each\n        mirror.\n\n    xmatchinfo : str or dict\n        This is either the xmatch dict produced by the function\n        `load_xmatch_external_catalogs` above, or the path to the xmatch info\n        pickle file produced by that function.\n\n    xmatchradiusarcsec : float\n        This is the cross-matching radius to use in arcseconds.\n\n    minobservations : int\n        The minimum of observations the input object's mag/flux time-series must\n        have for this function to plot its light curve and phased light\n        curve. If the object has less than this number, no light curves will be\n        plotted, but the checkplotdict will still contain all of the other\n        information.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curves specified in `basedir` or `use_list_of_filenames`.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    timecols : list of str or None\n        The timecol keys to use from the lcdict in generating this checkplot.\n\n    magcols : list of str or None\n        The magcol keys to use from the lcdict in generating this checkplot.\n\n    errcols : list of str or None\n        The errcol keys to use from the lcdict in generating this checkplot.\n\n    skipdone : bool\n        This indicates if this function will skip creating checkplots that\n        already exist corresponding to the current `objectid` and `magcol`. If\n        `skipdone` is set to True, this will be done.\n\n    done_callback : Python function or None\n        This is used to provide a function to execute after the checkplot\n        pickles are generated. This is useful if you want to stream the results\n        of checkplot making to some other process, e.g. directly running an\n        ingestion into an LCC-Server collection. The function will always get\n        the list of the generated checkplot pickles as its first arg, and all of\n        the kwargs for runcp in the kwargs dict. Additional args and kwargs can\n        be provided by giving a list in the `done_callbacks_args` kwarg and a\n        dict in the `done_callbacks_kwargs` kwarg.\n\n        NOTE: the function you pass in here should be pickleable by normal\n        Python if you want to use it with the parallel_cp and parallel_cp_lcdir\n        functions below.\n\n    done_callback_args : tuple or None\n        If not None, contains any args to pass into the `done_callback`\n        function.\n\n    done_callback_kwargs : dict or None\n        If not None, contains any kwargs to pass into the `done_callback`\n        function.\n\n    maxobjects : int\n        The maximum number of objects to process in this run.\n\n    nworkers : int\n        The number of parallel workers that will work on the checkplot\n        generation process.\n\n    Returns\n    -------\n\n    dict\n        This returns a dict with keys = input period-finding pickles and vals =\n        list of the corresponding checkplot pickles produced.\n\n    '''\n\n    pfpicklelist = sorted(glob.glob(os.path.join(pfpickledir, pfpickleglob)))\n\n    LOGINFO('found %s period-finding pickles, running cp...' %\n            len(pfpicklelist))\n\n    return parallel_cp(pfpicklelist,\n                       outdir,\n                       lcbasedir,\n                       fast_mode=fast_mode,\n                       lclistpkl=lclistpkl,\n                       nbrradiusarcsec=nbrradiusarcsec,\n                       gaia_max_timeout=gaia_max_timeout,\n                       gaia_mirror=gaia_mirror,\n                       maxnumneighbors=maxnumneighbors,\n                       makeneighborlcs=makeneighborlcs,\n                       xmatchinfo=xmatchinfo,\n                       xmatchradiusarcsec=xmatchradiusarcsec,\n                       sigclip=sigclip,\n                       minobservations=minobservations,\n                       cprenorm=cprenorm,\n                       maxobjects=maxobjects,\n                       lcformat=lcformat,\n                       lcformatdir=lcformatdir,\n                       timecols=timecols,\n                       magcols=magcols,\n                       errcols=errcols,\n                       skipdone=skipdone,\n                       nworkers=nworkers,\n                       done_callback=done_callback,\n                       done_callback_args=done_callback_args,\n                       done_callback_kwargs=done_callback_kwargs)", "response": "This function drives the parallel execution of runcp for a directory of period - finding pickles."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef runpf(lcfile,\n          outdir,\n          timecols=None,\n          magcols=None,\n          errcols=None,\n          lcformat='hat-sql',\n          lcformatdir=None,\n          pfmethods=('gls','pdm','mav','win'),\n          pfkwargs=({},{},{},{}),\n          sigclip=10.0,\n          getblssnr=False,\n          nworkers=NCPUS,\n          minobservations=500,\n          excludeprocessed=False,\n          raiseonfail=False):\n    '''This runs the period-finding for a single LC.\n\n    Parameters\n    ----------\n\n    lcfile : str\n        The light curve file to run period-finding on.\n\n    outdir : str\n        The output directory where the result pickle will go.\n\n    timecols : list of str or None\n        The timecol keys to use from the lcdict in calculating the features.\n\n    magcols : list of str or None\n        The magcol keys to use from the lcdict in calculating the features.\n\n    errcols : list of str or None\n        The errcol keys to use from the lcdict in calculating the features.\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curves specified in `basedir` or `use_list_of_filenames`.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    pfmethods : list of str\n        This is a list of period finding methods to run. Each element is a\n        string matching the keys of the `PFMETHODS` dict above. By default, this\n        runs GLS, PDM, AoVMH, and the spectral window Lomb-Scargle periodogram.\n\n    pfkwargs : list of dicts\n        This is used to provide any special kwargs as dicts to each\n        period-finding method function specified in `pfmethods`.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    getblssnr : bool\n        If this is True and BLS is one of the methods specified in `pfmethods`,\n        will also calculate the stats for each best period in the BLS results:\n        transit depth, duration, ingress duration, refit period and epoch, and\n        the SNR of the transit.\n\n    nworkers : int\n        The number of parallel period-finding workers to launch.\n\n    minobservations : int\n        The minimum number of finite LC points required to process a light\n        curve.\n\n    excludeprocessed : bool\n        If this is True, light curves that have existing period-finding result\n        pickles in `outdir` will not be processed.\n\n        FIXME: currently, this uses a dumb method of excluding already-processed\n        files. A smarter way to do this is to (i) generate a SHA512 cachekey\n        based on a repr of `{'lcfile', 'timecols', 'magcols', 'errcols',\n        'lcformat', 'pfmethods', 'sigclip', 'getblssnr', 'pfkwargs'}`, (ii) make\n        sure all list kwargs in the dict are sorted, (iii) check if the output\n        file has the same cachekey in its filename (last 8 chars of cachekey\n        should work), so the result was processed in exactly the same way as\n        specifed in the input to this function, and can therefore be\n        ignored. Will implement this later.\n\n    raiseonfail : bool\n        If something fails and this is True, will raise an Exception instead of\n        returning None at the end.\n\n    Returns\n    -------\n\n    str\n        The path to the output period-finding result pickle.\n\n    '''\n\n    try:\n        formatinfo = get_lcformat(lcformat,\n                                  use_lcformat_dir=lcformatdir)\n        if formatinfo:\n            (dfileglob, readerfunc,\n             dtimecols, dmagcols, derrcols,\n             magsarefluxes, normfunc) = formatinfo\n        else:\n            LOGERROR(\"can't figure out the light curve format\")\n            return None\n    except Exception as e:\n        LOGEXCEPTION(\"can't figure out the light curve format\")\n        return None\n\n    # override the default timecols, magcols, and errcols\n    # using the ones provided to the function\n    if timecols is None:\n        timecols = dtimecols\n    if magcols is None:\n        magcols = dmagcols\n    if errcols is None:\n        errcols = derrcols\n\n    try:\n\n        # get the LC into a dict\n        lcdict = readerfunc(lcfile)\n\n        # this should handle lists/tuples being returned by readerfunc\n        # we assume that the first element is the actual lcdict\n        # FIXME: figure out how to not need this assumption\n        if ( (isinstance(lcdict, (list, tuple))) and\n             (isinstance(lcdict[0], dict)) ):\n            lcdict = lcdict[0]\n\n        outfile = os.path.join(outdir, 'periodfinding-%s.pkl' %\n                               squeeze(lcdict['objectid']).replace(' ', '-'))\n\n        # if excludeprocessed is True, return the output file if it exists and\n        # has a size that is at least 100 kilobytes (this should be enough to\n        # contain the minimal results of this function).\n        if excludeprocessed:\n\n            test_outfile = os.path.exists(outfile)\n            test_outfile_gz = os.path.exists(outfile+'.gz')\n\n            if (test_outfile and os.stat(outfile).st_size > 102400):\n\n                LOGWARNING('periodfinding result for %s already exists at %s, '\n                           'skipping because excludeprocessed=True'\n                           % (lcfile, outfile))\n                return outfile\n\n            elif (test_outfile_gz and os.stat(outfile+'.gz').st_size > 102400):\n\n                LOGWARNING(\n                    'gzipped periodfinding result for %s already '\n                    'exists at %s, skipping because excludeprocessed=True'\n                    % (lcfile, outfile+'.gz')\n                )\n                return outfile+'.gz'\n\n\n        # this is the final returndict\n        resultdict = {\n            'objectid':lcdict['objectid'],\n            'lcfbasename':os.path.basename(lcfile),\n            'kwargs':{'timecols':timecols,\n                      'magcols':magcols,\n                      'errcols':errcols,\n                      'lcformat':lcformat,\n                      'lcformatdir':lcformatdir,\n                      'pfmethods':pfmethods,\n                      'pfkwargs':pfkwargs,\n                      'sigclip':sigclip,\n                      'getblssnr':getblssnr}\n        }\n\n        # normalize using the special function if specified\n        if normfunc is not None:\n            lcdict = normfunc(lcdict)\n\n        for tcol, mcol, ecol in zip(timecols, magcols, errcols):\n\n            # dereference the columns and get them from the lcdict\n            if '.' in tcol:\n                tcolget = tcol.split('.')\n            else:\n                tcolget = [tcol]\n            times = _dict_get(lcdict, tcolget)\n\n            if '.' in mcol:\n                mcolget = mcol.split('.')\n            else:\n                mcolget = [mcol]\n            mags = _dict_get(lcdict, mcolget)\n\n            if '.' in ecol:\n                ecolget = ecol.split('.')\n            else:\n                ecolget = [ecol]\n            errs = _dict_get(lcdict, ecolget)\n\n\n            # normalize here if not using special normalization\n            if normfunc is None:\n                ntimes, nmags = normalize_magseries(\n                    times, mags,\n                    magsarefluxes=magsarefluxes\n                )\n\n                times, mags, errs = ntimes, nmags, errs\n\n\n            # run each of the requested period-finder functions\n            resultdict[mcol] = {}\n\n            # check if we have enough non-nan observations to proceed\n            finmags = mags[np.isfinite(mags)]\n\n            if finmags.size < minobservations:\n\n                LOGERROR('not enough non-nan observations for '\n                         'this LC. have: %s, required: %s, '\n                         'magcol: %s, skipping...' %\n                         (finmags.size, minobservations, mcol))\n                continue\n\n            pfmkeys = []\n\n            for pfmind, pfm, pfkw in zip(range(len(pfmethods)),\n                                         pfmethods,\n                                         pfkwargs):\n\n                pf_func = PFMETHODS[pfm]\n\n                # get any optional kwargs for this function\n                pf_kwargs = pfkw\n                pf_kwargs.update({'verbose':False,\n                                  'nworkers':nworkers,\n                                  'magsarefluxes':magsarefluxes,\n                                  'sigclip':sigclip})\n\n                # we'll always prefix things with their index to allow multiple\n                # invocations and results from the same period-finder (for\n                # different period ranges, for example).\n                pfmkey = '%s-%s' % (pfmind, pfm)\n                pfmkeys.append(pfmkey)\n\n                # run this period-finder and save its results to the output dict\n                resultdict[mcol][pfmkey] = pf_func(\n                    times, mags, errs,\n                    **pf_kwargs\n                )\n\n\n            #\n            # done with running the period finders\n            #\n            # append the pfmkeys list to the magcol dict\n            resultdict[mcol]['pfmethods'] = pfmkeys\n\n            # check if we need to get the SNR from any BLS pfresults\n            if 'bls' in pfmethods and getblssnr:\n\n                # we need to scan thru the pfmethods to get to any BLS pfresults\n                for pfmk in resultdict[mcol]['pfmethods']:\n\n                    if 'bls' in pfmk:\n\n                        try:\n\n                            bls = resultdict[mcol][pfmk]\n\n                            # calculate the SNR for the BLS as well\n                            blssnr = bls_snr(bls, times, mags, errs,\n                                             magsarefluxes=magsarefluxes,\n                                             verbose=False)\n\n                            # add the SNR results to the BLS result dict\n                            resultdict[mcol][pfmk].update({\n                                'snr':blssnr['snr'],\n                                'transitdepth':blssnr['transitdepth'],\n                                'transitduration':blssnr['transitduration'],\n                            })\n\n                            # update the BLS result dict with the refit periods\n                            # and epochs using the results from bls_snr\n                            resultdict[mcol][pfmk].update({\n                                'nbestperiods':blssnr['period'],\n                                'epochs':blssnr['epoch']\n                            })\n\n                        except Exception as e:\n\n                            LOGEXCEPTION('could not calculate BLS SNR for %s' %\n                                         lcfile)\n                            # add the SNR null results to the BLS result dict\n                            resultdict[mcol][pfmk].update({\n                                'snr':[np.nan,np.nan,np.nan,np.nan,np.nan],\n                                'transitdepth':[np.nan,np.nan,np.nan,\n                                                np.nan,np.nan],\n                                'transitduration':[np.nan,np.nan,np.nan,\n                                                   np.nan,np.nan],\n                            })\n\n            elif 'bls' in pfmethods:\n\n                # we need to scan thru the pfmethods to get to any BLS pfresults\n                for pfmk in resultdict[mcol]['pfmethods']:\n\n                    if 'bls' in pfmk:\n\n                        # add the SNR null results to the BLS result dict\n                        resultdict[mcol][pfmk].update({\n                            'snr':[np.nan,np.nan,np.nan,np.nan,np.nan],\n                            'transitdepth':[np.nan,np.nan,np.nan,\n                                            np.nan,np.nan],\n                            'transitduration':[np.nan,np.nan,np.nan,\n                                               np.nan,np.nan],\n                        })\n\n\n        # once all mag cols have been processed, write out the pickle\n        with open(outfile, 'wb') as outfd:\n            pickle.dump(resultdict, outfd, protocol=pickle.HIGHEST_PROTOCOL)\n\n        return outfile\n\n    except Exception as e:\n\n        LOGEXCEPTION('failed to run for %s, because: %s' % (lcfile, e))\n\n        if raiseonfail:\n            raise\n\n        return None", "response": "This function runs the period - finding on a single LC - format JSON file."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef parallel_pf_lcdir(lcdir,\n                      outdir,\n                      fileglob=None,\n                      recursive=True,\n                      timecols=None,\n                      magcols=None,\n                      errcols=None,\n                      lcformat='hat-sql',\n                      lcformatdir=None,\n                      pfmethods=('gls','pdm','mav','win'),\n                      pfkwargs=({},{},{},{}),\n                      sigclip=10.0,\n                      getblssnr=False,\n                      nperiodworkers=NCPUS,\n                      ncontrolworkers=1,\n                      liststartindex=None,\n                      listmaxobjects=None,\n                      minobservations=500,\n                      excludeprocessed=True):\n    '''This runs parallel light curve period finding for directory of LCs.\n\n    Parameters\n    ----------\n\n    lcdir : str\n        The directory containing the LCs to process.\n\n    outdir : str\n        The directory where the resulting period-finding pickles will go.\n\n    fileglob : str or None\n        The UNIX file glob to use to search for LCs in `lcdir`. If None, the\n        default file glob associated with the registered LC format will be used\n        instead.\n\n    recursive : bool\n        If True, will search recursively in `lcdir` for light curves to process.\n\n    timecols : list of str or None\n        The timecol keys to use from the lcdict in calculating the features.\n\n    magcols : list of str or None\n        The magcol keys to use from the lcdict in calculating the features.\n\n    errcols : list of str or None\n        The errcol keys to use from the lcdict in calculating the features.\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curves specified in `basedir` or `use_list_of_filenames`.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    pfmethods : list of str\n        This is a list of period finding methods to run. Each element is a\n        string matching the keys of the `PFMETHODS` dict above. By default, this\n        runs GLS, PDM, AoVMH, and the spectral window Lomb-Scargle periodogram.\n\n    pfkwargs : list of dicts\n        This is used to provide any special kwargs as dicts to each\n        period-finding method function specified in `pfmethods`.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    getblssnr : bool\n        If this is True and BLS is one of the methods specified in `pfmethods`,\n        will also calculate the stats for each best period in the BLS results:\n        transit depth, duration, ingress duration, refit period and epoch, and\n        the SNR of the transit.\n\n    nperiodworkers : int\n        The number of parallel period-finding workers to launch per object task.\n\n    ncontrolworkers : int\n        The number of controlling processes to launch. This effectively sets how\n        many objects from `lclist` will be processed in parallel.\n\n    liststartindex : int or None\n        This sets the index from where to start in `lclist`.\n\n    listmaxobjects : int or None\n        This sets the maximum number of objects in `lclist` to run\n        period-finding for in this invocation. Together with `liststartindex`,\n        `listmaxobjects` can be used to distribute processing over several\n        independent machines if the number of light curves is very large.\n\n    minobservations : int\n        The minimum number of finite LC points required to process a light\n        curve.\n\n    excludeprocessed : bool\n        If this is True, light curves that have existing period-finding result\n        pickles in `outdir` will not be processed.\n\n        FIXME: currently, this uses a dumb method of excluding already-processed\n        files. A smarter way to do this is to (i) generate a SHA512 cachekey\n        based on a repr of `{'lcfile', 'timecols', 'magcols', 'errcols',\n        'lcformat', 'pfmethods', 'sigclip', 'getblssnr', 'pfkwargs'}`, (ii) make\n        sure all list kwargs in the dict are sorted, (iii) check if the output\n        file has the same cachekey in its filename (last 8 chars of cachekey\n        should work), so the result was processed in exactly the same way as\n        specifed in the input to this function, and can therefore be\n        ignored. Will implement this later.\n\n    Returns\n    -------\n\n    list of str\n        A list of the period-finding pickles created for all of input LCs\n        processed.\n\n    '''\n\n    try:\n        formatinfo = get_lcformat(lcformat,\n                                  use_lcformat_dir=lcformatdir)\n        if formatinfo:\n            (dfileglob, readerfunc,\n             dtimecols, dmagcols, derrcols,\n             magsarefluxes, normfunc) = formatinfo\n        else:\n            LOGERROR(\"can't figure out the light curve format\")\n            return None\n    except Exception as e:\n        LOGEXCEPTION(\"can't figure out the light curve format\")\n        return None\n\n    if not fileglob:\n        fileglob = dfileglob\n\n    # now find the files\n    LOGINFO('searching for %s light curves in %s ...' % (lcformat, lcdir))\n\n    if recursive is False:\n        matching = glob.glob(os.path.join(lcdir, fileglob))\n\n    else:\n        # use recursive glob for Python 3.5+\n        if sys.version_info[:2] > (3,4):\n\n            matching = glob.glob(os.path.join(lcdir,\n                                              '**',\n                                              fileglob),recursive=True)\n\n        # otherwise, use os.walk and glob\n        else:\n\n            # use os.walk to go through the directories\n            walker = os.walk(lcdir)\n            matching = []\n\n            for root, dirs, _files in walker:\n                for sdir in dirs:\n                    searchpath = os.path.join(root,\n                                              sdir,\n                                              fileglob)\n                    foundfiles = glob.glob(searchpath)\n\n                    if foundfiles:\n                        matching.extend(foundfiles)\n\n\n    # now that we have all the files, process them\n    if matching and len(matching) > 0:\n\n        # this helps us process things in deterministic order when we distribute\n        # processing over several machines\n        matching = sorted(matching)\n\n        LOGINFO('found %s light curves, running pf...' % len(matching))\n\n        return parallel_pf(matching,\n                           outdir,\n                           timecols=timecols,\n                           magcols=magcols,\n                           errcols=errcols,\n                           lcformat=lcformat,\n                           lcformatdir=lcformatdir,\n                           pfmethods=pfmethods,\n                           pfkwargs=pfkwargs,\n                           getblssnr=getblssnr,\n                           sigclip=sigclip,\n                           nperiodworkers=nperiodworkers,\n                           ncontrolworkers=ncontrolworkers,\n                           liststartindex=liststartindex,\n                           listmaxobjects=listmaxobjects,\n                           minobservations=minobservations,\n                           excludeprocessed=excludeprocessed)\n\n    else:\n\n        LOGERROR('no light curve files in %s format found in %s' % (lcformat,\n                                                                    lcdir))\n        return None", "response": "This function runs a parallel light curve period finding for a directory of LCs."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef collect_nonperiodic_features(\n        featuresdir,\n        magcol,\n        outfile,\n        pklglob='varfeatures-*.pkl',\n        featurestouse=NONPERIODIC_FEATURES_TO_COLLECT,\n        maxobjects=None,\n        labeldict=None,\n        labeltype='binary',\n):\n    '''This collects variability features into arrays for use with the classifer.\n\n    Parameters\n    ----------\n\n    featuresdir : str\n        This is the directory where all the varfeatures pickles are. Use\n        `pklglob` to specify the glob to search for. The `varfeatures` pickles\n        contain objectids, a light curve magcol, and features as dict\n        key-vals. The :py:mod:`astrobase.lcproc.lcvfeatures` module can be used\n        to produce these.\n\n    magcol : str\n        This is the key in each varfeatures pickle corresponding to the magcol\n        of the light curve the variability features were extracted from.\n\n    outfile : str\n        This is the filename of the output pickle that will be written\n        containing a dict of all the features extracted into np.arrays.\n\n    pklglob : str\n        This is the UNIX file glob to use to search for varfeatures pickle files\n        in `featuresdir`.\n\n    featurestouse : list of str\n        Each varfeatures pickle can contain any combination of non-periodic,\n        stellar, and periodic features; these must have the same names as\n        elements in the list of strings provided in `featurestouse`.  This tries\n        to get all the features listed in NONPERIODIC_FEATURES_TO_COLLECT by\n        default. If `featurestouse` is provided as a list, gets only the\n        features listed in this kwarg instead.\n\n    maxobjects : int or None\n        The controls how many pickles from the featuresdir to process. If None,\n        will process all varfeatures pickles.\n\n    labeldict : dict or None\n        If this is provided, it must be a dict with the following key:val list::\n\n            '<objectid>':<label value>\n\n        for each objectid collected from the varfeatures pickles. This will turn\n        the collected information into a training set for classifiers.\n\n        Example: to carry out non-periodic variable feature collection of fake\n        LCS prepared by :py:mod:`astrobase.fakelcs.generation`, use the value\n        of the 'isvariable' dict elem from the `fakelcs-info.pkl` here, like\n        so::\n\n            labeldict={x:y for x,y in zip(fakelcinfo['objectid'],\n                                          fakelcinfo['isvariable'])}\n\n    labeltype : {'binary', 'classes'}\n        This is either 'binary' or 'classes' for binary/multi-class\n        classification respectively.\n\n    Returns\n    -------\n\n    dict\n        This returns a dict with all of the features collected into np.arrays,\n        ready to use as input to a scikit-learn classifier.\n\n    '''\n\n    # list of input pickles generated by varfeatures in lcproc.py\n    pklist = glob.glob(os.path.join(featuresdir, pklglob))\n\n    if maxobjects:\n        pklist = pklist[:maxobjects]\n\n\n    # fancy progress bar with tqdm if present\n    if TQDM:\n        listiterator = tqdm(pklist)\n    else:\n        listiterator = pklist\n\n    # go through all the varfeatures arrays\n\n    feature_dict = {'objectids':[],'magcol':magcol, 'availablefeatures':[]}\n\n    LOGINFO('collecting features for magcol: %s' % magcol)\n\n    for pkl in listiterator:\n\n        with open(pkl,'rb') as infd:\n            varf = pickle.load(infd)\n\n        # update the objectid list\n        objectid = varf['objectid']\n        if objectid not in feature_dict['objectids']:\n            feature_dict['objectids'].append(objectid)\n\n        thisfeatures = varf[magcol]\n\n        if featurestouse and len(featurestouse) > 0:\n            featurestoget = featurestouse\n        else:\n            featurestoget = NONPERIODIC_FEATURES_TO_COLLECT\n\n        # collect all the features for this magcol/objectid combination\n        for feature in featurestoget:\n\n            # update the global feature list if necessary\n            if ((feature not in feature_dict['availablefeatures']) and\n                (feature in thisfeatures)):\n\n                feature_dict['availablefeatures'].append(feature)\n                feature_dict[feature] = []\n\n            if feature in thisfeatures:\n\n                feature_dict[feature].append(\n                    thisfeatures[feature]\n                )\n\n    # now that we've collected all the objects and their features, turn the list\n    # into arrays, and then concatenate them\n    for feat in feature_dict['availablefeatures']:\n        feature_dict[feat] = np.array(feature_dict[feat])\n\n    feature_dict['objectids'] = np.array(feature_dict['objectids'])\n\n    feature_array = np.column_stack([feature_dict[feat] for feat in\n                                     feature_dict['availablefeatures']])\n    feature_dict['features_array'] = feature_array\n\n\n    # if there's a labeldict available, use it to generate a label array. this\n    # feature collection is now a training set.\n    if isinstance(labeldict, dict):\n\n        labelarray = np.zeros(feature_dict['objectids'].size, dtype=np.int64)\n\n        # populate the labels for each object in the training set\n        for ind, objectid in enumerate(feature_dict['objectids']):\n\n            if objectid in labeldict:\n\n                # if this is a binary classifier training set, convert bools to\n                # ones and zeros\n                if labeltype == 'binary':\n\n                    if labeldict[objectid]:\n                        labelarray[ind] = 1\n\n                # otherwise, use the actual class label integer\n                elif labeltype == 'classes':\n                    labelarray[ind] = labeldict[objectid]\n\n        feature_dict['labels_array'] = labelarray\n\n\n    feature_dict['kwargs'] = {'pklglob':pklglob,\n                              'featurestouse':featurestouse,\n                              'maxobjects':maxobjects,\n                              'labeltype':labeltype}\n\n    # write the info to the output pickle\n    with open(outfile,'wb') as outfd:\n        pickle.dump(feature_dict, outfd, pickle.HIGHEST_PROTOCOL)\n\n    # return the feature_dict\n    return feature_dict", "response": "This function returns a list of arrays containing the non - periodic features for the given classifer."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef train_rf_classifier(\n        collected_features,\n        test_fraction=0.25,\n        n_crossval_iterations=20,\n        n_kfolds=5,\n        crossval_scoring_metric='f1',\n        classifier_to_pickle=None,\n        nworkers=-1,\n):\n\n    '''This gets the best RF classifier after running cross-validation.\n\n    - splits the training set into test/train samples\n    - does `KFold` stratified cross-validation using `RandomizedSearchCV`\n    - gets the `RandomForestClassifier` with the best performance after CV\n    - gets the confusion matrix for the test set\n\n    Runs on the output dict from functions that produce dicts similar to that\n    produced by `collect_nonperiodic_features` above.\n\n    Parameters\n    ----------\n\n    collected_features : dict or str\n        This is either the dict produced by a `collect_*_features` function or\n        the pickle produced by the same.\n\n    test_fraction : float\n        This sets the fraction of the input set that will be used as the\n        test set after training.\n\n    n_crossval_iterations : int\n        This sets the number of iterations to use when running the\n        cross-validation.\n\n    n_kfolds : int\n        This sets the number of K-folds to use on the data when doing a\n        test-train split.\n\n    crossval_scoring_metric : str\n        This is a string that describes how the cross-validation score is\n        calculated for each iteration. See the URL below for how to specify this\n        parameter:\n\n        http://scikit-learn.org/stable/modules/model_evaluation.html#scoring-parameter\n        By default, this is tuned for binary classification and uses the F1\n        scoring metric. Change the `crossval_scoring_metric` to another metric\n        (probably 'accuracy') for multi-class classification, e.g. for periodic\n        variable classification.\n\n    classifier_to_pickle : str\n        If this is a string indicating the name of a pickle file to write, will\n        write the trained classifier to the pickle that can be later loaded and\n        used to classify data.\n\n    nworkers : int\n        This is the number of parallel workers to use in the\n        RandomForestClassifier. Set to -1 to use all CPUs on your machine.\n\n    Returns\n    -------\n\n    dict\n        A dict containing the trained classifier, cross-validation results, the\n        input data set, and all input kwargs used is returned, along with\n        cross-validation score metrics.\n\n    '''\n\n    if (isinstance(collected_features,str) and\n        os.path.exists(collected_features)):\n        with open(collected_features,'rb') as infd:\n            fdict = pickle.load(infd)\n    elif isinstance(collected_features, dict):\n        fdict = collected_features\n    else:\n        LOGERROR(\"can't figure out the input collected_features arg\")\n        return None\n\n    tfeatures = fdict['features_array']\n    tlabels = fdict['labels_array']\n    tfeaturenames = fdict['availablefeatures']\n    tmagcol = fdict['magcol']\n    tobjectids = fdict['objectids']\n\n\n    # split the training set into training/test samples using stratification\n    # to keep the same fraction of variable/nonvariables in each\n    training_features, testing_features, training_labels, testing_labels = (\n        train_test_split(\n            tfeatures,\n            tlabels,\n            test_size=test_fraction,\n            random_state=RANDSEED,\n            stratify=tlabels\n        )\n    )\n\n    # get a random forest classifier\n    clf = RandomForestClassifier(n_jobs=nworkers,\n                                 random_state=RANDSEED)\n\n\n    # this is the grid def for hyperparam optimization\n    rf_hyperparams = {\n        \"max_depth\": [3,4,5,None],\n        \"n_estimators\":sp_randint(100,2000),\n        \"max_features\": sp_randint(1, 5),\n        \"min_samples_split\": sp_randint(2, 11),\n        \"min_samples_leaf\": sp_randint(2, 11),\n    }\n\n    # run the stratified kfold cross-validation on training features using our\n    # random forest classifier object\n    cvsearch = RandomizedSearchCV(\n        clf,\n        param_distributions=rf_hyperparams,\n        n_iter=n_crossval_iterations,\n        scoring=crossval_scoring_metric,\n        cv=StratifiedKFold(n_splits=n_kfolds,\n                           shuffle=True,\n                           random_state=RANDSEED),\n        random_state=RANDSEED\n    )\n\n\n    LOGINFO('running grid-search CV to optimize RF hyperparameters...')\n    cvsearch_classifiers = cvsearch.fit(training_features,\n                                        training_labels)\n\n    # report on the classifiers' performance\n    _gridsearch_report(cvsearch_classifiers.cv_results_)\n\n    # get the best classifier after CV is done\n    bestclf = cvsearch_classifiers.best_estimator_\n    bestclf_score = cvsearch_classifiers.best_score_\n    bestclf_hyperparams = cvsearch_classifiers.best_params_\n\n    # test this classifier on the testing set\n    test_predicted_labels = bestclf.predict(testing_features)\n\n    recscore = recall_score(testing_labels, test_predicted_labels)\n    precscore = precision_score(testing_labels,test_predicted_labels)\n    f1score = f1_score(testing_labels, test_predicted_labels)\n    confmatrix = confusion_matrix(testing_labels, test_predicted_labels)\n\n    # write the classifier, its training/testing set, and its stats to the\n    # pickle if requested\n    outdict = {'features':tfeatures,\n               'labels':tlabels,\n               'feature_names':tfeaturenames,\n               'magcol':tmagcol,\n               'objectids':tobjectids,\n               'kwargs':{'test_fraction':test_fraction,\n                         'n_crossval_iterations':n_crossval_iterations,\n                         'n_kfolds':n_kfolds,\n                         'crossval_scoring_metric':crossval_scoring_metric,\n                         'nworkers':nworkers},\n               'collect_kwargs':fdict['kwargs'],\n               'testing_features':testing_features,\n               'testing_labels':testing_labels,\n               'training_features':training_features,\n               'training_labels':training_labels,\n               'best_classifier':bestclf,\n               'best_score':bestclf_score,\n               'best_hyperparams':bestclf_hyperparams,\n               'best_recall':recscore,\n               'best_precision':precscore,\n               'best_f1':f1score,\n               'best_confmatrix':confmatrix}\n\n\n    if classifier_to_pickle:\n\n        with open(classifier_to_pickle,'wb') as outfd:\n            pickle.dump(outdict, outfd, pickle.HIGHEST_PROTOCOL)\n\n\n    # return this classifier and accompanying info\n    return outdict", "response": "Train the RF classifier on the input set."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef apply_rf_classifier(classifier,\n                        varfeaturesdir,\n                        outpickle,\n                        maxobjects=None):\n    '''This applys an RF classifier trained using `train_rf_classifier`\n    to varfeatures pickles in `varfeaturesdir`.\n\n    Parameters\n    ----------\n\n    classifier : dict or str\n        This is the output dict or pickle created by `get_rf_classifier`. This\n        will contain a `features_name` key that will be used to collect the same\n        features used to train the classifier from the varfeatures pickles in\n        varfeaturesdir.\n\n    varfeaturesdir : str\n        The directory containing the varfeatures pickles for objects that will\n        be classified by the trained `classifier`.\n\n    outpickle : str\n        This is a filename for the pickle that will be written containing the\n        result dict from this function.\n\n    maxobjects : int\n        This sets the number of objects to process in `varfeaturesdir`.\n\n    Returns\n    -------\n\n    dict\n        The classification results after running the trained `classifier` as\n        returned as a dict. This contains predicted labels and their prediction\n        probabilities.\n\n    '''\n\n    if isinstance(classifier,str) and os.path.exists(classifier):\n        with open(classifier,'rb') as infd:\n            clfdict = pickle.load(infd)\n    elif isinstance(classifier, dict):\n        clfdict = classifier\n    else:\n        LOGERROR(\"can't figure out the input classifier arg\")\n        return None\n\n\n    # get the features to extract from clfdict\n    if 'feature_names' not in clfdict:\n        LOGERROR(\"feature_names not present in classifier input, \"\n                 \"can't figure out which ones to extract from \"\n                 \"varfeature pickles in %s\" % varfeaturesdir)\n        return None\n\n    # get the feature labeltype, pklglob, and maxobjects from classifier's\n    # collect_kwargs elem.\n    featurestouse = clfdict['feature_names']\n    pklglob = clfdict['collect_kwargs']['pklglob']\n    magcol = clfdict['magcol']\n\n\n    # extract the features used by the classifier from the varfeatures pickles\n    # in varfeaturesdir using the pklglob provided\n    featfile = os.path.join(\n        os.path.dirname(outpickle),\n        'actual-collected-features.pkl'\n    )\n\n    features = collect_nonperiodic_features(\n        varfeaturesdir,\n        magcol,\n        featfile,\n        pklglob=pklglob,\n        featurestouse=featurestouse,\n        maxobjects=maxobjects\n    )\n\n    # now use the trained classifier on these features\n    bestclf = clfdict['best_classifier']\n\n    predicted_labels = bestclf.predict(features['features_array'])\n\n    # FIXME: do we need to use the probability calibration curves to fix these\n    # probabilities? probably. figure out how to do this.\n    predicted_label_probs = bestclf.predict_proba(\n        features['features_array']\n    )\n\n    outdict = {\n        'features':features,\n        'featfile':featfile,\n        'classifier':clfdict,\n        'predicted_labels':predicted_labels,\n        'predicted_label_probs':predicted_label_probs,\n    }\n\n    with open(outpickle,'wb') as outfd:\n        pickle.dump(outdict, outfd, pickle.HIGHEST_PROTOCOL)\n\n    return outdict", "response": "This function applies an RF classifier trained using get_rf_classifier to varfeaturesdir and returns the result as a dict."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef plot_training_results(classifier,\n                          classlabels,\n                          outfile):\n    '''This plots the training results from the classifier run on the training\n    set.\n\n    - plots the confusion matrix\n\n    - plots the feature importances\n\n    - FIXME: plot the learning curves too, see:\n      http://scikit-learn.org/stable/modules/learning_curve.html\n\n    Parameters\n    ----------\n\n    classifier : dict or str\n        This is the output dict or pickle created by `get_rf_classifier`\n        containing the trained classifier.\n\n    classlabels : list of str\n        This contains all of the class labels for the current classification\n        problem.\n\n    outfile : str\n        This is the filename where the plots will be written.\n\n    Returns\n    -------\n\n    str\n        The path to the generated plot file.\n\n    '''\n\n    if isinstance(classifier,str) and os.path.exists(classifier):\n        with open(classifier,'rb') as infd:\n            clfdict = pickle.load(infd)\n    elif isinstance(classifier, dict):\n        clfdict = classifier\n    else:\n        LOGERROR(\"can't figure out the input classifier arg\")\n        return None\n\n    confmatrix = clfdict['best_confmatrix']\n    overall_feature_importances = clfdict[\n        'best_classifier'\n    ].feature_importances_\n    feature_importances_per_tree = np.array([\n        tree.feature_importances_\n        for tree in clfdict['best_classifier'].estimators_\n    ])\n    stdev_feature_importances = np.std(feature_importances_per_tree,axis=0)\n\n    feature_names = np.array(clfdict['feature_names'])\n\n    plt.figure(figsize=(6.4*3.0,4.8))\n\n    # confusion matrix\n    plt.subplot(121)\n    classes = np.array(classlabels)\n    plt.imshow(confmatrix, interpolation='nearest', cmap=plt.cm.Blues)\n    tick_marks = np.arange(len(classes))\n    plt.xticks(tick_marks, classes)\n    plt.yticks(tick_marks, classes)\n    plt.title('evaluation set confusion matrix')\n    plt.ylabel('predicted class')\n    plt.xlabel('actual class')\n\n    thresh = confmatrix.max() / 2.\n    for i, j in itertools.product(range(confmatrix.shape[0]),\n                                  range(confmatrix.shape[1])):\n        plt.text(j, i, confmatrix[i, j],\n                 horizontalalignment=\"center\",\n                 color=\"white\" if confmatrix[i, j] > thresh else \"black\")\n\n    # feature importances\n    plt.subplot(122)\n\n    features = np.array(feature_names)\n    sorted_ind = np.argsort(overall_feature_importances)[::-1]\n\n    features = features[sorted_ind]\n    feature_names = feature_names[sorted_ind]\n    overall_feature_importances = overall_feature_importances[sorted_ind]\n    stdev_feature_importances = stdev_feature_importances[sorted_ind]\n\n    plt.bar(np.arange(0,features.size),\n            overall_feature_importances,\n            yerr=stdev_feature_importances,\n            width=0.8,\n            color='grey')\n    plt.xticks(np.arange(0,features.size),\n               features,\n               rotation=90)\n    plt.yticks([0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0])\n    plt.xlim(-0.75, features.size - 1.0 + 0.75)\n    plt.ylim(0.0,0.9)\n    plt.ylabel('relative importance')\n    plt.title('relative importance of features')\n\n    plt.subplots_adjust(wspace=0.1)\n\n    plt.savefig(outfile,\n                bbox_inches='tight',\n                dpi=100)\n    plt.close('all')\n    return outfile", "response": "This function plots the training results from the classifier run on the training set."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _fourier_residual(fourierparams,\n                      phase,\n                      mags):\n    '''\n    This is the residual objective function to be minimized by `scipy.leastsq`.\n\n    The parameters are the same as `_fourier_func` above.\n\n    '''\n\n    f = _fourier_func(fourierparams, phase, mags)\n    residual = mags - f\n\n    return residual", "response": "This function computes the residual objective function for the Fourier model."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef plot_phased_magseries(times,\n                          mags,\n                          period,\n                          epoch='min',\n                          fitknotfrac=0.01,\n                          errs=None,\n                          magsarefluxes=False,\n                          normto='globalmedian',\n                          normmingap=4.0,\n                          sigclip=30.0,\n                          phasewrap=True,\n                          phasesort=True,\n                          phasebin=None,\n                          plotphaselim=(-0.8,0.8),\n                          yrange=None,\n                          xtimenotphase=False,\n                          xaxlabel='phase',\n                          yaxlabel=None,\n                          modelmags=None,\n                          modeltimes=None,\n                          modelerrs=None,\n                          outfile=None,\n                          plotdpi=100):\n    '''Plots a phased magnitude/flux time-series using the period provided.\n\n    Parameters\n    ----------\n\n    times,mags : np.array\n        The mag/flux time-series to plot as a function of phase given `period`.\n\n    period : float\n        The period to use to phase-fold the time-series. Should be the same unit\n        as `times` (usually in days)\n\n    epoch : 'min' or float or None\n        This indicates how to get the epoch to use for phasing the light curve:\n\n        - If None, uses the `min(times)` as the epoch for phasing.\n\n        - If epoch is the string 'min', then fits a cubic spline to the phased\n          light curve using `min(times)` as the initial epoch, finds the\n          magnitude/flux minimum of this phased light curve fit, and finally\n          uses the that time value as the epoch. This is useful for plotting\n          planetary transits and eclipsing binary phased light curves so that\n          phase 0.0 corresponds to the mid-center time of primary eclipse (or\n          transit).\n\n        - If epoch is a float, then uses that directly to phase the light\n          curve and as the epoch of the phased mag series plot.\n\n    fitknotfrac : float\n        If `epoch='min'`, this function will attempt to fit a cubic spline to\n        the phased light curve to find a time of light minimum as phase\n        0.0. This kwarg sets the number of knots to generate the spline as a\n        fraction of the total number of measurements in the input\n        time-series. By default, this is set so that 100 knots are used to\n        generate a spline for fitting the phased light curve consisting of 10000\n        measurements.\n\n    errs : np.array or None\n        If this is provided, contains the measurement errors associated with\n        each measurement of flux/mag in time-series. Providing this kwarg will\n        add errbars to the output plot.\n\n    magsarefluxes : bool\n        Indicates if the input `mags` array is actually an array of flux\n        measurements instead of magnitude measurements. If this is set to True,\n        then the plot y-axis will be set as appropriate for mag or fluxes.\n\n    normto : {'globalmedian', 'zero'} or a float\n        Sets the normalization target::\n\n          'globalmedian' -> norms each mag to the global median of the LC column\n          'zero'         -> norms each mag to zero\n          a float        -> norms each mag to this specified float value.\n\n    normmingap : float\n        This defines how much the difference between consecutive measurements is\n        allowed to be to consider them as parts of different timegroups. By\n        default it is set to 4.0 days.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    phasewrap : bool\n        If this is True, the phased time-series will be wrapped around phase\n        0.0.\n\n    phasesort : bool\n        If this is True, the phased time-series will be sorted in phase.\n\n    phasebin : float or None\n        If this is provided, indicates the bin size to use to group together\n        measurements closer than this amount in phase. This is in units of\n        phase. The binned phased light curve will be overplotted on top of the\n        phased light curve. Useful for when one has many measurement points and\n        needs to pick out a small trend in an otherwise noisy phased light\n        curve.\n\n    plotphaselim : sequence of two floats or None\n        The x-axis limits to use when making the phased light curve plot. By\n        default, this is (-0.8, 0.8), which places phase 0.0 at the center of\n        the plot and covers approximately two cycles in phase to make any trends\n        clear.\n\n    yrange : list of two floats or None\n        This is used to provide a custom y-axis range to the plot. If None, will\n        automatically determine y-axis range.\n\n    xtimenotphase : bool\n        If True, the x-axis gets units of time (multiplies phase by period).\n\n    xaxlabel : str\n        Sets the label for the x-axis.\n\n    yaxlabel : str or None\n        Sets the label for the y-axis. If this is None, the appropriate label\n        will be used based on the value of the `magsarefluxes` kwarg.\n\n    modeltimes,modelmags,modelerrs : np.array or None\n        If all of these are provided, then this function will overplot the\n        values of modeltimes and modelmags on top of the actual phased light\n        curve. This is useful for plotting variability models on top of the\n        light curve (e.g. plotting a Mandel-Agol transit model over the actual\n        phased light curve. These arrays will be phased using the already\n        provided period and epoch.\n\n    outfile : str or StringIO/BytesIO or matplotlib.axes.Axes or None\n        - a string filename for the file where the plot will be written.\n        - a StringIO/BytesIO object to where the plot will be written.\n        - a matplotlib.axes.Axes object to where the plot will be written.\n        - if None, plots to 'magseries-phased-plot.png' in current dir.\n\n    plotdpi : int\n        Sets the resolution in DPI for PNG plots (default = 100).\n\n    Returns\n    -------\n\n    str or StringIO/BytesIO or matplotlib.axes.Axes\n        This returns based on the input:\n\n        - If `outfile` is a str or None, the path to the generated plot file is\n          returned.\n        - If `outfile` is a StringIO/BytesIO object, will return the\n          StringIO/BytesIO object to which the plot was written.\n        - If `outfile` is a matplotlib.axes.Axes object, will return the Axes\n          object with the plot elements added to it. One can then directly\n          include this Axes object in some other Figure.\n\n    '''\n\n    # sigclip the magnitude timeseries\n    stimes, smags, serrs = sigclip_magseries(times,\n                                             mags,\n                                             errs,\n                                             magsarefluxes=magsarefluxes,\n                                             sigclip=sigclip)\n\n\n    # check if we need to normalize\n    if normto is not False:\n        stimes, smags = normalize_magseries(stimes, smags,\n                                            normto=normto,\n                                            magsarefluxes=magsarefluxes,\n                                            mingap=normmingap)\n\n        if ( isinstance(modelmags, np.ndarray) and\n             isinstance(modeltimes, np.ndarray) ):\n\n            stimes, smags = normalize_magseries(modeltimes, modelmags,\n                                                normto=normto,\n                                                magsarefluxes=magsarefluxes,\n                                                mingap=normmingap)\n\n    # figure out the epoch, if it's None, use the min of the time\n    if epoch is None:\n        epoch = stimes.min()\n\n    # if the epoch is 'min', then fit a spline to the light curve phased\n    # using the min of the time, find the fit mag minimum and use the time for\n    # that as the epoch\n    elif isinstance(epoch, str) and epoch == 'min':\n\n        try:\n            spfit = spline_fit_magseries(stimes, smags, serrs, period,\n                                         knotfraction=fitknotfrac)\n            epoch = spfit['fitinfo']['fitepoch']\n            if len(epoch) != 1:\n                epoch = epoch[0]\n        except Exception as e:\n            LOGEXCEPTION('spline fit failed, using min(times) as epoch')\n            epoch = npmin(stimes)\n\n\n    # now phase the data light curve (and optionally, phase bin the light curve)\n    if errs is not None:\n\n        phasedlc = phase_magseries_with_errs(stimes, smags, serrs, period,\n                                             epoch, wrap=phasewrap,\n                                             sort=phasesort)\n        plotphase = phasedlc['phase']\n        plotmags = phasedlc['mags']\n        ploterrs = phasedlc['errs']\n\n        # if we're supposed to bin the phases, do so\n        if phasebin:\n\n            binphasedlc = phase_bin_magseries_with_errs(plotphase, plotmags,\n                                                        ploterrs,\n                                                        binsize=phasebin)\n            binplotphase = binphasedlc['binnedphases']\n            binplotmags = binphasedlc['binnedmags']\n            binploterrs = binphasedlc['binnederrs']\n\n    else:\n\n        phasedlc = phase_magseries(stimes, smags, period, epoch,\n                                   wrap=phasewrap, sort=phasesort)\n        plotphase = phasedlc['phase']\n        plotmags = phasedlc['mags']\n        ploterrs = None\n\n        # if we're supposed to bin the phases, do so\n        if phasebin:\n\n            binphasedlc = phase_bin_magseries(plotphase,\n                                              plotmags,\n                                              binsize=phasebin)\n            binplotphase = binphasedlc['binnedphases']\n            binplotmags = binphasedlc['binnedmags']\n            binploterrs = None\n\n    # phase the model light curve\n    modelplotphase, modelplotmags = None, None\n\n    if ( isinstance(modelerrs,np.ndarray) and\n         isinstance(modeltimes,np.ndarray) and\n         isinstance(modelmags,np.ndarray) ):\n\n        modelphasedlc = phase_magseries_with_errs(modeltimes, modelmags,\n                                                  modelerrs, period, epoch,\n                                                  wrap=phasewrap,\n                                                  sort=phasesort)\n        modelplotphase = modelphasedlc['phase']\n        modelplotmags = modelphasedlc['mags']\n\n    # note that we never will phase-bin the model (no point).\n    elif ( not isinstance(modelerrs,np.ndarray) and\n           isinstance(modeltimes,np.ndarray) and\n           isinstance(modelmags,np.ndarray) ):\n\n        modelphasedlc = phase_magseries(modeltimes, modelmags, period, epoch,\n                                        wrap=phasewrap, sort=phasesort)\n        modelplotphase = modelphasedlc['phase']\n        modelplotmags = modelphasedlc['mags']\n\n    # finally, make the plots\n\n    # check if the outfile is actually an Axes object\n    if isinstance(outfile, matplotlib.axes.Axes):\n        ax = outfile\n\n    # otherwise, it's just a normal file or StringIO/BytesIO\n    else:\n        fig = plt.figure()\n        fig.set_size_inches(7.5,4.8)\n        ax = plt.gca()\n\n    if xtimenotphase:\n        plotphase *= period\n\n    if phasebin:\n        ax.errorbar(plotphase, plotmags, fmt='o',\n                    color='#B2BEB5',\n                    yerr=ploterrs,\n                    markersize=3.0,\n                    markeredgewidth=0.0,\n                    ecolor='#B2BEB5',\n                    capsize=0)\n        if xtimenotphase:\n            binplotphase *= period\n        ax.errorbar(binplotphase, binplotmags, fmt='bo', yerr=binploterrs,\n                    markersize=5.0, markeredgewidth=0.0, ecolor='#B2BEB5',\n                    capsize=0)\n\n    else:\n        ax.errorbar(plotphase, plotmags, fmt='ko', yerr=ploterrs,\n                    markersize=3.0, markeredgewidth=0.0, ecolor='#B2BEB5',\n                    capsize=0)\n\n    if (isinstance(modelplotphase, np.ndarray) and\n        isinstance(modelplotmags, np.ndarray)):\n\n        if xtimenotphase:\n            modelplotphase *= period\n        ax.plot(modelplotphase, modelplotmags, zorder=5, linewidth=2,\n                alpha=0.9, color='#181c19')\n\n    # make a grid\n    ax.grid(color='#a9a9a9',\n            alpha=0.9,\n            zorder=0,\n            linewidth=1.0,\n            linestyle=':')\n\n    # make lines for phase 0.0, 0.5, and -0.5\n    ax.axvline(0.0,alpha=0.9,linestyle='dashed',color='g')\n    if not xtimenotphase:\n        ax.axvline(-0.5,alpha=0.9,linestyle='dashed',color='g')\n        ax.axvline(0.5,alpha=0.9,linestyle='dashed',color='g')\n    else:\n        ax.axvline(-period*0.5,alpha=0.9,linestyle='dashed',color='g')\n        ax.axvline(period*0.5,alpha=0.9,linestyle='dashed',color='g')\n\n    # fix the ticks to use no offsets\n    ax.get_yaxis().get_major_formatter().set_useOffset(False)\n    ax.get_xaxis().get_major_formatter().set_useOffset(False)\n\n    # get the yrange\n    if yrange and isinstance(yrange,(list,tuple)) and len(yrange) == 2:\n        ymin, ymax = yrange\n    else:\n        ymin, ymax = ax.get_ylim()\n\n    # set the y axis labels and range\n    if not yaxlabel:\n        if not magsarefluxes:\n            ax.set_ylim(ymax, ymin)\n            yaxlabel = 'magnitude'\n        else:\n            ax.set_ylim(ymin, ymax)\n            yaxlabel = 'flux'\n\n    # set the x axis limit\n    if not plotphaselim:\n        ax.set_xlim((npmin(plotphase)-0.1,\n                     npmax(plotphase)+0.1))\n    else:\n        if xtimenotphase:\n            ax.set_xlim((period*plotphaselim[0],period*plotphaselim[1]))\n        else:\n            ax.set_xlim((plotphaselim[0],plotphaselim[1]))\n\n    # set up the axis labels and plot title\n    ax.set_xlabel(xaxlabel)\n    ax.set_ylabel(yaxlabel)\n    ax.set_title('period: %.6f d - epoch: %.6f' % (period, epoch))\n\n    LOGINFO('using period: %.6f d and epoch: %.6f' % (period, epoch))\n\n    # check if the output filename is actually an instance of StringIO\n    if sys.version_info[:2] < (3,0):\n\n        is_Strio = isinstance(outfile, cStringIO.InputType)\n\n    else:\n\n        is_Strio = isinstance(outfile, Strio)\n\n    # make the figure\n    if (outfile and\n        not is_Strio and\n        not isinstance(outfile, matplotlib.axes.Axes)):\n\n        if outfile.endswith('.png'):\n            fig.savefig(outfile, bbox_inches='tight', dpi=plotdpi)\n        else:\n            fig.savefig(outfile, bbox_inches='tight')\n        plt.close()\n        return period, epoch, os.path.abspath(outfile)\n\n    elif outfile and is_Strio:\n\n        fig.savefig(outfile, bbox_inches='tight', dpi=plotdpi, format='png')\n        return outfile\n\n    elif outfile and isinstance(outfile, matplotlib.axes.Axes):\n\n        return outfile\n\n    elif not outfile and dispok:\n\n        plt.show()\n        plt.close()\n        return period, epoch\n\n    else:\n\n        LOGWARNING('no output file specified and no $DISPLAY set, '\n                   'saving to magseries-phased-plot.png in current directory')\n        outfile = 'magseries-phased-plot.png'\n        plt.savefig(outfile, bbox_inches='tight', dpi=plotdpi)\n        plt.close()\n        return period, epoch, os.path.abspath(outfile)", "response": "Plots a phased magnetic time - series using the given period."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef skyview_stamp(ra, decl,\n                  survey='DSS2 Red',\n                  scaling='Linear',\n                  flip=True,\n                  convolvewith=None,\n                  forcefetch=False,\n                  cachedir='~/.astrobase/stamp-cache',\n                  timeout=10.0,\n                  retry_failed=False,\n                  savewcsheader=True,\n                  verbose=False):\n    '''This downloads a DSS FITS stamp centered on the coordinates specified.\n\n    This wraps the function :py:func:`astrobase.services.skyview.get_stamp`,\n    which downloads Digitized Sky Survey stamps in FITS format from the NASA\n    SkyView service:\n\n    https://skyview.gsfc.nasa.gov/current/cgi/query.pl\n\n    Also adds some useful operations on top of the FITS file returned.\n\n    Parameters\n    ----------\n\n    ra,decl : float\n        The center coordinates for the stamp in decimal degrees.\n\n    survey : str\n        The survey name to get the stamp from. This is one of the\n        values in the 'SkyView Surveys' option boxes on the SkyView\n        webpage. Currently, we've only tested using 'DSS2 Red' as the value for\n        this kwarg, but the other ones should work in principle.\n\n    scaling : str\n        This is the pixel value scaling function to use.\n\n    flip : bool\n        Will flip the downloaded image top to bottom. This should usually be\n        True because matplotlib and FITS have different image coord origin\n        conventions. Alternatively, set this to False and use the\n        `origin='lower'` in any call to `matplotlib.pyplot.imshow` when plotting\n        this image.\n\n    convolvewith : astropy.convolution Kernel object or None\n        If `convolvewith` is an astropy.convolution Kernel object from:\n\n        http://docs.astropy.org/en/stable/convolution/kernels.html\n\n        then, this function will return the stamp convolved with that\n        kernel. This can be useful to see effects of wide-field telescopes (like\n        the HATNet and HATSouth lenses) degrading the nominal 1 arcsec/px of\n        DSS, causing blending of targets and any variability.\n\n    forcefetch : bool\n        If True, will disregard any existing cached copies of the stamp already\n        downloaded corresponding to the requested center coordinates and\n        redownload the FITS from the SkyView service.\n\n    cachedir : str\n        This is the path to the astrobase cache directory. All downloaded FITS\n        stamps are stored here as .fits.gz files so we can immediately respond\n        with the cached copy when a request is made for a coordinate center\n        that's already been downloaded.\n\n    timeout : float\n        Sets the timeout in seconds to wait for a response from the NASA SkyView\n        service.\n\n    retry_failed : bool\n        If the initial request to SkyView fails, and this is True, will retry\n        until it succeeds.\n\n    savewcsheader : bool\n        If this is True, also returns the WCS header of the downloaded FITS\n        stamp in addition to the FITS image itself. Useful for projecting object\n        coordinates onto image xy coordinates for visualization.\n\n    verbose : bool\n        If True, indicates progress.\n\n    Returns\n    -------\n\n    tuple or array or None\n        This returns based on the value of `savewcsheader`:\n\n        - If `savewcsheader=True`, returns a tuple:\n          (FITS stamp image as a numpy array, FITS header)\n        - If `savewcsheader=False`, returns only the FITS stamp image as numpy\n          array.\n        - If the stamp retrieval fails, returns None.\n\n    '''\n\n    stampdict = get_stamp(ra, decl,\n                          survey=survey,\n                          scaling=scaling,\n                          forcefetch=forcefetch,\n                          cachedir=cachedir,\n                          timeout=timeout,\n                          retry_failed=retry_failed,\n                          verbose=verbose)\n    #\n    # DONE WITH FETCHING STUFF\n    #\n    if stampdict:\n\n        # open the frame\n        stampfits = pyfits.open(stampdict['fitsfile'])\n        header = stampfits[0].header\n        frame = stampfits[0].data\n        stampfits.close()\n\n        # finally, we can process the frame\n        if flip:\n            frame = np.flipud(frame)\n\n        if verbose:\n            LOGINFO('fetched stamp successfully for (%.3f, %.3f)'\n                    % (ra, decl))\n\n\n        if convolvewith:\n\n            convolved = aconv.convolve(frame, convolvewith)\n            if savewcsheader:\n                return convolved, header\n            else:\n                return convolved\n\n        else:\n\n            if savewcsheader:\n                return frame, header\n            else:\n                return frame\n\n    else:\n        LOGERROR('could not fetch the requested stamp for '\n                 'coords: (%.3f, %.3f) from survey: %s and scaling: %s'\n                 % (ra, decl, survey, scaling))\n        return None", "response": "This function downloads a DSS FITS stamp centered on the coordinates specified."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nmaking a plot of periodograms obtained from periodbase functions.", "response": "def plot_periodbase_lsp(lspinfo, outfile=None, plotdpi=100):\n\n    '''Makes a plot of periodograms obtained from `periodbase` functions.\n\n    This takes the output dict produced by any `astrobase.periodbase`\n    period-finder function or a pickle filename containing such a dict and makes\n    a periodogram plot.\n\n    Parameters\n    ----------\n\n    lspinfo : dict or str\n        If lspinfo is a dict, it must be a dict produced by an\n        `astrobase.periodbase` period-finder function or a dict from your own\n        period-finder function or routine that is of the form below with at\n        least these keys::\n\n            {'periods': np.array of all periods searched by the period-finder,\n             'lspvals': np.array of periodogram power value for each period,\n             'bestperiod': a float value that is the period with the highest\n                           peak in the periodogram, i.e. the most-likely actual\n                           period,\n             'method': a three-letter code naming the period-finder used; must\n                       be one of the keys in the `METHODLABELS` dict above,\n             'nbestperiods': a list of the periods corresponding to periodogram\n                             peaks (`nbestlspvals` below) to annotate on the\n                             periodogram plot so they can be called out\n                             visually,\n             'nbestlspvals': a list of the power values associated with\n                             periodogram peaks to annotate on the periodogram\n                             plot so they can be called out visually; should be\n                             the same length as `nbestperiods` above}\n\n        If lspinfo is a str, then it must be a path to a pickle file that\n        contains a dict of the form described above.\n\n    outfile : str or None\n        If this is a str, will write the periodogram plot to the file specified\n        by this string. If this is None, will write to a file called\n        'lsp-plot.png' in the current working directory.\n\n    plotdpi : int\n        Sets the resolution in DPI of the output periodogram plot PNG file.\n\n    Returns\n    -------\n\n    str\n        Absolute path to the periodogram plot file created.\n\n    '''\n\n    # get the lspinfo from a pickle file transparently\n    if isinstance(lspinfo,str) and os.path.exists(lspinfo):\n        LOGINFO('loading LSP info from pickle %s' % lspinfo)\n        with open(lspinfo,'rb') as infd:\n            lspinfo = pickle.load(infd)\n\n    try:\n\n        # get the things to plot out of the data\n        periods = lspinfo['periods']\n        lspvals = lspinfo['lspvals']\n        bestperiod = lspinfo['bestperiod']\n        lspmethod = lspinfo['method']\n\n        # make the LSP plot on the first subplot\n        plt.plot(periods, lspvals)\n        plt.xscale('log',basex=10)\n        plt.xlabel('Period [days]')\n        plt.ylabel(PLOTYLABELS[lspmethod])\n        plottitle = '%s best period: %.6f d' % (METHODSHORTLABELS[lspmethod],\n                                                bestperiod)\n        plt.title(plottitle)\n\n        # show the best five peaks on the plot\n        for bestperiod, bestpeak in zip(lspinfo['nbestperiods'],\n                                        lspinfo['nbestlspvals']):\n\n            plt.annotate('%.6f' % bestperiod,\n                         xy=(bestperiod, bestpeak), xycoords='data',\n                         xytext=(0.0,25.0), textcoords='offset points',\n                         arrowprops=dict(arrowstyle=\"->\"),fontsize='x-small')\n\n        # make a grid\n        plt.grid(color='#a9a9a9',\n                 alpha=0.9,\n                 zorder=0,\n                 linewidth=1.0,\n                 linestyle=':')\n\n        # make the figure\n        if outfile and isinstance(outfile, str):\n\n            if outfile.endswith('.png'):\n                plt.savefig(outfile,bbox_inches='tight',dpi=plotdpi)\n            else:\n                plt.savefig(outfile,bbox_inches='tight')\n\n            plt.close()\n            return os.path.abspath(outfile)\n\n        elif dispok:\n\n            plt.show()\n            plt.close()\n            return\n\n        else:\n\n            LOGWARNING('no output file specified and no $DISPLAY set, '\n                       'saving to lsp-plot.png in current directory')\n            outfile = 'lsp-plot.png'\n            plt.savefig(outfile,bbox_inches='tight',dpi=plotdpi)\n            plt.close()\n            return os.path.abspath(outfile)\n\n    except Exception as e:\n\n        LOGEXCEPTION('could not plot this LSP, appears to be empty')\n        return"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef read_hatpi_textlc(lcfile):\n    '''\n    This reads in a textlc that is complete up to the TFA stage.\n\n    '''\n\n    if 'TF1' in lcfile:\n        thiscoldefs = COLDEFS + [('itf1',float)]\n    elif 'TF2' in lcfile:\n        thiscoldefs = COLDEFS + [('itf2',float)]\n    elif 'TF3' in lcfile:\n        thiscoldefs = COLDEFS + [('itf3',float)]\n\n    LOGINFO('reading %s' % lcfile)\n\n    if lcfile.endswith('.gz'):\n        infd = gzip.open(lcfile,'r')\n    else:\n        infd = open(lcfile,'r')\n\n\n    with infd:\n\n        lclines = infd.read().decode().split('\\n')\n        lclines = [x.split() for x in lclines if ('#' not in x and len(x) > 0)]\n        ndet = len(lclines)\n\n        if ndet > 0:\n\n            lccols = list(zip(*lclines))\n            lcdict = {x[0]:y for (x,y) in zip(thiscoldefs, lccols)}\n\n            # convert to ndarray\n            for col in thiscoldefs:\n                lcdict[col[0]] = np.array([col[1](x) for x in lcdict[col[0]]])\n\n        else:\n\n            lcdict = {}\n            LOGWARNING('no detections in %s' % lcfile)\n            # convert to empty ndarrays\n            for col in thiscoldefs:\n                lcdict[col[0]] = np.array([])\n\n        # add the object's name to the lcdict\n        hatid = HATIDREGEX.findall(lcfile)\n        lcdict['objectid'] = hatid[0] if hatid else 'unknown object'\n\n        # add the columns to the lcdict\n        lcdict['columns'] = [x[0] for x in thiscoldefs]\n\n        # add some basic info similar to usual HATLCs\n        lcdict['objectinfo'] = {\n            'ndet':ndet,\n            'hatid':hatid[0] if hatid else 'unknown object',\n            'network':'HP',\n        }\n\n        # break out the {stationid}-{framenum}{framesub}_{ccdnum} framekey\n        # into separate columns\n        framekeyelems = FRAMEREGEX.findall('\\n'.join(lcdict['frk']))\n\n        lcdict['stf'] = np.array([(int(x[0]) if x[0].isdigit() else np.nan)\n                                  for x in framekeyelems])\n        lcdict['cfn'] = np.array([(int(x[1]) if x[0].isdigit() else np.nan)\n                                  for x in framekeyelems])\n        lcdict['cfs'] = np.array([x[2] for x in framekeyelems])\n        lcdict['ccd'] = np.array([(int(x[3]) if x[0].isdigit() else np.nan)\n                                  for x in framekeyelems])\n\n        # update the column list with these columns\n        lcdict['columns'].extend(['stf','cfn','cfs','ccd'])\n\n        # add more objectinfo: 'stations', etc.\n        lcdict['objectinfo']['network'] = 'HP'\n        lcdict['objectinfo']['stations'] = [\n            'HP%s' % x for x in np.unique(lcdict['stf']).tolist()\n        ]\n\n\n    return lcdict", "response": "Reads in a textlc that is complete up to the TFA stage."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef lcdict_to_pickle(lcdict, outfile=None):\n    '''This just writes the lcdict to a pickle.\n\n    If outfile is None, then will try to get the name from the\n    lcdict['objectid'] and write to <objectid>-hptxtlc.pkl. If that fails, will\n    write to a file named hptxtlc.pkl'.\n\n    '''\n\n    if not outfile and lcdict['objectid']:\n        outfile = '%s-hplc.pkl' % lcdict['objectid']\n    elif not outfile and not lcdict['objectid']:\n        outfile = 'hplc.pkl'\n\n    with open(outfile,'wb') as outfd:\n        pickle.dump(lcdict, outfd, protocol=pickle.HIGHEST_PROTOCOL)\n\n    if os.path.exists(outfile):\n        LOGINFO('lcdict for object: %s -> %s OK' % (lcdict['objectid'],\n                                                    outfile))\n        return outfile\n    else:\n        LOGERROR('could not make a pickle for this lcdict!')\n        return None", "response": "This just writes the lcdict to a pickle file."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef read_hatpi_pklc(lcfile):\n    '''\n    This just reads a pickle LC. Returns an lcdict.\n\n    '''\n\n    try:\n\n        if lcfile.endswith('.gz'):\n            infd = gzip.open(lcfile,'rb')\n        else:\n            infd = open(lcfile,'rb')\n\n        lcdict = pickle.load(infd)\n        infd.close()\n\n        return lcdict\n\n    except UnicodeDecodeError:\n\n        if lcfile.endswith('.gz'):\n            infd = gzip.open(lcfile,'rb')\n        else:\n            infd = open(lcfile,'rb')\n\n        LOGWARNING('pickle %s was probably from Python 2 '\n                   'and failed to load without using \"latin1\" encoding. '\n                   'This is probably a numpy issue: '\n                   'http://stackoverflow.com/q/11305790' % lcfile)\n        lcdict = pickle.load(infd, encoding='latin1')\n        infd.close()\n\n        return lcdict", "response": "This just reads a pickle LC. Returns an lcdict."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef concatenate_textlcs_for_objectid(lcbasedir,\n                                     objectid,\n                                     aperture='TF1',\n                                     postfix='.gz',\n                                     sortby='rjd',\n                                     normalize=True,\n                                     recursive=True):\n    '''This concatenates all text LCs for an objectid with the given aperture.\n\n    Does not care about overlaps or duplicates. The light curves must all be\n    from the same aperture.\n\n    The intended use is to concatenate light curves across CCDs or instrument\n    changes for a single object. These can then be normalized later using\n    standard astrobase tools to search for variablity and/or periodicity.\n\n\n    lcbasedir is the directory to start searching in.\n\n    objectid is the object to search for.\n\n    aperture is the aperture postfix to use: (TF1 = aperture 1,\n                                              TF2 = aperture 2,\n                                              TF3 = aperture 3)\n\n    sortby is a column to sort the final concatenated light curve by in\n    ascending order.\n\n    If normalize is True, then each light curve's magnitude columns are\n    normalized to zero, and the whole light curve is then normalized to the\n    global median magnitude for each magnitude column.\n\n    If recursive is True, then the function will search recursively in lcbasedir\n    for any light curves matching the specified criteria. This may take a while,\n    especially on network filesystems.\n\n    The returned lcdict has an extra column: 'lcn' that tracks which measurement\n    belongs to which input light curve. This can be used with\n    lcdict['concatenated'] which relates input light curve index to input light\n    curve filepath. Finally, there is an 'nconcatenated' key in the lcdict that\n    contains the total number of concatenated light curves.\n\n    '''\n    LOGINFO('looking for light curves for %s, aperture %s in directory: %s'\n            % (objectid, aperture, lcbasedir))\n\n    if recursive is False:\n\n        matching = glob.glob(os.path.join(lcbasedir,\n                                          '*%s*%s*%s' % (objectid,\n                                                         aperture,\n                                                         postfix)))\n    else:\n        # use recursive glob for Python 3.5+\n        if sys.version_info[:2] > (3,4):\n\n            matching = glob.glob(os.path.join(lcbasedir,\n                                              '**',\n                                              '*%s*%s*%s' % (objectid,\n                                                             aperture,\n                                                             postfix)),\n                                 recursive=True)\n            LOGINFO('found %s files: %s' % (len(matching), repr(matching)))\n\n        # otherwise, use os.walk and glob\n        else:\n\n            # use os.walk to go through the directories\n            walker = os.walk(lcbasedir)\n            matching = []\n\n            for root, dirs, _files in walker:\n                for sdir in dirs:\n                    searchpath = os.path.join(root,\n                                              sdir,\n                                              '*%s*%s*%s' % (objectid,\n                                                             aperture,\n                                                             postfix))\n                    foundfiles = glob.glob(searchpath)\n\n                    if foundfiles:\n                        matching.extend(foundfiles)\n                        LOGINFO(\n                            'found %s in dir: %s' % (repr(foundfiles),\n                                                     os.path.join(root,sdir))\n                        )\n\n    # now that we have all the files, concatenate them\n    # a single file will be returned as normalized\n    if matching and len(matching) > 0:\n        clcdict = concatenate_textlcs(matching,\n                                      sortby=sortby,\n                                      normalize=normalize)\n        return clcdict\n    else:\n        LOGERROR('did not find any light curves for %s and aperture %s' %\n                 (objectid, aperture))\n        return None", "response": "This function concatenates all text LCs for a given objectid with the given aperture."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef parallel_concat_worker(task):\n    '''\n    This is a worker for the function below.\n\n    task[0] = lcbasedir\n    task[1] = objectid\n    task[2] = {'aperture','postfix','sortby','normalize','outdir','recursive'}\n\n    '''\n\n    lcbasedir, objectid, kwargs = task\n\n    try:\n        return concat_write_pklc(lcbasedir, objectid, **kwargs)\n    except Exception as e:\n        LOGEXCEPTION('failed LC concatenation for %s in %s'\n                     % (objectid, lcbasedir))\n        return None", "response": "This function is a worker function that takes two arguments and returns the result of the concat_write_pklc function."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef parallel_concat_lcdir(lcbasedir,\n                          objectidlist,\n                          aperture='TF1',\n                          postfix='.gz',\n                          sortby='rjd',\n                          normalize=True,\n                          outdir=None,\n                          recursive=True,\n                          nworkers=32,\n                          maxworkertasks=1000):\n    '''This concatenates all text LCs for the given objectidlist.\n\n\n    '''\n\n    if not outdir:\n        outdir = 'pklcs'\n\n    if not os.path.exists(outdir):\n        os.mkdir(outdir)\n\n    tasks = [(lcbasedir, x, {'aperture':aperture,\n                             'postfix':postfix,\n                             'sortby':sortby,\n                             'normalize':normalize,\n                             'outdir':outdir,\n                             'recursive':recursive}) for x in objectidlist]\n\n    pool = mp.Pool(nworkers, maxtasksperchild=maxworkertasks)\n    results = pool.map(parallel_concat_worker, tasks)\n\n    pool.close()\n    pool.join()\n\n    return {x:y for (x,y) in zip(objectidlist, results)}", "response": "This function concatenates all text LCs for the given objectidlist."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef merge_hatpi_textlc_apertures(lclist):\n    '''This merges all TFA text LCs with separate apertures for a single object.\n\n    The framekey column will be used as the join column across all light curves\n    in lclist. Missing values will be filled in with nans. This function assumes\n    all light curves are in the format specified in COLDEFS above and readable\n    by read_hatpi_textlc above (i.e. have a single column for TFA mags for a\n    specific aperture at the end).\n\n    '''\n\n    lcaps = {}\n    framekeys = []\n\n    for lc in lclist:\n\n        lcd = read_hatpi_textlc(lc)\n\n        # figure what aperture this is and put it into the lcdict. if two LCs\n        # with the same aperture (i.e. TF1 and TF1) are provided, the later one\n        # in the lclist will overwrite the previous one,\n        for col in lcd['columns']:\n            if col.startswith('itf'):\n                lcaps[col] = lcd\n        thisframekeys = lcd['frk'].tolist()\n        framekeys.extend(thisframekeys)\n\n    # uniqify the framekeys\n    framekeys = sorted(list(set(framekeys)))", "response": "This function merges all TFA text LCs with separate apertures for a single object."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef read_hatpi_binnedlc(binnedpklf, textlcf, timebinsec):\n    '''This reads a binnedlc pickle produced by the HATPI prototype pipeline.\n\n    Converts it into a standard lcdict as produced by the read_hatpi_textlc\n    function above by using the information in unbinnedtextlc for the same\n    object.\n\n    Adds a 'binned' key to the standard lcdict containing the binned mags, etc.\n\n    '''\n\n    LOGINFO('reading binned LC %s' % binnedpklf)\n\n    # read the textlc\n    lcdict = read_hatpi_textlc(textlcf)\n\n    # read the binned LC\n\n    if binnedpklf.endswith('.gz'):\n        infd = gzip.open(binnedpklf,'rb')\n    else:\n        infd = open(binnedpklf,'rb')\n\n\n    try:\n        binned = pickle.load(infd)\n    except Exception as e:\n        infd.seek(0)\n        binned = pickle.load(infd, encoding='latin1')\n    infd.close()\n\n    # now that we have both, pull out the required columns from the binnedlc\n    blckeys = binned.keys()\n\n    lcdict['binned'] = {}\n\n    for key in blckeys:\n\n        # get EPD stuff\n        if (key == 'epdlc' and\n            'AP0' in binned[key] and\n            'AP1' in binned[key] and\n            'AP2' in binned[key]):\n\n            # we'll have to generate errors because we don't have any in the\n            # generated binned LC.\n\n            ap0mad = np.nanmedian(np.abs(binned[key]['AP0'] -\n                                         np.nanmedian(binned[key]['AP0'])))\n            ap1mad = np.nanmedian(np.abs(binned[key]['AP1'] -\n                                         np.nanmedian(binned[key]['AP1'])))\n            ap2mad = np.nanmedian(np.abs(binned[key]['AP2'] -\n                                         np.nanmedian(binned[key]['AP2'])))\n\n\n            lcdict['binned']['iep1'] = {'times':binned[key]['RJD'],\n                                        'mags':binned[key]['AP0'],\n                                        'errs':np.full_like(binned[key]['AP0'],\n                                                            ap0mad),\n                                        'nbins':binned[key]['nbins'],\n                                        'timebins':binned[key]['jdbins'],\n                                        'timebinsec':timebinsec}\n            lcdict['binned']['iep2'] = {'times':binned[key]['RJD'],\n                                        'mags':binned[key]['AP1'],\n                                        'errs':np.full_like(binned[key]['AP1'],\n                                                            ap1mad),\n                                        'nbins':binned[key]['nbins'],\n                                        'timebins':binned[key]['jdbins'],\n                                        'timebinsec':timebinsec}\n            lcdict['binned']['iep3'] = {'times':binned[key]['RJD'],\n                                        'mags':binned[key]['AP2'],\n                                        'errs':np.full_like(binned[key]['AP2'],\n                                                            ap2mad),\n                                        'nbins':binned[key]['nbins'],\n                                        'timebins':binned[key]['jdbins'],\n                                        'timebinsec':timebinsec}\n\n        # get TFA stuff for aperture 1\n        if ((key == 'tfalc.TF1' or key == 'tfalc.TF1.gz') and\n            'AP0' in binned[key]):\n\n            # we'll have to generate errors because we don't have any in the\n            # generated binned LC.\n\n            ap0mad = np.nanmedian(np.abs(binned[key]['AP0'] -\n                                         np.nanmedian(binned[key]['AP0'])))\n\n\n            lcdict['binned']['itf1'] = {'times':binned[key]['RJD'],\n                                        'mags':binned[key]['AP0'],\n                                        'errs':np.full_like(binned[key]['AP0'],\n                                                            ap0mad),\n                                        'nbins':binned[key]['nbins'],\n                                        'timebins':binned[key]['jdbins'],\n                                        'timebinsec':timebinsec}\n\n        # get TFA stuff for aperture 1\n        if ((key == 'tfalc.TF2' or key == 'tfalc.TF2.gz') and\n            'AP0' in binned[key]):\n\n            # we'll have to generate errors because we don't have any in the\n            # generated binned LC.\n\n            ap0mad = np.nanmedian(np.abs(binned[key]['AP0'] -\n                                         np.nanmedian(binned[key]['AP0'])))\n\n\n            lcdict['binned']['itf2'] = {'times':binned[key]['RJD'],\n                                        'mags':binned[key]['AP0'],\n                                        'errs':np.full_like(binned[key]['AP0'],\n                                                            ap0mad),\n                                        'nbins':binned[key]['nbins'],\n                                        'timebins':binned[key]['jdbins'],\n                                        'timebinsec':timebinsec}\n\n        # get TFA stuff for aperture 1\n        if ((key == 'tfalc.TF3' or key == 'tfalc.TF3.gz') and\n            'AP0' in binned[key]):\n\n            # we'll have to generate errors because we don't have any in the\n            # generated binned LC.\n\n            ap0mad = np.nanmedian(np.abs(binned[key]['AP0'] -\n                                         np.nanmedian(binned[key]['AP0'])))\n\n\n            lcdict['binned']['itf3'] = {'times':binned[key]['RJD'],\n                                        'mags':binned[key]['AP0'],\n                                        'errs':np.full_like(binned[key]['AP0'],\n                                                            ap0mad),\n                                        'nbins':binned[key]['nbins'],\n                                        'timebins':binned[key]['jdbins'],\n                                        'timebinsec':timebinsec}\n\n    # all done, check if we succeeded\n    if lcdict['binned']:\n\n        return lcdict\n\n    else:\n\n        LOGERROR('no binned measurements found in %s!' % binnedpklf)\n        return None", "response": "This function reads a binned LC and returns a lcdict containing the binned LC and the time binsec."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nadding catalog info to objectinfo key of all pklcs in pklcdir.", "response": "def pklc_fovcatalog_objectinfo(\n        pklcdir,\n        fovcatalog,\n        fovcatalog_columns=[0,1,2,\n                            6,7,\n                            8,9,\n                            10,11,\n                            13,14,15,16,\n                            17,18,19,\n                            20,21],\n        fovcatalog_colnames=['objectid','ra','decl',\n                             'jmag','jmag_err',\n                             'hmag','hmag_err',\n                             'kmag','kmag_err',\n                             'bmag','vmag','rmag','imag',\n                             'sdssu','sdssg','sdssr',\n                             'sdssi','sdssz'],\n        fovcatalog_colformats=('U20,f8,f8,'\n                               'f8,f8,'\n                               'f8,f8,'\n                               'f8,f8,'\n                               'f8,f8,f8,f8,'\n                               'f8,f8,f8,'\n                               'f8,f8')\n):\n    '''Adds catalog info to objectinfo key of all pklcs in lcdir.\n\n    If fovcatalog, fovcatalog_columns, fovcatalog_colnames are provided, uses\n    them to find all the additional information listed in the fovcatalog_colname\n    keys, and writes this info to the objectinfo key of each lcdict. This makes\n    it easier for astrobase tools to work on these light curve.\n\n    The default set up for fovcatalog is to use a text file generated by the\n    HATPI pipeline before auto-calibrating a field. The format is specified as\n    above in _columns,  _colnames, and _colformats.\n\n    '''\n\n    if fovcatalog.endswith('.gz'):\n        catfd = gzip.open(fovcatalog)\n    else:\n        catfd = open(fovcatalog)\n\n    # read the catalog using the colformats, etc.\n    fovcat = np.genfromtxt(catfd,\n                           usecols=fovcatalog_columns,\n                           names=fovcatalog_colnames,\n                           dtype=fovcatalog_colformats)\n    catfd.close()\n\n    pklclist = sorted(glob.glob(os.path.join(pklcdir, '*HAT*-pklc.pkl')))\n\n    updatedpklcs, failedpklcs = [], []\n\n    for pklc in pklclist:\n\n        lcdict = read_hatpi_pklc(pklc)\n        objectid = lcdict['objectid']\n\n        catind = np.where(fovcat['objectid'] == objectid)\n\n        # if we found catalog info for this object, put it into objectinfo\n        if len(catind) > 0 and catind[0]:\n\n            lcdict['objectinfo'].update(\n                {x:y for x,y in zip(\n                    fovcatalog_colnames,\n                    [np.asscalar(fovcat[z][catind]) for\n                     z in fovcatalog_colnames]\n                )\n                }\n            )\n\n            # write the LC back to the pickle (tempfile for atomicity)\n            with open(pklc+'-tmp','wb') as outfd:\n                pickle.dump(lcdict, outfd, pickle.HIGHEST_PROTOCOL)\n\n            # overwrite previous once we know it exists\n            if os.path.exists(pklc+'-tmp'):\n                shutil.move(pklc+'-tmp',pklc)\n\n                LOGINFO('updated %s with catalog info for %s at %.3f, %.3f OK' %\n                        (pklc, objectid,\n                         lcdict['objectinfo']['ra'],\n                         lcdict['objectinfo']['decl']))\n\n                updatedpklcs.append(pklc)\n\n        # otherwise, do nothing\n        else:\n            failedpklcs.append(pklc)\n\n    # end of pklclist processing\n    return updatedpklcs, failedpklcs"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _base64_to_file(b64str, outfpath, writetostrio=False):\n    '''This converts the base64 encoded string to a file.\n\n    Parameters\n    ----------\n\n    b64str : str\n        A base64 encoded strin that is the output of `base64.b64encode`.\n\n    outfpath : str\n        The path to where the file will be written. This should include an\n        appropriate extension for the file (e.g. a base64 encoded string that\n        represents a PNG should have its `outfpath` end in a '.png') so the OS\n        can open these files correctly.\n\n    writetostrio : bool\n        If this is True, will return a StringIO object with the binary stream\n        decoded from the base64-encoded input string `b64str`. This can be\n        useful to embed these into other files without having to write them to\n        disk.\n\n    Returns\n    -------\n\n    str or StringIO object\n        If `writetostrio` is False, will return the output file's path as a\n        str. If it is True, will return a StringIO object directly. If writing\n        the file fails in either case, will return None.\n\n    '''\n\n    try:\n\n        filebytes = base64.b64decode(b64str)\n\n        # if we're writing back to a stringio object\n        if writetostrio:\n\n            outobj = StrIO(filebytes)\n            return outobj\n\n        # otherwise, we're writing to an actual file\n        else:\n\n            with open(outfpath,'wb') as outfd:\n                outfd.write(filebytes)\n\n            if os.path.exists(outfpath):\n                return outfpath\n            else:\n                LOGERROR('could not write output file: %s' % outfpath)\n                return None\n\n    except Exception as e:\n\n        LOGEXCEPTION('failed while trying to convert '\n                     'b64 string to file %s' % outfpath)\n        return None", "response": "This function converts the base64 encoded string to a file."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef get_phased_quantities(stimes, smags, serrs, period):\n    '''Does phase-folding for the mag/flux time-series given a period.\n\n    Given finite and sigma-clipped times, magnitudes, and errors, along with the\n    period at which to phase-fold the data, perform the phase-folding and\n    return the phase-folded values.\n\n    Parameters\n    ----------\n\n    stimes,smags,serrs : np.array\n        The sigma-clipped and finite input mag/flux time-series arrays to\n        operate on.\n\n    period : float\n        The period to phase the mag/flux time-series at. stimes.min() is used as\n        the epoch value to fold the times-series around.\n\n    Returns\n    -------\n\n    (phase, pmags, perrs, ptimes, mintime) : tuple\n        The tuple returned contains the following items:\n\n        - `phase`: phase-sorted values of phase at each of stimes\n        - `pmags`: phase-sorted magnitudes at each phase\n        - `perrs`: phase-sorted errors\n        - `ptimes`: phase-sorted times\n        - `mintime`: earliest time in stimes.\n\n    '''\n\n    # phase the mag series using the given period and faintest mag time\n    # mintime = stimes[npwhere(smags == npmax(smags))]\n\n    # phase the mag series using the given period and epoch = min(stimes)\n    mintime = np.min(stimes)\n\n    # calculate the unsorted phase, then sort it\n    iphase = (stimes - mintime)/period - np.floor((stimes - mintime)/period)\n    phasesortind = np.argsort(iphase)\n\n    # these are the final quantities to use for the Fourier fits\n    phase = iphase[phasesortind]\n    pmags = smags[phasesortind]\n    perrs = serrs[phasesortind]\n\n    # get the times sorted in phase order (useful to get the fit mag minimum\n    # with respect to phase -- the light curve minimum)\n    ptimes = stimes[phasesortind]\n\n    return phase, pmags, perrs, ptimes, mintime", "response": "Does phase - folding for the mag - time - series given a period. Returns the final quantities of the mag - time - series."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef make_fit_plot(phase, pmags, perrs, fitmags,\n                  period, mintime, magseriesepoch,\n                  plotfit,\n                  magsarefluxes=False,\n                  wrap=False,\n                  model_over_lc=False):\n    '''This makes a plot of the LC model fit.\n\n    Parameters\n    ----------\n\n    phase,pmags,perrs : np.array\n        The actual mag/flux time-series.\n\n    fitmags : np.array\n        The model fit time-series.\n\n    period : float\n        The period at which the phased LC was generated.\n\n    mintime : float\n        The minimum time value.\n\n    magseriesepoch : float\n        The value of time around which the phased LC was folded.\n\n    plotfit : str\n        The name of a file to write the plot to.\n\n    magsarefluxes : bool\n        Set this to True if the values in `pmags` and `fitmags` are actually\n        fluxes.\n\n    wrap : bool\n        If True, will wrap the phased LC around 0.0 to make some phased LCs\n        easier to look at.\n\n    model_over_lc : bool\n        Usually, this function will plot the actual LC over the model LC. Set\n        this to True to plot the model over the actual LC; this is most useful\n        when you have a very dense light curve and want to be able to see how it\n        follows the model.\n\n    Returns\n    -------\n\n    Nothing.\n\n    '''\n\n    # set up the figure\n    plt.close('all')\n    plt.figure(figsize=(8,4.8))\n\n    if model_over_lc:\n        model_z = 100\n        lc_z = 0\n    else:\n        model_z = 0\n        lc_z = 100\n\n\n    if not wrap:\n\n        plt.plot(phase, fitmags, linewidth=3.0, color='red',zorder=model_z)\n        plt.plot(phase,pmags,\n                 marker='o',\n                 markersize=1.0,\n                 linestyle='none',\n                 rasterized=True, color='k',zorder=lc_z)\n\n        # set the x axis ticks and label\n        plt.gca().set_xticks(\n            [0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]\n        )\n\n    else:\n        plt.plot(np.concatenate([phase-1.0,phase]),\n                 np.concatenate([fitmags,fitmags]),\n                 linewidth=3.0,\n                 color='red',zorder=model_z)\n        plt.plot(np.concatenate([phase-1.0,phase]),\n                 np.concatenate([pmags,pmags]),\n                 marker='o',\n                 markersize=1.0,\n                 linestyle='none',\n                 rasterized=True, color='k',zorder=lc_z)\n\n        plt.gca().set_xlim((-0.8,0.8))\n        # set the x axis ticks and label\n        plt.gca().set_xticks(\n            [-0.8,-0.7,-0.6,-0.5,-0.4,-0.3,-0.2,-0.1,\n             0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8]\n        )\n\n    # set the y axis limit and label\n    ymin, ymax = plt.ylim()\n    if not magsarefluxes:\n        plt.gca().invert_yaxis()\n        plt.ylabel('magnitude')\n    else:\n        plt.ylabel('flux')\n\n\n    plt.xlabel('phase')\n    plt.title('period: %.6f, folded at %.6f, fit epoch: %.6f' %\n              (period, mintime, magseriesepoch))\n    plt.savefig(plotfit)\n    plt.close()", "response": "This function makes a plot of the model fit of the phased LC model."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef objectlist_radeclbox(radeclbox,\n                         gaia_mirror=None,\n                         columns=('source_id',\n                                  'ra','dec',\n                                  'phot_g_mean_mag',\n                                  'l','b',\n                                  'parallax, parallax_error',\n                                  'pmra','pmra_error',\n                                  'pmdec','pmdec_error'),\n                         extra_filter=None,\n                         returnformat='csv',\n                         forcefetch=False,\n                         cachedir='~/.astrobase/gaia-cache',\n                         verbose=True,\n                         timeout=15.0,\n                         refresh=2.0,\n                         maxtimeout=300.0,\n                         maxtries=3,\n                         complete_query_later=True):\n\n    '''This queries the GAIA TAP service for a list of objects in an equatorial\n    coordinate box.\n\n    Parameters\n    ----------\n\n    radeclbox : sequence of four floats\n        This defines the box to search in::\n\n            [ra_min, ra_max, decl_min, decl_max]\n\n    gaia_mirror : {'gaia','heidelberg','vizier'} or None\n        This is the key used to select a GAIA catalog mirror from the\n        `GAIA_URLS` dict above. If set, the specified mirror will be used. If\n        None, a random mirror chosen from that dict will be used.\n\n    columns : sequence of str\n        This indicates which columns from the GAIA table to request for the\n        objects found within the search radius.\n\n    extra_filter: str or None\n        If this is provided, must be a valid ADQL filter string that is used to\n        further filter the cone-search results.\n\n    returnformat : {'csv','votable','json'}\n        The returned file format to request from the GAIA catalog service.\n\n    forcefetch : bool\n        If this is True, the query will be retried even if cached results for\n        it exist.\n\n    cachedir : str\n        This points to the directory where results will be downloaded.\n\n    verbose : bool\n        If True, will indicate progress and warn of any issues.\n\n    timeout : float\n        This sets the amount of time in seconds to wait for the service to\n        respond to our initial request.\n\n    refresh : float\n        This sets the amount of time in seconds to wait before checking if the\n        result file is available. If the results file isn't available after\n        `refresh` seconds have elapsed, the function will wait for `refresh`\n        seconds continuously, until `maxtimeout` is reached or the results file\n        becomes available.\n\n    maxtimeout : float\n        The maximum amount of time in seconds to wait for a result to become\n        available after submitting our query request.\n\n    maxtries : int\n        The maximum number of tries (across all mirrors tried) to make to either\n        submit the request or download the results, before giving up.\n\n    completequerylater : bool\n        If set to True, a submitted query that does not return a result before\n        `maxtimeout` has passed will be cancelled but its input request\n        parameters and the result URL provided by the service will be saved. If\n        this function is then called later with these same input request\n        parameters, it will check if the query finally finished and a result is\n        available. If so, will download the results instead of submitting a new\n        query. If it's not done yet, will start waiting for results again. To\n        force launch a new query with the same request parameters, set the\n        `forcefetch` kwarg to True.\n\n    Returns\n    -------\n\n    dict\n        This returns a dict of the following form::\n\n            {'params':dict of the input params used for the query,\n             'provenance':'cache' or 'new download',\n             'result':path to the file on disk with the downloaded data table}\n\n    '''\n\n    # this was generated using the awesome query generator at:\n    # https://gea.esac.esa.int/archive/\n\n    # NOTE: here we don't resolve the table name right away. this is because\n    # some of the GAIA mirrors use different table names, so we leave the table\n    # name to be resolved by the lower level tap_query function. this is done by\n    # the {{table}} construct.\n    query = (\n        \"select {columns} from {{table}} where \"\n        \"CONTAINS(POINT('ICRS',{{table}}.ra, {{table}}.dec),\"\n        \"BOX('ICRS',{ra_center:.5f},{decl_center:.5f},\"\n        \"{ra_width:.5f},{decl_height:.5f}))=1\"\n        \"{extra_filter_str}\"\n    )\n\n    ra_min, ra_max, decl_min, decl_max = radeclbox\n    ra_center = (ra_max + ra_min)/2.0\n    decl_center = (decl_max + decl_min)/2.0\n    ra_width = ra_max - ra_min\n    decl_height = decl_max - decl_min\n\n    if extra_filter is not None:\n        extra_filter_str = ' and %s ' % extra_filter\n    else:\n        extra_filter_str = ''\n\n    formatted_query = query.format(columns=', '.join(columns),\n                                   extra_filter_str=extra_filter_str,\n                                   ra_center=ra_center,\n                                   decl_center=decl_center,\n                                   ra_width=ra_width,\n                                   decl_height=decl_height)\n\n    return tap_query(formatted_query,\n                     gaia_mirror=gaia_mirror,\n                     returnformat=returnformat,\n                     forcefetch=forcefetch,\n                     cachedir=cachedir,\n                     verbose=verbose,\n                     timeout=timeout,\n                     refresh=refresh,\n                     maxtimeout=maxtimeout,\n                     maxtries=maxtries,\n                     complete_query_later=complete_query_later)", "response": "This function returns a list of objects in an equatorial object list."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef generalized_lsp_value(times, mags, errs, omega):\n    '''Generalized LSP value for a single omega.\n\n    The relations used are::\n\n        P(w) = (1/YY) * (YC*YC/CC + YS*YS/SS)\n\n        where: YC, YS, CC, and SS are all calculated at T\n\n        and where: tan 2omegaT = 2*CS/(CC - SS)\n\n        and where:\n\n        Y = sum( w_i*y_i )\n        C = sum( w_i*cos(wT_i) )\n        S = sum( w_i*sin(wT_i) )\n\n        YY = sum( w_i*y_i*y_i ) - Y*Y\n        YC = sum( w_i*y_i*cos(wT_i) ) - Y*C\n        YS = sum( w_i*y_i*sin(wT_i) ) - Y*S\n\n        CpC = sum( w_i*cos(w_T_i)*cos(w_T_i) )\n        CC = CpC - C*C\n        SS = (1 - CpC) - S*S\n        CS = sum( w_i*cos(w_T_i)*sin(w_T_i) ) - C*S\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The time-series to calculate the periodogram value for.\n\n    omega : float\n        The frequency to calculate the periodogram value at.\n\n    Returns\n    -------\n\n    periodogramvalue : float\n        The normalized periodogram at the specified test frequency `omega`.\n\n    '''\n\n    one_over_errs2 = 1.0/(errs*errs)\n\n    W = npsum(one_over_errs2)\n    wi = one_over_errs2/W\n\n    sin_omegat = npsin(omega*times)\n    cos_omegat = npcos(omega*times)\n\n    sin2_omegat = sin_omegat*sin_omegat\n    cos2_omegat = cos_omegat*cos_omegat\n    sincos_omegat = sin_omegat*cos_omegat\n\n    # calculate some more sums and terms\n    Y = npsum( wi*mags )\n    C = npsum( wi*cos_omegat )\n    S = npsum( wi*sin_omegat )\n\n    CpS = npsum( wi*sincos_omegat )\n    CpC = npsum( wi*cos2_omegat )\n    CS = CpS - C*S\n    CC = CpC - C*C\n    SS = 1 - CpC - S*S  # use SpS = 1 - CpC\n\n    # calculate tau\n    tan_omega_tau_top = 2.0*CS\n    tan_omega_tau_bottom = CC - SS\n    tan_omega_tau = tan_omega_tau_top/tan_omega_tau_bottom\n    tau = nparctan(tan_omega_tau)/(2.0*omega)\n\n    YpY = npsum( wi*mags*mags)\n\n    YpC = npsum( wi*mags*cos_omegat )\n    YpS = npsum( wi*mags*sin_omegat )\n\n    # SpS = npsum( wi*sin2_omegat )\n\n    # the final terms\n    YY = YpY - Y*Y\n    YC = YpC - Y*C\n    YS = YpS - Y*S\n\n    periodogramvalue = (YC*YC/CC + YS*YS/SS)/YY\n\n    return periodogramvalue", "response": "Generalized LSP value for a single omega."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef generalized_lsp_value_withtau(times, mags, errs, omega):\n    '''Generalized LSP value for a single omega.\n\n    This uses tau to provide an arbitrary time-reference point.\n\n    The relations used are::\n\n        P(w) = (1/YY) * (YC*YC/CC + YS*YS/SS)\n\n        where: YC, YS, CC, and SS are all calculated at T\n\n        and where: tan 2omegaT = 2*CS/(CC - SS)\n\n        and where:\n\n        Y = sum( w_i*y_i )\n        C = sum( w_i*cos(wT_i) )\n        S = sum( w_i*sin(wT_i) )\n\n        YY = sum( w_i*y_i*y_i ) - Y*Y\n        YC = sum( w_i*y_i*cos(wT_i) ) - Y*C\n        YS = sum( w_i*y_i*sin(wT_i) ) - Y*S\n\n        CpC = sum( w_i*cos(w_T_i)*cos(w_T_i) )\n        CC = CpC - C*C\n        SS = (1 - CpC) - S*S\n        CS = sum( w_i*cos(w_T_i)*sin(w_T_i) ) - C*S\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The time-series to calculate the periodogram value for.\n\n    omega : float\n        The frequency to calculate the periodogram value at.\n\n    Returns\n    -------\n\n    periodogramvalue : float\n        The normalized periodogram at the specified test frequency `omega`.\n\n    '''\n\n    one_over_errs2 = 1.0/(errs*errs)\n\n    W = npsum(one_over_errs2)\n    wi = one_over_errs2/W\n\n    sin_omegat = npsin(omega*times)\n    cos_omegat = npcos(omega*times)\n\n    sin2_omegat = sin_omegat*sin_omegat\n    cos2_omegat = cos_omegat*cos_omegat\n    sincos_omegat = sin_omegat*cos_omegat\n\n    # calculate some more sums and terms\n    Y = npsum( wi*mags )\n    C = npsum( wi*cos_omegat )\n    S = npsum( wi*sin_omegat )\n\n    CpS = npsum( wi*sincos_omegat )\n    CpC = npsum( wi*cos2_omegat )\n    CS = CpS - C*S\n    CC = CpC - C*C\n    SS = 1 - CpC - S*S  # use SpS = 1 - CpC\n\n    # calculate tau\n    tan_omega_tau_top = 2.0*CS\n    tan_omega_tau_bottom = CC - SS\n    tan_omega_tau = tan_omega_tau_top/tan_omega_tau_bottom\n    tau = nparctan(tan_omega_tau)/(2.0*omega)\n\n    # now we need to calculate all the bits at tau\n    sin_omega_tau = npsin(omega*(times - tau))\n    cos_omega_tau = npcos(omega*(times - tau))\n    sin2_omega_tau = sin_omega_tau*sin_omega_tau\n    cos2_omega_tau = cos_omega_tau*cos_omega_tau\n    sincos_omega_tau = sin_omega_tau*cos_omega_tau\n\n    C_tau = npsum(wi*cos_omega_tau)\n    S_tau = npsum(wi*sin_omega_tau)\n\n    CpS_tau = npsum( wi*sincos_omega_tau )\n    CpC_tau = npsum( wi*cos2_omega_tau )\n    CS_tau = CpS_tau - C_tau*S_tau\n    CC_tau = CpC_tau - C_tau*C_tau\n    SS_tau = 1 - CpC_tau - S_tau*S_tau  # use SpS = 1 - CpC\n\n    YpY = npsum( wi*mags*mags)\n\n    YpC_tau = npsum( wi*mags*cos_omega_tau )\n    YpS_tau = npsum( wi*mags*sin_omega_tau )\n\n    # SpS = npsum( wi*sin2_omegat )\n\n    # the final terms\n    YY = YpY - Y*Y\n    YC_tau = YpC_tau - Y*C_tau\n    YS_tau = YpS_tau - Y*S_tau\n\n    periodogramvalue = (YC_tau*YC_tau/CC_tau + YS_tau*YS_tau/SS_tau)/YY\n\n    return periodogramvalue", "response": "Generalized LSP value for a single omega."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef specwindow_lsp_value(times, mags, errs, omega):\n    '''This calculates the peak associated with the spectral window function\n    for times and at the specified omega.\n\n    NOTE: this is classical Lomb-Scargle, not the Generalized\n    Lomb-Scargle. `mags` and `errs` are silently ignored since we're calculating\n    the periodogram of the observing window function. These are kept to present\n    a consistent external API so the `pgen_lsp` function below can call this\n    transparently.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The time-series to calculate the periodogram value for.\n\n    omega : float\n        The frequency to calculate the periodogram value at.\n\n    Returns\n    -------\n\n    periodogramvalue : float\n        The normalized periodogram at the specified test frequency `omega`.\n\n    '''\n\n    norm_times = times - times.min()\n\n    tau = (\n        (1.0/(2.0*omega)) *\n        nparctan( npsum(npsin(2.0*omega*norm_times)) /\n                  npsum(npcos(2.0*omega*norm_times)) )\n    )\n\n    lspval_top_cos = (npsum(1.0 * npcos(omega*(norm_times-tau))) *\n                      npsum(1.0 * npcos(omega*(norm_times-tau))))\n    lspval_bot_cos = npsum( (npcos(omega*(norm_times-tau))) *\n                            (npcos(omega*(norm_times-tau))) )\n\n    lspval_top_sin = (npsum(1.0 * npsin(omega*(norm_times-tau))) *\n                      npsum(1.0 * npsin(omega*(norm_times-tau))))\n    lspval_bot_sin = npsum( (npsin(omega*(norm_times-tau))) *\n                            (npsin(omega*(norm_times-tau))) )\n\n    lspval = 0.5 * ( (lspval_top_cos/lspval_bot_cos) +\n                     (lspval_top_sin/lspval_bot_sin) )\n\n    return lspval", "response": "This function calculates the peak associated with the spectral window function\n    for times and at the specified omega."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef pgen_lsp(\n        times,\n        mags,\n        errs,\n        magsarefluxes=False,\n        startp=None,\n        endp=None,\n        stepsize=1.0e-4,\n        autofreq=True,\n        nbestpeaks=5,\n        periodepsilon=0.1,\n        sigclip=10.0,\n        nworkers=None,\n        workchunksize=None,\n        glspfunc=_glsp_worker_withtau,\n        verbose=True\n):\n    '''This calculates the generalized Lomb-Scargle periodogram.\n\n    Uses the algorithm from Zechmeister and Kurster (2009).\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The mag/flux time-series with associated measurement errors to run the\n        period-finding on.\n\n    magsarefluxes : bool\n        If the input measurement values in `mags` and `errs` are in fluxes, set\n        this to True.\n\n    startp,endp : float or None\n        The minimum and maximum periods to consider for the transit search.\n\n    stepsize : float\n        The step-size in frequency to use when constructing a frequency grid for\n        the period search.\n\n    autofreq : bool\n        If this is True, the value of `stepsize` will be ignored and the\n        :py:func:`astrobase.periodbase.get_frequency_grid` function will be used\n        to generate a frequency grid based on `startp`, and `endp`. If these are\n        None as well, `startp` will be set to 0.1 and `endp` will be set to\n        `times.max() - times.min()`.\n\n    nbestpeaks : int\n        The number of 'best' peaks to return from the periodogram results,\n        starting from the global maximum of the periodogram peak values.\n\n    periodepsilon : float\n        The fractional difference between successive values of 'best' periods\n        when sorting by periodogram power to consider them as separate periods\n        (as opposed to part of the same periodogram peak). This is used to avoid\n        broad peaks in the periodogram and make sure the 'best' periods returned\n        are all actually independent.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    nworkers : int\n        The number of parallel workers to use when calculating the periodogram.\n\n    workchunksize : None or int\n        If this is an int, will use chunks of the given size to break up the\n        work for the parallel workers. If None, the chunk size is set to 1.\n\n    glspfunc : Python function\n        The worker function to use to calculate the periodogram. This can be\n        used to make this function calculate the time-series sampling window\n        function instead of the time-series measurements' GLS periodogram by\n        passing in `_glsp_worker_specwindow` instead of the default\n        `_glsp_worker_withtau` function.\n\n    verbose : bool\n        If this is True, will indicate progress and details about the frequency\n        grid used for the period search.\n\n    Returns\n    -------\n\n    dict\n        This function returns a dict, referred to as an `lspinfo` dict in other\n        astrobase functions that operate on periodogram results. This is a\n        standardized format across all astrobase period-finders, and is of the\n        form below::\n\n            {'bestperiod': the best period value in the periodogram,\n             'bestlspval': the periodogram peak associated with the best period,\n             'nbestpeaks': the input value of nbestpeaks,\n             'nbestlspvals': nbestpeaks-size list of best period peak values,\n             'nbestperiods': nbestpeaks-size list of best periods,\n             'lspvals': the full array of periodogram powers,\n             'periods': the full array of periods considered,\n             'method':'gls' -> the name of the period-finder method,\n             'kwargs':{ dict of all of the input kwargs for record-keeping}}\n\n    '''\n\n    # get rid of nans first and sigclip\n    stimes, smags, serrs = sigclip_magseries(times,\n                                             mags,\n                                             errs,\n                                             magsarefluxes=magsarefluxes,\n                                             sigclip=sigclip)\n\n    # get rid of zero errs\n    nzind = npnonzero(serrs)\n    stimes, smags, serrs = stimes[nzind], smags[nzind], serrs[nzind]\n\n\n    # make sure there are enough points to calculate a spectrum\n    if len(stimes) > 9 and len(smags) > 9 and len(serrs) > 9:\n\n        # get the frequencies to use\n        if startp:\n            endf = 1.0/startp\n        else:\n            # default start period is 0.1 day\n            endf = 1.0/0.1\n\n        if endp:\n            startf = 1.0/endp\n        else:\n            # default end period is length of time series\n            startf = 1.0/(stimes.max() - stimes.min())\n\n        # if we're not using autofreq, then use the provided frequencies\n        if not autofreq:\n            omegas = 2*pi_value*nparange(startf, endf, stepsize)\n            if verbose:\n                LOGINFO(\n                    'using %s frequency points, start P = %.3f, end P = %.3f' %\n                    (omegas.size, 1.0/endf, 1.0/startf)\n                )\n        else:\n            # this gets an automatic grid of frequencies to use\n            freqs = get_frequency_grid(stimes,\n                                       minfreq=startf,\n                                       maxfreq=endf)\n            omegas = 2*pi_value*freqs\n            if verbose:\n                LOGINFO(\n                    'using autofreq with %s frequency points, '\n                    'start P = %.3f, end P = %.3f' %\n                    (omegas.size, 1.0/freqs.max(), 1.0/freqs.min())\n                )\n\n        # map to parallel workers\n        if (not nworkers) or (nworkers > NCPUS):\n            nworkers = NCPUS\n            if verbose:\n                LOGINFO('using %s workers...' % nworkers)\n\n        pool = Pool(nworkers)\n\n        tasks = [(stimes, smags, serrs, x) for x in omegas]\n        if workchunksize:\n            lsp = pool.map(glspfunc, tasks, chunksize=workchunksize)\n        else:\n            lsp = pool.map(glspfunc, tasks)\n\n        pool.close()\n        pool.join()\n        del pool\n\n        lsp = nparray(lsp)\n        periods = 2.0*pi_value/omegas\n\n        # find the nbestpeaks for the periodogram: 1. sort the lsp array by\n        # highest value first 2. go down the values until we find five\n        # values that are separated by at least periodepsilon in period\n\n        # make sure to filter out non-finite values of lsp\n\n        finitepeakind = npisfinite(lsp)\n        finlsp = lsp[finitepeakind]\n        finperiods = periods[finitepeakind]\n\n        # make sure that finlsp has finite values before we work on it\n        try:\n\n            bestperiodind = npargmax(finlsp)\n\n        except ValueError:\n\n            LOGERROR('no finite periodogram values '\n                     'for this mag series, skipping...')\n            return {'bestperiod':npnan,\n                    'bestlspval':npnan,\n                    'nbestpeaks':nbestpeaks,\n                    'nbestlspvals':None,\n                    'nbestperiods':None,\n                    'lspvals':None,\n                    'omegas':omegas,\n                    'periods':None,\n                    'method':'gls',\n                    'kwargs':{'startp':startp,\n                              'endp':endp,\n                              'stepsize':stepsize,\n                              'autofreq':autofreq,\n                              'periodepsilon':periodepsilon,\n                              'nbestpeaks':nbestpeaks,\n                              'sigclip':sigclip}}\n\n        sortedlspind = npargsort(finlsp)[::-1]\n        sortedlspperiods = finperiods[sortedlspind]\n        sortedlspvals = finlsp[sortedlspind]\n\n        # now get the nbestpeaks\n        nbestperiods, nbestlspvals, peakcount = (\n            [finperiods[bestperiodind]],\n            [finlsp[bestperiodind]],\n            1\n        )\n        prevperiod = sortedlspperiods[0]\n\n        # find the best nbestpeaks in the lsp and their periods\n        for period, lspval in zip(sortedlspperiods, sortedlspvals):\n\n            if peakcount == nbestpeaks:\n                break\n            perioddiff = abs(period - prevperiod)\n            bestperiodsdiff = [abs(period - x) for x in nbestperiods]\n\n            # print('prevperiod = %s, thisperiod = %s, '\n            #       'perioddiff = %s, peakcount = %s' %\n            #       (prevperiod, period, perioddiff, peakcount))\n\n            # this ensures that this period is different from the last\n            # period and from all the other existing best periods by\n            # periodepsilon to make sure we jump to an entire different peak\n            # in the periodogram\n            if (perioddiff > (periodepsilon*prevperiod) and\n                all(x > (periodepsilon*period) for x in bestperiodsdiff)):\n                nbestperiods.append(period)\n                nbestlspvals.append(lspval)\n                peakcount = peakcount + 1\n\n            prevperiod = period\n\n\n        return {'bestperiod':finperiods[bestperiodind],\n                'bestlspval':finlsp[bestperiodind],\n                'nbestpeaks':nbestpeaks,\n                'nbestlspvals':nbestlspvals,\n                'nbestperiods':nbestperiods,\n                'lspvals':lsp,\n                'omegas':omegas,\n                'periods':periods,\n                'method':'gls',\n                'kwargs':{'startp':startp,\n                          'endp':endp,\n                          'stepsize':stepsize,\n                          'autofreq':autofreq,\n                          'periodepsilon':periodepsilon,\n                          'nbestpeaks':nbestpeaks,\n                          'sigclip':sigclip}}\n\n    else:\n\n        LOGERROR('no good detections for these times and mags, skipping...')\n        return {'bestperiod':npnan,\n                'bestlspval':npnan,\n                'nbestpeaks':nbestpeaks,\n                'nbestlspvals':None,\n                'nbestperiods':None,\n                'lspvals':None,\n                'omegas':None,\n                'periods':None,\n                'method':'gls',\n                'kwargs':{'startp':startp,\n                          'endp':endp,\n                          'stepsize':stepsize,\n                          'autofreq':autofreq,\n                          'periodepsilon':periodepsilon,\n                          'nbestpeaks':nbestpeaks,\n                          'sigclip':sigclip}}", "response": "This function calculates the generalized Lomb - Scargle periodogram for the given time - series."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef specwindow_lsp(\n        times,\n        mags,\n        errs,\n        magsarefluxes=False,\n        startp=None,\n        endp=None,\n        stepsize=1.0e-4,\n        autofreq=True,\n        nbestpeaks=5,\n        periodepsilon=0.1,\n        sigclip=10.0,\n        nworkers=None,\n        glspfunc=_glsp_worker_specwindow,\n        verbose=True\n):\n    '''This calculates the spectral window function.\n\n    Wraps the `pgen_lsp` function above to use the specific worker for\n    calculating the window-function.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The mag/flux time-series with associated measurement errors to run the\n        period-finding on.\n\n    magsarefluxes : bool\n        If the input measurement values in `mags` and `errs` are in fluxes, set\n        this to True.\n\n    startp,endp : float or None\n        The minimum and maximum periods to consider for the transit search.\n\n    stepsize : float\n        The step-size in frequency to use when constructing a frequency grid for\n        the period search.\n\n    autofreq : bool\n        If this is True, the value of `stepsize` will be ignored and the\n        :py:func:`astrobase.periodbase.get_frequency_grid` function will be used\n        to generate a frequency grid based on `startp`, and `endp`. If these are\n        None as well, `startp` will be set to 0.1 and `endp` will be set to\n        `times.max() - times.min()`.\n\n    nbestpeaks : int\n        The number of 'best' peaks to return from the periodogram results,\n        starting from the global maximum of the periodogram peak values.\n\n    periodepsilon : float\n        The fractional difference between successive values of 'best' periods\n        when sorting by periodogram power to consider them as separate periods\n        (as opposed to part of the same periodogram peak). This is used to avoid\n        broad peaks in the periodogram and make sure the 'best' periods returned\n        are all actually independent.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    nworkers : int\n        The number of parallel workers to use when calculating the periodogram.\n\n    glspfunc : Python function\n        The worker function to use to calculate the periodogram. This is used to\n        used to make the `pgen_lsp` function calculate the time-series sampling\n        window function instead of the time-series measurements' GLS periodogram\n        by passing in `_glsp_worker_specwindow` instead of the default\n        `_glsp_worker` function.\n\n    verbose : bool\n        If this is True, will indicate progress and details about the frequency\n        grid used for the period search.\n\n    Returns\n    -------\n\n    dict\n        This function returns a dict, referred to as an `lspinfo` dict in other\n        astrobase functions that operate on periodogram results. This is a\n        standardized format across all astrobase period-finders, and is of the\n        form below::\n\n            {'bestperiod': the best period value in the periodogram,\n             'bestlspval': the periodogram peak associated with the best period,\n             'nbestpeaks': the input value of nbestpeaks,\n             'nbestlspvals': nbestpeaks-size list of best period peak values,\n             'nbestperiods': nbestpeaks-size list of best periods,\n             'lspvals': the full array of periodogram powers,\n             'periods': the full array of periods considered,\n             'method':'win' -> the name of the period-finder method,\n             'kwargs':{ dict of all of the input kwargs for record-keeping}}\n\n    '''\n\n    # run the LSP using glsp_worker_specwindow as the worker\n    lspres = pgen_lsp(\n        times,\n        mags,\n        errs,\n        magsarefluxes=magsarefluxes,\n        startp=startp,\n        endp=endp,\n        autofreq=autofreq,\n        nbestpeaks=nbestpeaks,\n        periodepsilon=periodepsilon,\n        stepsize=stepsize,\n        nworkers=nworkers,\n        sigclip=sigclip,\n        glspfunc=glspfunc,\n        verbose=verbose\n    )\n\n    # update the resultdict to indicate we're a spectral window function\n    lspres['method'] = 'win'\n\n    if lspres['lspvals'] is not None:\n\n        # renormalize the periodogram to between 0 and 1 like the usual GLS.\n        lspmax = npnanmax(lspres['lspvals'])\n\n        if npisfinite(lspmax):\n\n            lspres['lspvals'] = lspres['lspvals']/lspmax\n            lspres['nbestlspvals'] = [\n                x/lspmax for x in lspres['nbestlspvals']\n            ]\n            lspres['bestlspval'] = lspres['bestlspval']/lspmax\n\n    return lspres", "response": "This function calculates the spectral window function for a set of time - series times mags and errs."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_new_apikey(lcc_server):\n    '''This gets a new API key from the specified LCC-Server.\n\n    NOTE: this only gets an anonymous API key. To get an API key tied to a user\n    account (and associated privilege level), see the `import_apikey` function\n    below.\n\n    Parameters\n    ----------\n\n    lcc_server : str\n        The base URL of the LCC-Server from where the API key will be fetched.\n\n    Returns\n    -------\n\n    (apikey, expiry) : tuple\n        This returns a tuple with the API key and its expiry date.\n\n    '''\n\n    USERHOME = os.path.expanduser('~')\n    APIKEYFILE = os.path.join(USERHOME,\n                              '.astrobase',\n                              'lccs',\n                              'apikey-%s' % lcc_server.replace(\n                                  'https://',\n                                  'https-'\n                              ).replace(\n                                  'http://',\n                                  'http-'\n                              ))\n\n    # url for getting an API key\n    url = '%s/api/key' % lcc_server\n\n    # get the API key\n    resp = urlopen(url)\n\n    if resp.code == 200:\n\n        respdict = json.loads(resp.read())\n\n    else:\n\n        LOGERROR('could not fetch the API key from LCC-Server at: %s' %\n                 lcc_server)\n        LOGERROR('the HTTP status code was: %s' % resp.status_code)\n        return None\n\n    #\n    # now that we have an API key dict, get the API key out of it and write it\n    # to the APIKEYFILE\n    #\n    apikey = respdict['result']['apikey']\n    expires = respdict['result']['expires']\n\n    # write this to the apikey file\n\n    if not os.path.exists(os.path.dirname(APIKEYFILE)):\n        os.makedirs(os.path.dirname(APIKEYFILE))\n\n    with open(APIKEYFILE,'w') as outfd:\n        outfd.write('%s %s\\n' % (apikey, expires))\n\n    # chmod it to the correct value\n    os.chmod(APIKEYFILE, 0o100600)\n\n    LOGINFO('key fetched successfully from: %s. expires on: %s' % (lcc_server,\n                                                                   expires))\n    LOGINFO('written to: %s' % APIKEYFILE)\n\n    return apikey, expires", "response": "This function gets a new API key from the specified LCC - Server and creates a new apikey file."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef submit_post_searchquery(url, data, apikey):\n    '''This submits a POST query to an LCC-Server search API endpoint.\n\n    Handles streaming of the results, and returns the final JSON stream. Also\n    handles results that time out.\n\n    Parameters\n    ----------\n\n    url : str\n        The URL of the search API endpoint to hit. This is something like\n        `https://data.hatsurveys.org/api/conesearch`\n\n    data : dict\n        A dict of the search query parameters to pass to the search service.\n\n    apikey : str\n        The API key to use to access the search service. API keys are required\n        for all POST request made to an LCC-Server's API endpoints.\n\n    Returns\n    -------\n\n    (status_flag, data_dict, dataset_id) : tuple\n        This returns a tuple containing the status of the request: ('complete',\n        'failed', 'background', etc.), a dict parsed from the JSON result of the\n        request, and a dataset ID, which can be used to reconstruct the URL on\n        the LCC-Server where the results can be browsed.\n\n    '''\n\n    # first, we need to convert any columns and collections items to broken out\n    # params\n    postdata = {}\n\n    for key in data:\n\n        if key == 'columns':\n            postdata['columns[]'] = data[key]\n        elif key == 'collections':\n            postdata['collections[]'] = data[key]\n        else:\n            postdata[key] = data[key]\n\n    # do the urlencode with doseq=True\n    # we also need to encode to bytes\n    encoded_postdata = urlencode(postdata, doseq=True).encode()\n\n    # if apikey is not None, add it in as an Authorization: Bearer [apikey]\n    # header\n    if apikey:\n        headers = {'Authorization':'Bearer: %s' % apikey}\n    else:\n        headers = {}\n\n    LOGINFO('submitting search query to LCC-Server API URL: %s' % url)\n\n    try:\n\n        # hit the server with a POST request\n        req = Request(url, data=encoded_postdata, headers=headers)\n        resp = urlopen(req)\n\n        if resp.code == 200:\n\n            # we'll iterate over the lines in the response\n            # this works super-well for ND-JSON!\n            for line in resp:\n\n                data = json.loads(line)\n                msg = data['message']\n                status = data['status']\n\n                if status != 'failed':\n                    LOGINFO('status: %s, %s' % (status, msg))\n                else:\n                    LOGERROR('status: %s, %s' % (status, msg))\n\n                # here, we'll decide what to do about the query\n\n                # completed query or query sent to background...\n                if status in ('ok','background'):\n\n                    setid = data['result']['setid']\n                    # save the data pickle to astrobase lccs directory\n                    outpickle = os.path.join(os.path.expanduser('~'),\n                                             '.astrobase',\n                                             'lccs',\n                                             'query-%s.pkl' % setid)\n                    if not os.path.exists(os.path.dirname(outpickle)):\n                        os.makedirs(os.path.dirname(outpickle))\n\n                    with open(outpickle,'wb') as outfd:\n                        pickle.dump(data, outfd, pickle.HIGHEST_PROTOCOL)\n                        LOGINFO('saved query info to %s, use this to '\n                                'download results later with '\n                                'retrieve_dataset_files' % outpickle)\n\n                    # we're done at this point, return\n                    return status, data, data['result']['setid']\n\n                # the query probably failed...\n                elif status == 'failed':\n\n                    # we're done at this point, return\n                    return status, data, None\n\n\n        # if the response was not OK, then we probably failed\n        else:\n\n            try:\n                data = json.load(resp)\n                msg = data['message']\n\n                LOGERROR(msg)\n                return 'failed', None, None\n\n            except Exception as e:\n\n                LOGEXCEPTION('failed to submit query to %s' % url)\n                return 'failed', None, None\n\n    except HTTPError as e:\n\n        LOGERROR('could not submit query to LCC API at: %s' % url)\n        LOGERROR('HTTP status code was %s, reason: %s' % (e.code, e.reason))\n        return 'failed', None, None", "response": "This function submits a POST request to an LCC - Server search API endpoint and returns the final JSON stream that can be used to reconstruct the URL on the LCC - Server where the results can be browsed."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef object_info(lcc_server, objectid, db_collection_id):\n    '''This gets information on a single object from the LCC-Server.\n\n    Returns a dict with all of the available information on an object, including\n    finding charts, comments, object type and variability tags, and\n    period-search results (if available).\n\n    If you have an LCC-Server API key present in `~/.astrobase/lccs/` that is\n    associated with an LCC-Server user account, objects that are visible to this\n    user will be returned, even if they are not visible to the public. Use this\n    to look up objects that have been marked as 'private' or 'shared'.\n\n    NOTE: you can pass the result dict returned by this function directly into\n    the `astrobase.checkplot.checkplot_pickle_to_png` function, e.g.::\n\n        astrobase.checkplot.checkplot_pickle_to_png(result_dict,\n                                                    'object-%s-info.png' %\n                                                    result_dict['objectid'])\n\n    to generate a quick PNG overview of the object information.\n\n    Parameters\n    ----------\n\n    lcc_server : str\n        This is the base URL of the LCC-Server to talk to.\n\n    objectid : str\n        This is the unique database ID of the object to retrieve info for. This\n        is always returned as the `db_oid` column in LCC-Server search results.\n\n    db_collection_id : str\n        This is the collection ID which will be searched for the object. This is\n        always returned as the `collection` column in LCC-Server search results.\n\n    Returns\n    -------\n\n    dict\n        A dict containing the object info is returned. Some important items in\n        the result dict:\n\n        - `objectinfo`: all object magnitude, color, GAIA cross-match, and\n          object type information available for this object\n\n        - `objectcomments`: comments on the object's variability if available\n\n        - `varinfo`: variability comments, variability features, type tags,\n          period and epoch information if available\n\n        - `neighbors`: information on the neighboring objects of this object in\n          its parent light curve collection\n\n        - `xmatch`: information on any cross-matches to external catalogs\n          (e.g. KIC, EPIC, TIC, APOGEE, etc.)\n\n        - `finderchart`: a base-64 encoded PNG image of the object's DSS2 RED\n          finder chart. To convert this to an actual PNG, try the function:\n          `astrobase.checkplot.pkl_io._b64_to_file`.\n\n        - `magseries`: a base-64 encoded PNG image of the object's light\n          curve. To convert this to an actual PNG, try the function:\n          `astrobase.checkplot.pkl_io._b64_to_file`.\n\n        - `pfmethods`: a list of period-finding methods applied to the object if\n          any. If this list is present, use the keys in it to get to the actual\n          period-finding results for each method. These will contain base-64\n          encoded PNGs of the periodogram and phased light curves using the best\n          three peaks in the periodogram, as well as period and epoch\n          information.\n\n    '''\n\n    urlparams = {\n        'objectid':objectid,\n        'collection':db_collection_id\n    }\n\n    urlqs = urlencode(urlparams)\n    url = '%s/api/object?%s' % (lcc_server, urlqs)\n\n    try:\n\n        LOGINFO(\n            'getting info for %s in collection %s from %s' % (\n                objectid,\n                db_collection_id,\n                lcc_server\n            )\n        )\n\n        # check if we have an API key already\n        have_apikey, apikey, expires = check_existing_apikey(lcc_server)\n\n        # if not, get a new one\n        if not have_apikey:\n            apikey, expires = get_new_apikey(lcc_server)\n\n        # if apikey is not None, add it in as an Authorization: Bearer [apikey]\n        # header\n        if apikey:\n            headers = {'Authorization':'Bearer: %s' % apikey}\n        else:\n            headers = {}\n\n        # hit the server\n        req = Request(url, data=None, headers=headers)\n        resp = urlopen(req)\n        objectinfo = json.loads(resp.read())['result']\n        return objectinfo\n\n    except HTTPError as e:\n\n        if e.code == 404:\n\n            LOGERROR(\n                'additional info for object %s not '\n                'found in collection: %s' % (objectid,\n                                             db_collection_id)\n            )\n\n        else:\n\n            LOGERROR('could not retrieve object info, '\n                     'URL used: %s, error code: %s, reason: %s' %\n                     (url, e.code, e.reason))\n\n\n        return None", "response": "This function gets the basic information on a single object from the LCC - Server."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef list_recent_datasets(lcc_server, nrecent=25):\n    '''This lists recent publicly visible datasets available on the LCC-Server.\n\n    If you have an LCC-Server API key present in `~/.astrobase/lccs/` that is\n    associated with an LCC-Server user account, datasets that belong to this\n    user will be returned as well, even if they are not visible to the public.\n\n    Parameters\n    ----------\n\n    lcc_server : str\n        This is the base URL of the LCC-Server to talk to.\n\n    nrecent : int\n        This indicates how many recent public datasets you want to list. This is\n        always capped at 1000.\n\n    Returns\n    -------\n\n    list of dicts\n        Returns a list of dicts, with each dict containing info on each dataset.\n\n    '''\n\n    urlparams = {'nsets':nrecent}\n    urlqs = urlencode(urlparams)\n\n    url = '%s/api/datasets?%s' % (lcc_server, urlqs)\n\n    try:\n\n        LOGINFO(\n            'getting list of recent publicly '\n            'visible and owned datasets from %s' % (\n                lcc_server,\n            )\n        )\n\n        # check if we have an API key already\n        have_apikey, apikey, expires = check_existing_apikey(lcc_server)\n\n        # if not, get a new one\n        if not have_apikey:\n            apikey, expires = get_new_apikey(lcc_server)\n\n        # if apikey is not None, add it in as an Authorization: Bearer [apikey]\n        # header\n        if apikey:\n            headers = {'Authorization':'Bearer: %s' % apikey}\n        else:\n            headers = {}\n\n        # hit the server\n        req = Request(url, data=None, headers=headers)\n        resp = urlopen(req)\n        recent_datasets = json.loads(resp.read())['result']\n        return recent_datasets\n\n    except HTTPError as e:\n\n        LOGERROR('could not retrieve recent datasets list, '\n                 'URL used: %s, error code: %s, reason: %s' %\n                 (url, e.code, e.reason))\n\n        return None", "response": "This function returns a list of recent publicly visible datasets available on the LCC - Server."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef stetson_kindex(fmags, ferrs):\n    '''This calculates the Stetson K index (a robust measure of the kurtosis).\n\n    Parameters\n    ----------\n\n    fmags,ferrs : np.array\n        The input mag/flux time-series to process. Must have no non-finite\n        elems.\n\n    Returns\n    -------\n\n    float\n        The Stetson K variability index.\n\n    '''\n\n    # use a fill in value for the errors if they're none\n    if ferrs is None:\n        ferrs = npfull_like(fmags, 0.005)\n\n    ndet = len(fmags)\n\n    if ndet > 9:\n\n        # get the median and ndet\n        medmag = npmedian(fmags)\n\n        # get the stetson index elements\n        delta_prefactor = (ndet/(ndet - 1))\n        sigma_i = delta_prefactor*(fmags - medmag)/ferrs\n\n        stetsonk = (\n            npsum(npabs(sigma_i))/(npsqrt(npsum(sigma_i*sigma_i))) *\n            (ndet**(-0.5))\n        )\n\n        return stetsonk\n\n    else:\n\n        LOGERROR('not enough detections in this magseries '\n                 'to calculate stetson K index')\n        return npnan", "response": "This calculates the Stetson K index for a robust measure of the kurtosis."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef nonperiodic_lightcurve_features(times, mags, errs, magsarefluxes=False):\n    '''This calculates the following nonperiodic features of the light curve,\n    listed in Richards, et al. 2011):\n\n    - amplitude\n    - beyond1std\n    - flux_percentile_ratio_mid20\n    - flux_percentile_ratio_mid35\n    - flux_percentile_ratio_mid50\n    - flux_percentile_ratio_mid65\n    - flux_percentile_ratio_mid80\n    - linear_trend\n    - max_slope\n    - median_absolute_deviation\n    - median_buffer_range_percentage\n    - pair_slope_trend\n    - percent_amplitude\n    - percent_difference_flux_percentile\n    - skew\n    - stdev\n    - timelength\n    - mintime\n    - maxtime\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The input mag/flux time-series to process.\n\n    magsarefluxes : bool\n        If True, will treat values in `mags` as fluxes instead of magnitudes.\n\n    Returns\n    -------\n\n    dict\n        A dict containing all of the features listed above.\n\n    '''\n\n    # remove nans first\n    finiteind = npisfinite(times) & npisfinite(mags) & npisfinite(errs)\n    ftimes, fmags, ferrs = times[finiteind], mags[finiteind], errs[finiteind]\n\n    # remove zero errors\n    nzind = npnonzero(ferrs)\n    ftimes, fmags, ferrs = ftimes[nzind], fmags[nzind], ferrs[nzind]\n\n    ndet = len(fmags)\n\n    if ndet > 9:\n\n        # calculate the moments\n        moments = lightcurve_moments(ftimes, fmags, ferrs)\n\n        # calculate the flux measures\n        fluxmeasures = lightcurve_flux_measures(ftimes, fmags, ferrs,\n                                                magsarefluxes=magsarefluxes)\n\n        # calculate the point-to-point measures\n        ptpmeasures = lightcurve_ptp_measures(ftimes, fmags, ferrs)\n\n        # get the length in time\n        mintime, maxtime = npmin(ftimes), npmax(ftimes)\n        timelength = maxtime - mintime\n\n        # get the amplitude\n        series_amplitude = 0.5*(npmax(fmags) - npmin(fmags))\n\n        # calculate the linear fit to the entire mag series\n        fitcoeffs = nppolyfit(ftimes, fmags, 1, w=1.0/(ferrs*ferrs))\n        series_linear_slope = fitcoeffs[1]\n\n        # roll fmags by 1\n        rolled_fmags = nproll(fmags,1)\n\n        # calculate the magnitude ratio (from the WISE paper)\n        series_magratio = (\n            (npmax(fmags) - moments['median']) / (npmax(fmags) - npmin(fmags) )\n        )\n\n        # this is the dictionary returned containing all the measures\n        measures = {\n            'ndet':fmags.size,\n            'mintime':mintime,\n            'maxtime':maxtime,\n            'timelength':timelength,\n            'amplitude':series_amplitude,\n            'ndetobslength_ratio':ndet/timelength,\n            'linear_fit_slope':series_linear_slope,\n            'magnitude_ratio':series_magratio,\n        }\n        if moments:\n            measures.update(moments)\n        if ptpmeasures:\n            measures.update(ptpmeasures)\n        if fluxmeasures:\n            measures.update(fluxmeasures)\n\n        return measures\n\n    else:\n\n        LOGERROR('not enough detections in this magseries '\n                 'to calculate non-periodic features')\n        return None", "response": "This function calculates the non - periodic light curve features."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef gilliland_cdpp(times, mags, errs,\n                   windowlength=97,\n                   polyorder=2,\n                   binsize=23400,  # in seconds: 6.5 hours for classic CDPP\n                   sigclip=5.0,\n                   magsarefluxes=False,\n                   **kwargs):\n    '''This calculates the CDPP of a timeseries using the method in the paper:\n\n    Gilliland, R. L., Chaplin, W. J., Dunham, E. W., et al. 2011, ApJS, 197, 6\n    (http://adsabs.harvard.edu/abs/2011ApJS..197....6G)\n\n    The steps are:\n\n    - pass the time-series through a Savitsky-Golay filter.\n\n      - we use `scipy.signal.savgol_filter`, `**kwargs` are passed to this.\n\n      - also see: http://scipy.github.io/old-wiki/pages/Cookbook/SavitzkyGolay.\n\n      - the `windowlength` is the number of LC points to use (Kepler uses 2 days\n        = (1440 minutes/day / 30 minutes/LC point) x 2 days = 96 -> 97 LC\n        points).\n\n      - the `polyorder` is a quadratic by default.\n\n\n    - subtract the smoothed time-series from the actual light curve.\n\n    - sigma clip the remaining LC.\n\n    - get the binned mag series by averaging over 6.5 hour bins, only retaining\n      bins with at least 7 points.\n\n    - the standard deviation of the binned averages is the CDPP.\n\n    - multiply this by 1.168 to correct for over-subtraction of white-noise.\n\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The input mag/flux time-series to calculate CDPP for.\n\n    windowlength : int\n        The smoothing window size to use.\n\n    polyorder : int\n        The polynomial order to use in the Savitsky-Golay smoothing.\n\n    binsize : int\n        The bin size to use for binning the light curve.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    magsarefluxes : bool\n        If True, indicates the input time-series is fluxes and not mags.\n\n    kwargs : additional kwargs\n        These are passed directly to `scipy.signal.savgol_filter`.\n\n    Returns\n    -------\n\n    float\n        The calculated CDPP value.\n\n    '''\n\n    # if no errs are given, assume 0.1% errors\n    if errs is None:\n        errs = 0.001*mags\n\n    # get rid of nans first\n    find = npisfinite(times) & npisfinite(mags) & npisfinite(errs)\n    ftimes = times[find]\n    fmags = mags[find]\n    ferrs = errs[find]\n\n    if ftimes.size < (3*windowlength):\n        LOGERROR('not enough LC points to calculate CDPP')\n        return npnan\n\n    # now get the smoothed mag series using the filter\n    # kwargs are provided to the savgol_filter function\n    smoothed = savgol_filter(fmags, windowlength, polyorder, **kwargs)\n    subtracted = fmags - smoothed\n\n    # sigclip the subtracted light curve\n    stimes, smags, serrs = sigclip_magseries(ftimes, subtracted, ferrs,\n                                             magsarefluxes=magsarefluxes)\n\n    # bin over 6.5 hour bins and throw away all bins with less than 7 elements\n    binned = time_bin_magseries_with_errs(stimes, smags, serrs,\n                                          binsize=binsize,\n                                          minbinelems=7)\n    bmags = binned['binnedmags']\n\n    # stdev of bin mags x 1.168 -> CDPP\n    cdpp = npstd(bmags) * 1.168\n\n    return cdpp", "response": "This function calculates the CDPP of a time - series using the Gilliland method."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef all_nonperiodic_features(times, mags, errs,\n                             magsarefluxes=False,\n                             stetson_weightbytimediff=True):\n    '''This rolls up the feature functions above and returns a single dict.\n\n    NOTE: this doesn't calculate the CDPP to save time since binning and\n    smoothing takes a while for dense light curves.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The input mag/flux time-series to calculate CDPP for.\n\n    magsarefluxes : bool\n        If True, indicates `mags` is actually an array of flux values.\n\n    stetson_weightbytimediff : bool\n        If this is True, the Stetson index for any pair of mags will be\n        reweighted by the difference in times between them using the scheme in\n        Fruth+ 2012 and Zhange+ 2003 (as seen in Sokolovsky+ 2017)::\n\n            w_i = exp(- (t_i+1 - t_i)/ delta_t )\n\n    Returns\n    -------\n\n    dict\n        Returns a dict with all of the variability features.\n\n    '''\n\n    # remove nans first\n    finiteind = npisfinite(times) & npisfinite(mags) & npisfinite(errs)\n    ftimes, fmags, ferrs = times[finiteind], mags[finiteind], errs[finiteind]\n\n    # remove zero errors\n    nzind = npnonzero(ferrs)\n    ftimes, fmags, ferrs = ftimes[nzind], fmags[nzind], ferrs[nzind]\n\n    xfeatures = nonperiodic_lightcurve_features(times, mags, errs,\n                                                magsarefluxes=magsarefluxes)\n    stetj = stetson_jindex(ftimes, fmags, ferrs,\n                           weightbytimediff=stetson_weightbytimediff)\n    stetk = stetson_kindex(fmags, ferrs)\n\n    xfeatures.update({'stetsonj':stetj,\n                      'stetsonk':stetk})\n\n    return xfeatures", "response": "This function rolls up the feature functions above and returns a single dict with all non - periodic features."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _bls_runner(times,\n                mags,\n                nfreq,\n                freqmin,\n                stepsize,\n                nbins,\n                minduration,\n                maxduration):\n    '''This runs the pyeebls.eebls function using the given inputs.\n\n    Parameters\n    ----------\n\n    times,mags : np.array\n        The input magnitude time-series to search for transits.\n\n    nfreq : int\n        The number of frequencies to use when searching for transits.\n\n    freqmin : float\n        The minimum frequency of the period-search -> max period that will be\n        used for the search.\n\n    stepsize : float\n        The step-size in frequency to use to generate a frequency-grid.\n\n    nbins : int\n        The number of phase bins to use.\n\n    minduration : float\n        The minimum fractional transit duration that will be considered.\n\n    maxduration : float\n        The maximum fractional transit duration that will be considered.\n\n    Returns\n    -------\n\n    dict\n        Returns a dict of the form::\n\n            {\n                'power':           the periodogram power array,\n                'bestperiod':      the best period found,\n                'bestpower':       the highest peak of the periodogram power,\n                'transdepth':      transit depth found by eebls.f,\n                'transduration':   transit duration found by eebls.f,\n                'transingressbin': transit ingress bin found by eebls.f,\n                'transegressbin':  transit egress bin found by eebls.f,\n            }\n\n    '''\n\n    workarr_u = npones(times.size)\n    workarr_v = npones(times.size)\n\n    blsresult = eebls(times, mags,\n                      workarr_u, workarr_v,\n                      nfreq, freqmin, stepsize,\n                      nbins, minduration, maxduration)\n\n    return {'power':blsresult[0],\n            'bestperiod':blsresult[1],\n            'bestpower':blsresult[2],\n            'transdepth':blsresult[3],\n            'transduration':blsresult[4],\n            'transingressbin':blsresult[5],\n            'transegressbin':blsresult[6]}", "response": "This function runs the pyeebls. eebls function and returns the output of the pyeebls. eebls function."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef bls_serial_pfind(\n        times, mags, errs,\n        magsarefluxes=False,\n        startp=0.1,  # search from 0.1 d to...\n        endp=100.0,  # ... 100.0 d -- don't search full timebase\n        stepsize=5.0e-4,\n        mintransitduration=0.01,  # minimum transit length in phase\n        maxtransitduration=0.4,   # maximum transit length in phase\n        nphasebins=200,\n        autofreq=True,  # figure out f0, nf, and df automatically\n        periodepsilon=0.1,\n        nbestpeaks=5,\n        sigclip=10.0,\n        verbose=True,\n        get_stats=True,\n):\n    '''Runs the Box Least Squares Fitting Search for transit-shaped signals.\n\n    Based on eebls.f from Kovacs et al. 2002 and python-bls from Foreman-Mackey\n    et al. 2015. This is the serial version (which is good enough in most cases\n    because BLS in Fortran is fairly fast). If nfreq > 5e5, this will take a\n    while.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The magnitude/flux time-series to search for transits.\n\n    magsarefluxes : bool\n        If the input measurement values in `mags` and `errs` are in fluxes, set\n        this to True.\n\n    startp,endp : float\n        The minimum and maximum periods to consider for the transit search.\n\n    stepsize : float\n        The step-size in frequency to use when constructing a frequency grid for\n        the period search.\n\n    mintransitduration,maxtransitduration : float\n        The minimum and maximum transitdurations (in units of phase) to consider\n        for the transit search.\n\n    nphasebins : int\n        The number of phase bins to use in the period search.\n\n    autofreq : bool\n        If this is True, the values of `stepsize` and `nphasebins` will be\n        ignored, and these, along with a frequency-grid, will be determined\n        based on the following relations::\n\n            nphasebins = int(ceil(2.0/mintransitduration))\n            if nphasebins > 3000:\n                nphasebins = 3000\n\n            stepsize = 0.25*mintransitduration/(times.max()-times.min())\n\n            minfreq = 1.0/endp\n            maxfreq = 1.0/startp\n            nfreq = int(ceil((maxfreq - minfreq)/stepsize))\n\n    periodepsilon : float\n        The fractional difference between successive values of 'best' periods\n        when sorting by periodogram power to consider them as separate periods\n        (as opposed to part of the same periodogram peak). This is used to avoid\n        broad peaks in the periodogram and make sure the 'best' periods returned\n        are all actually independent.\n\n    nbestpeaks : int\n        The number of 'best' peaks to return from the periodogram results,\n        starting from the global maximum of the periodogram peak values.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    verbose : bool\n        If this is True, will indicate progress and details about the frequency\n        grid used for the period search.\n\n    get_stats : bool\n        If True, runs :py:func:`.bls_stats_singleperiod` for each of the best\n        periods in the output and injects the output into the output dict so you\n        only have to run this function to get the periods and their stats.\n\n    Returns\n    -------\n\n    dict\n        This function returns a dict, referred to as an `lspinfo` dict in other\n        astrobase functions that operate on periodogram results. This is a\n        standardized format across all astrobase period-finders, and is of the\n        form below::\n\n            {'bestperiod': the best period value in the periodogram,\n             'bestlspval': the periodogram peak associated with the best period,\n             'nbestpeaks': the input value of nbestpeaks,\n             'nbestlspvals': nbestpeaks-size list of best period peak values,\n             'nbestperiods': nbestpeaks-size list of best periods,\n             'stats': BLS stats for each best period,\n             'lspvals': the full array of periodogram powers,\n             'frequencies': the full array of frequencies considered,\n             'periods': the full array of periods considered,\n             'blsresult': the result dict from the eebls.f wrapper function,\n             'stepsize': the actual stepsize used,\n             'nfreq': the actual nfreq used,\n             'nphasebins': the actual nphasebins used,\n             'mintransitduration': the input mintransitduration,\n             'maxtransitduration': the input maxtransitdurations,\n             'method':'bls' -> the name of the period-finder method,\n             'kwargs':{ dict of all of the input kwargs for record-keeping}}\n\n    '''\n\n    # get rid of nans first and sigclip\n    stimes, smags, serrs = sigclip_magseries(times,\n                                             mags,\n                                             errs,\n                                             magsarefluxes=magsarefluxes,\n                                             sigclip=sigclip)\n\n    # make sure there are enough points to calculate a spectrum\n    if len(stimes) > 9 and len(smags) > 9 and len(serrs) > 9:\n\n        # if we're setting up everything automatically\n        if autofreq:\n\n            # figure out the best number of phasebins to use\n            nphasebins = int(npceil(2.0/mintransitduration))\n            if nphasebins > 3000:\n                nphasebins = 3000\n\n            # use heuristic to figure out best timestep\n            stepsize = 0.25*mintransitduration/(stimes.max()-stimes.min())\n\n            # now figure out the frequencies to use\n            minfreq = 1.0/endp\n            maxfreq = 1.0/startp\n            nfreq = int(npceil((maxfreq - minfreq)/stepsize))\n\n            # say what we're using\n            if verbose:\n                LOGINFO('min P: %s, max P: %s, nfreq: %s, '\n                        'minfreq: %s, maxfreq: %s' % (startp, endp, nfreq,\n                                                      minfreq, maxfreq))\n                LOGINFO('autofreq = True: using AUTOMATIC values for '\n                        'freq stepsize: %s, nphasebins: %s, '\n                        'min transit duration: %s, max transit duration: %s' %\n                        (stepsize, nphasebins,\n                         mintransitduration, maxtransitduration))\n\n        else:\n\n            minfreq = 1.0/endp\n            maxfreq = 1.0/startp\n            nfreq = int(npceil((maxfreq - minfreq)/stepsize))\n\n            # say what we're using\n            if verbose:\n                LOGINFO('min P: %s, max P: %s, nfreq: %s, '\n                        'minfreq: %s, maxfreq: %s' % (startp, endp, nfreq,\n                                                      minfreq, maxfreq))\n                LOGINFO('autofreq = False: using PROVIDED values for '\n                        'freq stepsize: %s, nphasebins: %s, '\n                        'min transit duration: %s, max transit duration: %s' %\n                        (stepsize, nphasebins,\n                         mintransitduration, maxtransitduration))\n\n\n        if nfreq > 5.0e5:\n\n            if verbose:\n                LOGWARNING('more than 5.0e5 frequencies to go through; '\n                           'this will take a while. '\n                           'you might want to use the '\n                           'periodbase.bls_parallel_pfind function instead')\n\n        if minfreq < (1.0/(stimes.max() - stimes.min())):\n\n            if verbose:\n                LOGWARNING('the requested max P = %.3f is larger than '\n                           'the time base of the observations = %.3f, '\n                           ' will make minfreq = 2 x 1/timebase'\n                           % (endp, stimes.max() - stimes.min()))\n            minfreq = 2.0/(stimes.max() - stimes.min())\n            if verbose:\n                LOGINFO('new minfreq: %s, maxfreq: %s' %\n                        (minfreq, maxfreq))\n\n\n        # run BLS\n        try:\n\n            blsresult = _bls_runner(stimes,\n                                    smags,\n                                    nfreq,\n                                    minfreq,\n                                    stepsize,\n                                    nphasebins,\n                                    mintransitduration,\n                                    maxtransitduration)\n\n            # find the peaks in the BLS. this uses wavelet transforms to\n            # smooth the spectrum and find peaks. a similar thing would be\n            # to do a convolution with a gaussian kernel or a tophat\n            # function, calculate d/dx(result), then get indices where this\n            # is zero\n            # blspeakinds = find_peaks_cwt(blsresults['power'],\n            #                              nparray([2.0,3.0,4.0,5.0]))\n\n\n\n            frequencies = minfreq + nparange(nfreq)*stepsize\n            periods = 1.0/frequencies\n            lsp = blsresult['power']\n\n            # find the nbestpeaks for the periodogram: 1. sort the lsp array\n            # by highest value first 2. go down the values until we find\n            # five values that are separated by at least periodepsilon in\n            # period\n            # make sure to get only the finite peaks in the periodogram\n            # this is needed because BLS may produce infs for some peaks\n            finitepeakind = npisfinite(lsp)\n            finlsp = lsp[finitepeakind]\n            finperiods = periods[finitepeakind]\n\n            # make sure that finlsp has finite values before we work on it\n            try:\n\n                bestperiodind = npargmax(finlsp)\n\n            except ValueError:\n\n                LOGERROR('no finite periodogram values '\n                         'for this mag series, skipping...')\n                return {'bestperiod':npnan,\n                        'bestlspval':npnan,\n                        'nbestpeaks':nbestpeaks,\n                        'nbestlspvals':None,\n                        'nbestperiods':None,\n                        'lspvals':None,\n                        'periods':None,\n                        'method':'bls',\n                        'kwargs':{'startp':startp,\n                                  'endp':endp,\n                                  'stepsize':stepsize,\n                                  'mintransitduration':mintransitduration,\n                                  'maxtransitduration':maxtransitduration,\n                                  'nphasebins':nphasebins,\n                                  'autofreq':autofreq,\n                                  'periodepsilon':periodepsilon,\n                                  'nbestpeaks':nbestpeaks,\n                                  'sigclip':sigclip,\n                                  'magsarefluxes':magsarefluxes}}\n\n            sortedlspind = npargsort(finlsp)[::-1]\n            sortedlspperiods = finperiods[sortedlspind]\n            sortedlspvals = finlsp[sortedlspind]\n\n            # now get the nbestpeaks\n            nbestperiods, nbestlspvals, peakcount = (\n                [finperiods[bestperiodind]],\n                [finlsp[bestperiodind]],\n                1\n            )\n            prevperiod = sortedlspperiods[0]\n\n            # find the best nbestpeaks in the lsp and their periods\n            for period, lspval in zip(sortedlspperiods, sortedlspvals):\n\n                if peakcount == nbestpeaks:\n                    break\n                perioddiff = abs(period - prevperiod)\n                bestperiodsdiff = [abs(period - x) for x in nbestperiods]\n\n                # print('prevperiod = %s, thisperiod = %s, '\n                #       'perioddiff = %s, peakcount = %s' %\n                #       (prevperiod, period, perioddiff, peakcount))\n\n                # this ensures that this period is different from the last\n                # period and from all the other existing best periods by\n                # periodepsilon to make sure we jump to an entire different\n                # peak in the periodogram\n                if (perioddiff > (periodepsilon*prevperiod) and\n                    all(x > (periodepsilon*period)\n                        for x in bestperiodsdiff)):\n                    nbestperiods.append(period)\n                    nbestlspvals.append(lspval)\n                    peakcount = peakcount + 1\n\n                prevperiod = period\n\n\n            # generate the return dict\n            resultdict = {\n                'bestperiod':finperiods[bestperiodind],\n                'bestlspval':finlsp[bestperiodind],\n                'nbestpeaks':nbestpeaks,\n                'nbestlspvals':nbestlspvals,\n                'nbestperiods':nbestperiods,\n                'lspvals':lsp,\n                'frequencies':frequencies,\n                'periods':periods,\n                'blsresult':blsresult,\n                'stepsize':stepsize,\n                'nfreq':nfreq,\n                'nphasebins':nphasebins,\n                'mintransitduration':mintransitduration,\n                'maxtransitduration':maxtransitduration,\n                'method':'bls',\n                'kwargs':{'startp':startp,\n                          'endp':endp,\n                          'stepsize':stepsize,\n                          'mintransitduration':mintransitduration,\n                          'maxtransitduration':maxtransitduration,\n                          'nphasebins':nphasebins,\n                          'autofreq':autofreq,\n                          'periodepsilon':periodepsilon,\n                          'nbestpeaks':nbestpeaks,\n                          'sigclip':sigclip,\n                          'magsarefluxes':magsarefluxes}\n            }\n\n            # get stats if requested\n            if get_stats:\n                resultdict['stats'] = []\n                for bp in nbestperiods:\n\n                    if verbose:\n                        LOGINFO(\"Getting stats for best period: %.6f\" % bp)\n\n                    this_pstats = bls_stats_singleperiod(\n                        times, mags, errs, bp,\n                        magsarefluxes=resultdict['kwargs']['magsarefluxes'],\n                        sigclip=resultdict['kwargs']['sigclip'],\n                        nphasebins=resultdict['nphasebins'],\n                        mintransitduration=resultdict['mintransitduration'],\n                        maxtransitduration=resultdict['maxtransitduration'],\n                        verbose=verbose,\n                    )\n                    resultdict['stats'].append(this_pstats)\n\n\n            return resultdict\n\n        except Exception as e:\n\n            LOGEXCEPTION('BLS failed!')\n            return {'bestperiod':npnan,\n                    'bestlspval':npnan,\n                    'nbestpeaks':nbestpeaks,\n                    'nbestlspvals':None,\n                    'nbestperiods':None,\n                    'lspvals':None,\n                    'periods':None,\n                    'blsresult':None,\n                    'stepsize':stepsize,\n                    'nfreq':nfreq,\n                    'nphasebins':nphasebins,\n                    'mintransitduration':mintransitduration,\n                    'maxtransitduration':maxtransitduration,\n                    'method':'bls',\n                    'kwargs':{'startp':startp,\n                              'endp':endp,\n                              'stepsize':stepsize,\n                              'mintransitduration':mintransitduration,\n                              'maxtransitduration':maxtransitduration,\n                              'nphasebins':nphasebins,\n                              'autofreq':autofreq,\n                              'periodepsilon':periodepsilon,\n                              'nbestpeaks':nbestpeaks,\n                              'sigclip':sigclip,\n                              'magsarefluxes':magsarefluxes}}\n\n\n    else:\n\n        LOGERROR('no good detections for these times and mags, skipping...')\n        return {'bestperiod':npnan,\n                'bestlspval':npnan,\n                'nbestpeaks':nbestpeaks,\n                'nbestlspvals':None,\n                'nbestperiods':None,\n                'lspvals':None,\n                'periods':None,\n                'blsresult':None,\n                'stepsize':stepsize,\n                'nfreq':None,\n                'nphasebins':None,\n                'mintransitduration':mintransitduration,\n                'maxtransitduration':maxtransitduration,\n                'method':'bls',\n                'kwargs':{'startp':startp,\n                          'endp':endp,\n                          'stepsize':stepsize,\n                          'mintransitduration':mintransitduration,\n                          'maxtransitduration':maxtransitduration,\n                          'nphasebins':nphasebins,\n                          'autofreq':autofreq,\n                          'periodepsilon':periodepsilon,\n                          'nbestpeaks':nbestpeaks,\n                          'sigclip':sigclip,\n                          'magsarefluxes':magsarefluxes}}", "response": "This function runs the Box Least Squares Fitting Search for transit - shaped signals."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef bls_parallel_pfind(\n        times, mags, errs,\n        magsarefluxes=False,\n        startp=0.1,  # by default, search from 0.1 d to...\n        endp=100.0,  # ... 100.0 d -- don't search full timebase\n        stepsize=1.0e-4,\n        mintransitduration=0.01,  # minimum transit length in phase\n        maxtransitduration=0.4,   # maximum transit length in phase\n        nphasebins=200,\n        autofreq=True,  # figure out f0, nf, and df automatically\n        nbestpeaks=5,\n        periodepsilon=0.1,  # 0.1\n        sigclip=10.0,\n        verbose=True,\n        nworkers=None,\n        get_stats=True,\n):\n    '''Runs the Box Least Squares Fitting Search for transit-shaped signals.\n\n    Based on eebls.f from Kovacs et al. 2002 and python-bls from Foreman-Mackey\n    et al. 2015. Breaks up the full frequency space into chunks and passes them\n    to parallel BLS workers.\n\n    NOTE: the combined BLS spectrum produced by this function is not identical\n    to that produced by running BLS in one shot for the entire frequency\n    space. There are differences on the order of 1.0e-3 or so in the respective\n    peak values, but peaks appear at the same frequencies for both methods. This\n    is likely due to different aliasing caused by smaller chunks of the\n    frequency space used by the parallel workers in this function. When in\n    doubt, confirm results for this parallel implementation by comparing to\n    those from the serial implementation above.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        The magnitude/flux time-series to search for transits.\n\n    magsarefluxes : bool\n        If the input measurement values in `mags` and `errs` are in fluxes, set\n        this to True.\n\n    startp,endp : float\n        The minimum and maximum periods to consider for the transit search.\n\n    stepsize : float\n        The step-size in frequency to use when constructing a frequency grid for\n        the period search.\n\n    mintransitduration,maxtransitduration : float\n        The minimum and maximum transitdurations (in units of phase) to consider\n        for the transit search.\n\n    nphasebins : int\n        The number of phase bins to use in the period search.\n\n    autofreq : bool\n        If this is True, the values of `stepsize` and `nphasebins` will be\n        ignored, and these, along with a frequency-grid, will be determined\n        based on the following relations::\n\n            nphasebins = int(ceil(2.0/mintransitduration))\n            if nphasebins > 3000:\n                nphasebins = 3000\n\n            stepsize = 0.25*mintransitduration/(times.max()-times.min())\n\n            minfreq = 1.0/endp\n            maxfreq = 1.0/startp\n            nfreq = int(ceil((maxfreq - minfreq)/stepsize))\n\n    periodepsilon : float\n        The fractional difference between successive values of 'best' periods\n        when sorting by periodogram power to consider them as separate periods\n        (as opposed to part of the same periodogram peak). This is used to avoid\n        broad peaks in the periodogram and make sure the 'best' periods returned\n        are all actually independent.\n\n    nbestpeaks : int\n        The number of 'best' peaks to return from the periodogram results,\n        starting from the global maximum of the periodogram peak values.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    verbose : bool\n        If this is True, will indicate progress and details about the frequency\n        grid used for the period search.\n\n    nworkers : int or None\n        The number of parallel workers to launch for period-search. If None,\n        nworkers = NCPUS.\n\n    get_stats : bool\n        If True, runs :py:func:`.bls_stats_singleperiod` for each of the best\n        periods in the output and injects the output into the output dict so you\n        only have to run this function to get the periods and their stats.\n\n    Returns\n    -------\n\n    dict\n        This function returns a dict, referred to as an `lspinfo` dict in other\n        astrobase functions that operate on periodogram results. This is a\n        standardized format across all astrobase period-finders, and is of the\n        form below::\n\n            {'bestperiod': the best period value in the periodogram,\n             'bestlspval': the periodogram peak associated with the best period,\n             'nbestpeaks': the input value of nbestpeaks,\n             'nbestlspvals': nbestpeaks-size list of best period peak values,\n             'nbestperiods': nbestpeaks-size list of best periods,\n             'stats': list of stats dicts returned for each best period,\n             'lspvals': the full array of periodogram powers,\n             'frequencies': the full array of frequencies considered,\n             'periods': the full array of periods considered,\n             'blsresult': list of result dicts from eebls.f wrapper functions,\n             'stepsize': the actual stepsize used,\n             'nfreq': the actual nfreq used,\n             'nphasebins': the actual nphasebins used,\n             'mintransitduration': the input mintransitduration,\n             'maxtransitduration': the input maxtransitdurations,\n             'method':'bls' -> the name of the period-finder method,\n             'kwargs':{ dict of all of the input kwargs for record-keeping}}\n\n    '''\n\n    # get rid of nans first and sigclip\n    stimes, smags, serrs = sigclip_magseries(times,\n                                             mags,\n                                             errs,\n                                             magsarefluxes=magsarefluxes,\n                                             sigclip=sigclip)\n\n    # make sure there are enough points to calculate a spectrum\n    if len(stimes) > 9 and len(smags) > 9 and len(serrs) > 9:\n\n        # if we're setting up everything automatically\n        if autofreq:\n\n            # figure out the best number of phasebins to use\n            nphasebins = int(npceil(2.0/mintransitduration))\n            if nphasebins > 3000:\n                nphasebins = 3000\n\n            # use heuristic to figure out best timestep\n            stepsize = 0.25*mintransitduration/(stimes.max()-stimes.min())\n\n            # now figure out the frequencies to use\n            minfreq = 1.0/endp\n            maxfreq = 1.0/startp\n            nfreq = int(npceil((maxfreq - minfreq)/stepsize))\n\n            # say what we're using\n            if verbose:\n                LOGINFO('min P: %s, max P: %s, nfreq: %s, '\n                        'minfreq: %s, maxfreq: %s' % (startp, endp, nfreq,\n                                                      minfreq, maxfreq))\n                LOGINFO('autofreq = True: using AUTOMATIC values for '\n                        'freq stepsize: %s, nphasebins: %s, '\n                        'min transit duration: %s, max transit duration: %s' %\n                        (stepsize, nphasebins,\n                         mintransitduration, maxtransitduration))\n\n        else:\n\n            minfreq = 1.0/endp\n            maxfreq = 1.0/startp\n            nfreq = int(npceil((maxfreq - minfreq)/stepsize))\n\n            # say what we're using\n            if verbose:\n                LOGINFO('min P: %s, max P: %s, nfreq: %s, '\n                        'minfreq: %s, maxfreq: %s' % (startp, endp, nfreq,\n                                                      minfreq, maxfreq))\n                LOGINFO('autofreq = False: using PROVIDED values for '\n                        'freq stepsize: %s, nphasebins: %s, '\n                        'min transit duration: %s, max transit duration: %s' %\n                        (stepsize, nphasebins,\n                         mintransitduration, maxtransitduration))\n\n        # check the minimum frequency\n        if minfreq < (1.0/(stimes.max() - stimes.min())):\n\n            minfreq = 2.0/(stimes.max() - stimes.min())\n            if verbose:\n                LOGWARNING('the requested max P = %.3f is larger than '\n                           'the time base of the observations = %.3f, '\n                           ' will make minfreq = 2 x 1/timebase'\n                           % (endp, stimes.max() - stimes.min()))\n                LOGINFO('new minfreq: %s, maxfreq: %s' %\n                        (minfreq, maxfreq))\n\n\n        #############################\n        ## NOW RUN BLS IN PARALLEL ##\n        #############################\n\n        # fix number of CPUs if needed\n        if not nworkers or nworkers > NCPUS:\n            nworkers = NCPUS\n            if verbose:\n                LOGINFO('using %s workers...' % nworkers)\n\n        # the frequencies array to be searched\n        frequencies = minfreq + nparange(nfreq)*stepsize\n\n        # break up the tasks into chunks\n        csrem = int(fmod(nfreq, nworkers))\n        csint = int(float(nfreq/nworkers))\n        chunk_minfreqs, chunk_nfreqs = [], []\n\n        for x in range(nworkers):\n\n            this_minfreqs = frequencies[x*csint]\n\n            # handle usual nfreqs\n            if x < (nworkers - 1):\n                this_nfreqs = frequencies[x*csint:x*csint+csint].size\n            else:\n                this_nfreqs = frequencies[x*csint:x*csint+csint+csrem].size\n\n            chunk_minfreqs.append(this_minfreqs)\n            chunk_nfreqs.append(this_nfreqs)\n\n\n        # populate the tasks list\n        tasks = [(stimes, smags,\n                  chunk_minf, chunk_nf,\n                  stepsize, nphasebins,\n                  mintransitduration, maxtransitduration)\n                 for (chunk_nf, chunk_minf)\n                 in zip(chunk_minfreqs, chunk_nfreqs)]\n\n        if verbose:\n            for ind, task in enumerate(tasks):\n                LOGINFO('worker %s: minfreq = %.6f, nfreqs = %s' %\n                        (ind+1, task[3], task[2]))\n            LOGINFO('running...')\n\n        # return tasks\n\n        # start the pool\n        pool = Pool(nworkers)\n        results = pool.map(_parallel_bls_worker, tasks)\n\n        pool.close()\n        pool.join()\n        del pool\n\n        # now concatenate the output lsp arrays\n        lsp = npconcatenate([x['power'] for x in results])\n        periods = 1.0/frequencies\n\n        # find the nbestpeaks for the periodogram: 1. sort the lsp array\n        # by highest value first 2. go down the values until we find\n        # five values that are separated by at least periodepsilon in\n        # period\n\n        # make sure to get only the finite peaks in the periodogram\n        # this is needed because BLS may produce infs for some peaks\n        finitepeakind = npisfinite(lsp)\n        finlsp = lsp[finitepeakind]\n        finperiods = periods[finitepeakind]\n\n        # make sure that finlsp has finite values before we work on it\n        try:\n\n            bestperiodind = npargmax(finlsp)\n\n        except ValueError:\n\n            LOGERROR('no finite periodogram values '\n                     'for this mag series, skipping...')\n            return {'bestperiod':npnan,\n                    'bestlspval':npnan,\n                    'nbestpeaks':nbestpeaks,\n                    'nbestlspvals':None,\n                    'nbestperiods':None,\n                    'lspvals':None,\n                    'periods':None,\n                    'blsresult':None,\n                    'method':'bls',\n                    'kwargs':{'startp':startp,\n                              'endp':endp,\n                              'stepsize':stepsize,\n                              'mintransitduration':mintransitduration,\n                              'maxtransitduration':maxtransitduration,\n                              'nphasebins':nphasebins,\n                              'autofreq':autofreq,\n                              'periodepsilon':periodepsilon,\n                              'nbestpeaks':nbestpeaks,\n                              'sigclip':sigclip,\n                              'magsarefluxes':magsarefluxes}}\n\n        sortedlspind = npargsort(finlsp)[::-1]\n        sortedlspperiods = finperiods[sortedlspind]\n        sortedlspvals = finlsp[sortedlspind]\n\n        # now get the nbestpeaks\n        nbestperiods, nbestlspvals, peakcount = (\n            [finperiods[bestperiodind]],\n            [finlsp[bestperiodind]],\n            1\n        )\n        prevperiod = sortedlspperiods[0]\n\n        # find the best nbestpeaks in the lsp and their periods\n        for period, lspval in zip(sortedlspperiods, sortedlspvals):\n\n            if peakcount == nbestpeaks:\n                break\n            perioddiff = abs(period - prevperiod)\n            bestperiodsdiff = [abs(period - x) for x in nbestperiods]\n\n            # this ensures that this period is different from the last\n            # period and from all the other existing best periods by\n            # periodepsilon to make sure we jump to an entire different\n            # peak in the periodogram\n            if (perioddiff > (periodepsilon*prevperiod) and\n                all(x > (periodepsilon*period) for x in bestperiodsdiff)):\n                nbestperiods.append(period)\n                nbestlspvals.append(lspval)\n                peakcount = peakcount + 1\n\n            prevperiod = period\n\n\n        # generate the return dict\n        resultdict = {\n            'bestperiod':finperiods[bestperiodind],\n            'bestlspval':finlsp[bestperiodind],\n            'nbestpeaks':nbestpeaks,\n            'nbestlspvals':nbestlspvals,\n            'nbestperiods':nbestperiods,\n            'lspvals':lsp,\n            'frequencies':frequencies,\n            'periods':periods,\n            'blsresult':results,\n            'stepsize':stepsize,\n            'nfreq':nfreq,\n            'nphasebins':nphasebins,\n            'mintransitduration':mintransitduration,\n            'maxtransitduration':maxtransitduration,\n            'method':'bls',\n            'kwargs':{'startp':startp,\n                      'endp':endp,\n                      'stepsize':stepsize,\n                      'mintransitduration':mintransitduration,\n                      'maxtransitduration':maxtransitduration,\n                      'nphasebins':nphasebins,\n                      'autofreq':autofreq,\n                      'periodepsilon':periodepsilon,\n                      'nbestpeaks':nbestpeaks,\n                      'sigclip':sigclip,\n                      'magsarefluxes':magsarefluxes}\n        }\n\n        # get stats if requested\n        if get_stats:\n\n            resultdict['stats'] = []\n\n            for bp in nbestperiods.copy():\n\n                if verbose:\n                    LOGINFO(\"Getting stats for best period: %.6f\" % bp)\n\n                this_pstats = bls_stats_singleperiod(\n                    times, mags, errs, bp,\n                    magsarefluxes=resultdict['kwargs']['magsarefluxes'],\n                    sigclip=resultdict['kwargs']['sigclip'],\n                    nphasebins=resultdict['nphasebins'],\n                    mintransitduration=resultdict['mintransitduration'],\n                    maxtransitduration=resultdict['maxtransitduration'],\n                    verbose=verbose,\n                )\n                resultdict['stats'].append(this_pstats)\n\n        return resultdict\n\n    else:\n\n        LOGERROR('no good detections for these times and mags, skipping...')\n        return {'bestperiod':npnan,\n                'bestlspval':npnan,\n                'nbestpeaks':nbestpeaks,\n                'nbestlspvals':None,\n                'nbestperiods':None,\n                'lspvals':None,\n                'periods':None,\n                'blsresult':None,\n                'stepsize':stepsize,\n                'nfreq':None,\n                'nphasebins':None,\n                'mintransitduration':mintransitduration,\n                'maxtransitduration':maxtransitduration,\n                'method':'bls',\n                'kwargs':{'startp':startp,\n                          'endp':endp,\n                          'stepsize':stepsize,\n                          'mintransitduration':mintransitduration,\n                          'maxtransitduration':maxtransitduration,\n                          'nphasebins':nphasebins,\n                          'autofreq':autofreq,\n                          'periodepsilon':periodepsilon,\n                          'nbestpeaks':nbestpeaks,\n                          'sigclip':sigclip,\n                          'magsarefluxes':magsarefluxes}}", "response": "This function runs the Box Least Squares Fitting Search for transit - shaped signals."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef bls_stats_singleperiod(times, mags, errs, period,\n                           magsarefluxes=False,\n                           sigclip=10.0,\n                           perioddeltapercent=10,\n                           nphasebins=200,\n                           mintransitduration=0.01,\n                           maxtransitduration=0.4,\n                           ingressdurationfraction=0.1,\n                           verbose=True):\n    '''This calculates the SNR, depth, duration, a refit period, and time of\n    center-transit for a single period.\n\n    The equation used for SNR is::\n\n        SNR = (transit model depth / RMS of LC with transit model subtracted)\n              * sqrt(number of points in transit)\n\n    NOTE: you should set the kwargs `sigclip`, `nphasebins`,\n    `mintransitduration`, `maxtransitduration` to what you used for an initial\n    BLS run to detect transits in the input light curve to match those input\n    conditions.\n\n    Parameters\n    ----------\n\n    times,mags,errs : np.array\n        These contain the magnitude/flux time-series and any associated errors.\n\n    period : float\n        The period to search around and refit the transits. This will be used to\n        calculate the start and end periods of a rerun of BLS to calculate the\n        stats.\n\n    magsarefluxes : bool\n        Set to True if the input measurements in `mags` are actually fluxes and\n        not magnitudes.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    perioddeltapercent : float\n        The fraction of the period provided to use to search around this\n        value. This is a percentage. The period range searched will then be::\n\n            [period - (perioddeltapercent/100.0)*period,\n             period + (perioddeltapercent/100.0)*period]\n\n    nphasebins : int\n        The number of phase bins to use in the BLS run.\n\n    mintransitduration : float\n        The minimum transit duration in phase to consider.\n\n    maxtransitduration : float\n        The maximum transit duration to consider.\n\n    ingressdurationfraction : float\n        The fraction of the transit duration to use to generate an initial value\n        of the transit ingress duration for the BLS model refit. This will be\n        fit by this function.\n\n    verbose : bool\n        If True, will indicate progress and any problems encountered.\n\n    Returns\n    -------\n\n    dict\n        A dict of the following form is returned::\n\n            {'period': the refit best period,\n             'epoch': the refit epoch (i.e. mid-transit time),\n             'snr':the SNR of the transit,\n             'transitdepth':the depth of the transit,\n             'transitduration':the duration of the transit,\n             'nphasebins':the input value of nphasebins,\n             'transingressbin':the phase bin containing transit ingress,\n             'transegressbin':the phase bin containing transit egress,\n             'blsmodel':the full BLS model used along with its parameters,\n             'subtractedmags':BLS model - phased light curve,\n             'phasedmags':the phase light curve,\n             'phases': the phase values}\n\n    '''\n\n    # get rid of nans first and sigclip\n    stimes, smags, serrs = sigclip_magseries(times,\n                                             mags,\n                                             errs,\n                                             magsarefluxes=magsarefluxes,\n                                             sigclip=sigclip)\n\n\n    # make sure there are enough points to calculate a spectrum\n    if len(stimes) > 9 and len(smags) > 9 and len(serrs) > 9:\n\n        # get the period interval\n        startp = period - perioddeltapercent*period/100.0\n\n        if startp < 0:\n            startp = period\n\n        endp = period + perioddeltapercent*period/100.0\n\n        # rerun BLS in serial mode around the specified period to get the\n        # transit depth, duration, ingress and egress bins\n        blsres = bls_serial_pfind(stimes, smags, serrs,\n                                  verbose=verbose,\n                                  startp=startp,\n                                  endp=endp,\n                                  nphasebins=nphasebins,\n                                  mintransitduration=mintransitduration,\n                                  maxtransitduration=maxtransitduration,\n                                  magsarefluxes=magsarefluxes,\n                                  get_stats=False,\n                                  sigclip=None)\n\n        if (not blsres or\n            'blsresult' not in blsres or\n            blsres['blsresult'] is None):\n            LOGERROR(\"BLS failed during a period-search \"\n                     \"performed around the input best period: %.6f. \"\n                     \"Can't continue. \" % period)\n            return None\n\n        thistransdepth = blsres['blsresult']['transdepth']\n        thistransduration = blsres['blsresult']['transduration']\n        thisbestperiod = blsres['bestperiod']\n        thistransingressbin = blsres['blsresult']['transingressbin']\n        thistransegressbin = blsres['blsresult']['transegressbin']\n        thisnphasebins = nphasebins\n\n        stats = _get_bls_stats(stimes,\n                               smags,\n                               serrs,\n                               thistransdepth,\n                               thistransduration,\n                               ingressdurationfraction,\n                               nphasebins,\n                               thistransingressbin,\n                               thistransegressbin,\n                               thisbestperiod,\n                               thisnphasebins,\n                               magsarefluxes=magsarefluxes,\n                               verbose=verbose)\n\n        return stats\n\n\n    # if there aren't enough points in the mag series, bail out\n    else:\n\n        LOGERROR('no good detections for these times and mags, skipping...')\n        return None", "response": "This function calculates the SNR depth duration time - series and time - series stats for a single period of the light curve."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef bls_snr(blsdict,\n            times,\n            mags,\n            errs,\n            assumeserialbls=False,\n            magsarefluxes=False,\n            sigclip=10.0,\n            npeaks=None,\n            perioddeltapercent=10,\n            ingressdurationfraction=0.1,\n            verbose=True):\n    '''Calculates the signal to noise ratio for each best peak in the BLS\n    periodogram, along with transit depth, duration, and refit period and epoch.\n\n    The following equation is used for SNR::\n\n        SNR = (transit model depth / RMS of LC with transit model subtracted)\n              * sqrt(number of points in transit)\n\n    Parameters\n    ----------\n\n    blsdict : dict\n        This is an lspinfo dict produced by either `bls_parallel_pfind` or\n        `bls_serial_pfind` in this module, or by your own BLS function. If you\n        provide results in a dict from an external BLS function, make sure this\n        matches the form below::\n\n            {'bestperiod': the best period value in the periodogram,\n             'bestlspval': the periodogram peak associated with the best period,\n             'nbestpeaks': the input value of nbestpeaks,\n             'nbestlspvals': nbestpeaks-size list of best period peak values,\n             'nbestperiods': nbestpeaks-size list of best periods,\n             'lspvals': the full array of periodogram powers,\n             'frequencies': the full array of frequencies considered,\n             'periods': the full array of periods considered,\n             'blsresult': list of result dicts from eebls.f wrapper functions,\n             'stepsize': the actual stepsize used,\n             'nfreq': the actual nfreq used,\n             'nphasebins': the actual nphasebins used,\n             'mintransitduration': the input mintransitduration,\n             'maxtransitduration': the input maxtransitdurations,\n             'method':'bls' -> the name of the period-finder method,\n             'kwargs':{ dict of all of the input kwargs for record-keeping}}\n\n    times,mags,errs : np.array\n        These contain the magnitude/flux time-series and any associated errors.\n\n    assumeserialbls : bool\n        If this is True, this function will not rerun BLS around each best peak\n        in the input lspinfo dict to refit the periods and epochs. This is\n        usally required for `bls_parallel_pfind` so set this to False if you use\n        results from that function. The parallel method breaks up the frequency\n        space into chunks for speed, and the results may not exactly match those\n        from a regular BLS run.\n\n    magsarefluxes : bool\n        Set to True if the input measurements in `mags` are actually fluxes and\n        not magnitudes.\n\n    npeaks : int or None\n        This controls how many of the periods in `blsdict['nbestperiods']` to\n        find the SNR for. If it's None, then this will calculate the SNR for all\n        of them. If it's an integer between 1 and\n        `len(blsdict['nbestperiods'])`, will calculate for only the specified\n        number of peak periods, starting from the best period.\n\n    perioddeltapercent : float\n        The fraction of the period provided to use to search around this\n        value. This is a percentage. The period range searched will then be::\n\n            [period - (perioddeltapercent/100.0)*period,\n             period + (perioddeltapercent/100.0)*period]\n\n    ingressdurationfraction : float\n        The fraction of the transit duration to use to generate an initial value\n        of the transit ingress duration for the BLS model refit. This will be\n        fit by this function.\n\n    verbose : bool\n        If True, will indicate progress and any problems encountered.\n\n    Returns\n    -------\n\n    dict\n        A dict of the following form is returned::\n\n            {'npeaks: the number of periodogram peaks requested to get SNR for,\n             'period': list of refit best periods for each requested peak,\n             'epoch': list of refit epochs (i.e. mid-transit times),\n             'snr':list of SNRs of the transit for each requested peak,\n             'transitdepth':list of depths of the transits,\n             'transitduration':list of durations of the transits,\n             'nphasebins':the input value of nphasebins,\n             'transingressbin':the phase bin containing transit ingress,\n             'transegressbin':the phase bin containing transit egress,\n             'allblsmodels':the full BLS models used along with its parameters,\n             'allsubtractedmags':BLS models - phased light curves,\n             'allphasedmags':the phase light curves,\n             'allphases': the phase values}\n\n    '''\n\n    # figure out how many periods to work on\n    if (npeaks and (0 < npeaks < len(blsdict['nbestperiods']))):\n        nperiods = npeaks\n    else:\n        if verbose:\n            LOGWARNING('npeaks not specified or invalid, '\n                       'getting SNR for all %s BLS peaks' %\n                       len(blsdict['nbestperiods']))\n        nperiods = len(blsdict['nbestperiods'])\n\n    nbestperiods = blsdict['nbestperiods'][:nperiods]\n\n    # get rid of nans first and sigclip\n    stimes, smags, serrs = sigclip_magseries(times,\n                                             mags,\n                                             errs,\n                                             magsarefluxes=magsarefluxes,\n                                             sigclip=sigclip)\n\n\n    # make sure there are enough points to calculate a spectrum\n    if len(stimes) > 9 and len(smags) > 9 and len(serrs) > 9:\n\n        nbestsnrs = []\n        transitdepth, transitduration = [], []\n        nphasebins, transingressbin, transegressbin = [], [], []\n\n        # keep these around for diagnostics\n        allsubtractedmags = []\n        allphasedmags = []\n        allphases = []\n        allblsmodels = []\n\n        # these are refit periods and epochs\n        refitperiods = []\n        refitepochs = []\n\n        for period in nbestperiods:\n\n            # get the period interval\n            startp = period - perioddeltapercent*period/100.0\n\n            if startp < 0:\n                startp = period\n\n            endp = period + perioddeltapercent*period/100.0\n\n            # see if we need to rerun bls_serial_pfind\n            if not assumeserialbls:\n\n                # run bls_serial_pfind with the kwargs copied over from the\n                # initial run. replace only the startp, endp, verbose, sigclip\n                # kwarg values\n                prevkwargs = blsdict['kwargs'].copy()\n                prevkwargs['verbose'] = verbose\n                prevkwargs['startp'] = startp\n                prevkwargs['endp'] = endp\n                prevkwargs['sigclip'] = None\n\n                blsres = bls_serial_pfind(stimes,\n                                          smags,\n                                          serrs,\n                                          **prevkwargs)\n\n            else:\n                blsres = blsdict\n\n            thistransdepth = blsres['blsresult']['transdepth']\n            thistransduration = blsres['blsresult']['transduration']\n            thisbestperiod = blsres['bestperiod']\n            thistransingressbin = blsres['blsresult']['transingressbin']\n            thistransegressbin = blsres['blsresult']['transegressbin']\n            thisnphasebins = blsdict['kwargs']['nphasebins']\n\n            stats = _get_bls_stats(stimes,\n                                   smags,\n                                   serrs,\n                                   thistransdepth,\n                                   thistransduration,\n                                   ingressdurationfraction,\n                                   nphasebins,\n                                   thistransingressbin,\n                                   thistransegressbin,\n                                   thisbestperiod,\n                                   thisnphasebins,\n                                   magsarefluxes=magsarefluxes,\n                                   verbose=verbose)\n\n\n            # update the lists with results from this peak\n            nbestsnrs.append(stats['snr'])\n            transitdepth.append(stats['transitdepth'])\n            transitduration.append(stats['transitduration'])\n            transingressbin.append(stats['transingressbin'])\n            transegressbin.append(stats['transegressbin'])\n            nphasebins.append(stats['nphasebins'])\n\n            # update the refit periods and epochs\n            refitperiods.append(stats['period'])\n            refitepochs.append(stats['epoch'])\n\n            # update the diagnostics\n            allsubtractedmags.append(stats['subtractedmags'])\n            allphasedmags.append(stats['phasedmags'])\n            allphases.append(stats['phases'])\n            allblsmodels.append(stats['blsmodel'])\n\n        # done with working on each peak\n\n    # if there aren't enough points in the mag series, bail out\n    else:\n\n        LOGERROR('no good detections for these times and mags, skipping...')\n        nbestsnrs = None\n        transitdepth, transitduration = None, None\n        nphasebins, transingressbin, transegressbin = None, None, None\n        allsubtractedmags, allphases, allphasedmags = None, None, None\n\n    return {'npeaks':npeaks,\n            'period':refitperiods,\n            'epoch':refitepochs,\n            'snr':nbestsnrs,\n            'transitdepth':transitdepth,\n            'transitduration':transitduration,\n            'nphasebins':nphasebins,\n            'transingressbin':transingressbin,\n            'transegressbin':transegressbin,\n            'allblsmodels':allblsmodels,\n            'allsubtractedmags':allsubtractedmags,\n            'allphasedmags':allphasedmags,\n            'allphases':allphases}", "response": "Calculates the signal to noise ratio for each best peak in the BLS tree."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _collect_tfa_stats(task):\n    '''\n    This is a parallel worker to gather LC stats.\n\n    task[0] = lcfile\n    task[1] = lcformat\n    task[2] = lcformatdir\n    task[3] = timecols\n    task[4] = magcols\n    task[5] = errcols\n    task[6] = custom_bandpasses\n\n    '''\n\n    try:\n\n        (lcfile, lcformat, lcformatdir,\n         timecols, magcols, errcols,\n         custom_bandpasses) = task\n\n        try:\n            formatinfo = get_lcformat(lcformat,\n                                      use_lcformat_dir=lcformatdir)\n            if formatinfo:\n                (dfileglob, readerfunc,\n                 dtimecols, dmagcols, derrcols,\n                 magsarefluxes, normfunc) = formatinfo\n            else:\n                LOGERROR(\"can't figure out the light curve format\")\n                return None\n        except Exception as e:\n            LOGEXCEPTION(\"can't figure out the light curve format\")\n            return None\n\n        # override the default timecols, magcols, and errcols\n        # using the ones provided to the function\n        if timecols is None:\n            timecols = dtimecols\n        if magcols is None:\n            magcols = dmagcols\n        if errcols is None:\n            errcols = derrcols\n\n        # get the LC into a dict\n        lcdict = readerfunc(lcfile)\n\n        # this should handle lists/tuples being returned by readerfunc\n        # we assume that the first element is the actual lcdict\n        # FIXME: figure out how to not need this assumption\n        if ( (isinstance(lcdict, (list, tuple))) and\n             (isinstance(lcdict[0], dict)) ):\n            lcdict = lcdict[0]\n\n        #\n        # collect the necessary stats for this light curve\n        #\n\n        # 1. number of observations\n        # 2. median mag\n        # 3. eta_normal\n        # 4. MAD\n        # 5. objectid\n        # 6. get mags and colors from objectinfo if there's one in lcdict\n\n        if 'objectid' in lcdict:\n            objectid = lcdict['objectid']\n        elif 'objectinfo' in lcdict and 'objectid' in lcdict['objectinfo']:\n            objectid = lcdict['objectinfo']['objectid']\n        elif 'objectinfo' in lcdict and 'hatid' in lcdict['objectinfo']:\n            objectid = lcdict['objectinfo']['hatid']\n        else:\n            LOGERROR('no objectid present in lcdict for LC %s, '\n                     'using filename prefix as objectid' % lcfile)\n            objectid = os.path.splitext(os.path.basename(lcfile))[0]\n\n        if 'objectinfo' in lcdict:\n\n            colorfeat = starfeatures.color_features(\n                lcdict['objectinfo'],\n                deredden=False,\n                custom_bandpasses=custom_bandpasses\n            )\n\n        else:\n            LOGERROR('no objectinfo dict in lcdict, '\n                     'could not get magnitudes for LC %s, '\n                     'cannot use for TFA template ensemble' %\n                     lcfile)\n            return None\n\n\n        # this is the initial dict\n        resultdict = {'objectid':objectid,\n                      'ra':lcdict['objectinfo']['ra'],\n                      'decl':lcdict['objectinfo']['decl'],\n                      'colorfeat':colorfeat,\n                      'lcfpath':os.path.abspath(lcfile),\n                      'lcformat':lcformat,\n                      'lcformatdir':lcformatdir,\n                      'timecols':timecols,\n                      'magcols':magcols,\n                      'errcols':errcols}\n\n        for tcol, mcol, ecol in zip(timecols, magcols, errcols):\n\n            try:\n\n                # dereference the columns and get them from the lcdict\n                if '.' in tcol:\n                    tcolget = tcol.split('.')\n                else:\n                    tcolget = [tcol]\n                times = _dict_get(lcdict, tcolget)\n\n                if '.' in mcol:\n                    mcolget = mcol.split('.')\n                else:\n                    mcolget = [mcol]\n                mags = _dict_get(lcdict, mcolget)\n\n                if '.' in ecol:\n                    ecolget = ecol.split('.')\n                else:\n                    ecolget = [ecol]\n                errs = _dict_get(lcdict, ecolget)\n\n                # normalize here if not using special normalization\n                if normfunc is None:\n                    ntimes, nmags = normalize_magseries(\n                        times, mags,\n                        magsarefluxes=magsarefluxes\n                    )\n\n                    times, mags, errs = ntimes, nmags, errs\n\n                # get the variability features for this object\n                varfeat = varfeatures.all_nonperiodic_features(\n                    times, mags, errs\n                )\n\n                resultdict[mcol] = varfeat\n\n            except Exception as e:\n\n                LOGEXCEPTION('%s, magcol: %s, probably ran into all-nans' %\n                             (lcfile, mcol))\n                resultdict[mcol] = {'ndet':0,\n                                    'mad':np.nan,\n                                    'eta_normal':np.nan}\n\n\n        return resultdict\n\n    except Exception as e:\n\n        LOGEXCEPTION('could not execute get_tfa_stats for task: %s' %\n                     repr(task))\n        return None", "response": "collects the required stats for the current TFA light curve"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _reform_templatelc_for_tfa(task):\n    '''\n    This is a parallel worker that reforms light curves for TFA.\n\n    task[0] = lcfile\n    task[1] = lcformat\n    task[2] = lcformatdir\n    task[3] = timecol\n    task[4] = magcol\n    task[5] = errcol\n    task[6] = timebase\n    task[7] = interpolate_type\n    task[8] = sigclip\n\n    '''\n\n    try:\n\n        (lcfile, lcformat, lcformatdir,\n         tcol, mcol, ecol,\n         timebase, interpolate_type, sigclip) = task\n\n        try:\n            formatinfo = get_lcformat(lcformat,\n                                      use_lcformat_dir=lcformatdir)\n            if formatinfo:\n                (dfileglob, readerfunc,\n                 dtimecols, dmagcols, derrcols,\n                 magsarefluxes, normfunc) = formatinfo\n            else:\n                LOGERROR(\"can't figure out the light curve format\")\n                return None\n        except Exception as e:\n            LOGEXCEPTION(\"can't figure out the light curve format\")\n            return None\n\n        # get the LC into a dict\n        lcdict = readerfunc(lcfile)\n\n        # this should handle lists/tuples being returned by readerfunc\n        # we assume that the first element is the actual lcdict\n        # FIXME: figure out how to not need this assumption\n        if ( (isinstance(lcdict, (list, tuple))) and\n             (isinstance(lcdict[0], dict)) ):\n            lcdict = lcdict[0]\n\n        outdict = {}\n\n        # dereference the columns and get them from the lcdict\n        if '.' in tcol:\n            tcolget = tcol.split('.')\n        else:\n            tcolget = [tcol]\n        times = _dict_get(lcdict, tcolget)\n\n        if '.' in mcol:\n            mcolget = mcol.split('.')\n        else:\n            mcolget = [mcol]\n        mags = _dict_get(lcdict, mcolget)\n\n        if '.' in ecol:\n            ecolget = ecol.split('.')\n        else:\n            ecolget = [ecol]\n        errs = _dict_get(lcdict, ecolget)\n\n        # normalize here if not using special normalization\n        if normfunc is None:\n            ntimes, nmags = normalize_magseries(\n                times, mags,\n                magsarefluxes=magsarefluxes\n            )\n\n        times, mags, errs = ntimes, nmags, errs\n\n        #\n        # now we'll do: 1. sigclip, 2. reform to timebase, 3. renorm to zero\n        #\n\n        # 1. sigclip as requested\n        stimes, smags, serrs = sigclip_magseries(times,\n                                                 mags,\n                                                 errs,\n                                                 sigclip=sigclip)\n\n        # 2. now, we'll renorm to the timebase\n        mags_interpolator = spi.interp1d(stimes, smags,\n                                         kind=interpolate_type,\n                                         fill_value='extrapolate')\n        errs_interpolator = spi.interp1d(stimes, serrs,\n                                         kind=interpolate_type,\n                                         fill_value='extrapolate')\n\n        interpolated_mags = mags_interpolator(timebase)\n        interpolated_errs = errs_interpolator(timebase)\n\n        # 3. renorm to zero\n        magmedian = np.median(interpolated_mags)\n\n        renormed_mags = interpolated_mags - magmedian\n\n        # update the dict\n        outdict = {'mags':renormed_mags,\n                   'errs':interpolated_errs,\n                   'origmags':interpolated_mags}\n\n        #\n        # done with this magcol\n        #\n        return outdict\n\n    except Exception as e:\n\n        LOGEXCEPTION('reform LC task failed: %s' % repr(task))\n        return None", "response": "This function reforms the templatelc for TFA."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef tfa_templates_lclist(\n        lclist,\n        lcinfo_pkl=None,\n        outfile=None,\n        target_template_frac=0.1,\n        max_target_frac_obs=0.25,\n        min_template_number=10,\n        max_template_number=1000,\n        max_rms=0.15,\n        max_mult_above_magmad=1.5,\n        max_mult_above_mageta=1.5,\n        xieta_bins=20,\n        mag_bandpass='sdssr',\n        custom_bandpasses=None,\n        mag_bright_limit=10.0,\n        mag_faint_limit=12.0,\n        process_template_lcs=True,\n        template_sigclip=5.0,\n        template_interpolate='nearest',\n        lcformat='hat-sql',\n        lcformatdir=None,\n        timecols=None,\n        magcols=None,\n        errcols=None,\n        nworkers=NCPUS,\n        maxworkertasks=1000,\n):\n    '''This selects template objects for TFA.\n\n    Selection criteria for TFA template ensemble objects:\n\n    - not variable: use a poly fit to the mag-MAD relation and eta-normal\n      variability index to get nonvar objects\n\n    - not more than 10% of the total number of objects in the field or\n      `max_tfa_templates` at most\n\n    - allow shuffling of the templates if the target ends up in them\n\n    - nothing with less than the median number of observations in the field\n\n    - sigma-clip the input time series observations\n\n    - TODO: uniform sampling in tangent plane coordinates (we'll need ra and\n      decl)\n\n    This also determines the effective cadence that all TFA LCs will be binned\n    to as the template LC with the largest number of non-nan observations will\n    be used. All template LCs will be renormed to zero.\n\n    Parameters\n    ----------\n\n    lclist : list of str\n        This is a list of light curves to use as input to generate the template\n        set.\n\n    lcinfo_pkl : str or None\n        If provided, is a file path to a pickle file created by this function on\n        a previous run containing the LC information. This will be loaded\n        directly instead of having to re-run LC info collection.\n\n    outfile : str or None\n        This is the pickle filename to which the TFA template list will be\n        written to. If None, a default file name will be used for this.\n\n    target_template_frac : float\n        This is the fraction of total objects in lclist to use for the number of\n        templates.\n\n    max_target_frac_obs : float\n        This sets the number of templates to generate if the number of\n        observations for the light curves is smaller than the number of objects\n        in the collection. The number of templates will be set to this fraction\n        of the number of observations if this is the case.\n\n    min_template_number : int\n        This is the minimum number of templates to generate.\n\n    max_template_number : int\n        This is the maximum number of templates to generate. If\n        `target_template_frac` times the number of objects is greater than\n        `max_template_number`, only `max_template_number` templates will be\n        used.\n\n    max_rms : float\n        This is the maximum light curve RMS for an object to consider it as a\n        possible template ensemble member.\n\n    max_mult_above_magmad : float\n        This is the maximum multiplier above the mag-RMS fit to consider an\n        object as variable and thus not part of the template ensemble.\n\n    max_mult_above_mageta : float\n        This is the maximum multiplier above the mag-eta (variable index) fit to\n        consider an object as variable and thus not part of the template\n        ensemble.\n\n    mag_bandpass : str\n        This sets the key in the light curve dict's objectinfo dict to use as\n        the canonical magnitude for the object and apply any magnitude limits\n        to.\n\n    custom_bandpasses : dict or None\n        This can be used to provide any custom band name keys to the star\n        feature collection function.\n\n    mag_bright_limit : float or list of floats\n        This sets the brightest mag (in the `mag_bandpass` filter) for a\n        potential member of the TFA template ensemble. If this is a single\n        float, the value will be used for all magcols. If this is a list of\n        floats with len = len(magcols), the specific bright limits will be used\n        for each magcol individually.\n\n    mag_faint_limit : float or list of floats\n        This sets the faintest mag (in the `mag_bandpass` filter) for a\n        potential member of the TFA template ensemble. If this is a single\n        float, the value will be used for all magcols. If this is a list of\n        floats with len = len(magcols), the specific faint limits will be used\n        for each magcol individually.\n\n    process_template_lcs : bool\n        If True, will reform the template light curves to the chosen\n        time-base. If False, will only select light curves for templates but not\n        process them. This is useful for initial exploration of how the template\n        LC are selected.\n\n    template_sigclip : float or sequence of floats or None\n        This sets the sigma-clip to be applied to the template light curves.\n\n    template_interpolate : str\n        This sets the kwarg to pass to `scipy.interpolate.interp1d` to set the\n        kind of interpolation to use when reforming light curves to the TFA\n        template timebase.\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curves specified in `basedir` or `use_list_of_filenames`.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    timecols : list of str or None\n        The timecol keys to use from the lcdict in calculating the features.\n\n    magcols : list of str or None\n        The magcol keys to use from the lcdict in calculating the features.\n\n    errcols : list of str or None\n        The errcol keys to use from the lcdict in calculating the features.\n\n    nworkers : int\n        The number of parallel workers to launch.\n\n    maxworkertasks : int\n        The maximum number of tasks to run per worker before it is replaced by a\n        fresh one.\n\n    Returns\n    -------\n\n    dict\n        This function returns a dict that can be passed directly to\n        `apply_tfa_magseries` below. It can optionally produce a pickle with the\n        same dict, which can also be passed to that function.\n\n    '''\n\n    try:\n        formatinfo = get_lcformat(lcformat,\n                                  use_lcformat_dir=lcformatdir)\n        if formatinfo:\n            (dfileglob, readerfunc,\n             dtimecols, dmagcols, derrcols,\n             magsarefluxes, normfunc) = formatinfo\n        else:\n            LOGERROR(\"can't figure out the light curve format\")\n            return None\n    except Exception as e:\n        LOGEXCEPTION(\"can't figure out the light curve format\")\n        return None\n\n    # override the default timecols, magcols, and errcols\n    # using the ones provided to the function\n    if timecols is None:\n        timecols = dtimecols\n    if magcols is None:\n        magcols = dmagcols\n    if errcols is None:\n        errcols = derrcols\n\n    LOGINFO('collecting light curve information for %s objects in list...' %\n            len(lclist))\n\n    #\n    # check if we have cached results for this run\n    #\n\n    # case where we provide a cache info pkl directly\n    if lcinfo_pkl and os.path.exists(lcinfo_pkl):\n\n        with open(lcinfo_pkl,'rb') as infd:\n            results = pickle.load(infd)\n\n    # case where we don't have an info pickle or an outfile\n    elif ((not outfile) and\n          os.path.exists('tfa-collected-lcfinfo-%s.pkl' % lcformat)):\n\n        with open('tfa-collected-lcfinfo-%s.pkl' % lcformat, 'rb') as infd:\n            results = pickle.load(infd)\n\n    # case where we don't have an info pickle but do have an outfile\n    elif (outfile and os.path.exists('tfa-collected-lcfinfo-%s-%s' %\n                                     (lcformat, os.path.basename(outfile)))):\n\n        with open(\n                'tfa-collected-lcinfo-%s-%s' %\n                (lcformat, os.path.basename(outfile)),\n                'rb'\n        ) as infd:\n            results = pickle.load(infd)\n\n    # case where we have to redo the LC info collection\n    else:\n\n        # first, we'll collect the light curve info\n        tasks = [(x, lcformat, lcformat,\n                  timecols, magcols, errcols,\n                  custom_bandpasses) for x in lclist]\n\n        pool = mp.Pool(nworkers, maxtasksperchild=maxworkertasks)\n        results = pool.map(_collect_tfa_stats, tasks)\n        pool.close()\n        pool.join()\n\n        # save these results so we don't have to redo if something breaks here\n        if not outfile:\n            with open('tfa-collected-lcinfo-%s.pkl' % lcformat,'wb') as outfd:\n                pickle.dump(results, outfd, pickle.HIGHEST_PROTOCOL)\n        else:\n            with open(\n                    'tfa-collected-lcinfo-%s-%s' %\n                    (lcformat, os.path.basename(outfile)),\n                    'wb'\n            ) as outfd:\n                pickle.dump(results, outfd, pickle.HIGHEST_PROTOCOL)\n\n    #\n    # now, go through the light curve information\n    #\n\n    # find the center RA and center DEC -> median of all LC RAs and DECs\n    all_ras = np.array([res['ra'] for res in results])\n    all_decls = np.array([res['decl'] for res in results])\n    center_ra = np.nanmedian(all_ras)\n    center_decl = np.nanmedian(all_decls)\n\n    outdict = {\n        'timecols':[],\n        'magcols':[],\n        'errcols':[],\n        'center_ra':center_ra,\n        'center_decl':center_decl,\n    }\n\n    # for each magcol, we'll generate a separate template list\n    for tcol, mcol, ecol in zip(timecols, magcols, errcols):\n\n        if '.' in tcol:\n            tcolget = tcol.split('.')\n        else:\n            tcolget = [tcol]\n\n\n        # these are the containers for possible template collection LC info\n        (lcmag, lcmad, lceta,\n         lcndet, lcobj, lcfpaths,\n         lcra, lcdecl) = [], [], [], [], [], [], [], []\n\n        outdict['timecols'].append(tcol)\n        outdict['magcols'].append(mcol)\n        outdict['errcols'].append(ecol)\n\n        # add to the collection of all light curves\n        outdict[mcol] = {'collection':{'mag':[],\n                                       'mad':[],\n                                       'eta':[],\n                                       'ndet':[],\n                                       'obj':[],\n                                       'lcf':[],\n                                       'ra':[],\n                                       'decl':[]}}\n\n        LOGINFO('magcol: %s, collecting prospective template LC info...' %\n                mcol)\n\n\n        # collect the template LCs for this magcol\n        for result in results:\n\n            # we'll only append objects that have all of these elements\n            try:\n\n                thismag = result['colorfeat'][mag_bandpass]\n                thismad = result[mcol]['mad']\n                thiseta = result[mcol]['eta_normal']\n                thisndet = result[mcol]['ndet']\n                thisobj = result['objectid']\n                thislcf = result['lcfpath']\n                thisra = result['ra']\n                thisdecl = result['decl']\n\n                outdict[mcol]['collection']['mag'].append(thismag)\n                outdict[mcol]['collection']['mad'].append(thismad)\n                outdict[mcol]['collection']['eta'].append(thiseta)\n                outdict[mcol]['collection']['ndet'].append(thisndet)\n                outdict[mcol]['collection']['obj'].append(thisobj)\n                outdict[mcol]['collection']['lcf'].append(thislcf)\n                outdict[mcol]['collection']['ra'].append(thisra)\n                outdict[mcol]['collection']['decl'].append(thisdecl)\n\n\n                # check if we have more than one bright or faint limit elem\n                if isinstance(mag_bright_limit, (list, tuple)):\n                    use_bright_maglim = mag_bright_limit[\n                        magcols.index(mcol)\n                    ]\n                else:\n                    use_bright_maglim = mag_bright_limit\n                if isinstance(mag_faint_limit, (list, tuple)):\n                    use_faint_maglim = mag_faint_limit[\n                        magcols.index(mcol)\n                    ]\n                else:\n                    use_faint_maglim = mag_faint_limit\n\n                # make sure the object lies in the mag limits and RMS limits we\n                # set before to try to accept it into the TFA ensemble\n                if ((use_bright_maglim < thismag < use_faint_maglim) and\n                    (1.4826*thismad < max_rms)):\n\n                    lcmag.append(thismag)\n                    lcmad.append(thismad)\n                    lceta.append(thiseta)\n                    lcndet.append(thisndet)\n                    lcobj.append(thisobj)\n                    lcfpaths.append(thislcf)\n                    lcra.append(thisra)\n                    lcdecl.append(thisdecl)\n\n            except Exception as e:\n                pass\n\n        # make sure we have enough LCs to work on\n        if len(lcobj) >= min_template_number:\n\n            LOGINFO('magcol: %s, %s objects eligible for '\n                    'template selection after filtering on mag '\n                    'limits (%s, %s) and max RMS (%s)' %\n                    (mcol, len(lcobj),\n                     mag_bright_limit, mag_faint_limit, max_rms))\n\n            lcmag = np.array(lcmag)\n            lcmad = np.array(lcmad)\n            lceta = np.array(lceta)\n            lcndet = np.array(lcndet)\n            lcobj = np.array(lcobj)\n            lcfpaths = np.array(lcfpaths)\n            lcra = np.array(lcra)\n            lcdecl = np.array(lcdecl)\n\n            sortind = np.argsort(lcmag)\n            lcmag = lcmag[sortind]\n            lcmad = lcmad[sortind]\n            lceta = lceta[sortind]\n            lcndet = lcndet[sortind]\n            lcobj = lcobj[sortind]\n            lcfpaths = lcfpaths[sortind]\n            lcra = lcra[sortind]\n            lcdecl = lcdecl[sortind]\n\n            # 1. get the mag-MAD relation\n\n            # this is needed for spline fitting\n            # should take care of the pesky 'x must be strictly increasing' bit\n            splfit_ind = np.diff(lcmag) > 0.0\n            splfit_ind = np.concatenate((np.array([True]), splfit_ind))\n\n            fit_lcmag = lcmag[splfit_ind]\n            fit_lcmad = lcmad[splfit_ind]\n            fit_lceta = lceta[splfit_ind]\n\n            magmadfit = np.poly1d(np.polyfit(\n                fit_lcmag,\n                fit_lcmad,\n                2\n            ))\n            magmadind = lcmad/magmadfit(lcmag) < max_mult_above_magmad\n\n            # 2. get the mag-eta relation\n            magetafit = np.poly1d(np.polyfit(\n                fit_lcmag,\n                fit_lceta,\n                2\n            ))\n            magetaind = magetafit(lcmag)/lceta < max_mult_above_mageta\n\n            # 3. get the median ndet\n            median_ndet = np.median(lcndet)\n            ndetind = lcndet >= median_ndet\n\n            # form the final template ensemble\n            templateind = magmadind & magetaind & ndetind\n\n            # check again if we have enough LCs in the template\n            if templateind.sum() >= min_template_number:\n\n                LOGINFO('magcol: %s, %s objects selectable for TFA templates' %\n                        (mcol, templateind.sum()))\n\n                templatemag = lcmag[templateind]\n                templatemad = lcmad[templateind]\n                templateeta = lceta[templateind]\n                templatendet = lcndet[templateind]\n                templateobj = lcobj[templateind]\n                templatelcf = lcfpaths[templateind]\n                templatera = lcra[templateind]\n                templatedecl = lcdecl[templateind]\n\n                # now, check if we have no more than the required fraction of\n                # TFA templates\n                target_number_templates = int(target_template_frac*len(results))\n\n                if target_number_templates > max_template_number:\n                    target_number_templates = max_template_number\n\n                LOGINFO('magcol: %s, selecting %s TFA templates randomly' %\n                        (mcol, target_number_templates))\n\n                # FIXME: how do we select uniformly in xi-eta?\n                # 1. 2D histogram the data into binsize (nx, ny)\n                # 2. random uniform select from 0 to nx-1, 0 to ny-1\n                # 3. pick object from selected bin\n                # 4. continue until we have target_number_templates\n                # 5. make sure the same object isn't picked twice\n\n                # get the xi-eta\n                template_cxi, template_ceta = coordutils.xieta_from_radecl(\n                    templatera,\n                    templatedecl,\n                    center_ra,\n                    center_decl\n                )\n\n                cxi_bins = np.linspace(template_cxi.min(),\n                                       template_cxi.max(),\n                                       num=xieta_bins)\n                ceta_bins = np.linspace(template_ceta.min(),\n                                        template_ceta.max(),\n                                        num=xieta_bins)\n\n                digitized_cxi_inds = np.digitize(template_cxi, cxi_bins)\n                digitized_ceta_inds = np.digitize(template_ceta, ceta_bins)\n\n                # pick target_number_templates indexes out of the bins\n                targetind = npr.choice(xieta_bins,\n                                       target_number_templates,\n                                       replace=True)\n\n                # put together the template lists\n                selected_template_obj = []\n                selected_template_lcf = []\n                selected_template_ndet = []\n                selected_template_ra = []\n                selected_template_decl = []\n                selected_template_mag = []\n                selected_template_mad = []\n                selected_template_eta = []\n\n                for ind in targetind:\n\n                    pass\n\n\n                # select random uniform objects from the template candidates\n                targetind = npr.choice(templateobj.size,\n                                       target_number_templates,\n                                       replace=False)\n\n                templatemag = templatemag[targetind]\n                templatemad = templatemad[targetind]\n                templateeta = templateeta[targetind]\n                templatendet = templatendet[targetind]\n                templateobj = templateobj[targetind]\n                templatelcf = templatelcf[targetind]\n                templatera = templatera[targetind]\n                templatedecl = templatedecl[targetind]\n\n                # get the max ndet so far to use that LC as the timebase\n                maxndetind = templatendet == templatendet.max()\n                timebaselcf = templatelcf[maxndetind][0]\n                timebasendet = templatendet[maxndetind][0]\n\n                LOGINFO('magcol: %s, selected %s as template time '\n                        'base LC with %s observations' %\n                        (mcol, timebaselcf, timebasendet))\n\n                if process_template_lcs:\n\n                    timebaselcdict = readerfunc(timebaselcf)\n\n                    if ( (isinstance(timebaselcdict, (list, tuple))) and\n                         (isinstance(timebaselcdict[0], dict)) ):\n                        timebaselcdict = timebaselcdict[0]\n\n                    # this is the timebase to use for all of the templates\n                    timebase = _dict_get(timebaselcdict, tcolget)\n\n                else:\n                    timebase = None\n\n                # also check if the number of templates is longer than the\n                # actual timebase of the observations. this will cause issues\n                # with overcorrections and will probably break TFA\n                if target_number_templates > timebasendet:\n\n                    LOGWARNING('The number of TFA templates (%s) is '\n                               'larger than the number of observations '\n                               'of the time base (%s). This will likely '\n                               'overcorrect all light curves to a '\n                               'constant level. '\n                               'Will use up to %s x timebase ndet '\n                               'templates instead' %\n                               (target_number_templates,\n                                timebasendet,\n                                max_target_frac_obs))\n\n                    # regen the templates based on the new number\n                    newmaxtemplates = int(max_target_frac_obs*timebasendet)\n\n                    # choose this number out of the already chosen templates\n                    # randomly\n\n                    LOGWARNING('magcol: %s, re-selecting %s TFA '\n                               'templates randomly' %\n                               (mcol, newmaxtemplates))\n\n                    # FIXME: how do we select uniformly in ra-decl?\n                    # 1. 2D histogram the data into binsize (nx, ny)\n                    # 2. random uniform select from 0 to nx-1, 0 to ny-1\n                    # 3. pick object from selected bin\n                    # 4. continue until we have target_number_templates\n                    # 5. make sure the same object isn't picked twice\n\n                    # select random uniform objects from the template candidates\n                    targetind = npr.choice(templateobj.size,\n                                           newmaxtemplates,\n                                           replace=False)\n\n                    templatemag = templatemag[targetind]\n                    templatemad = templatemad[targetind]\n                    templateeta = templateeta[targetind]\n                    templatendet = templatendet[targetind]\n                    templateobj = templateobj[targetind]\n                    templatelcf = templatelcf[targetind]\n                    templatera = templatera[targetind]\n                    templatedecl = templatedecl[targetind]\n\n                    # get the max ndet so far to use that LC as the timebase\n                    maxndetind = templatendet == templatendet.max()\n                    timebaselcf = templatelcf[maxndetind][0]\n                    timebasendet = templatendet[maxndetind][0]\n                    LOGWARNING('magcol: %s, re-selected %s as template time '\n                               'base LC with %s observations' %\n                               (mcol, timebaselcf, timebasendet))\n\n                    if process_template_lcs:\n\n                        timebaselcdict = readerfunc(timebaselcf)\n\n                        if ( (isinstance(timebaselcdict, (list, tuple))) and\n                             (isinstance(timebaselcdict[0], dict)) ):\n                            timebaselcdict = timebaselcdict[0]\n\n                        # this is the timebase to use for all of the templates\n                        timebase = _dict_get(timebaselcdict, tcolget)\n\n                    else:\n\n                        timebase = None\n\n                #\n                # end of check for ntemplates > timebase ndet\n                #\n\n                if process_template_lcs:\n\n                    LOGINFO('magcol: %s, reforming TFA template LCs to '\n                            ' chosen timebase...' % mcol)\n\n                    # reform all template LCs to this time base, normalize to\n                    # zero, and sigclip as requested. this is a parallel op\n                    # first, we'll collect the light curve info\n                    tasks = [(x, lcformat, lcformatdir,\n                              tcol, mcol, ecol,\n                              timebase, template_interpolate,\n                              template_sigclip) for x\n                             in templatelcf]\n\n                    pool = mp.Pool(nworkers, maxtasksperchild=maxworkertasks)\n                    reform_results = pool.map(_reform_templatelc_for_tfa, tasks)\n                    pool.close()\n                    pool.join()\n\n                    # generate a 2D array for the template magseries with\n                    # dimensions = (n_objects, n_lcpoints)\n                    template_magseries = np.array([x['mags']\n                                                   for x in reform_results])\n                    template_errseries = np.array([x['errs']\n                                                   for x in reform_results])\n\n                else:\n                    template_magseries = None\n                    template_errseries = None\n\n                # put everything into a templateinfo dict for this magcol\n                outdict[mcol].update({\n                    'timebaselcf':timebaselcf,\n                    'timebase':timebase,\n                    'trendfits':{'mag-mad':magmadfit,\n                                 'mag-eta':magetafit},\n                    'template_objects':templateobj,\n                    'template_ra':templatera,\n                    'template_decl':templatedecl,\n                    'template_mag':templatemag,\n                    'template_mad':templatemad,\n                    'template_eta':templateeta,\n                    'template_ndet':templatendet,\n                    'template_magseries':template_magseries,\n                    'template_errseries':template_errseries\n                })\n\n                # make a KDTree on the template coordinates\n                outdict[mcol]['template_radecl_kdtree'] = (\n                    coordutils.make_kdtree(\n                        templatera, templatedecl\n                    )\n                )\n\n            # if we don't have enough, return nothing for this magcol\n            else:\n                LOGERROR('not enough objects meeting requested '\n                         'MAD, eta, ndet conditions to '\n                         'select templates for magcol: %s' % mcol)\n                continue\n\n        else:\n\n            LOGERROR('nobjects: %s, not enough in requested mag range to '\n                     'select templates for magcol: %s' % (len(lcobj),mcol))\n\n            continue\n\n        # make the plots for mag-MAD/mag-eta relation and fits used\n        plt.plot(lcmag, lcmad, marker='o', linestyle='none', ms=1.0)\n        modelmags = np.linspace(lcmag.min(), lcmag.max(), num=1000)\n        plt.plot(modelmags, outdict[mcol]['trendfits']['mag-mad'](modelmags))\n        plt.yscale('log')\n        plt.xlabel('catalog magnitude')\n        plt.ylabel('light curve MAD')\n        plt.title('catalog mag vs. light curve MAD and fit')\n        plt.savefig('catmag-%s-lcmad-fit.png' % mcol,\n                    bbox_inches='tight')\n        plt.close('all')\n\n        plt.plot(lcmag, lceta, marker='o', linestyle='none', ms=1.0)\n        modelmags = np.linspace(lcmag.min(), lcmag.max(), num=1000)\n        plt.plot(modelmags, outdict[mcol]['trendfits']['mag-eta'](modelmags))\n        plt.yscale('log')\n        plt.xlabel('catalog magnitude')\n        plt.ylabel('light curve eta variable index')\n        plt.title('catalog mag vs. light curve eta and fit')\n        plt.savefig('catmag-%s-lceta-fit.png' % mcol,\n                    bbox_inches='tight')\n        plt.close('all')\n\n\n    #\n    # end of operating on each magcol\n    #\n\n    # save the templateinfo dict to a pickle if requested\n    if outfile:\n\n        if outfile.endswith('.gz'):\n            outfd = gzip.open(outfile,'wb')\n        else:\n            outfd = open(outfile,'wb')\n\n        with outfd:\n            pickle.dump(outdict, outfd, protocol=pickle.HIGHEST_PROTOCOL)\n\n    # return the templateinfo dict\n    return outdict", "response": "This function generates a list of template objects for TFA templates."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef apply_tfa_magseries(lcfile,\n                        timecol,\n                        magcol,\n                        errcol,\n                        templateinfo,\n                        mintemplatedist_arcmin=10.0,\n                        lcformat='hat-sql',\n                        lcformatdir=None,\n                        interp='nearest',\n                        sigclip=5.0):\n    '''This applies the TFA correction to an LC given TFA template information.\n\n    Parameters\n    ----------\n\n    lcfile : str\n        This is the light curve file to apply the TFA correction to.\n\n    timecol,magcol,errcol : str\n        These are the column keys in the lcdict for the LC file to apply the TFA\n        correction to.\n\n    templateinfo : dict or str\n        This is either the dict produced by `tfa_templates_lclist` or the pickle\n        produced by the same function.\n\n    mintemplatedist_arcmin : float\n        This sets the minimum distance required from the target object for\n        objects in the TFA template ensemble. Objects closer than this distance\n        will be removed from the ensemble.\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curves specified in `basedir` or `use_list_of_filenames`.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    interp : str\n        This is passed to scipy.interpolate.interp1d as the kind of\n        interpolation to use when reforming this light curve to the timebase of\n        the TFA templates.\n\n    sigclip : float or sequence of two floats or None\n        This is the sigma clip to apply to this light curve before running TFA\n        on it.\n\n    Returns\n    -------\n\n    str\n        This returns the filename of the light curve file generated after TFA\n        applications. This is a pickle (that can be read by `lcproc.read_pklc`)\n        in the same directory as `lcfile`. The `magcol` will be encoded in the\n        filename, so each `magcol` in `lcfile` gets its own output file.\n\n    '''\n\n    try:\n        formatinfo = get_lcformat(lcformat,\n                                  use_lcformat_dir=lcformatdir)\n        if formatinfo:\n            (dfileglob, readerfunc,\n             dtimecols, dmagcols, derrcols,\n             magsarefluxes, normfunc) = formatinfo\n        else:\n            LOGERROR(\"can't figure out the light curve format\")\n            return None\n    except Exception as e:\n        LOGEXCEPTION(\"can't figure out the light curve format\")\n        return None\n\n    # get the templateinfo from a pickle if necessary\n    if isinstance(templateinfo,str) and os.path.exists(templateinfo):\n        with open(templateinfo,'rb') as infd:\n            templateinfo = pickle.load(infd)\n\n    lcdict = readerfunc(lcfile)\n    if ((isinstance(lcdict, (tuple, list))) and\n        isinstance(lcdict[0], dict)):\n        lcdict = lcdict[0]\n\n    objectid = lcdict['objectid']\n\n    # this is the initial template array\n    tmagseries = templateinfo[magcol][\n        'template_magseries'\n    ][::]\n\n    # if the object itself is in the template ensemble, remove it\n    if objectid in templateinfo[magcol]['template_objects']:\n\n        LOGWARNING('object %s found in the TFA template ensemble, removing...' %\n                   objectid)\n\n        templateind = templateinfo[magcol]['template_objects'] == objectid\n\n        # get the objects in the tmagseries not corresponding to the current\n        # object's index\n        tmagseries = tmagseries[~templateind,:]\n\n    # check if there are close matches to the current object in the templates\n    object_matches = coordutils.conesearch_kdtree(\n        templateinfo[magcol]['template_radecl_kdtree'],\n        lcdict['objectinfo']['ra'], lcdict['objectinfo']['decl'],\n        mintemplatedist_arcmin/60.0\n    )\n\n    if len(object_matches) > 0:\n\n        LOGWARNING(\n            \"object %s is within %.1f arcminutes of %s \"\n            \"template objects. Will remove these objects \"\n            \"from the template applied to this object.\" %\n            (objectid, mintemplatedist_arcmin, len(object_matches))\n        )\n        removalind = np.full(\n            templateinfo[magcol]['template_objects'].size,\n            False, dtype=np.bool\n        )\n        removalind[np.array(object_matches)] = True\n        tmagseries = tmagseries[~removalind,:]\n\n    #\n    # finally, proceed to TFA\n    #\n\n    # this is the normal matrix\n    normal_matrix = np.dot(tmagseries, tmagseries.T)\n\n    # get the inverse of the matrix\n    normal_matrix_inverse = spla.pinv2(normal_matrix)\n\n    # get the timebase from the template\n    timebase = templateinfo[magcol]['timebase']\n\n    # use this to reform the target lc in the same manner as that for a TFA\n    # template LC\n    reformed_targetlc = _reform_templatelc_for_tfa((\n        lcfile,\n        lcformat,\n        lcformatdir,\n        timecol,\n        magcol,\n        errcol,\n        timebase,\n        interp,\n        sigclip\n    ))\n\n    # calculate the scalar products of the target and template magseries\n    scalar_products = np.dot(tmagseries, reformed_targetlc['mags'])\n\n    # calculate the corrections\n    corrections = np.dot(normal_matrix_inverse, scalar_products)\n\n    # finally, get the corrected time series for the target object\n    corrected_magseries = (\n        reformed_targetlc['origmags'] -\n        np.dot(tmagseries.T, corrections)\n    )\n\n    outdict = {\n        'times':timebase,\n        'mags':corrected_magseries,\n        'errs':reformed_targetlc['errs'],\n        'mags_median':np.median(corrected_magseries),\n        'mags_mad': np.median(np.abs(corrected_magseries -\n                                     np.median(corrected_magseries))),\n        'work':{'tmagseries':tmagseries,\n                'normal_matrix':normal_matrix,\n                'normal_matrix_inverse':normal_matrix_inverse,\n                'scalar_products':scalar_products,\n                'corrections':corrections,\n                'reformed_targetlc':reformed_targetlc},\n    }\n\n\n    # we'll write back the tfa times and mags to the lcdict\n    lcdict['tfa'] = outdict\n    outfile = os.path.join(\n        os.path.dirname(lcfile),\n        '%s-tfa-%s-pklc.pkl' % (\n            squeeze(objectid).replace(' ','-'),\n            magcol\n        )\n    )\n    with open(outfile,'wb') as outfd:\n        pickle.dump(lcdict, outfd, pickle.HIGHEST_PROTOCOL)\n\n    return outfile", "response": "This function applies the TFA correction to an LC given TFA template information."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef parallel_tfa_lclist(lclist,\n                        templateinfo,\n                        timecols=None,\n                        magcols=None,\n                        errcols=None,\n                        lcformat='hat-sql',\n                        lcformatdir=None,\n                        interp='nearest',\n                        sigclip=5.0,\n                        mintemplatedist_arcmin=10.0,\n                        nworkers=NCPUS,\n                        maxworkertasks=1000):\n    '''This applies TFA in parallel to all LCs in the given list of file names.\n\n    Parameters\n    ----------\n\n    lclist : str\n        This is a list of light curve files to apply TFA correction to.\n\n    templateinfo : dict or str\n        This is either the dict produced by `tfa_templates_lclist` or the pickle\n        produced by the same function.\n\n    timecols : list of str or None\n        The timecol keys to use from the lcdict in applying TFA corrections.\n\n    magcols : list of str or None\n        The magcol keys to use from the lcdict in applying TFA corrections.\n\n    errcols : list of str or None\n        The errcol keys to use from the lcdict in applying TFA corrections.\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curves specified in `basedir` or `use_list_of_filenames`.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    interp : str\n        This is passed to scipy.interpolate.interp1d as the kind of\n        interpolation to use when reforming the light curves to the timebase of\n        the TFA templates.\n\n    sigclip : float or sequence of two floats or None\n        This is the sigma clip to apply to the light curves before running TFA\n        on it.\n\n    mintemplatedist_arcmin : float\n        This sets the minimum distance required from the target object for\n        objects in the TFA template ensemble. Objects closer than this distance\n        will be removed from the ensemble.\n\n    nworkers : int\n        The number of parallel workers to launch\n\n    maxworkertasks : int\n        The maximum number of tasks per worker allowed before it's replaced by a\n        fresh one.\n\n    Returns\n    -------\n\n    dict\n        Contains the input file names and output TFA light curve filenames per\n        input file organized by each `magcol` in `magcols`.\n\n    '''\n\n    # open the templateinfo first\n    if isinstance(templateinfo,str) and os.path.exists(templateinfo):\n        with open(templateinfo,'rb') as infd:\n            templateinfo = pickle.load(infd)\n\n    try:\n        formatinfo = get_lcformat(lcformat,\n                                  use_lcformat_dir=lcformatdir)\n        if formatinfo:\n            (dfileglob, readerfunc,\n             dtimecols, dmagcols, derrcols,\n             magsarefluxes, normfunc) = formatinfo\n        else:\n            LOGERROR(\"can't figure out the light curve format\")\n            return None\n    except Exception as e:\n        LOGEXCEPTION(\"can't figure out the light curve format\")\n        return None\n\n    # override the default timecols, magcols, and errcols\n    # using the ones provided to the function\n    # we'll get the defaults from the templateinfo object\n    if timecols is None:\n        timecols = templateinfo['timecols']\n    if magcols is None:\n        magcols = templateinfo['magcols']\n    if errcols is None:\n        errcols = templateinfo['errcols']\n\n    outdict = {}\n\n    # run by magcol\n    for t, m, e in zip(timecols, magcols, errcols):\n\n        tasks = [(x, t, m, e, templateinfo,\n                  lcformat, lcformatdir,\n                  interp, sigclip) for\n                 x in lclist]\n\n        pool = mp.Pool(nworkers, maxtasksperchild=maxworkertasks)\n        results = pool.map(_parallel_tfa_worker, tasks)\n        pool.close()\n        pool.join()\n\n        outdict[m] = results\n\n    return outdict", "response": "This function applies TFA in parallel to all light curve files in a given list of file names."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef parallel_tfa_lcdir(lcdir,\n                       templateinfo,\n                       lcfileglob=None,\n                       timecols=None,\n                       magcols=None,\n                       errcols=None,\n                       lcformat='hat-sql',\n                       lcformatdir=None,\n                       interp='nearest',\n                       sigclip=5.0,\n                       mintemplatedist_arcmin=10.0,\n                       nworkers=NCPUS,\n                       maxworkertasks=1000):\n    '''This applies TFA in parallel to all LCs in a directory.\n\n    Parameters\n    ----------\n\n    lcdir : str\n        This is the directory containing the light curve files to process..\n\n    templateinfo : dict or str\n        This is either the dict produced by `tfa_templates_lclist` or the pickle\n        produced by the same function.\n\n    lcfileglob : str or None\n        The UNIX file glob to use when searching for light curve files in\n        `lcdir`. If None, the default file glob associated with registered LC\n        format provided is used.\n\n    timecols : list of str or None\n        The timecol keys to use from the lcdict in applying TFA corrections.\n\n    magcols : list of str or None\n        The magcol keys to use from the lcdict in applying TFA corrections.\n\n    errcols : list of str or None\n        The errcol keys to use from the lcdict in applying TFA corrections.\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curves specified in `basedir` or `use_list_of_filenames`.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    interp : str\n        This is passed to scipy.interpolate.interp1d as the kind of\n        interpolation to use when reforming the light curves to the timebase of\n        the TFA templates.\n\n    sigclip : float or sequence of two floats or None\n        This is the sigma clip to apply to the light curves before running TFA\n        on it.\n\n    mintemplatedist_arcmin : float\n        This sets the minimum distance required from the target object for\n        objects in the TFA template ensemble. Objects closer than this distance\n        will be removed from the ensemble.\n\n    nworkers : int\n        The number of parallel workers to launch\n\n    maxworkertasks : int\n        The maximum number of tasks per worker allowed before it's replaced by a\n        fresh one.\n\n    Returns\n    -------\n\n    dict\n        Contains the input file names and output TFA light curve filenames per\n        input file organized by each `magcol` in `magcols`.\n\n    '''\n\n    # open the templateinfo first\n    if isinstance(templateinfo,str) and os.path.exists(templateinfo):\n        with open(templateinfo,'rb') as infd:\n            templateinfo = pickle.load(infd)\n\n    try:\n        formatinfo = get_lcformat(lcformat,\n                                  use_lcformat_dir=lcformatdir)\n        if formatinfo:\n            (dfileglob, readerfunc,\n             dtimecols, dmagcols, derrcols,\n             magsarefluxes, normfunc) = formatinfo\n        else:\n            LOGERROR(\"can't figure out the light curve format\")\n            return None\n    except Exception as e:\n        LOGEXCEPTION(\"can't figure out the light curve format\")\n        return None\n\n    # find all the files matching the lcglob in lcdir\n    if lcfileglob is None:\n        lcfileglob = dfileglob\n\n    lclist = sorted(glob.glob(os.path.join(lcdir, lcfileglob)))\n\n    return parallel_tfa_lclist(\n        lclist,\n        templateinfo,\n        timecols=timecols,\n        magcols=magcols,\n        errcols=errcols,\n        lcformat=lcformat,\n        lcformatdir=None,\n        interp=interp,\n        sigclip=sigclip,\n        mintemplatedist_arcmin=mintemplatedist_arcmin,\n        nworkers=nworkers,\n        maxworkertasks=maxworkertasks\n    )", "response": "This function applies TFA in parallel to all light curve files in a directory."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef update_checkplot_objectinfo(cpf,\n                                fast_mode=False,\n                                findercmap='gray_r',\n                                finderconvolve=None,\n                                deredden_object=True,\n                                custom_bandpasses=None,\n                                gaia_submit_timeout=10.0,\n                                gaia_submit_tries=3,\n                                gaia_max_timeout=180.0,\n                                gaia_mirror=None,\n                                complete_query_later=True,\n                                lclistpkl=None,\n                                nbrradiusarcsec=60.0,\n                                maxnumneighbors=5,\n                                plotdpi=100,\n                                findercachedir='~/.astrobase/stamp-cache',\n                                verbose=True):\n    '''This updates a checkplot objectinfo dict.\n\n    Useful in cases where a previous round of GAIA/finderchart/external catalog\n    acquisition failed. This will preserve the following keys in the checkplot\n    if they exist::\n\n        comments\n        varinfo\n        objectinfo.objecttags\n\n    Parameters\n    ----------\n\n    cpf : str\n        The path to the checkplot pickle to update.\n\n    fast_mode : bool or float\n        This runs the external catalog operations in a \"fast\" mode, with short\n        timeouts and not trying to hit external catalogs that take a long time\n        to respond. See the docstring for\n        :py:func:`astrobase.checkplot.pkl_utils._pkl_finder_objectinfo` for\n        details on how this works. If this is True, will run in \"fast\" mode with\n        default timeouts (5 seconds in most cases). If this is a float, will run\n        in \"fast\" mode with the provided timeout value in seconds.\n\n    findercmap : str or matplotlib.cm.ColorMap object\n        The Colormap object to use for the finder chart image.\n\n    finderconvolve : astropy.convolution.Kernel object or None\n        If not None, the Kernel object to use for convolving the finder image.\n\n    deredden_objects : bool\n        If this is True, will use the 2MASS DUST service to get extinction\n        coefficients in various bands, and then try to deredden the magnitudes\n        and colors of the object already present in the checkplot's objectinfo\n        dict.\n\n    custom_bandpasses : dict\n        This is a dict used to provide custom bandpass definitions for any\n        magnitude measurements in the objectinfo dict that are not automatically\n        recognized by the `varclass.starfeatures.color_features` function. See\n        its docstring for details on the required format.\n\n    gaia_submit_timeout : float\n        Sets the timeout in seconds to use when submitting a request to look up\n        the object's information to the GAIA service. Note that if `fast_mode`\n        is set, this is ignored.\n\n    gaia_submit_tries : int\n        Sets the maximum number of times the GAIA services will be contacted to\n        obtain this object's information. If `fast_mode` is set, this is\n        ignored, and the services will be contacted only once (meaning that a\n        failure to respond will be silently ignored and no GAIA data will be\n        added to the checkplot's objectinfo dict).\n\n    gaia_max_timeout : float\n        Sets the timeout in seconds to use when waiting for the GAIA service to\n        respond to our request for the object's information. Note that if\n        `fast_mode` is set, this is ignored.\n\n    gaia_mirror : str\n        This sets the GAIA mirror to use. This is a key in the\n        :py:data:`astrobase.services.gaia.GAIA_URLS` dict which defines the URLs\n        to hit for each mirror.\n\n    complete_query_later : bool\n        If this is True, saves the state of GAIA queries that are not yet\n        complete when `gaia_max_timeout` is reached while waiting for the GAIA\n        service to respond to our request. A later call for GAIA info on the\n        same object will attempt to pick up the results from the existing query\n        if it's completed. If `fast_mode` is True, this is ignored.\n\n    lclistpkl : dict or str\n        If this is provided, must be a dict resulting from reading a catalog\n        produced by the `lcproc.catalogs.make_lclist` function or a str path\n        pointing to the pickle file produced by that function. This catalog is\n        used to find neighbors of the current object in the current light curve\n        collection. Looking at neighbors of the object within the radius\n        specified by `nbrradiusarcsec` is useful for light curves produced by\n        instruments that have a large pixel scale, so are susceptible to\n        blending of variability and potential confusion of neighbor variability\n        with that of the actual object being looked at. If this is None, no\n        neighbor lookups will be performed.\n\n    nbrradiusarcsec : float\n        The radius in arcseconds to use for a search conducted around the\n        coordinates of this object to look for any potential confusion and\n        blending of variability amplitude caused by their proximity.\n\n    maxnumneighbors : int\n        The maximum number of neighbors that will have their light curves and\n        magnitudes noted in this checkplot as potential blends with the target\n        object.\n\n    plotdpi : int\n        The resolution in DPI of the plots to generate in this function\n        (e.g. the finder chart, etc.)\n\n    findercachedir : str\n        The path to the astrobase cache directory for finder chart downloads\n        from the NASA SkyView service.\n\n    verbose : bool\n        If True, will indicate progress and warn about potential problems.\n\n    Returns\n    -------\n\n    str\n        Path to the updated checkplot pickle file.\n\n    '''\n\n    cpd = _read_checkplot_picklefile(cpf)\n\n    if cpd['objectinfo']['objecttags'] is not None:\n        objecttags = cpd['objectinfo']['objecttags'][::]\n    else:\n        objecttags = None\n\n    varinfo = deepcopy(cpd['varinfo'])\n\n    if 'comments' in cpd and cpd['comments'] is not None:\n        comments = cpd['comments'][::]\n    else:\n        comments = None\n\n    newcpd = _pkl_finder_objectinfo(cpd['objectinfo'],\n                                    varinfo,\n                                    findercmap,\n                                    finderconvolve,\n                                    cpd['sigclip'],\n                                    cpd['normto'],\n                                    cpd['normmingap'],\n                                    fast_mode=fast_mode,\n                                    deredden_object=deredden_object,\n                                    custom_bandpasses=custom_bandpasses,\n                                    gaia_submit_timeout=gaia_submit_timeout,\n                                    gaia_submit_tries=gaia_submit_tries,\n                                    gaia_max_timeout=gaia_max_timeout,\n                                    gaia_mirror=gaia_mirror,\n                                    complete_query_later=complete_query_later,\n                                    lclistpkl=lclistpkl,\n                                    nbrradiusarcsec=nbrradiusarcsec,\n                                    maxnumneighbors=maxnumneighbors,\n                                    plotdpi=plotdpi,\n                                    findercachedir=findercachedir,\n                                    verbose=verbose)\n\n    #\n    # don't update neighbors or finder chart if the new one is bad\n    #\n    if (newcpd['finderchart'] is None and\n        cpd['finderchart'] is not None):\n        newcpd['finderchart'] = deepcopy(\n            cpd['finderchart']\n        )\n\n    if (newcpd['neighbors'] is None and\n        cpd['neighbors'] is not None):\n        newcpd['neighbors'] = deepcopy(\n            cpd['neighbors']\n        )\n\n    #\n    # if there's existing GAIA info, don't overwrite if the new objectinfo dict\n    # doesn't have any\n    #\n    if (('failed' in newcpd['objectinfo']['gaia_status'] or\n         ('gaiaid' in newcpd['objectinfo'] and\n          newcpd['objectinfo']['gaiaid'] is None)) and\n        'ok' in cpd['objectinfo']['gaia_status']):\n\n        newcpd['objectinfo']['gaia_status'] = deepcopy(\n            cpd['objectinfo']['gaia_status']\n        )\n        if 'gaiaid' in cpd['objectinfo']:\n            newcpd['objectinfo']['gaiaid'] = deepcopy(\n                cpd['objectinfo']['gaiaid']\n            )\n        newcpd['objectinfo']['gaiamag'] = deepcopy(\n            cpd['objectinfo']['gaiamag']\n        )\n        newcpd['objectinfo']['gaia_absmag'] = deepcopy(\n            cpd['objectinfo']['gaia_absmag']\n        )\n        newcpd['objectinfo']['gaia_parallax'] = deepcopy(\n            cpd['objectinfo']['gaia_parallax']\n        )\n        newcpd['objectinfo']['gaia_parallax_err'] = deepcopy(\n            cpd['objectinfo']['gaia_parallax_err']\n        )\n        newcpd['objectinfo']['gaia_pmra'] = deepcopy(\n            cpd['objectinfo']['gaia_pmra']\n        )\n        newcpd['objectinfo']['gaia_pmra_err'] = deepcopy(\n            cpd['objectinfo']['gaia_pmra_err']\n        )\n        newcpd['objectinfo']['gaia_pmdecl'] = deepcopy(\n            cpd['objectinfo']['gaia_pmdecl']\n        )\n        newcpd['objectinfo']['gaia_pmdecl_err'] = deepcopy(\n            cpd['objectinfo']['gaia_pmdecl_err']\n        )\n\n    if (not np.isfinite(newcpd['objectinfo']['gaia_neighbors']) and\n        np.isfinite(cpd['objectinfo']['gaia_neighbors'])):\n        newcpd['objectinfo']['gaia_neighbors'] = deepcopy(\n            cpd['objectinfo']['gaia_neighbors']\n        )\n    if (not np.isfinite(newcpd['objectinfo']['gaia_closest_distarcsec']) and\n        np.isfinite(cpd['objectinfo']['gaia_closest_distarcsec'])):\n        newcpd['objectinfo']['gaia_closest_distarcsec'] = deepcopy(\n            cpd['objectinfo']['gaia_closest_gmagdiff']\n        )\n    if (not np.isfinite(newcpd['objectinfo']['gaia_closest_gmagdiff']) and\n        np.isfinite(cpd['objectinfo']['gaia_closest_gmagdiff'])):\n        newcpd['objectinfo']['gaia_closest_gmagdiff'] = deepcopy(\n            cpd['objectinfo']['gaia_closest_gmagdiff']\n        )\n\n    if (newcpd['objectinfo']['gaia_ids'] is None and\n        cpd['objectinfo']['gaia_ids'] is not None):\n        newcpd['objectinfo']['gaia_ids'] = deepcopy(\n            cpd['objectinfo']['gaia_ids']\n        )\n    if (newcpd['objectinfo']['gaia_xypos'] is None and\n        cpd['objectinfo']['gaia_xypos'] is not None):\n        newcpd['objectinfo']['gaia_xypos'] = deepcopy(\n            cpd['objectinfo']['gaia_xypos']\n        )\n    if (newcpd['objectinfo']['gaia_mags'] is None and\n        cpd['objectinfo']['gaia_mags'] is not None):\n        newcpd['objectinfo']['gaia_mags'] = deepcopy(\n            cpd['objectinfo']['gaia_mags']\n        )\n    if (newcpd['objectinfo']['gaia_parallaxes'] is None and\n        cpd['objectinfo']['gaia_parallaxes'] is not None):\n        newcpd['objectinfo']['gaia_parallaxes'] = deepcopy(\n            cpd['objectinfo']['gaia_parallaxes']\n        )\n    if (newcpd['objectinfo']['gaia_parallax_errs'] is None and\n        cpd['objectinfo']['gaia_parallax_errs'] is not None):\n        newcpd['objectinfo']['gaia_parallax_errs'] = deepcopy(\n            cpd['objectinfo']['gaia_parallax_errs']\n        )\n    if (newcpd['objectinfo']['gaia_pmras'] is None and\n        cpd['objectinfo']['gaia_pmras'] is not None):\n        newcpd['objectinfo']['gaia_pmras'] = deepcopy(\n            cpd['objectinfo']['gaia_pmras']\n        )\n    if (newcpd['objectinfo']['gaia_pmra_errs'] is None and\n        cpd['objectinfo']['gaia_pmra_errs'] is not None):\n        newcpd['objectinfo']['gaia_pmra_errs'] = deepcopy(\n            cpd['objectinfo']['gaia_pmra_errs']\n        )\n    if (newcpd['objectinfo']['gaia_pmdecls'] is None and\n        cpd['objectinfo']['gaia_pmdecls'] is not None):\n        newcpd['objectinfo']['gaia_pmdecls'] = deepcopy(\n            cpd['objectinfo']['gaia_pmdecls']\n        )\n    if (newcpd['objectinfo']['gaia_pmdecl_errs'] is None and\n        cpd['objectinfo']['gaia_pmdecl_errs'] is not None):\n        newcpd['objectinfo']['gaia_pmdecl_errs'] = deepcopy(\n            cpd['objectinfo']['gaia_pmdecl_errs']\n        )\n    if (newcpd['objectinfo']['gaia_absolute_mags'] is None and\n        cpd['objectinfo']['gaia_absolute_mags'] is not None):\n        newcpd['objectinfo']['gaia_absolute_mags'] = deepcopy(\n            cpd['objectinfo']['gaia_absolute_mags']\n        )\n    if (newcpd['objectinfo']['gaiak_colors'] is None and\n        cpd['objectinfo']['gaiak_colors'] is not None):\n        newcpd['objectinfo']['gaiak_colors'] = deepcopy(\n            cpd['objectinfo']['gaiak_colors']\n        )\n    if (newcpd['objectinfo']['gaia_dists'] is None and\n        cpd['objectinfo']['gaia_dists'] is not None):\n        newcpd['objectinfo']['gaia_dists'] = deepcopy(\n            cpd['objectinfo']['gaia_dists']\n        )\n\n    #\n    # don't overwrite good SIMBAD info with bad\n    #\n    if ('failed' in newcpd['objectinfo']['simbad_status'] and\n        'ok' in cpd['objectinfo']['simbad_status']):\n        newcpd['objectinfo']['simbad_status'] = deepcopy(\n            cpd['objectinfo']['simbad_status']\n        )\n\n    if (newcpd['objectinfo']['simbad_nmatches'] is None and\n        cpd['objectinfo']['simbad_nmatches'] is not None):\n        newcpd['objectinfo']['simbad_nmatches'] = deepcopy(\n            cpd['objectinfo']['simbad_nmatches']\n        )\n    if (newcpd['objectinfo']['simbad_mainid'] is None and\n        cpd['objectinfo']['simbad_mainid'] is not None):\n        newcpd['objectinfo']['simbad_mainid'] = deepcopy(\n            cpd['objectinfo']['simbad_mainid']\n        )\n    if (newcpd['objectinfo']['simbad_objtype'] is None and\n        cpd['objectinfo']['simbad_objtype'] is not None):\n        newcpd['objectinfo']['simbad_objtype'] = deepcopy(\n            cpd['objectinfo']['simbad_objtype']\n        )\n    if (newcpd['objectinfo']['simbad_allids'] is None and\n        cpd['objectinfo']['simbad_allids'] is not None):\n        newcpd['objectinfo']['simbad_allids'] = deepcopy(\n            cpd['objectinfo']['simbad_allids']\n        )\n    if (newcpd['objectinfo']['simbad_distarcsec'] is None and\n        cpd['objectinfo']['simbad_distarcsec'] is not None):\n        newcpd['objectinfo']['simbad_distarcsec'] = deepcopy(\n            cpd['objectinfo']['simbad_distarcsec']\n        )\n    if (newcpd['objectinfo']['simbad_best_mainid'] is None and\n        cpd['objectinfo']['simbad_best_mainid'] is not None):\n        newcpd['objectinfo']['simbad_best_mainid'] = deepcopy(\n            cpd['objectinfo']['simbad_best_mainid']\n        )\n    if (newcpd['objectinfo']['simbad_best_objtype'] is None and\n        cpd['objectinfo']['simbad_best_objtype'] is not None):\n        newcpd['objectinfo']['simbad_best_objtype'] = deepcopy(\n            cpd['objectinfo']['simbad_best_objtype']\n        )\n    if (newcpd['objectinfo']['simbad_best_allids'] is None and\n        cpd['objectinfo']['simbad_best_allids'] is not None):\n        newcpd['objectinfo']['simbad_best_allids'] = deepcopy(\n            cpd['objectinfo']['simbad_best_allids']\n        )\n    if (newcpd['objectinfo']['simbad_best_distarcsec'] is None and\n        cpd['objectinfo']['simbad_best_distarcsec'] is not None):\n        newcpd['objectinfo']['simbad_best_distarcsec'] = deepcopy(\n            cpd['objectinfo']['simbad_best_distarcsec']\n        )\n\n\n    #\n    # update the objectinfo dict\n    #\n    cpd.update(newcpd)\n    cpd['objectinfo']['objecttags'] = objecttags\n    cpd['comments'] = comments\n\n    newcpf = _write_checkplot_picklefile(cpd, outfile=cpf)\n\n    return newcpf", "response": "This function updates the checkplot objectinfo dict with the information from the external catalog."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _read_pklc(lcfile):\n    '''\n    This just reads a light curve pickle file.\n\n    Parameters\n    ----------\n\n    lcfile : str\n        The file name of the pickle to open.\n\n    Returns\n    -------\n\n    dict\n        This returns an lcdict.\n\n    '''\n\n    if lcfile.endswith('.gz'):\n\n        try:\n            with gzip.open(lcfile,'rb') as infd:\n                lcdict = pickle.load(infd)\n        except UnicodeDecodeError:\n            with gzip.open(lcfile,'rb') as infd:\n                lcdict = pickle.load(infd, encoding='latin1')\n\n    else:\n\n        try:\n            with open(lcfile,'rb') as infd:\n                lcdict = pickle.load(infd)\n        except UnicodeDecodeError:\n            with open(lcfile,'rb') as infd:\n                lcdict = pickle.load(infd, encoding='latin1')\n\n    return lcdict", "response": "This just reads a light curve pickle file and returns a dict."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef get_lcformat(formatkey, use_lcformat_dir=None):\n    '''This loads an LC format description from a previously-saved JSON file.\n\n    Parameters\n    ----------\n\n    formatkey : str\n        The key used to refer to the LC format. This is part of the JSON file's\n        name, e.g. the format key 'hat-csv' maps to the format JSON file:\n        '<astrobase install path>/data/lcformats/hat-csv.json'.\n\n    use_lcformat_dir : str or None\n        If provided, must be the path to a directory that contains the\n        corresponding lcformat JSON file for `formatkey`. If this is None, this\n        function will look for lcformat JSON files corresponding to the given\n        `formatkey`:\n\n        - first, in the directory specified in this kwarg,\n        - if not found there, in the home directory: ~/.astrobase/lcformat-jsons\n        - if not found there, in: <astrobase install path>/data/lcformats\n\n    Returns\n    -------\n\n    tuple\n        A tuple of the following form is returned::\n\n            (fileglob       : the file glob of the associated LC files,\n             readerfunc_in  : the imported Python function for reading LCs,\n             timecols       : list of time col keys to get from the lcdict,\n             magcols        : list of mag col keys to get from the lcdict ,\n             errcols        : list of err col keys to get from the lcdict,\n             magsarefluxes  : True if the measurements are fluxes not mags,\n             normfunc_in    : the imported Python function for normalizing LCs)\n\n        All `astrobase.lcproc` functions can then use this tuple to dynamically\n        import your LC reader and normalization functions to work with your LC\n        format transparently.\n\n    '''\n\n    if isinstance(use_lcformat_dir, str):\n\n        # look for the lcformat JSON\n        lcformat_jsonpath = os.path.join(\n            use_lcformat_dir,\n            '%s.json' % formatkey\n        )\n\n        if not os.path.exists(lcformat_jsonpath):\n\n            lcformat_jsonpath = os.path.join(\n                os.path.expanduser('~/.astrobase/lcformat-jsons'),\n                '%s.json' % formatkey\n            )\n\n            if not os.path.exists(lcformat_jsonpath):\n\n                install_path = os.path.dirname(__file__)\n                install_path = os.path.abspath(\n                    os.path.join(install_path, '..', 'data','lcformats')\n                )\n\n                lcformat_jsonpath = os.path.join(\n                    install_path,\n                    '%s.json' % formatkey\n                )\n\n                if not os.path.exists(lcformat_jsonpath):\n                    LOGERROR('could not find an lcformat JSON '\n                             'for formatkey: %s in any of: '\n                             'use_lcformat_dir, home directory, '\n                             'astrobase installed data directory'\n                             % formatkey)\n                    return None\n\n    else:\n\n        lcformat_jsonpath = os.path.join(\n            os.path.expanduser('~/.astrobase/lcformat-jsons'),\n            '%s.json' % formatkey\n        )\n\n        if not os.path.exists(lcformat_jsonpath):\n\n            install_path = os.path.dirname(__file__)\n            install_path = os.path.abspath(\n                os.path.join(install_path, '..', 'data','lcformats')\n            )\n\n            lcformat_jsonpath = os.path.join(\n                install_path,\n                '%s.json' % formatkey\n            )\n\n            if not os.path.exists(lcformat_jsonpath):\n                LOGERROR('could not find an lcformat JSON '\n                         'for formatkey: %s in any of: '\n                         'use_lcformat_dir, home directory, '\n                         'astrobase installed data directory'\n                         % formatkey)\n                return None\n\n    # load the found lcformat JSON\n    with open(lcformat_jsonpath) as infd:\n        lcformatdict = json.load(infd)\n\n    readerfunc_module = lcformatdict['lcreader_module']\n    readerfunc = lcformatdict['lcreader_func']\n    readerfunc_kwargs = lcformatdict['lcreader_kwargs']\n    normfunc_module = lcformatdict['lcnorm_module']\n    normfunc = lcformatdict['lcnorm_func']\n    normfunc_kwargs = lcformatdict['lcnorm_kwargs']\n\n    fileglob = lcformatdict['fileglob']\n    timecols = lcformatdict['timecols']\n    magcols = lcformatdict['magcols']\n    errcols = lcformatdict['errcols']\n    magsarefluxes = lcformatdict['magsarefluxes']\n\n    # import all the required bits\n    # see if we can import the reader module\n    readermodule = _check_extmodule(readerfunc_module, formatkey)\n\n    if not readermodule:\n        LOGERROR(\"could not import the required \"\n                 \"module: %s to read %s light curves\" %\n                 (readerfunc_module, formatkey))\n        return None\n\n    # then, get the function we need to read the light curve\n    try:\n        readerfunc_in = getattr(readermodule, readerfunc)\n    except AttributeError:\n        LOGEXCEPTION('Could not get the specified reader '\n                     'function: %s for lcformat: %s '\n                     'from module: %s'\n                     % (formatkey, readerfunc_module, readerfunc))\n        raise\n\n    # see if we can import the normalization module\n    if normfunc_module:\n        normmodule = _check_extmodule(normfunc_module, formatkey)\n        if not normmodule:\n            LOGERROR(\"could not import the required \"\n                     \"module: %s to normalize %s light curves\" %\n                     (normfunc_module, formatkey))\n            return None\n\n    else:\n        normmodule = None\n\n    # finally, get the function we need to normalize the light curve\n    if normfunc_module and normfunc:\n        try:\n            normfunc_in = getattr(normmodule, normfunc)\n        except AttributeError:\n            LOGEXCEPTION('Could not get the specified norm '\n                         'function: %s for lcformat: %s '\n                         'from module: %s'\n                         % (formatkey, normfunc_module, normfunc))\n            raise\n\n    else:\n        normfunc_in = None\n\n\n    # add in any optional kwargs that need to be there for readerfunc\n    if isinstance(readerfunc_kwargs, dict):\n        readerfunc_in = partial(readerfunc_in, **readerfunc_kwargs)\n\n    # add in any optional kwargs that need to be there for normfunc\n    if normfunc_in is not None:\n        if isinstance(normfunc_kwargs, dict):\n            normfunc_in = partial(normfunc_in, **normfunc_kwargs)\n\n    # assemble the return tuple\n    # this can be used directly by other lcproc functions\n    returntuple = (\n        fileglob,\n        readerfunc_in,\n        timecols,\n        magcols,\n        errcols,\n        magsarefluxes,\n        normfunc_in,\n    )\n\n    return returntuple", "response": "This function loads an LC format description from a previously - saved JSON file."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nuploads a file to an Amazon S3 bucket.", "response": "def s3_put_file(local_file, bucket, client=None, raiseonfail=False):\n    \"\"\"This uploads a file to S3.\n\n    Parameters\n    ----------\n\n    local_file : str\n        Path to the file to upload to S3.\n\n    bucket : str\n        The AWS S3 bucket to upload the file to.\n\n    client : boto3.Client or None\n        If None, this function will instantiate a new `boto3.Client` object to\n        use in its operations. Alternatively, pass in an existing `boto3.Client`\n        instance to re-use it here.\n\n    raiseonfail : bool\n        If True, will re-raise whatever Exception caused the operation to fail\n        and break out immediately.\n\n    Returns\n    -------\n\n    str or None\n        If the file upload is successful, returns the s3:// URL of the uploaded\n        file. If it failed, will return None.\n\n    \"\"\"\n\n    if not client:\n        client = boto3.client('s3')\n\n    try:\n        client.upload_file(local_file, bucket, os.path.basename(local_file))\n        return 's3://%s/%s' % (bucket, os.path.basename(local_file))\n    except Exception as e:\n        LOGEXCEPTION('could not upload %s to bucket: %s' % (local_file,\n                                                            bucket))\n\n        if raiseonfail:\n            raise\n\n        return None"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef s3_delete_file(bucket, filename, client=None, raiseonfail=False):\n\n    if not client:\n        client = boto3.client('s3')\n\n    try:\n        resp = client.delete_object(Bucket=bucket, Key=filename)\n        if not resp:\n            LOGERROR('could not delete file %s from bucket %s' % (filename,\n                                                                  bucket))\n        else:\n            return resp['DeleteMarker']\n    except Exception as e:\n        LOGEXCEPTION('could not delete file %s from bucket %s' % (filename,\n                                                                  bucket))\n        if raiseonfail:\n            raise\n\n        return None", "response": "This function deletes a file from an S3 bucket."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef sqs_create_queue(queue_name, options=None, client=None):\n\n    if not client:\n        client = boto3.client('sqs')\n\n    try:\n\n        if isinstance(options, dict):\n            resp = client.create_queue(QueueName=queue_name, Attributes=options)\n        else:\n            resp = client.create_queue(QueueName=queue_name)\n\n        if resp is not None:\n            return {'url':resp['QueueUrl'],\n                    'name':queue_name}\n        else:\n            LOGERROR('could not create the specified queue: %s with options: %s'\n                     % (queue_name, options))\n            return None\n\n    except Exception as e:\n        LOGEXCEPTION('could not create the specified queue: %s with options: %s'\n                     % (queue_name, options))\n        return None", "response": "This function creates an SQS queue."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef sqs_delete_item(queue_url,\n                    receipt_handle,\n                    client=None,\n                    raiseonfail=False):\n    \"\"\"This deletes a message from the queue, effectively acknowledging its\n    receipt.\n\n    Call this only when all messages retrieved from the queue have been\n    processed, since this will prevent redelivery of these messages to other\n    queue workers pulling fromn the same queue channel.\n\n    Parameters\n    ----------\n\n    queue_url : str\n        The SQS URL of the queue where we got the messages from. This should be\n        the same queue used to retrieve the messages in `sqs_get_item`.\n\n    receipt_handle : str\n        The receipt handle of the queue message that we're responding to, and\n        will acknowledge receipt of. This will be present in each message\n        retrieved using `sqs_get_item`.\n\n    client : boto3.Client or None\n        If None, this function will instantiate a new `boto3.Client` object to\n        use in its operations. Alternatively, pass in an existing `boto3.Client`\n        instance to re-use it here.\n\n    raiseonfail : bool\n        If True, will re-raise whatever Exception caused the operation to fail\n        and break out immediately.\n\n    Returns\n    -------\n\n    Nothing.\n\n    \"\"\"\n\n    if not client:\n        client = boto3.client('sqs')\n\n    try:\n\n        client.delete_message(\n            QueueUrl=queue_url,\n            ReceiptHandle=receipt_handle\n        )\n\n    except Exception as e:\n\n        LOGEXCEPTION(\n            'could not delete message with receipt handle: '\n            '%s from queue: %s' % (receipt_handle, queue_url)\n        )\n\n        if raiseonfail:\n            raise", "response": "This function deletes a message from the SQS queue."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef make_ec2_nodes(\n        security_groupid,\n        subnet_id,\n        keypair_name,\n        iam_instance_profile_arn,\n        launch_instances=1,\n        ami='ami-04681a1dbd79675a5',\n        instance='t3.micro',\n        ebs_optimized=True,\n        user_data=None,\n        wait_until_up=True,\n        client=None,\n        raiseonfail=False,\n):\n    \"\"\"This makes new EC2 worker nodes.\n\n    This requires a security group ID attached to a VPC config and subnet, a\n    keypair generated beforehand, and an IAM role ARN for the instance. See:\n\n    https://docs.aws.amazon.com/cli/latest/userguide/tutorial-ec2-ubuntu.html\n\n    Use `user_data` to launch tasks on instance launch.\n\n    Parameters\n    ----------\n\n    security_groupid : str\n        The security group ID of the AWS VPC where the instances will be\n        launched.\n\n    subnet_id : str\n        The subnet ID of the AWS VPC where the instances will be\n        launched.\n\n    keypair_name : str\n        The name of the keypair to be used to allow SSH access to all instances\n        launched here. This corresponds to an already downloaded AWS keypair PEM\n        file.\n\n    iam_instance_profile_arn : str\n        The ARN string corresponding to the AWS instance profile that describes\n        the permissions the launched instances have to access other AWS\n        resources. Set this up in AWS IAM.\n\n    launch_instances : int\n        The number of instances to launch in this request.\n\n    ami : str\n        The Amazon Machine Image ID that describes the OS the instances will use\n        after launch. The default ID is Amazon Linux 2 in the US East region.\n\n    instance : str\n        The instance type to launch. See the following URL for a list of IDs:\n        https://aws.amazon.com/ec2/pricing/on-demand/\n\n    ebs_optimized : bool\n        If True, will enable EBS optimization to speed up IO. This is usually\n        True for all instances made available in the last couple of years.\n\n    user_data : str or None\n        This is either the path to a file on disk that contains a shell-script\n        or a string containing a shell-script that will be executed by root\n        right after the instance is launched. Use to automatically set up\n        workers and queues. If None, will not execute anything at instance\n        start up.\n\n    wait_until_up : bool\n        If True, will not return from this function until all launched instances\n        are verified as running by AWS.\n\n    client : boto3.Client or None\n        If None, this function will instantiate a new `boto3.Client` object to\n        use in its operations. Alternatively, pass in an existing `boto3.Client`\n        instance to re-use it here.\n\n    raiseonfail : bool\n        If True, will re-raise whatever Exception caused the operation to fail\n        and break out immediately.\n\n    Returns\n    -------\n\n    dict\n        Returns launched instance info as a dict, keyed by instance ID.\n\n    \"\"\"\n\n    if not client:\n        client = boto3.client('ec2')\n\n    # get the user data from a string or a file\n    # note: boto3 will base64 encode this itself\n    if isinstance(user_data, str) and os.path.exists(user_data):\n        with open(user_data,'r') as infd:\n            udata = infd.read()\n\n    elif isinstance(user_data, str):\n        udata = user_data\n\n    else:\n        udata = (\n            '#!/bin/bash\\necho \"No user data provided. '\n            'Launched instance at: %s UTC\"' % datetime.utcnow().isoformat()\n        )\n\n\n    # fire the request\n    try:\n        resp = client.run_instances(\n            ImageId=ami,\n            InstanceType=instance,\n            SecurityGroupIds=[\n                security_groupid,\n            ],\n            SubnetId=subnet_id,\n            UserData=udata,\n            IamInstanceProfile={'Arn':iam_instance_profile_arn},\n            InstanceInitiatedShutdownBehavior='terminate',\n            KeyName=keypair_name,\n            MaxCount=launch_instances,\n            MinCount=launch_instances,\n            EbsOptimized=ebs_optimized,\n        )\n\n        if not resp:\n            LOGERROR('could not launch requested instance')\n            return None\n\n        else:\n\n            instance_dict = {}\n\n            instance_list = resp.get('Instances',[])\n\n            if len(instance_list) > 0:\n\n                for instance in instance_list:\n\n                    LOGINFO('launched instance ID: %s of type: %s at: %s. '\n                            'current state: %s'\n                            % (instance['InstanceId'],\n                               instance['InstanceType'],\n                               instance['LaunchTime'].isoformat(),\n                               instance['State']['Name']))\n\n                    instance_dict[instance['InstanceId']] = {\n                        'type':instance['InstanceType'],\n                        'launched':instance['LaunchTime'],\n                        'state':instance['State']['Name'],\n                        'info':instance\n                    }\n\n            # if we're waiting until we're up, then do so\n            if wait_until_up:\n\n                ready_instances = []\n\n                LOGINFO('waiting until launched instances are up...')\n\n                ntries = 5\n                curr_try = 0\n\n                while ( (curr_try < ntries) or\n                        ( len(ready_instances) <\n                          len(list(instance_dict.keys()))) ):\n\n                    resp = client.describe_instances(\n                        InstanceIds=list(instance_dict.keys()),\n                    )\n\n                    if len(resp['Reservations']) > 0:\n                        for resv in resp['Reservations']:\n                            if len(resv['Instances']) > 0:\n                                for instance in resv['Instances']:\n                                    if instance['State']['Name'] == 'running':\n\n                                        ready_instances.append(\n                                            instance['InstanceId']\n                                        )\n\n                                        instance_dict[\n                                            instance['InstanceId']\n                                        ]['state'] = 'running'\n\n                                        instance_dict[\n                                            instance['InstanceId']\n                                        ]['ip'] = instance['PublicIpAddress']\n\n                                        instance_dict[\n                                            instance['InstanceId']\n                                        ]['info'] = instance\n\n                    # sleep for a bit so we don't hit the API too often\n                    curr_try = curr_try + 1\n                    time.sleep(5.0)\n\n                if len(ready_instances) == len(list(instance_dict.keys())):\n                    LOGINFO('all instances now up.')\n                else:\n                    LOGWARNING(\n                        'reached maximum number of tries for instance status, '\n                        'not all instances may be up.'\n                    )\n\n\n            return instance_dict\n\n    except ClientError as e:\n\n        LOGEXCEPTION('could not launch requested instance')\n        if raiseonfail:\n            raise\n\n        return None\n\n    except Exception as e:\n\n        LOGEXCEPTION('could not launch requested instance')\n        if raiseonfail:\n            raise\n\n        return None", "response": "This function creates EC2 worker nodes for the given instance."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef delete_ec2_nodes(\n        instance_id_list,\n        client=None\n):\n    \"\"\"This deletes EC2 nodes and terminates the instances.\n\n    Parameters\n    ----------\n\n    instance_id_list : list of str\n        A list of EC2 instance IDs to terminate.\n\n    client : boto3.Client or None\n        If None, this function will instantiate a new `boto3.Client` object to\n        use in its operations. Alternatively, pass in an existing `boto3.Client`\n        instance to re-use it here.\n\n    Returns\n    -------\n\n    Nothing.\n\n    \"\"\"\n\n    if not client:\n        client = boto3.client('ec2')\n\n    resp = client.terminate_instances(\n        InstanceIds=instance_id_list\n    )\n\n    return resp", "response": "This function deletes EC2 nodes and terminates the instances."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef make_spot_fleet_cluster(\n        security_groupid,\n        subnet_id,\n        keypair_name,\n        iam_instance_profile_arn,\n        spot_fleet_iam_role,\n        target_capacity=20,\n        spot_price=0.4,\n        expires_days=7,\n        allocation_strategy='lowestPrice',\n        instance_types=SPOT_INSTANCE_TYPES,\n        instance_weights=None,\n        instance_ami='ami-04681a1dbd79675a5',\n        instance_user_data=None,\n        instance_ebs_optimized=True,\n        wait_until_up=True,\n        client=None,\n        raiseonfail=False\n):\n    \"\"\"This makes an EC2 spot-fleet cluster.\n\n    This requires a security group ID attached to a VPC config and subnet, a\n    keypair generated beforehand, and an IAM role ARN for the instance. See:\n\n    https://docs.aws.amazon.com/cli/latest/userguide/tutorial-ec2-ubuntu.html\n\n    Use `user_data` to launch tasks on instance launch.\n\n    Parameters\n    ----------\n\n    security_groupid : str\n        The security group ID of the AWS VPC where the instances will be\n        launched.\n\n    subnet_id : str\n        The subnet ID of the AWS VPC where the instances will be\n        launched.\n\n    keypair_name : str\n        The name of the keypair to be used to allow SSH access to all instances\n        launched here. This corresponds to an already downloaded AWS keypair PEM\n        file.\n\n    iam_instance_profile_arn : str\n        The ARN string corresponding to the AWS instance profile that describes\n        the permissions the launched instances have to access other AWS\n        resources. Set this up in AWS IAM.\n\n    spot_fleet_iam_role : str\n        This is the name of AWS IAM role that allows the Spot Fleet Manager to\n        scale up and down instances based on demand and instances failing,\n        etc. Set this up in IAM.\n\n    target_capacity : int\n        The number of instances to target in the fleet request. The fleet\n        manager service will attempt to maintain this number over the lifetime\n        of the Spot Fleet Request.\n\n    spot_price : float\n        The bid price in USD for the instances. This is per hour. Keep this at\n        about half the hourly on-demand price of the desired instances to make\n        sure your instances aren't taken away by AWS when it needs capacity.\n\n    expires_days : int\n        The number of days this request is active for. All instances launched by\n        this request will live at least this long and will be terminated\n        automatically after.\n\n    allocation_strategy : {'lowestPrice', 'diversified'}\n        The allocation strategy used by the fleet manager.\n\n    instance_types : list of str\n        List of the instance type to launch. See the following URL for a list of\n        IDs: https://aws.amazon.com/ec2/pricing/on-demand/\n\n    instance_weights : list of float or None\n        If `instance_types` is a list of different instance types, this is the\n        relative weight applied towards launching each instance type. This can\n        be used to launch a mix of instances in a defined ratio among their\n        types. Doing this can make the spot fleet more resilient to AWS taking\n        back the instances if it runs out of capacity.\n\n    instance_ami : str\n        The Amazon Machine Image ID that describes the OS the instances will use\n        after launch. The default ID is Amazon Linux 2 in the US East region.\n\n    instance_user_data : str or None\n        This is either the path to a file on disk that contains a shell-script\n        or a string containing a shell-script that will be executed by root\n        right after the instance is launched. Use to automatically set up\n        workers and queues. If None, will not execute anything at instance\n        start up.\n\n    instance_ebs_optimized : bool\n        If True, will enable EBS optimization to speed up IO. This is usually\n        True for all instances made available in the last couple of years.\n\n    wait_until_up : bool\n        If True, will not return from this function until the spot fleet request\n        is acknowledged by AWS.\n\n    client : boto3.Client or None\n        If None, this function will instantiate a new `boto3.Client` object to\n        use in its operations. Alternatively, pass in an existing `boto3.Client`\n        instance to re-use it here.\n\n    raiseonfail : bool\n        If True, will re-raise whatever Exception caused the operation to fail\n        and break out immediately.\n\n    Returns\n    -------\n\n    str or None\n        This is the spot fleet request ID if successful. Otherwise, returns\n        None.\n\n    \"\"\"\n\n    fleetconfig = copy.deepcopy(SPOT_FLEET_CONFIG)\n    fleetconfig['IamFleetRole'] = spot_fleet_iam_role\n    fleetconfig['AllocationStrategy'] = allocation_strategy\n    fleetconfig['TargetCapacity'] = target_capacity\n    fleetconfig['SpotPrice'] = str(spot_price)\n    fleetconfig['ValidUntil'] = (\n        datetime.utcnow() + timedelta(days=expires_days)\n    ).strftime(\n        '%Y-%m-%dT%H:%M:%SZ'\n    )\n\n    # get the user data from a string or a file\n    # we need to base64 encode it here\n    if (isinstance(instance_user_data, str) and\n        os.path.exists(instance_user_data)):\n        with open(instance_user_data,'rb') as infd:\n            udata = base64.b64encode(infd.read()).decode()\n\n    elif isinstance(instance_user_data, str):\n        udata = base64.b64encode(instance_user_data.encode()).decode()\n\n    else:\n        udata = (\n            '#!/bin/bash\\necho \"No user data provided. '\n            'Launched instance at: %s UTC\"' % datetime.utcnow().isoformat()\n        )\n        udata = base64.b64encode(udata.encode()).decode()\n\n\n    for ind, itype in enumerate(instance_types):\n\n        thisinstance = SPOT_PERINSTANCE_CONFIG.copy()\n        thisinstance['InstanceType'] = itype\n        thisinstance['ImageId'] = instance_ami\n        thisinstance['SubnetId'] = subnet_id\n        thisinstance['KeyName'] = keypair_name\n        thisinstance['IamInstanceProfile']['Arn'] = iam_instance_profile_arn\n        thisinstance['SecurityGroups'][0] = {'GroupId':security_groupid}\n        thisinstance['UserData'] = udata\n        thisinstance['EbsOptimized'] = instance_ebs_optimized\n\n        # get the instance weights\n        if isinstance(instance_weights, list):\n            thisinstance['WeightedCapacity'] = instance_weights[ind]\n\n        fleetconfig['LaunchSpecifications'].append(thisinstance)\n\n    #\n    # launch the fleet\n    #\n\n    if not client:\n        client = boto3.client('ec2')\n\n    try:\n\n        resp = client.request_spot_fleet(\n            SpotFleetRequestConfig=fleetconfig,\n        )\n\n        if not resp:\n\n            LOGERROR('spot fleet request failed.')\n            return None\n\n        else:\n\n            spot_fleet_reqid = resp['SpotFleetRequestId']\n            LOGINFO('spot fleet requested successfully. request ID: %s' %\n                    spot_fleet_reqid)\n\n            if not wait_until_up:\n                return spot_fleet_reqid\n\n            else:\n\n                ntries = 10\n                curr_try = 0\n\n                while curr_try < ntries:\n\n                    resp = client.describe_spot_fleet_requests(\n                        SpotFleetRequestIds=[\n                            spot_fleet_reqid\n                        ]\n                    )\n\n                    curr_state = resp.get('SpotFleetRequestConfigs',[])\n\n                    if len(curr_state) > 0:\n\n                        curr_state = curr_state[0]['SpotFleetRequestState']\n\n                        if curr_state == 'active':\n                            LOGINFO('spot fleet with reqid: %s is now active' %\n                                    spot_fleet_reqid)\n                            break\n\n                    LOGINFO(\n                        'spot fleet not yet active, waiting 15 seconds. '\n                        'try %s/%s' % (curr_try, ntries)\n                    )\n                    curr_try = curr_try + 1\n                    time.sleep(15.0)\n\n                return spot_fleet_reqid\n\n\n    except ClientError as e:\n\n        LOGEXCEPTION('could not launch spot fleet')\n        if raiseonfail:\n            raise\n\n        return None\n\n    except Exception as e:\n\n        LOGEXCEPTION('could not launch spot fleet')\n        if raiseonfail:\n            raise\n\n        return None", "response": "This function creates an EC2 spot - fleet cluster."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef delete_spot_fleet_cluster(\n        spot_fleet_reqid,\n        client=None,\n):\n    \"\"\"\n    This deletes a spot-fleet cluster.\n\n    Parameters\n    ----------\n\n    spot_fleet_reqid : str\n        The fleet request ID returned by `make_spot_fleet_cluster`.\n\n    client : boto3.Client or None\n        If None, this function will instantiate a new `boto3.Client` object to\n        use in its operations. Alternatively, pass in an existing `boto3.Client`\n        instance to re-use it here.\n\n    Returns\n    -------\n\n    Nothing.\n\n    \"\"\"\n\n    if not client:\n        client = boto3.client('ec2')\n\n    resp = client.cancel_spot_fleet_requests(\n        SpotFleetRequestIds=[spot_fleet_reqid],\n        TerminateInstances=True\n    )\n\n    return resp", "response": "This function deletes a spot - fleet cluster."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef gcs_get_file(bucketname,\n                 filename,\n                 local_file,\n                 altexts=None,\n                 client=None,\n                 service_account_json=None,\n                 raiseonfail=False):\n    \"\"\"This gets a single file from a Google Cloud Storage bucket.\n\n    Parameters\n    ----------\n\n    bucketname : str\n        The name of the GCS bucket to download the file from.\n\n    filename : str\n        The full name of the file to download, including all prefixes.\n\n    local_file : str\n        Path to where the downloaded file will be stored.\n\n    altexts : None or list of str\n        If not None, this is a list of alternate extensions to try for the file\n        other than the one provided in `filename`. For example, to get anything\n        that's an .sqlite where .sqlite.gz is expected, use altexts=[''] to\n        strip the .gz.\n\n    client : google.cloud.storage.Client instance\n        The instance of the Client to use to perform the download operation. If\n        this is None, a new Client will be used. If this is None and\n        `service_account_json` points to a downloaded JSON file with GCS\n        credentials, a new Client with the provided credentials will be used. If\n        this is not None, the existing Client instance will be used.\n\n    service_account_json : str\n        Path to a downloaded GCS credentials JSON file.\n\n    raiseonfail : bool\n        If True, will re-raise whatever Exception caused the operation to fail\n        and break out immediately.\n\n    Returns\n    -------\n\n    str\n        Path to the downloaded filename or None if the download was\n        unsuccessful.\n\n    \"\"\"\n\n    if not client:\n\n        if (service_account_json is not None and\n            os.path.exists(service_account_json)):\n            client = storage.Client.from_service_account_json(\n                service_account_json\n            )\n        else:\n            client = storage.Client()\n\n    try:\n\n        bucket = client.get_bucket(bucketname)\n        blob = bucket.get_blob(filename)\n        blob.download_to_filename(local_file)\n        return local_file\n\n    except Exception as e:\n\n        for alt_extension in altexts:\n\n            split_ext = os.path.splitext(filename)\n            check_file = split_ext[0] + alt_extension\n            try:\n                bucket = client.get_bucket(bucket)\n                blob = bucket.get_blob(check_file)\n                blob.download_to_filename(\n                    local_file.replace(split_ext[-1],\n                                       alt_extension)\n                )\n                return local_file.replace(split_ext[-1],\n                                          alt_extension)\n            except Exception as e:\n                pass\n\n    else:\n\n        LOGEXCEPTION('could not download gs://%s/%s' % (bucket, filename))\n\n        if raiseonfail:\n            raise\n\n        return None", "response": "This function downloads a single file from a Google Cloud Storage bucket."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef read_hatlc(hatlc):\n    '''\n    This reads a consolidated HAT LC written by the functions above.\n\n    Returns a dict.\n\n    '''\n\n    lcfname = os.path.basename(hatlc)\n\n    # unzip the files first\n    if '.gz' in lcfname:\n        lcf = gzip.open(hatlc,'rb')\n    elif '.bz2' in lcfname:\n        lcf = bz2.BZ2File(hatlc, 'rb')\n    else:\n        lcf = open(hatlc,'rb')\n\n    if '.fits' in lcfname and HAVEPYFITS:\n\n        hdulist = pyfits.open(lcf)\n        objectinfo = hdulist[0].header\n        objectlc = hdulist[1].data\n        lccols = objectlc.columns.names\n        hdulist.close()\n        lcf.close()\n\n        lcdict = {}\n\n        for col in lccols:\n            lcdict[col] = np.array(objectlc[col])\n\n        lcdict['hatid'] = objectinfo['hatid']\n        lcdict['twomassid'] = objectinfo['2massid']\n        lcdict['ra'] = objectinfo['ra']\n        lcdict['dec'] = objectinfo['dec']\n        lcdict['mags'] = [objectinfo[x] for x in ('vmag','rmag','imag',\n                                                  'jmag','hmag','kmag')]\n        lcdict['ndet'] = objectinfo['ndet']\n        lcdict['hatstations'] = objectinfo['hats']\n        lcdict['filters'] = objectinfo['filters']\n        lcdict['columns'] = lccols\n\n        return lcdict\n\n    elif '.fits' in lcfname and not HAVEPYFITS:\n\n        print(\"can't read %s since we don't have the pyfits module\" % lcfname)\n        return\n\n    elif '.csv' in lcfname or '.hatlc' in lcfname:\n\n        lcflines = lcf.read().decode().split('\\n')\n        lcf.close()\n\n        # now process the read-in LC\n        objectdata = [x for x in lcflines if x.startswith('#')]\n        objectlc = [x for x in lcflines if not x.startswith('#')]\n        objectlc = [x for x in objectlc if len(x) > 1]\n\n        if '.csv' in lcfname:\n            objectlc = [x.split(',') for x in objectlc]\n        else:\n            objectlc = [x.split() for x in objectlc]\n\n        # transpose split rows to get columns\n        objectlc = list(zip(*objectlc))\n\n        # read the header to figure out the object's info and column names\n        objectdata = [x.strip('#') for x in objectdata]\n        objectdata = [x.strip() for x in objectdata]\n        objectdata = [x for x in objectdata if len(x) > 0]\n\n        hatid, twomassid = objectdata[0].split(' - ')\n        ra, dec = objectdata[1].split(', ')\n        ra = float(ra.split(' = ')[-1].strip(' deg'))\n        dec = float(dec.split(' = ')[-1].strip(' deg'))\n\n        vmag, rmag, imag, jmag, hmag, kmag = objectdata[2].split(', ')\n        vmag = float(vmag.split(' = ')[-1])\n        rmag = float(rmag.split(' = ')[-1])\n        imag = float(imag.split(' = ')[-1])\n        jmag = float(jmag.split(' = ')[-1])\n        hmag = float(hmag.split(' = ')[-1])\n        kmag = float(kmag.split(' = ')[-1])\n\n        ndet = int(objectdata[3].split(': ')[-1])\n        hatstations = objectdata[4].split(': ')[-1]\n\n        filterhead_ind = objectdata.index('Filters used:')\n        columnhead_ind = objectdata.index('Columns:')\n\n        filters = objectdata[filterhead_ind:columnhead_ind]\n\n        columndefs = objectdata[columnhead_ind+1:]\n\n        columns = []\n        for line in columndefs:\n\n            colnum, colname, coldesc = line.split(' - ')\n            columns.append(colname)\n\n        lcdict = {}\n\n        # now write all the columns to the output dictionary\n        for ind, col in enumerate(columns):\n\n            # this formats everything nicely using our existing column\n            # definitions\n            lcdict[col] = np.array([TEXTLC_OUTPUT_COLUMNS[col][3](x)\n                                    for x in objectlc[ind]])\n\n        # write the object metadata to the output dictionary\n        lcdict['hatid'] = hatid\n        lcdict['twomassid'] = twomassid.replace('2MASS J','')\n        lcdict['ra'] = ra\n        lcdict['dec'] = dec\n        lcdict['mags'] = [vmag, rmag, imag, jmag, hmag, kmag]\n        lcdict['ndet'] = ndet\n        lcdict['hatstations'] = hatstations.split(', ')\n        lcdict['filters'] = filters[1:]\n        lcdict['cols'] = columns\n\n        return lcdict", "response": "Reads a consolidated HAT LC file and returns a dict."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nread a pickled LC file and returns a dictionary of the base class s attributes.", "response": "def read_fakelc(fakelcfile):\n    '''\n    This just reads a pickled fake LC.\n\n    Parameters\n    ----------\n\n    fakelcfile : str\n        The fake LC file to read.\n\n    Returns\n    -------\n\n    dict\n        This returns an lcdict.\n\n    '''\n\n    try:\n        with open(fakelcfile,'rb') as infd:\n            lcdict = pickle.load(infd)\n    except UnicodeDecodeError:\n        with open(fakelcfile,'rb') as infd:\n            lcdict = pickle.load(infd, encoding='latin1')\n\n    return lcdict"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ncalculating the precision of the base log file.", "response": "def precision(ntp, nfp):\n    '''\n    This calculates precision.\n\n    https://en.wikipedia.org/wiki/Precision_and_recall\n\n    Parameters\n    ----------\n\n    ntp : int\n        The number of true positives.\n\n    nfp : int\n        The number of false positives.\n\n    Returns\n    -------\n\n    float\n        The precision calculated using `ntp/(ntp + nfp)`.\n\n    '''\n\n    if (ntp+nfp) > 0:\n        return ntp/(ntp+nfp)\n    else:\n        return np.nan"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ncalculates the recall of a single node in a sequence.", "response": "def recall(ntp, nfn):\n    '''\n    This calculates recall.\n\n    https://en.wikipedia.org/wiki/Precision_and_recall\n\n    Parameters\n    ----------\n\n    ntp : int\n        The number of true positives.\n\n    nfn : int\n        The number of false negatives.\n\n    Returns\n    -------\n\n    float\n        The precision calculated using `ntp/(ntp + nfn)`.\n\n    '''\n\n    if (ntp+nfn) > 0:\n        return ntp/(ntp+nfn)\n    else:\n        return np.nan"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ncalculates the Matthews correlation coefficient of two sets of true positives and false negatives.", "response": "def matthews_correl_coeff(ntp, ntn, nfp, nfn):\n    '''\n    This calculates the Matthews correlation coefficent.\n\n    https://en.wikipedia.org/wiki/Matthews_correlation_coefficient\n\n    Parameters\n    ----------\n\n    ntp : int\n        The number of true positives.\n\n    ntn : int\n        The number of true negatives\n\n    nfp : int\n        The number of false positives.\n\n    nfn : int\n        The number of false negatives.\n\n    Returns\n    -------\n\n    float\n        The Matthews correlation coefficient.\n\n    '''\n\n    mcc_top = (ntp*ntn - nfp*nfn)\n    mcc_bot = msqrt((ntp + nfp)*(ntp + nfn)*(ntn + nfp)*(ntn + nfn))\n\n    if mcc_bot > 0:\n        return mcc_top/mcc_bot\n    else:\n        return np.nan"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef get_recovered_variables_for_magbin(simbasedir,\n                                       magbinmedian,\n                                       stetson_stdev_min=2.0,\n                                       inveta_stdev_min=2.0,\n                                       iqr_stdev_min=2.0,\n                                       statsonly=True):\n    '''This runs variability selection for the given magbinmedian.\n\n    To generate a full recovery matrix over all magnitude bins, run this\n    function for each magbin over the specified stetson_stdev_min and\n    inveta_stdev_min grid.\n\n    Parameters\n    ----------\n\n    simbasedir : str\n        The input directory of fake LCs.\n\n    magbinmedian : float\n        The magbin to run the variable recovery for. This is an item from the\n        dict from `simbasedir/fakelcs-info.pkl: `fakelcinfo['magrms'][magcol]`\n        list for each magcol and designates which magbin to get the recovery\n        stats for.\n\n    stetson_stdev_min : float\n        The minimum sigma above the trend in the Stetson J variability index\n        distribution for this magbin to use to consider objects as variable.\n\n    inveta_stdev_min : float\n        The minimum sigma above the trend in the 1/eta variability index\n        distribution for this magbin to use to consider objects as variable.\n\n    iqr_stdev_min : float\n        The minimum sigma above the trend in the IQR variability index\n        distribution for this magbin to use to consider objects as variable.\n\n    statsonly : bool\n        If this is True, only the final stats will be returned. If False, the\n        full arrays used to generate the stats will also be returned.\n\n    Returns\n    -------\n\n    dict\n        The returned dict contains statistics for this magbin and if requested,\n        the full arrays used to calculate the statistics.\n\n    '''\n\n\n    # get the info from the simbasedir\n    with open(os.path.join(simbasedir, 'fakelcs-info.pkl'),'rb') as infd:\n        siminfo = pickle.load(infd)\n\n    objectids = siminfo['objectid']\n    varflags = siminfo['isvariable']\n    sdssr = siminfo['sdssr']\n\n    # get the column defs for the fakelcs\n    timecols = siminfo['timecols']\n    magcols = siminfo['magcols']\n    errcols = siminfo['errcols']\n\n    # register the fakelc pklc as a custom lcproc format\n    # now we should be able to use all lcproc functions correctly\n    fakelc_formatkey = 'fake-%s' % siminfo['lcformat']\n    lcproc.register_lcformat(\n        fakelc_formatkey,\n        '*-fakelc.pkl',\n        timecols,\n        magcols,\n        errcols,\n        'astrobase.lcproc',\n        '_read_pklc',\n        magsarefluxes=siminfo['magsarefluxes']\n    )\n\n    # make the output directory if it doesn't exit\n    outdir = os.path.join(simbasedir, 'recvar-threshold-pkls')\n    if not os.path.exists(outdir):\n        os.mkdir(outdir)\n\n    # run the variability search\n    varfeaturedir = os.path.join(simbasedir, 'varfeatures')\n    varthreshinfof = os.path.join(\n        outdir,\n        'varthresh-magbinmed%.2f-stet%.2f-inveta%.2f.pkl' % (magbinmedian,\n                                                             stetson_stdev_min,\n                                                             inveta_stdev_min)\n    )\n    varthresh = varthreshold.variability_threshold(\n        varfeaturedir,\n        varthreshinfof,\n        lcformat=fakelc_formatkey,\n        min_stetj_stdev=stetson_stdev_min,\n        min_inveta_stdev=inveta_stdev_min,\n        min_iqr_stdev=iqr_stdev_min,\n        verbose=False\n    )\n\n    # get the magbins from the varthresh info\n    magbins = varthresh['magbins']\n\n    # get the magbininds\n    magbininds = np.digitize(sdssr, magbins)\n\n    # bin the objects according to these magbins\n    binned_objectids = []\n    binned_actualvars = []\n    binned_actualnotvars = []\n\n    # go through all the mag bins and bin up the objectids, actual variables,\n    # and actual not-variables\n    for mbinind, _magi in zip(np.unique(magbininds),\n                              range(len(magbins)-1)):\n\n        thisbinind = np.where(magbininds == mbinind)\n\n        thisbin_objectids = objectids[thisbinind]\n        thisbin_varflags = varflags[thisbinind]\n\n        thisbin_actualvars = thisbin_objectids[thisbin_varflags]\n        thisbin_actualnotvars = thisbin_objectids[~thisbin_varflags]\n\n        binned_objectids.append(thisbin_objectids)\n        binned_actualvars.append(thisbin_actualvars)\n        binned_actualnotvars.append(thisbin_actualnotvars)\n\n\n    # this is the output dict\n    recdict = {\n        'simbasedir':simbasedir,\n        'timecols':timecols,\n        'magcols':magcols,\n        'errcols':errcols,\n        'magsarefluxes':siminfo['magsarefluxes'],\n        'stetj_min_stdev':stetson_stdev_min,\n        'inveta_min_stdev':inveta_stdev_min,\n        'iqr_min_stdev':iqr_stdev_min,\n        'magbinmedian':magbinmedian,\n    }\n\n\n    # now, for each magcol, find the magbin corresponding to magbinmedian, and\n    # get its stats\n    for magcol in magcols:\n\n        # this is the index of the matching magnitude bin for the magbinmedian\n        # provided\n        magbinind = np.where(\n            np.array(varthresh[magcol]['binned_sdssr_median']) == magbinmedian\n        )\n\n        magbinind = np.asscalar(magbinind[0])\n\n        # get the objectids, actual vars and actual notvars in this magbin\n        thisbin_objectids = binned_objectids[magbinind]\n        thisbin_actualvars = binned_actualvars[magbinind]\n        thisbin_actualnotvars = binned_actualnotvars[magbinind]\n\n        # stetson recovered variables in this magbin\n        stet_recoveredvars = varthresh[magcol][\n            'binned_objectids_thresh_stetsonj'\n        ][magbinind]\n\n        # calculate TP, FP, TN, FN\n        stet_recoverednotvars = np.setdiff1d(thisbin_objectids,\n                                             stet_recoveredvars)\n\n        stet_truepositives = np.intersect1d(stet_recoveredvars,\n                                            thisbin_actualvars)\n        stet_falsepositives = np.intersect1d(stet_recoveredvars,\n                                             thisbin_actualnotvars)\n        stet_truenegatives = np.intersect1d(stet_recoverednotvars,\n                                            thisbin_actualnotvars)\n        stet_falsenegatives = np.intersect1d(stet_recoverednotvars,\n                                             thisbin_actualvars)\n\n        # calculate stetson recall, precision, Matthews correl coeff\n        stet_recall = recall(stet_truepositives.size,\n                             stet_falsenegatives.size)\n\n        stet_precision = precision(stet_truepositives.size,\n                                   stet_falsepositives.size)\n\n        stet_mcc = matthews_correl_coeff(stet_truepositives.size,\n                                         stet_truenegatives.size,\n                                         stet_falsepositives.size,\n                                         stet_falsenegatives.size)\n\n\n        # inveta recovered variables in this magbin\n        inveta_recoveredvars = varthresh[magcol][\n            'binned_objectids_thresh_inveta'\n        ][magbinind]\n        inveta_recoverednotvars = np.setdiff1d(thisbin_objectids,\n                                               inveta_recoveredvars)\n\n        inveta_truepositives = np.intersect1d(inveta_recoveredvars,\n                                              thisbin_actualvars)\n        inveta_falsepositives = np.intersect1d(inveta_recoveredvars,\n                                               thisbin_actualnotvars)\n        inveta_truenegatives = np.intersect1d(inveta_recoverednotvars,\n                                              thisbin_actualnotvars)\n        inveta_falsenegatives = np.intersect1d(inveta_recoverednotvars,\n                                               thisbin_actualvars)\n\n        # calculate inveta recall, precision, Matthews correl coeff\n        inveta_recall = recall(inveta_truepositives.size,\n                               inveta_falsenegatives.size)\n\n        inveta_precision = precision(inveta_truepositives.size,\n                                     inveta_falsepositives.size)\n\n        inveta_mcc = matthews_correl_coeff(inveta_truepositives.size,\n                                           inveta_truenegatives.size,\n                                           inveta_falsepositives.size,\n                                           inveta_falsenegatives.size)\n\n\n        # iqr recovered variables in this magbin\n        iqr_recoveredvars = varthresh[magcol][\n            'binned_objectids_thresh_iqr'\n        ][magbinind]\n        iqr_recoverednotvars = np.setdiff1d(thisbin_objectids,\n                                            iqr_recoveredvars)\n\n        iqr_truepositives = np.intersect1d(iqr_recoveredvars,\n                                           thisbin_actualvars)\n        iqr_falsepositives = np.intersect1d(iqr_recoveredvars,\n                                            thisbin_actualnotvars)\n        iqr_truenegatives = np.intersect1d(iqr_recoverednotvars,\n                                           thisbin_actualnotvars)\n        iqr_falsenegatives = np.intersect1d(iqr_recoverednotvars,\n                                            thisbin_actualvars)\n\n        # calculate iqr recall, precision, Matthews correl coeff\n        iqr_recall = recall(iqr_truepositives.size,\n                            iqr_falsenegatives.size)\n\n        iqr_precision = precision(iqr_truepositives.size,\n                                  iqr_falsepositives.size)\n\n        iqr_mcc = matthews_correl_coeff(iqr_truepositives.size,\n                                        iqr_truenegatives.size,\n                                        iqr_falsepositives.size,\n                                        iqr_falsenegatives.size)\n\n\n        # calculate the items missed by one method but found by the other\n        # methods\n        stet_missed_inveta_found = np.setdiff1d(inveta_truepositives,\n                                                stet_truepositives)\n        stet_missed_iqr_found = np.setdiff1d(iqr_truepositives,\n                                             stet_truepositives)\n\n        inveta_missed_stet_found = np.setdiff1d(stet_truepositives,\n                                                inveta_truepositives)\n        inveta_missed_iqr_found = np.setdiff1d(iqr_truepositives,\n                                               inveta_truepositives)\n\n        iqr_missed_stet_found = np.setdiff1d(stet_truepositives,\n                                             iqr_truepositives)\n        iqr_missed_inveta_found = np.setdiff1d(inveta_truepositives,\n                                               iqr_truepositives)\n\n        if not statsonly:\n\n            recdict[magcol] = {\n                # stetson J alone\n                'stet_recoveredvars':stet_recoveredvars,\n                'stet_truepositives':stet_truepositives,\n                'stet_falsepositives':stet_falsepositives,\n                'stet_truenegatives':stet_truenegatives,\n                'stet_falsenegatives':stet_falsenegatives,\n                'stet_precision':stet_precision,\n                'stet_recall':stet_recall,\n                'stet_mcc':stet_mcc,\n                # inveta alone\n                'inveta_recoveredvars':inveta_recoveredvars,\n                'inveta_truepositives':inveta_truepositives,\n                'inveta_falsepositives':inveta_falsepositives,\n                'inveta_truenegatives':inveta_truenegatives,\n                'inveta_falsenegatives':inveta_falsenegatives,\n                'inveta_precision':inveta_precision,\n                'inveta_recall':inveta_recall,\n                'inveta_mcc':inveta_mcc,\n                # iqr alone\n                'iqr_recoveredvars':iqr_recoveredvars,\n                'iqr_truepositives':iqr_truepositives,\n                'iqr_falsepositives':iqr_falsepositives,\n                'iqr_truenegatives':iqr_truenegatives,\n                'iqr_falsenegatives':iqr_falsenegatives,\n                'iqr_precision':iqr_precision,\n                'iqr_recall':iqr_recall,\n                'iqr_mcc':iqr_mcc,\n                # true positive variables missed by one method but picked up by\n                # the others\n                'stet_missed_inveta_found':stet_missed_inveta_found,\n                'stet_missed_iqr_found':stet_missed_iqr_found,\n                'inveta_missed_stet_found':inveta_missed_stet_found,\n                'inveta_missed_iqr_found':inveta_missed_iqr_found,\n                'iqr_missed_stet_found':iqr_missed_stet_found,\n                'iqr_missed_inveta_found':iqr_missed_inveta_found,\n                # bin info\n                'actual_variables':thisbin_actualvars,\n                'actual_nonvariables':thisbin_actualnotvars,\n                'all_objectids':thisbin_objectids,\n                'magbinind':magbinind,\n\n            }\n\n        # if statsonly is set, then we only return the numbers but not the\n        # arrays themselves\n        else:\n\n            recdict[magcol] = {\n                # stetson J alone\n                'stet_recoveredvars':stet_recoveredvars.size,\n                'stet_truepositives':stet_truepositives.size,\n                'stet_falsepositives':stet_falsepositives.size,\n                'stet_truenegatives':stet_truenegatives.size,\n                'stet_falsenegatives':stet_falsenegatives.size,\n                'stet_precision':stet_precision,\n                'stet_recall':stet_recall,\n                'stet_mcc':stet_mcc,\n                # inveta alone\n                'inveta_recoveredvars':inveta_recoveredvars.size,\n                'inveta_truepositives':inveta_truepositives.size,\n                'inveta_falsepositives':inveta_falsepositives.size,\n                'inveta_truenegatives':inveta_truenegatives.size,\n                'inveta_falsenegatives':inveta_falsenegatives.size,\n                'inveta_precision':inveta_precision,\n                'inveta_recall':inveta_recall,\n                'inveta_mcc':inveta_mcc,\n                # iqr alone\n                'iqr_recoveredvars':iqr_recoveredvars.size,\n                'iqr_truepositives':iqr_truepositives.size,\n                'iqr_falsepositives':iqr_falsepositives.size,\n                'iqr_truenegatives':iqr_truenegatives.size,\n                'iqr_falsenegatives':iqr_falsenegatives.size,\n                'iqr_precision':iqr_precision,\n                'iqr_recall':iqr_recall,\n                'iqr_mcc':iqr_mcc,\n                # true positive variables missed by one method but picked up by\n                # the others\n                'stet_missed_inveta_found':stet_missed_inveta_found.size,\n                'stet_missed_iqr_found':stet_missed_iqr_found.size,\n                'inveta_missed_stet_found':inveta_missed_stet_found.size,\n                'inveta_missed_iqr_found':inveta_missed_iqr_found.size,\n                'iqr_missed_stet_found':iqr_missed_stet_found.size,\n                'iqr_missed_inveta_found':iqr_missed_inveta_found.size,\n                # bin info\n                'actual_variables':thisbin_actualvars.size,\n                'actual_nonvariables':thisbin_actualnotvars.size,\n                'all_objectids':thisbin_objectids.size,\n                'magbinind':magbinind,\n            }\n\n\n    #\n    # done with per magcol\n    #\n\n    return recdict", "response": "This function runs variability selection for the given magbin and returns a dict containing the statistics for that variable."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef plot_varind_gridsearch_magbin_results(gridsearch_results):\n    '''This plots the gridsearch results from `variable_index_gridsearch_magbin`.\n\n    Parameters\n    ----------\n\n    gridsearch_results : dict\n        This is the dict produced by `variable_index_gridsearch_magbin` above.\n\n    Returns\n    -------\n\n    dict\n        The returned dict contains filenames of the recovery rate plots made for\n        each variability index. These include plots of the precision, recall,\n        and Matthews Correlation Coefficient over each magbin and a heatmap of\n        these values over the grid points of the variability index stdev values\n        arrays used.\n\n    '''\n\n    # get the result pickle/dict\n    if (isinstance(gridsearch_results, str) and\n        os.path.exists(gridsearch_results)):\n\n        with open(gridsearch_results,'rb') as infd:\n            gridresults = pickle.load(infd)\n\n    elif isinstance(gridsearch_results, dict):\n\n        gridresults = gridsearch_results\n\n    else:\n        LOGERROR('could not understand the input '\n                 'variable index grid-search result dict/pickle')\n        return None\n\n\n\n    plotres = {'simbasedir':gridresults['simbasedir']}\n\n    recgrid = gridresults['recovery']\n    simbasedir = gridresults['simbasedir']\n\n    for magcol in gridresults['magcols']:\n\n        plotres[magcol] = {'best_stetsonj':[],\n                           'best_inveta':[],\n                           'best_iqr':[],\n                           'magbinmedians':gridresults['magbinmedians']}\n\n        # go through all the magbins\n        for magbinind, magbinmedian in enumerate(gridresults['magbinmedians']):\n\n            LOGINFO('plotting results for %s: magbin: %.3f' %\n                    (magcol, magbinmedian))\n\n            stet_mcc = np.array(\n                [x[magcol]['stet_mcc']\n                 for x in recgrid[magbinind]]\n            )[::(gridresults['inveta_grid'].size *\n                 gridresults['stetson_grid'].size)]\n            stet_precision = np.array(\n                [x[magcol]['stet_precision']\n                 for x in recgrid[magbinind]]\n            )[::(gridresults['inveta_grid'].size *\n                 gridresults['stetson_grid'].size)]\n            stet_recall = np.array(\n                [x[magcol]['stet_recall']\n                 for x in recgrid[magbinind]]\n            )[::(gridresults['inveta_grid'].size *\n                 gridresults['stetson_grid'].size)]\n            stet_missed_inveta_found = np.array(\n                [x[magcol]['stet_missed_inveta_found']\n                 for x in recgrid[magbinind]]\n            )[::(gridresults['inveta_grid'].size *\n                 gridresults['stetson_grid'].size)]\n            stet_missed_iqr_found = np.array(\n                [x[magcol]['stet_missed_iqr_found']\n                 for x in recgrid[magbinind]]\n            )[::(gridresults['inveta_grid'].size *\n                 gridresults['stetson_grid'].size)]\n\n            inveta_mcc = np.array(\n                [x[magcol]['inveta_mcc']\n                 for x in recgrid[magbinind]]\n            )[:(gridresults['iqr_grid'].size *\n                gridresults['stetson_grid'].size)][\n                    ::gridresults['inveta_grid'].size\n                ]\n            inveta_precision = np.array(\n                [x[magcol]['inveta_precision']\n                 for x in recgrid[magbinind]]\n            )[:(gridresults['iqr_grid'].size *\n                gridresults['stetson_grid'].size)][\n                    ::gridresults['inveta_grid'].size\n                ]\n            inveta_recall = np.array(\n                [x[magcol]['inveta_recall']\n                 for x in recgrid[magbinind]]\n            )[:(gridresults['iqr_grid'].size *\n                gridresults['stetson_grid'].size)][\n                    ::gridresults['inveta_grid'].size\n                ]\n            inveta_missed_stet_found = np.array(\n                [x[magcol]['inveta_missed_stet_found']\n                 for x in recgrid[magbinind]]\n            )[:(gridresults['iqr_grid'].size *\n                gridresults['stetson_grid'].size)][\n                    ::gridresults['inveta_grid'].size\n                ]\n            inveta_missed_iqr_found = np.array(\n                [x[magcol]['inveta_missed_iqr_found']\n                 for x in recgrid[magbinind]]\n            )[:(gridresults['iqr_grid'].size *\n                gridresults['stetson_grid'].size)][\n                    ::gridresults['inveta_grid'].size\n                ]\n\n            iqr_mcc = np.array(\n                [x[magcol]['iqr_mcc']\n                 for x in recgrid[magbinind]]\n            )[:(gridresults['iqr_grid'].size *\n                gridresults['stetson_grid'].size)][\n                    :gridresults['inveta_grid'].size\n                ]\n            iqr_precision = np.array(\n                [x[magcol]['iqr_precision']\n                 for x in recgrid[magbinind]]\n            )[:(gridresults['iqr_grid'].size *\n                gridresults['stetson_grid'].size)][\n                    :gridresults['inveta_grid'].size\n                ]\n            iqr_recall = np.array(\n                [x[magcol]['iqr_recall']\n                 for x in recgrid[magbinind]]\n            )[:(gridresults['iqr_grid'].size *\n                gridresults['stetson_grid'].size)][\n                    :gridresults['inveta_grid'].size\n                ]\n            iqr_missed_stet_found = np.array(\n                [x[magcol]['iqr_missed_stet_found']\n                 for x in recgrid[magbinind]]\n            )[:(gridresults['iqr_grid'].size *\n                gridresults['stetson_grid'].size)][\n                    :gridresults['inveta_grid'].size\n                ]\n            iqr_missed_inveta_found = np.array(\n                [x[magcol]['iqr_missed_inveta_found']\n                 for x in recgrid[magbinind]]\n            )[:(gridresults['iqr_grid'].size *\n                gridresults['stetson_grid'].size)][\n                    :gridresults['inveta_grid'].size\n                ]\n\n\n            fig = plt.figure(figsize=(6.4*5, 4.8*3))\n\n            # FIRST ROW: stetson J plot\n\n            plt.subplot(3,5,1)\n            if np.any(np.isfinite(stet_mcc)):\n                plt.plot(gridresults['stetson_grid'],\n                         stet_mcc)\n                plt.xlabel('stetson J stdev multiplier threshold')\n                plt.ylabel('MCC')\n                plt.title('MCC for stetson J')\n            else:\n                plt.text(0.5,0.5,\n                         'stet MCC values are all nan '\n                         'for this magbin',\n                         transform=plt.gca().transAxes,\n                         horizontalalignment='center',\n                         verticalalignment='center')\n                plt.xticks([])\n                plt.yticks([])\n\n            plt.subplot(3,5,2)\n            if np.any(np.isfinite(stet_precision)):\n                plt.plot(gridresults['stetson_grid'],\n                         stet_precision)\n                plt.xlabel('stetson J stdev multiplier threshold')\n                plt.ylabel('precision')\n                plt.title('precision for stetson J')\n            else:\n                plt.text(0.5,0.5,\n                         'stet precision values are all nan '\n                         'for this magbin',\n                         transform=plt.gca().transAxes,\n                         horizontalalignment='center',\n                         verticalalignment='center')\n                plt.xticks([])\n                plt.yticks([])\n\n            plt.subplot(3,5,3)\n            if np.any(np.isfinite(stet_recall)):\n                plt.plot(gridresults['stetson_grid'],\n                         stet_recall)\n                plt.xlabel('stetson J stdev multiplier threshold')\n                plt.ylabel('recall')\n                plt.title('recall for stetson J')\n            else:\n                plt.text(0.5,0.5,\n                         'stet recall values are all nan '\n                         'for this magbin',\n                         transform=plt.gca().transAxes,\n                         horizontalalignment='center',\n                         verticalalignment='center')\n                plt.xticks([])\n                plt.yticks([])\n\n            plt.subplot(3,5,4)\n            if np.any(np.isfinite(stet_missed_inveta_found)):\n                plt.plot(gridresults['stetson_grid'],\n                         stet_missed_inveta_found)\n                plt.xlabel('stetson J stdev multiplier threshold')\n                plt.ylabel('# objects stetson missed but inveta found')\n                plt.title('stetson J missed, inveta found')\n            else:\n                plt.text(0.5,0.5,\n                         'stet-missed/inveta-found values are all nan '\n                         'for this magbin',\n                         transform=plt.gca().transAxes,\n                         horizontalalignment='center',\n                         verticalalignment='center')\n                plt.xticks([])\n                plt.yticks([])\n\n            plt.subplot(3,5,5)\n            if np.any(np.isfinite(stet_missed_iqr_found)):\n                plt.plot(gridresults['stetson_grid'],\n                         stet_missed_iqr_found)\n                plt.xlabel('stetson J stdev multiplier threshold')\n                plt.ylabel('# objects stetson missed but IQR found')\n                plt.title('stetson J missed, IQR found')\n            else:\n                plt.text(0.5,0.5,\n                         'stet-missed/IQR-found values are all nan '\n                         'for this magbin',\n                         transform=plt.gca().transAxes,\n                         horizontalalignment='center',\n                         verticalalignment='center')\n                plt.xticks([])\n                plt.yticks([])\n\n            # SECOND ROW: inveta plots\n\n            plt.subplot(3,5,6)\n            if np.any(np.isfinite(inveta_mcc)):\n                plt.plot(gridresults['inveta_grid'],\n                         inveta_mcc)\n                plt.xlabel('inveta stdev multiplier threshold')\n                plt.ylabel('MCC')\n                plt.title('MCC for inveta')\n            else:\n                plt.text(0.5,0.5,\n                         'inveta MCC values are all nan '\n                         'for this magbin',\n                         transform=plt.gca().transAxes,\n                         horizontalalignment='center',\n                         verticalalignment='center')\n                plt.xticks([])\n                plt.yticks([])\n\n            plt.subplot(3,5,7)\n            if np.any(np.isfinite(inveta_precision)):\n                plt.plot(gridresults['inveta_grid'],\n                         inveta_precision)\n                plt.xlabel('inveta stdev multiplier threshold')\n                plt.ylabel('precision')\n                plt.title('precision for inveta')\n            else:\n                plt.text(0.5,0.5,\n                         'inveta precision values are all nan '\n                         'for this magbin',\n                         transform=plt.gca().transAxes,\n                         horizontalalignment='center',\n                         verticalalignment='center')\n                plt.xticks([])\n                plt.yticks([])\n\n            plt.subplot(3,5,8)\n            if np.any(np.isfinite(inveta_recall)):\n                plt.plot(gridresults['inveta_grid'],\n                         inveta_recall)\n                plt.xlabel('inveta stdev multiplier threshold')\n                plt.ylabel('recall')\n                plt.title('recall for inveta')\n            else:\n                plt.text(0.5,0.5,\n                         'inveta recall values are all nan '\n                         'for this magbin',\n                         transform=plt.gca().transAxes,\n                         horizontalalignment='center',\n                         verticalalignment='center')\n                plt.xticks([])\n                plt.yticks([])\n\n            plt.subplot(3,5,9)\n            if np.any(np.isfinite(inveta_missed_stet_found)):\n                plt.plot(gridresults['inveta_grid'],\n                         inveta_missed_stet_found)\n                plt.xlabel('inveta stdev multiplier threshold')\n                plt.ylabel('# objects inveta missed but stetson found')\n                plt.title('inveta missed, stetson J found')\n            else:\n                plt.text(0.5,0.5,\n                         'inveta-missed-stet-found values are all nan '\n                         'for this magbin',\n                         transform=plt.gca().transAxes,\n                         horizontalalignment='center',\n                         verticalalignment='center')\n                plt.xticks([])\n                plt.yticks([])\n\n            plt.subplot(3,5,10)\n            if np.any(np.isfinite(inveta_missed_iqr_found)):\n                plt.plot(gridresults['inveta_grid'],\n                         inveta_missed_iqr_found)\n                plt.xlabel('inveta stdev multiplier threshold')\n                plt.ylabel('# objects inveta missed but IQR found')\n                plt.title('inveta missed, IQR found')\n            else:\n                plt.text(0.5,0.5,\n                         'inveta-missed-iqr-found values are all nan '\n                         'for this magbin',\n                         transform=plt.gca().transAxes,\n                         horizontalalignment='center',\n                         verticalalignment='center')\n                plt.xticks([])\n                plt.yticks([])\n\n\n            # THIRD ROW: inveta plots\n\n            plt.subplot(3,5,11)\n            if np.any(np.isfinite(iqr_mcc)):\n                plt.plot(gridresults['iqr_grid'],\n                         iqr_mcc)\n                plt.xlabel('IQR stdev multiplier threshold')\n                plt.ylabel('MCC')\n                plt.title('MCC for IQR')\n            else:\n                plt.text(0.5,0.5,\n                         'IQR MCC values are all nan '\n                         'for this magbin',\n                         transform=plt.gca().transAxes,\n                         horizontalalignment='center',\n                         verticalalignment='center')\n                plt.xticks([])\n                plt.yticks([])\n\n            plt.subplot(3,5,12)\n            if np.any(np.isfinite(iqr_precision)):\n                plt.plot(gridresults['iqr_grid'],\n                         iqr_precision)\n                plt.xlabel('IQR stdev multiplier threshold')\n                plt.ylabel('precision')\n                plt.title('precision for IQR')\n            else:\n                plt.text(0.5,0.5,\n                         'IQR precision values are all nan '\n                         'for this magbin',\n                         transform=plt.gca().transAxes,\n                         horizontalalignment='center',\n                         verticalalignment='center')\n                plt.xticks([])\n                plt.yticks([])\n\n            plt.subplot(3,5,13)\n            if np.any(np.isfinite(iqr_recall)):\n                plt.plot(gridresults['iqr_grid'],\n                         iqr_recall)\n                plt.xlabel('IQR stdev multiplier threshold')\n                plt.ylabel('recall')\n                plt.title('recall for IQR')\n            else:\n                plt.text(0.5,0.5,\n                         'IQR recall values are all nan '\n                         'for this magbin',\n                         transform=plt.gca().transAxes,\n                         horizontalalignment='center',\n                         verticalalignment='center')\n                plt.xticks([])\n                plt.yticks([])\n\n            plt.subplot(3,5,14)\n            if np.any(np.isfinite(iqr_missed_stet_found)):\n                plt.plot(gridresults['iqr_grid'],\n                         iqr_missed_stet_found)\n                plt.xlabel('IQR stdev multiplier threshold')\n                plt.ylabel('# objects IQR missed but stetson found')\n                plt.title('IQR missed, stetson J found')\n            else:\n                plt.text(0.5,0.5,\n                         'iqr-missed-stet-found values are all nan '\n                         'for this magbin',\n                         transform=plt.gca().transAxes,\n                         horizontalalignment='center',\n                         verticalalignment='center')\n                plt.xticks([])\n                plt.yticks([])\n\n            plt.subplot(3,5,15)\n            if np.any(np.isfinite(iqr_missed_inveta_found)):\n                plt.plot(gridresults['iqr_grid'],\n                         iqr_missed_inveta_found)\n                plt.xlabel('IQR stdev multiplier threshold')\n                plt.ylabel('# objects IQR missed but inveta found')\n                plt.title('IQR missed, inveta found')\n            else:\n                plt.text(0.5,0.5,\n                         'iqr-missed-inveta-found values are all nan '\n                         'for this magbin',\n                         transform=plt.gca().transAxes,\n                         horizontalalignment='center',\n                         verticalalignment='center')\n                plt.xticks([])\n                plt.yticks([])\n\n\n            plt.subplots_adjust(hspace=0.25,wspace=0.25)\n\n            plt.suptitle('magcol: %s, magbin: %.3f' % (magcol, magbinmedian))\n\n            plotdir = os.path.join(gridresults['simbasedir'],\n                                   'varindex-gridsearch-plots')\n            if not os.path.exists(plotdir):\n                os.mkdir(plotdir)\n\n            gridplotf = os.path.join(\n                plotdir,\n                '%s-magbin-%.3f-var-recoverygrid-permagbin.png' %\n                (magcol, magbinmedian)\n            )\n\n            plt.savefig(gridplotf,dpi=100,bbox_inches='tight')\n            plt.close('all')\n\n\n            # get the best values of MCC, recall, precision and their associated\n            # stet, inveta\n            stet_mcc_maxind = np.where(stet_mcc == np.max(stet_mcc))\n            stet_precision_maxind = np.where(\n                stet_precision == np.max(stet_precision)\n            )\n            stet_recall_maxind = np.where(stet_recall == np.max(stet_recall))\n\n            best_stet_mcc = stet_mcc[stet_mcc_maxind]\n            best_stet_precision = stet_mcc[stet_precision_maxind]\n            best_stet_recall = stet_mcc[stet_recall_maxind]\n\n            stet_with_best_mcc = gridresults['stetson_grid'][stet_mcc_maxind]\n            stet_with_best_precision = gridresults['stetson_grid'][\n                stet_precision_maxind\n            ]\n            stet_with_best_recall = (\n                gridresults['stetson_grid'][stet_recall_maxind]\n            )\n\n\n            inveta_mcc_maxind = np.where(inveta_mcc == np.max(inveta_mcc))\n            inveta_precision_maxind = np.where(\n                inveta_precision == np.max(inveta_precision)\n            )\n            inveta_recall_maxind = (\n                np.where(inveta_recall == np.max(inveta_recall))\n            )\n\n            best_inveta_mcc = inveta_mcc[inveta_mcc_maxind]\n            best_inveta_precision = inveta_mcc[inveta_precision_maxind]\n            best_inveta_recall = inveta_mcc[inveta_recall_maxind]\n\n            inveta_with_best_mcc = gridresults['inveta_grid'][inveta_mcc_maxind]\n            inveta_with_best_precision = gridresults['inveta_grid'][\n                inveta_precision_maxind\n            ]\n            inveta_with_best_recall = gridresults['inveta_grid'][\n                inveta_recall_maxind\n            ]\n\n\n            iqr_mcc_maxind = np.where(iqr_mcc == np.max(iqr_mcc))\n            iqr_precision_maxind = np.where(\n                iqr_precision == np.max(iqr_precision)\n            )\n            iqr_recall_maxind = (\n                np.where(iqr_recall == np.max(iqr_recall))\n            )\n\n            best_iqr_mcc = iqr_mcc[iqr_mcc_maxind]\n            best_iqr_precision = iqr_mcc[iqr_precision_maxind]\n            best_iqr_recall = iqr_mcc[iqr_recall_maxind]\n\n            iqr_with_best_mcc = gridresults['iqr_grid'][iqr_mcc_maxind]\n            iqr_with_best_precision = gridresults['iqr_grid'][\n                iqr_precision_maxind\n            ]\n            iqr_with_best_recall = gridresults['iqr_grid'][\n                iqr_recall_maxind\n            ]\n\n\n            plotres[magcol][magbinmedian] = {\n                # stetson\n                'stet_grid':gridresults['stetson_grid'],\n                'stet_mcc':stet_mcc,\n                'stet_precision':stet_precision,\n                'stet_recall':stet_recall,\n                'stet_missed_inveta_found':stet_missed_inveta_found,\n                'best_stet_mcc':best_stet_mcc,\n                'stet_with_best_mcc':stet_with_best_mcc,\n                'best_stet_precision':best_stet_precision,\n                'stet_with_best_precision':stet_with_best_precision,\n                'best_stet_recall':best_stet_recall,\n                'stet_with_best_recall':stet_with_best_recall,\n                # inveta\n                'inveta_grid':gridresults['inveta_grid'],\n                'inveta_mcc':inveta_mcc,\n                'inveta_precision':inveta_precision,\n                'inveta_recall':inveta_recall,\n                'inveta_missed_stet_found':inveta_missed_stet_found,\n                'best_inveta_mcc':best_inveta_mcc,\n                'inveta_with_best_mcc':inveta_with_best_mcc,\n                'best_inveta_precision':best_inveta_precision,\n                'inveta_with_best_precision':inveta_with_best_precision,\n                'best_inveta_recall':best_inveta_recall,\n                'inveta_with_best_recall':inveta_with_best_recall,\n                # iqr\n                'iqr_grid':gridresults['iqr_grid'],\n                'iqr_mcc':iqr_mcc,\n                'iqr_precision':iqr_precision,\n                'iqr_recall':iqr_recall,\n                'iqr_missed_stet_found':iqr_missed_stet_found,\n                'best_iqr_mcc':best_iqr_mcc,\n                'iqr_with_best_mcc':iqr_with_best_mcc,\n                'best_iqr_precision':best_iqr_precision,\n                'iqr_with_best_precision':iqr_with_best_precision,\n                'best_iqr_recall':best_iqr_recall,\n                'iqr_with_best_recall':iqr_with_best_recall,\n                # plot info\n                'recoveryplot':gridplotf\n            }\n\n            # recommend inveta, stetson index, and iqr for this magbin\n\n            # if there are multiple stets, choose the smallest one\n            if stet_with_best_mcc.size > 1:\n                plotres[magcol]['best_stetsonj'].append(stet_with_best_mcc[0])\n            elif stet_with_best_mcc.size > 0:\n                plotres[magcol]['best_stetsonj'].append(stet_with_best_mcc[0])\n            else:\n                plotres[magcol]['best_stetsonj'].append(np.nan)\n\n            # if there are multiple best invetas, choose the smallest one\n            if inveta_with_best_mcc.size > 1:\n                plotres[magcol]['best_inveta'].append(inveta_with_best_mcc[0])\n            elif inveta_with_best_mcc.size > 0:\n                plotres[magcol]['best_inveta'].append(inveta_with_best_mcc[0])\n            else:\n                plotres[magcol]['best_inveta'].append(np.nan)\n\n            # if there are multiple best iqrs, choose the smallest one\n            if iqr_with_best_mcc.size > 1:\n                plotres[magcol]['best_iqr'].append(iqr_with_best_mcc[0])\n            elif iqr_with_best_mcc.size > 0:\n                plotres[magcol]['best_iqr'].append(iqr_with_best_mcc[0])\n            else:\n                plotres[magcol]['best_iqr'].append(np.nan)\n\n\n    # write the plotresults to a pickle\n    plotrespicklef = os.path.join(simbasedir,\n                                  'varindex-gridsearch-magbin-results.pkl')\n    with open(plotrespicklef, 'wb') as outfd:\n        pickle.dump(plotres, outfd, pickle.HIGHEST_PROTOCOL)\n\n\n    # recommend the values of stetson J and inveta to use\n    for magcol in gridresults['magcols']:\n\n        LOGINFO('best stdev multipliers for each %s magbin:' % magcol)\n        LOGINFO('magbin    inveta    stetson J    IQR')\n\n        for magbin, inveta, stet, iqr in zip(\n                plotres[magcol]['magbinmedians'],\n                plotres[magcol]['best_inveta'],\n                plotres[magcol]['best_stetsonj'],\n                plotres[magcol]['best_iqr']):\n            LOGINFO('%.3f    %.3f    %.3f    %.3f' % (magbin,\n                                                      inveta,\n                                                      stet,\n                                                      iqr))\n\n\n    return plotres", "response": "This function plots the gridsearch results from variable index gridsearch_magbin."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef run_periodfinding(simbasedir,\n                      pfmethods=('gls','pdm','bls'),\n                      pfkwargs=({},{},{'startp':1.0,'maxtransitduration':0.3}),\n                      getblssnr=False,\n                      sigclip=5.0,\n                      nperiodworkers=10,\n                      ncontrolworkers=4,\n                      liststartindex=None,\n                      listmaxobjects=None):\n    '''This runs periodfinding using several period-finders on a collection of\n    fake LCs.\n\n    As a rough benchmark, 25000 fake LCs with 10000--50000 points per LC take\n    about 26 days in total to run on an invocation of this function using\n    GLS+PDM+BLS and 10 periodworkers and 4 controlworkers (so all 40 'cores') on\n    a 2 x Xeon E5-2660v3 machine.\n\n    Parameters\n    ----------\n\n    pfmethods : sequence of str\n        This is used to specify which periodfinders to run. These must be in the\n        `lcproc.periodsearch.PFMETHODS` dict.\n\n    pfkwargs : sequence of dict\n        This is used to provide optional kwargs to the period-finders.\n\n    getblssnr : bool\n        If this is True, will run BLS SNR calculations for each object and\n        magcol. This takes a while to run, so it's disabled (False) by default.\n\n    sigclip : float or int or sequence of two floats/ints or None\n        If a single float or int, a symmetric sigma-clip will be performed using\n        the number provided as the sigma-multiplier to cut out from the input\n        time-series.\n\n        If a list of two ints/floats is provided, the function will perform an\n        'asymmetric' sigma-clip. The first element in this list is the sigma\n        value to use for fainter flux/mag values; the second element in this\n        list is the sigma value to use for brighter flux/mag values. For\n        example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma\n        dimmings and greater than 3-sigma brightenings. Here the meaning of\n        \"dimming\" and \"brightening\" is set by *physics* (not the magnitude\n        system), which is why the `magsarefluxes` kwarg must be correctly set.\n\n        If `sigclip` is None, no sigma-clipping will be performed, and the\n        time-series (with non-finite elems removed) will be passed through to\n        the output.\n\n    nperiodworkers : int\n        This is the number of parallel period-finding worker processes to use.\n\n    ncontrolworkers : int\n        This is the number of parallel period-finding control workers to\n        use. Each control worker will launch `nperiodworkers` worker processes.\n\n    liststartindex : int\n        The starting index of processing. This refers to the filename list\n        generated by running `glob.glob` on the fake LCs in `simbasedir`.\n\n    maxobjects : int\n        The maximum number of objects to process in this run. Use this with\n        `liststartindex` to effectively distribute working on a large list of\n        input light curves over several sessions or machines.\n\n    Returns\n    -------\n\n    str\n        The path to the output summary pickle produced by\n        `lcproc.periodsearch.parallel_pf`\n\n    '''\n\n    # get the info from the simbasedir\n    with open(os.path.join(simbasedir, 'fakelcs-info.pkl'),'rb') as infd:\n        siminfo = pickle.load(infd)\n\n    lcfpaths = siminfo['lcfpath']\n    pfdir = os.path.join(simbasedir,'periodfinding')\n\n    # get the column defs for the fakelcs\n    timecols = siminfo['timecols']\n    magcols = siminfo['magcols']\n    errcols = siminfo['errcols']\n\n    # register the fakelc pklc as a custom lcproc format\n    # now we should be able to use all lcproc functions correctly\n    fakelc_formatkey = 'fake-%s' % siminfo['lcformat']\n    lcproc.register_lcformat(\n        fakelc_formatkey,\n        '*-fakelc.pkl',\n        timecols,\n        magcols,\n        errcols,\n        'astrobase.lcproc',\n        '_read_pklc',\n        magsarefluxes=siminfo['magsarefluxes']\n    )\n\n    if liststartindex:\n        lcfpaths = lcfpaths[liststartindex:]\n\n    if listmaxobjects:\n        lcfpaths = lcfpaths[:listmaxobjects]\n\n    pfinfo = periodsearch.parallel_pf(lcfpaths,\n                                      pfdir,\n                                      lcformat=fakelc_formatkey,\n                                      pfmethods=pfmethods,\n                                      pfkwargs=pfkwargs,\n                                      getblssnr=getblssnr,\n                                      sigclip=sigclip,\n                                      nperiodworkers=nperiodworkers,\n                                       ncontrolworkers=ncontrolworkers)\n\n    with open(os.path.join(simbasedir,\n                           'fakelc-periodsearch.pkl'),'wb') as outfd:\n        pickle.dump(pfinfo, outfd, pickle.HIGHEST_PROTOCOL)\n\n    return os.path.join(simbasedir,'fakelc-periodsearch.pkl')", "response": "This function runs periodfinding on a single base directory and returns a list of dicts. Each dictionary is used to specify the periodfinders to run on the base directory."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nrecovers the periodic variable status and info for the simulated PF result.", "response": "def periodicvar_recovery(fakepfpkl,\n                         simbasedir,\n                         period_tolerance=1.0e-3):\n\n    '''Recovers the periodic variable status/info for the simulated PF result.\n\n    - Uses simbasedir and the lcfbasename stored in fakepfpkl to figure out\n      where the LC for this object is.\n    - Gets the actual_varparams, actual_varperiod, actual_vartype,\n      actual_varamplitude elements from the LC.\n    - Figures out if the current objectid is a periodic variable (using\n      actual_vartype).\n    - If it is a periodic variable, gets the canonical period assigned to it.\n    - Checks if the period was recovered in any of the five best periods\n      reported by any of the period-finders, checks if the period recovered was\n      a harmonic of the period.\n    - Returns the objectid, actual period and vartype, recovered period, and\n      recovery status.\n\n\n    Parameters\n    ----------\n\n    fakepfpkl : str\n        This is a periodfinding-<objectid>.pkl[.gz] file produced in the\n        `simbasedir/periodfinding` subdirectory after `run_periodfinding` above\n        is done.\n\n    simbasedir : str\n        The base directory where all of the fake LCs and period-finding results\n        are.\n\n    period_tolerance : float\n        The maximum difference that this function will consider between an\n        actual period (or its aliases) and a recovered period to consider it as\n        as a 'recovered' period.\n\n    Returns\n    -------\n\n    dict\n        Returns a dict of period-recovery results.\n\n    '''\n\n    if fakepfpkl.endswith('.gz'):\n        infd = gzip.open(fakepfpkl,'rb')\n    else:\n        infd = open(fakepfpkl,'rb')\n\n    fakepf = pickle.load(infd)\n    infd.close()\n\n    # get info from the fakepf dict\n    objectid, lcfbasename = fakepf['objectid'], fakepf['lcfbasename']\n    lcfpath = os.path.join(simbasedir,'lightcurves',lcfbasename)\n\n    # if the LC doesn't exist, bail out\n    if not os.path.exists(lcfpath):\n        LOGERROR('light curve for %s does not exist at: %s' % (objectid,\n                                                               lcfpath))\n        return None\n\n    # now, open the fakelc\n    fakelc = lcproc._read_pklc(lcfpath)\n\n    # get the actual_varparams, actual_varperiod, actual_varamplitude\n    actual_varparams, actual_varperiod, actual_varamplitude, actual_vartype = (\n        fakelc['actual_varparams'],\n        fakelc['actual_varperiod'],\n        fakelc['actual_varamplitude'],\n        fakelc['actual_vartype']\n    )\n\n    # get the moments too so we can track LC noise, etc.\n    actual_moments = fakelc['moments']\n\n    # get the magcols for this LC\n    magcols = fakelc['magcols']\n\n    # get the recovered info from each of the available methods\n    pfres = {\n        'objectid':objectid,\n        'simbasedir':simbasedir,\n        'magcols':magcols,\n        'fakelc':os.path.abspath(lcfpath),\n        'fakepf':os.path.abspath(fakepfpkl),\n        'actual_vartype':actual_vartype,\n        'actual_varperiod':actual_varperiod,\n        'actual_varamplitude':actual_varamplitude,\n        'actual_varparams':actual_varparams,\n        'actual_moments':actual_moments,\n        'recovery_periods':[],\n        'recovery_lspvals':[],\n        'recovery_pfmethods':[],\n        'recovery_magcols':[],\n        'recovery_status':[],\n        'recovery_pdiff':[],\n    }\n\n    # populate the pfres dict with the periods, pfmethods, and magcols\n    for magcol in magcols:\n\n        for pfm in lcproc.PFMETHODS:\n\n            if pfm in fakepf[magcol]:\n\n                # only get the unique recovered periods by using\n                # period_tolerance\n                for rpi, rp in enumerate(\n                        fakepf[magcol][pfm]['nbestperiods']\n                ):\n\n                    if ((not np.any(np.isclose(\n                            rp,\n                            np.array(pfres['recovery_periods']),\n                            rtol=period_tolerance\n                    ))) and np.isfinite(rp)):\n\n                        # populate the recovery periods, pfmethods, and magcols\n                        pfres['recovery_periods'].append(rp)\n                        pfres['recovery_pfmethods'].append(pfm)\n                        pfres['recovery_magcols'].append(magcol)\n\n                        # normalize the periodogram peak value to between\n                        # 0 and 1 so we can put in the results of multiple\n                        # periodfinders on one scale\n                        if pfm == 'pdm':\n\n                            this_lspval = (\n                                np.max(fakepf[magcol][pfm]['lspvals']) -\n                                fakepf[magcol][pfm]['nbestlspvals'][rpi]\n                            )\n\n                        else:\n\n                            this_lspval = (\n                                fakepf[magcol][pfm]['nbestlspvals'][rpi] /\n                                np.max(fakepf[magcol][pfm]['lspvals'])\n                            )\n\n                        # add the normalized lspval to the outdict for\n                        # this object as well. later, we'll use this to\n                        # construct a periodogram for objects that were actually\n                        # not variables\n                        pfres['recovery_lspvals'].append(this_lspval)\n\n\n    # convert the recovery_* lists to arrays\n    pfres['recovery_periods'] = np.array(pfres['recovery_periods'])\n    pfres['recovery_lspvals'] = np.array(pfres['recovery_lspvals'])\n    pfres['recovery_pfmethods'] = np.array(pfres['recovery_pfmethods'])\n    pfres['recovery_magcols'] = np.array(pfres['recovery_magcols'])\n\n    #\n    # now figure out recovery status\n    #\n\n    # if this is an actual periodic variable, characterize the recovery\n    if (actual_vartype and\n        actual_vartype in PERIODIC_VARTYPES and\n        np.isfinite(actual_varperiod)):\n\n        if pfres['recovery_periods'].size > 0:\n\n            for ri in range(pfres['recovery_periods'].size):\n\n                pfres['recovery_pdiff'].append(pfres['recovery_periods'][ri] -\n                                               np.asscalar(actual_varperiod))\n\n                # get the alias types\n                pfres['recovery_status'].append(\n                    check_periodrec_alias(actual_varperiod,\n                                          pfres['recovery_periods'][ri],\n                                          tolerance=period_tolerance)\n                )\n\n            # turn the recovery_pdiff/status lists into arrays\n            pfres['recovery_status'] = np.array(pfres['recovery_status'])\n            pfres['recovery_pdiff'] = np.array(pfres['recovery_pdiff'])\n\n            # find the best recovered period and its status\n            rec_absdiff = np.abs(pfres['recovery_pdiff'])\n            best_recp_ind = rec_absdiff == rec_absdiff.min()\n\n            pfres['best_recovered_period'] = (\n                pfres['recovery_periods'][best_recp_ind]\n            )\n            pfres['best_recovered_pfmethod'] = (\n                pfres['recovery_pfmethods'][best_recp_ind]\n            )\n            pfres['best_recovered_magcol'] = (\n                pfres['recovery_magcols'][best_recp_ind]\n            )\n            pfres['best_recovered_status'] = (\n                pfres['recovery_status'][best_recp_ind]\n            )\n            pfres['best_recovered_pdiff'] = (\n                pfres['recovery_pdiff'][best_recp_ind]\n            )\n\n        else:\n\n            LOGWARNING(\n                'no finite periods recovered from period-finding for %s' %\n                fakepfpkl\n            )\n\n            pfres['recovery_status'] = np.array(['no_finite_periods_recovered'])\n            pfres['recovery_pdiff'] = np.array([np.nan])\n            pfres['best_recovered_period'] = np.array([np.nan])\n            pfres['best_recovered_pfmethod'] = np.array([],dtype=np.unicode_)\n            pfres['best_recovered_magcol'] = np.array([],dtype=np.unicode_)\n            pfres['best_recovered_status'] = np.array([],dtype=np.unicode_)\n            pfres['best_recovered_pdiff'] = np.array([np.nan])\n\n    # if this is not actually a variable, get the recovered period,\n    # etc. anyway. this way, we can see what we need to look out for and avoid\n    # when getting these values for actual objects\n    else:\n\n        pfres['recovery_status'] = np.array(\n            ['not_variable']*pfres['recovery_periods'].size\n        )\n        pfres['recovery_pdiff'] = np.zeros(pfres['recovery_periods'].size)\n\n        pfres['best_recovered_period'] = np.array([np.nan])\n        pfres['best_recovered_pfmethod'] = np.array([],dtype=np.unicode_)\n        pfres['best_recovered_magcol'] = np.array([],dtype=np.unicode_)\n        pfres['best_recovered_status'] = np.array(['not_variable'])\n        pfres['best_recovered_pdiff'] = np.array([np.nan])\n\n    return pfres"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef periodrec_worker(task):\n    '''This is a parallel worker for running period-recovery.\n\n    Parameters\n    ----------\n\n    task : tuple\n        This is used to pass args to the `periodicvar_recovery` function::\n\n            task[0] = period-finding result pickle to work on\n            task[1] = simbasedir\n            task[2] = period_tolerance\n\n    Returns\n    -------\n\n    dict\n        This is the dict produced by the `periodicvar_recovery` function for the\n        input period-finding result pickle.\n\n    '''\n\n    pfpkl, simbasedir, period_tolerance = task\n\n    try:\n        return periodicvar_recovery(pfpkl,\n                                    simbasedir,\n                                    period_tolerance=period_tolerance)\n\n    except Exception as e:\n        LOGEXCEPTION('periodic var recovery failed for %s' % repr(task))\n        return None", "response": "This is a parallel worker for running periodic - var recovery on the input pickle. This is a parallel worker for running periodic - var recovery on the input pickle."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef parallel_periodicvar_recovery(simbasedir,\n                                  period_tolerance=1.0e-3,\n                                  liststartind=None,\n                                  listmaxobjects=None,\n                                  nworkers=None):\n    '''This is a parallel driver for `periodicvar_recovery`.\n\n    Parameters\n    ----------\n\n    simbasedir : str\n        The base directory where all of the fake LCs and period-finding results\n        are.\n\n    period_tolerance : float\n        The maximum difference that this function will consider between an\n        actual period (or its aliases) and a recovered period to consider it as\n        as a 'recovered' period.\n\n    liststartindex : int\n        The starting index of processing. This refers to the filename list\n        generated by running `glob.glob` on the period-finding result pickles in\n        `simbasedir/periodfinding`.\n\n    listmaxobjects : int\n        The maximum number of objects to process in this run. Use this with\n        `liststartindex` to effectively distribute working on a large list of\n        input period-finding result pickles over several sessions or machines.\n\n    nperiodworkers : int\n        This is the number of parallel period-finding worker processes to use.\n\n    Returns\n    -------\n\n    str\n        Returns the filename of the pickle produced containing all of the period\n        recovery results.\n\n    '''\n\n    # figure out the periodfinding pickles directory\n    pfpkldir = os.path.join(simbasedir,'periodfinding')\n\n    if not os.path.exists(pfpkldir):\n        LOGERROR('no \"periodfinding\" subdirectory in %s, can\\'t continue' %\n                 simbasedir)\n        return None\n\n    # find all the periodfinding pickles\n    pfpkl_list = glob.glob(os.path.join(pfpkldir,'*periodfinding*pkl*'))\n\n    if len(pfpkl_list) > 0:\n\n        if liststartind:\n            pfpkl_list = pfpkl_list[liststartind:]\n\n        if listmaxobjects:\n            pfpkl_list = pfpkl_list[:listmaxobjects]\n\n        tasks = [(x, simbasedir, period_tolerance) for x in pfpkl_list]\n\n        pool = mp.Pool(nworkers)\n        results = pool.map(periodrec_worker, tasks)\n        pool.close()\n        pool.join()\n\n        resdict = {x['objectid']:x for x in results if x is not None}\n\n        actual_periodicvars = np.array(\n            [x['objectid'] for x in results\n             if (x is not None and x['actual_vartype'] in PERIODIC_VARTYPES)],\n            dtype=np.unicode_\n        )\n\n        recovered_periodicvars = np.array(\n            [x['objectid'] for x in results\n             if (x is not None and 'actual' in x['best_recovered_status'])],\n            dtype=np.unicode_\n        )\n        alias_twice_periodicvars = np.array(\n            [x['objectid'] for x in results\n             if (x is not None and 'twice' in x['best_recovered_status'])],\n            dtype=np.unicode_\n        )\n        alias_half_periodicvars = np.array(\n            [x['objectid'] for x in results\n             if (x is not None and 'half' in x['best_recovered_status'])],\n            dtype=np.unicode_\n        )\n\n        all_objectids = [x['objectid'] for x in results]\n\n        outdict = {'simbasedir':os.path.abspath(simbasedir),\n                   'objectids':all_objectids,\n                   'period_tolerance':period_tolerance,\n                   'actual_periodicvars':actual_periodicvars,\n                   'recovered_periodicvars':recovered_periodicvars,\n                   'alias_twice_periodicvars':alias_twice_periodicvars,\n                   'alias_half_periodicvars':alias_half_periodicvars,\n                   'details':resdict}\n\n        outfile = os.path.join(simbasedir,'periodicvar-recovery.pkl')\n        with open(outfile, 'wb') as outfd:\n            pickle.dump(outdict, outfd, pickle.HIGHEST_PROTOCOL)\n\n        return outdict\n\n    else:\n\n        LOGERROR(\n            'no periodfinding result pickles found in %s, can\\'t continue' %\n            pfpkldir\n        )\n        return None", "response": "This function is a parallel driver for the period - finding process. This function is used by the period - finding process."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef plot_periodicvar_recovery_results(\n        precvar_results,\n        aliases_count_as_recovered=None,\n        magbins=None,\n        periodbins=None,\n        amplitudebins=None,\n        ndetbins=None,\n        minbinsize=1,\n        plotfile_ext='png',\n):\n    '''This plots the results of periodic var recovery.\n\n    This function makes plots for periodicvar recovered fraction as a function\n    of:\n\n    - magbin\n    - periodbin\n    - amplitude of variability\n    - ndet\n\n    with plot lines broken down by:\n\n    - magcol\n    - periodfinder\n    - vartype\n    - recovery status\n\n    The kwargs `magbins`, `periodbins`, `amplitudebins`, and `ndetbins` can be\n    used to set the bin lists as needed. The kwarg `minbinsize` controls how\n    many elements per bin are required to accept a bin in processing its\n    recovery characteristics for mags, periods, amplitudes, and ndets.\n\n    Parameters\n    ----------\n\n    precvar_results : dict or str\n        This is either a dict returned by parallel_periodicvar_recovery or the\n        pickle created by that function.\n\n    aliases_count_as_recovered : list of str or 'all'\n        This is used to set which kinds of aliases this function considers as\n        'recovered' objects. Normally, we require that recovered objects have a\n        recovery status of 'actual' to indicate the actual period was\n        recovered. To change this default behavior, aliases_count_as_recovered\n        can be set to a list of alias status strings that should be considered\n        as 'recovered' objects as well. Choose from the following alias types::\n\n          'twice'                    recovered_p = 2.0*actual_p\n          'half'                     recovered_p = 0.5*actual_p\n          'ratio_over_1plus'         recovered_p = actual_p/(1.0+actual_p)\n          'ratio_over_1minus'        recovered_p = actual_p/(1.0-actual_p)\n          'ratio_over_1plus_twice'   recovered_p = actual_p/(1.0+2.0*actual_p)\n          'ratio_over_1minus_twice'  recovered_p = actual_p/(1.0-2.0*actual_p)\n          'ratio_over_1plus_thrice'  recovered_p = actual_p/(1.0+3.0*actual_p)\n          'ratio_over_1minus_thrice' recovered_p = actual_p/(1.0-3.0*actual_p)\n          'ratio_over_minus1'        recovered_p = actual_p/(actual_p - 1.0)\n          'ratio_over_twice_minus1'  recovered_p = actual_p/(2.0*actual_p - 1.0)\n\n        or set `aliases_count_as_recovered='all'` to include all of the above in\n        the 'recovered' periodic var list.\n\n    magbins : np.array\n        The magnitude bins to plot the recovery rate results over. If None, the\n        default mag bins will be used: `np.arange(8.0,16.25,0.25)`.\n\n    periodbins : np.array\n        The period bins to plot the recovery rate results over. If None, the\n        default period bins will be used: `np.arange(0.0,500.0,0.5)`.\n\n    amplitudebins : np.array\n        The variability amplitude bins to plot the recovery rate results\n        over. If None, the default amplitude bins will be used:\n        `np.arange(0.0,2.0,0.05)`.\n\n    ndetbins : np.array\n        The ndet bins to plot the recovery rate results over. If None, the\n        default ndet bins will be used: `np.arange(0.0,60000.0,1000.0)`.\n\n    minbinsize : int\n        The minimum number of objects per bin required to plot a bin and its\n        recovery fraction on the plot.\n\n    plotfile_ext : {'png','pdf'}\n        Sets the plot output files' extension.\n\n    Returns\n    -------\n\n    dict\n        A dict containing recovery fraction statistics and the paths to each of\n        the plots made.\n\n    '''\n\n    # get the result pickle/dict\n    if isinstance(precvar_results, str) and os.path.exists(precvar_results):\n\n        with open(precvar_results,'rb') as infd:\n            precvar = pickle.load(infd)\n\n    elif isinstance(precvar_results, dict):\n\n        precvar = precvar_results\n\n    else:\n        LOGERROR('could not understand the input '\n                 'periodic var recovery dict/pickle')\n        return None\n\n    # get the simbasedir and open the fakelc-info.pkl. we'll need the magbins\n    # definition from here.\n    simbasedir = precvar['simbasedir']\n\n    lcinfof = os.path.join(simbasedir,'fakelcs-info.pkl')\n\n    if not os.path.exists(lcinfof):\n        LOGERROR('fakelcs-info.pkl does not exist in %s, can\\'t continue' %\n                 simbasedir)\n        return None\n\n    with open(lcinfof,'rb') as infd:\n        lcinfo = pickle.load(infd)\n\n    # get the magcols, vartypes, sdssr, isvariable flags\n    magcols = lcinfo['magcols']\n    objectid = lcinfo['objectid']\n    ndet = lcinfo['ndet']\n    sdssr = lcinfo['sdssr']\n\n    # get the actual periodic vars\n    actual_periodicvars = precvar['actual_periodicvars']\n\n    # generate lists of objects binned by magbins and periodbins\n    LOGINFO('getting sdssr and ndet for actual periodic vars...')\n\n    # get the sdssr and ndet for all periodic vars\n    periodicvar_sdssr = []\n    periodicvar_ndet = []\n    periodicvar_objectids = []\n\n    for pobj in actual_periodicvars:\n\n        pobjind = objectid == pobj\n        periodicvar_objectids.append(pobj)\n        periodicvar_sdssr.append(sdssr[pobjind])\n        periodicvar_ndet.append(ndet[pobjind])\n\n\n    periodicvar_sdssr = np.array(periodicvar_sdssr)\n    periodicvar_objectids = np.array(periodicvar_objectids)\n    periodicvar_ndet = np.array(periodicvar_ndet)\n\n    LOGINFO('getting periods, vartypes, '\n            'amplitudes, ndet for actual periodic vars...')\n\n    # get the periods, vartypes, amplitudes for the actual periodic vars\n    periodicvar_periods = [\n        np.asscalar(precvar['details'][x]['actual_varperiod'])\n        for x in periodicvar_objectids\n    ]\n    periodicvar_amplitudes = [\n        np.asscalar(precvar['details'][x]['actual_varamplitude'])\n        for x in periodicvar_objectids\n    ]\n    periodicvar_vartypes = [\n        precvar['details'][x]['actual_vartype'] for x in periodicvar_objectids\n    ]\n\n    #\n    # do the binning\n    #\n\n    # bin by mag\n    LOGINFO('binning actual periodic vars by magnitude...')\n\n    magbinned_sdssr = []\n    magbinned_periodicvars = []\n\n    if not magbins:\n        magbins = PERIODREC_DEFAULT_MAGBINS\n    magbininds = np.digitize(np.ravel(periodicvar_sdssr), magbins)\n\n    for mbinind, magi in zip(np.unique(magbininds),\n                             range(len(magbins)-1)):\n\n        thisbin_periodicvars = periodicvar_objectids[magbininds == mbinind]\n\n        if (thisbin_periodicvars.size > (minbinsize-1)):\n\n            magbinned_sdssr.append((magbins[magi] + magbins[magi+1])/2.0)\n            magbinned_periodicvars.append(thisbin_periodicvars)\n\n\n    # bin by period\n    LOGINFO('binning actual periodic vars by period...')\n\n    periodbinned_periods = []\n    periodbinned_periodicvars = []\n\n    if not periodbins:\n        periodbins = PERIODREC_DEFAULT_PERIODBINS\n    periodbininds = np.digitize(np.ravel(periodicvar_periods), periodbins)\n\n    for pbinind, peri in zip(np.unique(periodbininds),\n                             range(len(periodbins)-1)):\n\n        thisbin_periodicvars = periodicvar_objectids[periodbininds == pbinind]\n\n        if (thisbin_periodicvars.size > (minbinsize-1)):\n\n            periodbinned_periods.append((periodbins[peri] +\n                                         periodbins[peri+1])/2.0)\n            periodbinned_periodicvars.append(thisbin_periodicvars)\n\n\n    # bin by amplitude of variability\n    LOGINFO('binning actual periodic vars by variability amplitude...')\n\n    amplitudebinned_amplitudes = []\n    amplitudebinned_periodicvars = []\n\n    if not amplitudebins:\n        amplitudebins = PERIODREC_DEFAULT_AMPBINS\n    amplitudebininds = np.digitize(np.ravel(np.abs(periodicvar_amplitudes)),\n                                   amplitudebins)\n\n    for abinind, ampi in zip(np.unique(amplitudebininds),\n                             range(len(amplitudebins)-1)):\n\n        thisbin_periodicvars = periodicvar_objectids[\n            amplitudebininds == abinind\n        ]\n\n        if (thisbin_periodicvars.size > (minbinsize-1)):\n\n            amplitudebinned_amplitudes.append(\n                (amplitudebins[ampi] +\n                 amplitudebins[ampi+1])/2.0\n            )\n            amplitudebinned_periodicvars.append(thisbin_periodicvars)\n\n\n    # bin by ndet\n    LOGINFO('binning actual periodic vars by ndet...')\n\n    ndetbinned_ndets = []\n    ndetbinned_periodicvars = []\n\n    if not ndetbins:\n        ndetbins = PERIODREC_DEFAULT_NDETBINS\n    ndetbininds = np.digitize(np.ravel(periodicvar_ndet), ndetbins)\n\n    for nbinind, ndeti in zip(np.unique(ndetbininds),\n                              range(len(ndetbins)-1)):\n\n        thisbin_periodicvars = periodicvar_objectids[ndetbininds == nbinind]\n\n        if (thisbin_periodicvars.size > (minbinsize-1)):\n\n            ndetbinned_ndets.append(\n                (ndetbins[ndeti] +\n                 ndetbins[ndeti+1])/2.0\n            )\n            ndetbinned_periodicvars.append(thisbin_periodicvars)\n\n\n    # now figure out what 'recovered' means using the provided\n    # aliases_count_as_recovered kwarg\n    recovered_status = ['actual']\n\n    if isinstance(aliases_count_as_recovered, list):\n\n        for atype in aliases_count_as_recovered:\n            if atype in ALIAS_TYPES:\n                recovered_status.append(atype)\n            else:\n                LOGWARNING('unknown alias type: %s, skipping' % atype)\n\n    elif aliases_count_as_recovered and aliases_count_as_recovered == 'all':\n        for atype in ALIAS_TYPES[1:]:\n            recovered_status.append(atype)\n\n    # find all the matching objects for these recovered statuses\n    recovered_periodicvars = np.array(\n        [precvar['details'][x]['objectid'] for x in precvar['details']\n         if (precvar['details'][x] is not None and\n             precvar['details'][x]['best_recovered_status']\n             in recovered_status)],\n        dtype=np.unicode_\n    )\n\n    LOGINFO('recovered %s/%s periodic variables (frac: %.3f) with '\n            'period recovery status: %s' %\n            (recovered_periodicvars.size,\n             actual_periodicvars.size,\n             float(recovered_periodicvars.size/actual_periodicvars.size),\n             ', '.join(recovered_status)))\n\n\n    # get the objects recovered per bin and overall recovery fractions per bin\n    magbinned_recovered_objects = [\n        np.intersect1d(x,recovered_periodicvars)\n        for x in magbinned_periodicvars\n    ]\n    magbinned_recfrac = np.array([float(x.size/y.size) for x,y\n                                  in zip(magbinned_recovered_objects,\n                                         magbinned_periodicvars)])\n\n    periodbinned_recovered_objects = [\n        np.intersect1d(x,recovered_periodicvars)\n        for x in periodbinned_periodicvars\n    ]\n    periodbinned_recfrac = np.array([float(x.size/y.size) for x,y\n                                     in zip(periodbinned_recovered_objects,\n                                            periodbinned_periodicvars)])\n\n    amplitudebinned_recovered_objects = [\n        np.intersect1d(x,recovered_periodicvars)\n        for x in amplitudebinned_periodicvars\n    ]\n    amplitudebinned_recfrac = np.array(\n        [float(x.size/y.size) for x,y\n         in zip(amplitudebinned_recovered_objects,\n                amplitudebinned_periodicvars)]\n    )\n\n    ndetbinned_recovered_objects = [\n        np.intersect1d(x,recovered_periodicvars)\n        for x in ndetbinned_periodicvars\n    ]\n    ndetbinned_recfrac = np.array([float(x.size/y.size) for x,y\n                                   in zip(ndetbinned_recovered_objects,\n                                          ndetbinned_periodicvars)])\n\n\n    # convert the bin medians to arrays\n    magbinned_sdssr = np.array(magbinned_sdssr)\n    periodbinned_periods = np.array(periodbinned_periods)\n    amplitudebinned_amplitudes = np.array(amplitudebinned_amplitudes)\n    ndetbinned_ndets = np.array(ndetbinned_ndets)\n\n    # this is the initial output dict\n    outdict = {\n        'simbasedir':simbasedir,\n        'precvar_results':precvar,\n        'magcols':magcols,\n        'objectids':objectid,\n        'ndet':ndet,\n        'sdssr':sdssr,\n        'actual_periodicvars':actual_periodicvars,\n        'recovered_periodicvars':recovered_periodicvars,\n        'recovery_definition':recovered_status,\n        # mag binned actual periodicvars\n        # note that only bins with nobjects > minbinsize are included\n        'magbins':magbins,\n        'magbinned_mags':magbinned_sdssr,\n        'magbinned_periodicvars':magbinned_periodicvars,\n        'magbinned_recoveredvars':magbinned_recovered_objects,\n        'magbinned_recfrac':magbinned_recfrac,\n        # period binned actual periodicvars\n        # note that only bins with nobjects > minbinsize are included\n        'periodbins':periodbins,\n        'periodbinned_periods':periodbinned_periods,\n        'periodbinned_periodicvars':periodbinned_periodicvars,\n        'periodbinned_recoveredvars':periodbinned_recovered_objects,\n        'periodbinned_recfrac':periodbinned_recfrac,\n        # amplitude binned actual periodicvars\n        # note that only bins with nobjects > minbinsize are included\n        'amplitudebins':amplitudebins,\n        'amplitudebinned_amplitudes':amplitudebinned_amplitudes,\n        'amplitudebinned_periodicvars':amplitudebinned_periodicvars,\n        'amplitudebinned_recoveredvars':amplitudebinned_recovered_objects,\n        'amplitudebinned_recfrac':amplitudebinned_recfrac,\n        # ndet binned actual periodicvars\n        # note that only bins with nobjects > minbinsize are included\n        'ndetbins':ndetbins,\n        'ndetbinned_ndets':ndetbinned_ndets,\n        'ndetbinned_periodicvars':ndetbinned_periodicvars,\n        'ndetbinned_recoveredvars':ndetbinned_recovered_objects,\n        'ndetbinned_recfrac':ndetbinned_recfrac,\n    }\n\n\n    # figure out which pfmethods were used\n    all_pfmethods = np.unique(\n        np.concatenate(\n            [np.unique(precvar['details'][x]['recovery_pfmethods'])\n             for x in precvar['details']]\n        )\n    )\n\n    # figure out all vartypes\n    all_vartypes = np.unique(\n        [(precvar['details'][x]['actual_vartype'])\n         for x in precvar['details'] if\n         (precvar['details'][x]['actual_vartype'] is not None)]\n    )\n\n    # figure out all alias types\n    all_aliastypes = recovered_status\n\n    # add these to the outdict\n    outdict['aliastypes'] = all_aliastypes\n    outdict['pfmethods'] = all_pfmethods\n    outdict['vartypes'] = all_vartypes\n\n\n    # these are recfracs per-magcol, -vartype, -periodfinder, -aliastype\n    # binned appropriately by mags, periods, amplitudes, and ndet\n    # all of these have the shape as the magcols, aliastypes, pfmethods, and\n    # vartypes lists above.\n\n    magbinned_per_magcol_recfracs = []\n    magbinned_per_vartype_recfracs = []\n    magbinned_per_pfmethod_recfracs = []\n    magbinned_per_aliastype_recfracs = []\n\n    periodbinned_per_magcol_recfracs = []\n    periodbinned_per_vartype_recfracs = []\n    periodbinned_per_pfmethod_recfracs = []\n    periodbinned_per_aliastype_recfracs = []\n\n    amplitudebinned_per_magcol_recfracs = []\n    amplitudebinned_per_vartype_recfracs = []\n    amplitudebinned_per_pfmethod_recfracs = []\n    amplitudebinned_per_aliastype_recfracs = []\n\n    ndetbinned_per_magcol_recfracs = []\n    ndetbinned_per_vartype_recfracs = []\n    ndetbinned_per_pfmethod_recfracs = []\n    ndetbinned_per_aliastype_recfracs = []\n\n\n    #\n    # finally, we do stuff for the plots!\n    #\n    recplotdir = os.path.join(simbasedir, 'periodic-variable-recovery-plots')\n    if not os.path.exists(recplotdir):\n        os.mkdir(recplotdir)\n\n    # 1. recovery-rate by magbin\n\n    # 1a. plot of overall recovery rate per magbin\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    plt.plot(magbinned_sdssr, magbinned_recfrac,marker='.',ms=0.0)\n    plt.xlabel(r'SDSS $r$ magnitude')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('overall recovery fraction by periodic var magnitudes')\n    plt.ylim((0,1))\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-magnitudes-overall.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # 1b. plot of recovery rate per magbin per magcol\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    for magcol in magcols:\n\n        thismagcol_recfracs = []\n\n        for magbin_pv, magbin_rv in zip(magbinned_periodicvars,\n                                        magbinned_recovered_objects):\n\n            thisbin_thismagcol_recvars = [\n                x for x in magbin_rv\n                if (precvar['details'][x]['best_recovered_magcol'] == magcol)\n            ]\n            thisbin_thismagcol_recfrac = (\n                np.array(thisbin_thismagcol_recvars).size /\n                magbin_pv.size\n            )\n            thismagcol_recfracs.append(thisbin_thismagcol_recfrac)\n\n        # now that we have per magcol recfracs, plot them\n        plt.plot(magbinned_sdssr,\n                 np.array(thismagcol_recfracs),\n                 marker='.',\n                 label='magcol: %s' % magcol,\n                 ms=0.0)\n\n        # add this to the outdict array\n        magbinned_per_magcol_recfracs.append(np.array(thismagcol_recfracs))\n\n    # finish up the plot\n    plt.plot(magbinned_sdssr, magbinned_recfrac,\n             marker='.',ms=0.0, label='overall', color='k')\n    plt.xlabel(r'SDSS $r$ magnitude')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('per magcol recovery fraction by periodic var magnitudes')\n    plt.ylim((0,1))\n    plt.legend(markerscale=10.0)\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-magnitudes-magcols.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # 1c. plot of recovery rate per magbin per periodfinder\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    # figure out which pfmethods were used\n    all_pfmethods = np.unique(\n        np.concatenate(\n            [np.unique(precvar['details'][x]['recovery_pfmethods'])\n             for x in precvar['details']]\n        )\n    )\n\n    for pfm in all_pfmethods:\n\n        thispf_recfracs = []\n\n        for magbin_pv, magbin_rv in zip(magbinned_periodicvars,\n                                        magbinned_recovered_objects):\n\n            thisbin_thispf_recvars = [\n                x for x in magbin_rv\n                if (precvar['details'][x]['best_recovered_pfmethod'] == pfm)\n            ]\n            thisbin_thismagcol_recfrac = (\n                np.array(thisbin_thispf_recvars).size /\n                magbin_pv.size\n            )\n            thispf_recfracs.append(thisbin_thismagcol_recfrac)\n\n        # now that we have per magcol recfracs, plot them\n        plt.plot(magbinned_sdssr,\n                 np.array(thispf_recfracs),\n                 marker='.',\n                 label='%s' % pfm.upper(),\n                 ms=0.0)\n\n        # add this to the outdict array\n        magbinned_per_pfmethod_recfracs.append(np.array(thispf_recfracs))\n\n    # finish up the plot\n    plt.plot(magbinned_sdssr, magbinned_recfrac,\n             marker='.',ms=0.0, label='overall', color='k')\n    plt.xlabel(r'SDSS $r$ magnitude')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('per period-finder recovery fraction by periodic var magnitudes')\n    plt.ylim((0,1))\n    plt.legend(markerscale=10.0)\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-magnitudes-pfmethod.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # 1d. plot of recovery rate per magbin per variable type\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    # figure out all vartypes\n    all_vartypes = np.unique(\n        [(precvar['details'][x]['actual_vartype'])\n         for x in precvar['details'] if\n         (precvar['details'][x]['actual_vartype'] is not None)]\n    )\n\n    for vt in all_vartypes:\n\n        thisvt_recfracs = []\n\n        for magbin_pv, magbin_rv in zip(magbinned_periodicvars,\n                                        magbinned_recovered_objects):\n\n            thisbin_thisvt_recvars = [\n                x for x in magbin_rv\n                if (precvar['details'][x]['actual_vartype'] == vt)\n            ]\n            thisbin_thismagcol_recfrac = (\n                np.array(thisbin_thisvt_recvars).size /\n                magbin_pv.size\n            )\n            thisvt_recfracs.append(thisbin_thismagcol_recfrac)\n\n        # now that we have per magcol recfracs, plot them\n        plt.plot(magbinned_sdssr,\n                 np.array(thisvt_recfracs),\n                 marker='.',\n                 label='%s' % vt,\n                 ms=0.0)\n\n        # add this to the outdict array\n        magbinned_per_vartype_recfracs.append(np.array(thisvt_recfracs))\n\n    # finish up the plot\n    plt.plot(magbinned_sdssr, magbinned_recfrac,\n             marker='.',ms=0.0, label='overall', color='k')\n    plt.xlabel(r'SDSS $r$ magnitude')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('per vartype recovery fraction by periodic var magnitudes')\n    plt.ylim((0,1))\n    plt.legend(markerscale=10.0)\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-magnitudes-vartype.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # 1e. plot of recovery rate per magbin per alias type\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    # figure out all alias types\n    all_aliastypes = recovered_status\n\n    for at in all_aliastypes:\n\n        thisat_recfracs = []\n\n        for magbin_pv, magbin_rv in zip(magbinned_periodicvars,\n                                        magbinned_recovered_objects):\n\n            thisbin_thisat_recvars = [\n                x for x in magbin_rv\n                if (precvar['details'][x]['best_recovered_status'][0] == at)\n            ]\n            thisbin_thismagcol_recfrac = (\n                np.array(thisbin_thisat_recvars).size /\n                magbin_pv.size\n            )\n            thisat_recfracs.append(thisbin_thismagcol_recfrac)\n\n        # now that we have per magcol recfracs, plot them\n        plt.plot(magbinned_sdssr,\n                 np.array(thisat_recfracs),\n                 marker='.',\n                 label='%s' % at,\n                 ms=0.0)\n\n        # add this to the outdict array\n        magbinned_per_aliastype_recfracs.append(np.array(thisat_recfracs))\n\n    # finish up the plot\n    plt.plot(magbinned_sdssr, magbinned_recfrac,\n             marker='.',ms=0.0, label='overall', color='k')\n    plt.xlabel(r'SDSS $r$ magnitude')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('per alias-type recovery fraction by periodic var magnitudes')\n    plt.ylim((0,1))\n    plt.legend(markerscale=10.0)\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-magnitudes-aliastype.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # 2. recovery-rate by periodbin\n\n    # 2a. plot of overall recovery rate per periodbin\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    plt.plot(periodbinned_periods, periodbinned_recfrac,\n             marker='.',ms=0.0)\n    plt.xlabel('periodic variable period [days]')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('overall recovery fraction by periodic var periods')\n    plt.ylim((0,1))\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-periods-overall.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n    # 2b. plot of recovery rate per periodbin per magcol\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    for magcol in magcols:\n\n        thismagcol_recfracs = []\n\n        for periodbin_pv, periodbin_rv in zip(periodbinned_periodicvars,\n                                              periodbinned_recovered_objects):\n\n            thisbin_thismagcol_recvars = [\n                x for x in periodbin_rv\n                if (precvar['details'][x]['best_recovered_magcol'] == magcol)\n            ]\n            thisbin_thismagcol_recfrac = (\n                np.array(thisbin_thismagcol_recvars).size /\n                periodbin_pv.size\n            )\n            thismagcol_recfracs.append(thisbin_thismagcol_recfrac)\n\n        # now that we have per magcol recfracs, plot them\n        plt.plot(periodbinned_periods,\n                 np.array(thismagcol_recfracs),\n                 marker='.',\n                 label='magcol: %s' % magcol,\n                 ms=0.0)\n\n        # add this to the outdict array\n        periodbinned_per_magcol_recfracs.append(np.array(thismagcol_recfracs))\n\n    # finish up the plot\n    plt.plot(periodbinned_periods, periodbinned_recfrac,\n             marker='.',ms=0.0, label='overall', color='k')\n    plt.xlabel(r'SDSS $r$ magnitude')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('per magcol recovery fraction by periodic var periods')\n    plt.ylim((0,1))\n    plt.legend(markerscale=10.0)\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-periods-magcols.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # 2c. plot of recovery rate per periodbin per periodfinder\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    # figure out which pfmethods were used\n    all_pfmethods = np.unique(\n        np.concatenate(\n            [np.unique(precvar['details'][x]['recovery_pfmethods'])\n             for x in precvar['details']]\n        )\n    )\n\n    for pfm in all_pfmethods:\n\n        thispf_recfracs = []\n\n        for periodbin_pv, periodbin_rv in zip(periodbinned_periodicvars,\n                                              periodbinned_recovered_objects):\n\n            thisbin_thispf_recvars = [\n                x for x in periodbin_rv\n                if (precvar['details'][x]['best_recovered_pfmethod'] == pfm)\n            ]\n            thisbin_thismagcol_recfrac = (\n                np.array(thisbin_thispf_recvars).size /\n                periodbin_pv.size\n            )\n            thispf_recfracs.append(thisbin_thismagcol_recfrac)\n\n        # now that we have per magcol recfracs, plot them\n        plt.plot(periodbinned_periods,\n                 np.array(thispf_recfracs),\n                 marker='.',\n                 label='%s' % pfm.upper(),\n                 ms=0.0)\n\n        # add this to the outdict array\n        periodbinned_per_pfmethod_recfracs.append(np.array(thispf_recfracs))\n\n    # finish up the plot\n    plt.plot(periodbinned_periods, periodbinned_recfrac,\n             marker='.',ms=0.0, label='overall', color='k')\n    plt.xlabel(r'SDSS $r$ magnitude')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('per period-finder recovery fraction by periodic var periods')\n    plt.ylim((0,1))\n    plt.legend(markerscale=10.0)\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-periods-pfmethod.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # 2d. plot of recovery rate per periodbin per variable type\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    # figure out all vartypes\n    all_vartypes = np.unique(\n        [(precvar['details'][x]['actual_vartype'])\n         for x in precvar['details'] if\n         (precvar['details'][x]['actual_vartype'] is not None)]\n    )\n\n    for vt in all_vartypes:\n\n        thisvt_recfracs = []\n\n        for periodbin_pv, periodbin_rv in zip(periodbinned_periodicvars,\n                                              periodbinned_recovered_objects):\n\n            thisbin_thisvt_recvars = [\n                x for x in periodbin_rv\n                if (precvar['details'][x]['actual_vartype'] == vt)\n            ]\n            thisbin_thismagcol_recfrac = (\n                np.array(thisbin_thisvt_recvars).size /\n                periodbin_pv.size\n            )\n            thisvt_recfracs.append(thisbin_thismagcol_recfrac)\n\n        # now that we have per magcol recfracs, plot them\n        plt.plot(periodbinned_periods,\n                 np.array(thisvt_recfracs),\n                 marker='.',\n                 label='%s' % vt,\n                 ms=0.0)\n\n        # add this to the outdict array\n        periodbinned_per_vartype_recfracs.append(np.array(thisvt_recfracs))\n\n    # finish up the plot\n    plt.plot(periodbinned_periods, periodbinned_recfrac,\n             marker='.',ms=0.0, label='overall', color='k')\n    plt.xlabel(r'SDSS $r$ magnitude')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('per vartype recovery fraction by periodic var magnitudes')\n    plt.ylim((0,1))\n    plt.legend(markerscale=10.0)\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-periods-vartype.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # 2e. plot of recovery rate per periodbin per alias type\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    # figure out all vartypes\n    all_aliastypes = recovered_status\n\n    for at in all_aliastypes:\n\n        thisat_recfracs = []\n\n        for periodbin_pv, periodbin_rv in zip(\n                periodbinned_periodicvars,\n                periodbinned_recovered_objects\n        ):\n\n            thisbin_thisat_recvars = [\n                x for x in periodbin_rv\n                if (precvar['details'][x]['best_recovered_status'][0] == at)\n            ]\n            thisbin_thismagcol_recfrac = (\n                np.array(thisbin_thisat_recvars).size /\n                periodbin_pv.size\n            )\n            thisat_recfracs.append(thisbin_thismagcol_recfrac)\n\n        # now that we have per magcol recfracs, plot them\n        plt.plot(periodbinned_periods,\n                 np.array(thisat_recfracs),\n                 marker='.',\n                 label='%s' % at,\n                 ms=0.0)\n\n        # add this to the outdict array\n        periodbinned_per_aliastype_recfracs.append(np.array(thisat_recfracs))\n\n    # finish up the plot\n    plt.plot(periodbinned_periods, periodbinned_recfrac,\n             marker='.',ms=0.0, label='overall', color='k')\n    plt.xlabel(r'SDSS $r$ magnitude')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('per alias-type recovery fraction by periodic var magnitudes')\n    plt.ylim((0,1))\n    plt.legend(markerscale=10.0)\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-periods-aliastype.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # 3. recovery-rate by amplitude bin\n\n    # 3a. plot of overall recovery rate per amplitude bin\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    plt.plot(amplitudebinned_amplitudes, amplitudebinned_recfrac,\n             marker='.',ms=0.0)\n    plt.xlabel('periodic variable amplitude [mag]')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('overall recovery fraction by periodic var amplitudes')\n    plt.ylim((0,1))\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-amplitudes-overall.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n    # 3b. plot of recovery rate per amplitude bin per magcol\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    for magcol in magcols:\n\n        thismagcol_recfracs = []\n\n        for amplitudebin_pv, amplitudebin_rv in zip(\n                amplitudebinned_periodicvars,\n                amplitudebinned_recovered_objects\n        ):\n\n            thisbin_thismagcol_recvars = [\n                x for x in amplitudebin_rv\n                if (precvar['details'][x]['best_recovered_magcol'] == magcol)\n            ]\n            thisbin_thismagcol_recfrac = (\n                np.array(thisbin_thismagcol_recvars).size /\n                amplitudebin_pv.size\n            )\n            thismagcol_recfracs.append(thisbin_thismagcol_recfrac)\n\n        # now that we have per magcol recfracs, plot them\n        plt.plot(amplitudebinned_amplitudes,\n                 np.array(thismagcol_recfracs),\n                 marker='.',\n                 label='magcol: %s' % magcol,\n                 ms=0.0)\n\n        # add this to the outdict array\n        amplitudebinned_per_magcol_recfracs.append(\n            np.array(thismagcol_recfracs)\n        )\n\n    # finish up the plot\n    plt.plot(amplitudebinned_amplitudes, amplitudebinned_recfrac,\n             marker='.',ms=0.0, label='overall', color='k')\n    plt.xlabel(r'SDSS $r$ magnitude')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('per magcol recovery fraction by periodic var amplitudes')\n    plt.ylim((0,1))\n    plt.legend(markerscale=10.0)\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-amplitudes-magcols.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # 3c. plot of recovery rate per amplitude bin per periodfinder\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    # figure out which pfmethods were used\n    all_pfmethods = np.unique(\n        np.concatenate(\n            [np.unique(precvar['details'][x]['recovery_pfmethods'])\n             for x in precvar['details']]\n        )\n    )\n\n    for pfm in all_pfmethods:\n\n        thispf_recfracs = []\n\n        for amplitudebin_pv, amplitudebin_rv in zip(\n                amplitudebinned_periodicvars,\n                amplitudebinned_recovered_objects\n        ):\n\n            thisbin_thispf_recvars = [\n                x for x in amplitudebin_rv\n                if (precvar['details'][x]['best_recovered_pfmethod'] == pfm)\n            ]\n            thisbin_thismagcol_recfrac = (\n                np.array(thisbin_thispf_recvars).size /\n                amplitudebin_pv.size\n            )\n            thispf_recfracs.append(thisbin_thismagcol_recfrac)\n\n        # now that we have per magcol recfracs, plot them\n        plt.plot(amplitudebinned_amplitudes,\n                 np.array(thispf_recfracs),\n                 marker='.',\n                 label='%s' % pfm.upper(),\n                 ms=0.0)\n\n        # add this to the outdict array\n        amplitudebinned_per_pfmethod_recfracs.append(\n            np.array(thispf_recfracs)\n        )\n\n    # finish up the plot\n    plt.plot(amplitudebinned_amplitudes, amplitudebinned_recfrac,\n             marker='.',ms=0.0, label='overall', color='k')\n    plt.xlabel(r'SDSS $r$ magnitude')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('per period-finder recovery fraction by periodic var amplitudes')\n    plt.ylim((0,1))\n    plt.legend(markerscale=10.0)\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-amplitudes-pfmethod.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # 3d. plot of recovery rate per amplitude bin per variable type\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    # figure out all vartypes\n    all_vartypes = np.unique(\n        [(precvar['details'][x]['actual_vartype'])\n         for x in precvar['details'] if\n         (precvar['details'][x]['actual_vartype'] is not None)]\n    )\n\n    for vt in all_vartypes:\n\n        thisvt_recfracs = []\n\n        for amplitudebin_pv, amplitudebin_rv in zip(\n                amplitudebinned_periodicvars,\n                amplitudebinned_recovered_objects\n        ):\n\n            thisbin_thisvt_recvars = [\n                x for x in amplitudebin_rv\n                if (precvar['details'][x]['actual_vartype'] == vt)\n            ]\n            thisbin_thismagcol_recfrac = (\n                np.array(thisbin_thisvt_recvars).size /\n                amplitudebin_pv.size\n            )\n            thisvt_recfracs.append(thisbin_thismagcol_recfrac)\n\n        # now that we have per magcol recfracs, plot them\n        plt.plot(amplitudebinned_amplitudes,\n                 np.array(thisvt_recfracs),\n                 marker='.',\n                 label='%s' % vt,\n                 ms=0.0)\n\n        # add this to the outdict array\n        amplitudebinned_per_vartype_recfracs.append(\n            np.array(thisvt_recfracs)\n        )\n\n\n    # finish up the plot\n    plt.plot(amplitudebinned_amplitudes, amplitudebinned_recfrac,\n             marker='.',ms=0.0, label='overall', color='k')\n    plt.xlabel(r'SDSS $r$ magnitude')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('per vartype recovery fraction by periodic var amplitudes')\n    plt.ylim((0,1))\n    plt.legend(markerscale=10.0)\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-amplitudes-vartype.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # 3e. plot of recovery rate per amplitude bin per alias type\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    # figure out all vartypes\n    all_aliastypes = recovered_status\n\n    for at in all_aliastypes:\n\n        thisat_recfracs = []\n\n        for amplitudebin_pv, amplitudebin_rv in zip(\n                amplitudebinned_periodicvars,\n                amplitudebinned_recovered_objects\n        ):\n\n            thisbin_thisat_recvars = [\n                x for x in amplitudebin_rv\n                if (precvar['details'][x]['best_recovered_status'][0] == at)\n            ]\n            thisbin_thismagcol_recfrac = (\n                np.array(thisbin_thisat_recvars).size /\n                amplitudebin_pv.size\n            )\n            thisat_recfracs.append(thisbin_thismagcol_recfrac)\n\n        # now that we have per magcol recfracs, plot them\n        plt.plot(amplitudebinned_amplitudes,\n                 np.array(thisat_recfracs),\n                 marker='.',\n                 label='%s' % at,\n                 ms=0.0)\n\n        # add this to the outdict array\n        amplitudebinned_per_aliastype_recfracs.append(\n            np.array(thisat_recfracs)\n        )\n\n\n    # finish up the plot\n    plt.plot(amplitudebinned_amplitudes, amplitudebinned_recfrac,\n             marker='.',ms=0.0, label='overall', color='k')\n    plt.xlabel(r'SDSS $r$ magnitude')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('per alias-type recovery fraction by periodic var amplitudes')\n    plt.ylim((0,1))\n    plt.legend(markerscale=10.0)\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-amplitudes-aliastype.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # 4. recovery-rate by ndet bin\n\n    # 4a. plot of overall recovery rate per ndet bin\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    plt.plot(ndetbinned_ndets, ndetbinned_recfrac,\n             marker='.',ms=0.0)\n    plt.xlabel('periodic variable light curve points')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('overall recovery fraction by periodic var ndet')\n    plt.ylim((0,1))\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-ndet-overall.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n    # 4b. plot of recovery rate per ndet bin per magcol\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    for magcol in magcols:\n\n        thismagcol_recfracs = []\n\n        for ndetbin_pv, ndetbin_rv in zip(ndetbinned_periodicvars,\n                                          ndetbinned_recovered_objects):\n\n            thisbin_thismagcol_recvars = [\n                x for x in ndetbin_rv\n                if (precvar['details'][x]['best_recovered_magcol'] == magcol)\n            ]\n            thisbin_thismagcol_recfrac = (\n                np.array(thisbin_thismagcol_recvars).size /\n                ndetbin_pv.size\n            )\n            thismagcol_recfracs.append(thisbin_thismagcol_recfrac)\n\n        # now that we have per magcol recfracs, plot them\n        plt.plot(ndetbinned_ndets,\n                 np.array(thismagcol_recfracs),\n                 marker='.',\n                 label='magcol: %s' % magcol,\n                 ms=0.0)\n\n        # add this to the outdict array\n        ndetbinned_per_magcol_recfracs.append(\n            np.array(thismagcol_recfracs)\n        )\n\n    # finish up the plot\n    plt.plot(ndetbinned_ndets, ndetbinned_recfrac,\n             marker='.',ms=0.0, label='overall', color='k')\n    plt.xlabel(r'SDSS $r$ magnitude')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('per magcol recovery fraction by periodic var ndets')\n    plt.ylim((0,1))\n    plt.legend(markerscale=10.0)\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-ndet-magcols.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # 4c. plot of recovery rate per ndet bin per periodfinder\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    # figure out which pfmethods were used\n    all_pfmethods = np.unique(\n        np.concatenate(\n            [np.unique(precvar['details'][x]['recovery_pfmethods'])\n             for x in precvar['details']]\n        )\n    )\n\n    for pfm in all_pfmethods:\n\n        thispf_recfracs = []\n\n        for ndetbin_pv, ndetbin_rv in zip(ndetbinned_periodicvars,\n                                          ndetbinned_recovered_objects):\n\n            thisbin_thispf_recvars = [\n                x for x in ndetbin_rv\n                if (precvar['details'][x]['best_recovered_pfmethod'] == pfm)\n            ]\n            thisbin_thismagcol_recfrac = (\n                np.array(thisbin_thispf_recvars).size /\n                ndetbin_pv.size\n            )\n            thispf_recfracs.append(thisbin_thismagcol_recfrac)\n\n        # now that we have per magcol recfracs, plot them\n        plt.plot(ndetbinned_ndets,\n                 np.array(thispf_recfracs),\n                 marker='.',\n                 label='%s' % pfm.upper(),\n                 ms=0.0)\n\n        # add this to the outdict array\n        ndetbinned_per_pfmethod_recfracs.append(\n            np.array(thispf_recfracs)\n        )\n\n    # finish up the plot\n    plt.plot(ndetbinned_ndets, ndetbinned_recfrac,\n             marker='.',ms=0.0, label='overall', color='k')\n    plt.xlabel(r'SDSS $r$ magnitude')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('per period-finder recovery fraction by periodic var ndets')\n    plt.ylim((0,1))\n    plt.legend(markerscale=10.0)\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-ndet-pfmethod.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # 4d. plot of recovery rate per ndet bin per variable type\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    # figure out all vartypes\n    all_vartypes = np.unique(\n        [(precvar['details'][x]['actual_vartype'])\n         for x in precvar['details'] if\n         (precvar['details'][x]['actual_vartype'] in PERIODIC_VARTYPES)]\n    )\n\n    for vt in all_vartypes:\n\n        thisvt_recfracs = []\n\n        for ndetbin_pv, ndetbin_rv in zip(ndetbinned_periodicvars,\n                                          ndetbinned_recovered_objects):\n\n            thisbin_thisvt_recvars = [\n                x for x in ndetbin_rv\n                if (precvar['details'][x]['actual_vartype'] == vt)\n            ]\n            thisbin_thismagcol_recfrac = (\n                np.array(thisbin_thisvt_recvars).size /\n                ndetbin_pv.size\n            )\n            thisvt_recfracs.append(thisbin_thismagcol_recfrac)\n\n        # now that we have per magcol recfracs, plot them\n        plt.plot(ndetbinned_ndets,\n                 np.array(thisvt_recfracs),\n                 marker='.',\n                 label='%s' % vt,\n                 ms=0.0)\n\n        # add this to the outdict array\n        ndetbinned_per_vartype_recfracs.append(\n            np.array(thisvt_recfracs)\n        )\n\n    # finish up the plot\n    plt.plot(ndetbinned_ndets, ndetbinned_recfrac,\n             marker='.',ms=0.0, label='overall', color='k')\n    plt.xlabel(r'SDSS $r$ magnitude')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('per vartype recovery fraction by periodic var ndets')\n    plt.ylim((0,1))\n    plt.legend(markerscale=10.0)\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-ndet-vartype.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n    # 4e. plot of recovery rate per ndet bin per alias type\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    # figure out all vartypes\n    all_aliastypes = recovered_status\n\n    for at in all_aliastypes:\n\n        thisat_recfracs = []\n\n        for ndetbin_pv, ndetbin_rv in zip(ndetbinned_periodicvars,\n                                          ndetbinned_recovered_objects):\n\n            thisbin_thisat_recvars = [\n                x for x in ndetbin_rv\n                if (precvar['details'][x]['best_recovered_status'][0] == at)\n            ]\n            thisbin_thismagcol_recfrac = (\n                np.array(thisbin_thisat_recvars).size /\n                ndetbin_pv.size\n            )\n            thisat_recfracs.append(thisbin_thismagcol_recfrac)\n\n        # now that we have per magcol recfracs, plot them\n        plt.plot(ndetbinned_ndets,\n                 np.array(thisat_recfracs),\n                 marker='.',\n                 label='%s' % at,\n                 ms=0.0)\n\n        # add this to the outdict array\n        ndetbinned_per_aliastype_recfracs.append(\n            np.array(thisat_recfracs)\n        )\n\n    # finish up the plot\n    plt.plot(ndetbinned_ndets, ndetbinned_recfrac,\n             marker='.',ms=0.0, label='overall', color='k')\n    plt.xlabel(r'SDSS $r$ magnitude')\n    plt.ylabel('recovered fraction of periodic variables')\n    plt.title('per alias-type recovery fraction by periodic var ndets')\n    plt.ylim((0,1))\n    plt.legend(markerscale=10.0)\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-binned-ndet-aliastype.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n    # update the lists in the outdict\n    outdict['magbinned_per_magcol_recfracs'] = (\n        magbinned_per_magcol_recfracs\n    )\n    outdict['magbinned_per_pfmethod_recfracs'] = (\n        magbinned_per_pfmethod_recfracs\n    )\n    outdict['magbinned_per_vartype_recfracs'] = (\n        magbinned_per_vartype_recfracs\n    )\n    outdict['magbinned_per_aliastype_recfracs'] = (\n        magbinned_per_aliastype_recfracs\n    )\n\n    outdict['periodbinned_per_magcol_recfracs'] = (\n        periodbinned_per_magcol_recfracs\n    )\n    outdict['periodbinned_per_pfmethod_recfracs'] = (\n        periodbinned_per_pfmethod_recfracs\n    )\n    outdict['periodbinned_per_vartype_recfracs'] = (\n        periodbinned_per_vartype_recfracs\n    )\n    outdict['periodbinned_per_aliastype_recfracs'] = (\n        periodbinned_per_aliastype_recfracs\n    )\n\n    outdict['amplitudebinned_per_magcol_recfracs'] = (\n        amplitudebinned_per_magcol_recfracs\n    )\n    outdict['amplitudebinned_per_pfmethod_recfracs'] = (\n        amplitudebinned_per_pfmethod_recfracs\n    )\n    outdict['amplitudebinned_per_vartype_recfracs'] = (\n        amplitudebinned_per_vartype_recfracs\n    )\n    outdict['amplitudebinned_per_aliastype_recfracs'] = (\n        amplitudebinned_per_aliastype_recfracs\n    )\n\n    outdict['ndetbinned_per_magcol_recfracs'] = (\n        ndetbinned_per_magcol_recfracs\n    )\n    outdict['ndetbinned_per_pfmethod_recfracs'] = (\n        ndetbinned_per_pfmethod_recfracs\n    )\n    outdict['ndetbinned_per_vartype_recfracs'] = (\n        ndetbinned_per_vartype_recfracs\n    )\n    outdict['ndetbinned_per_aliastype_recfracs'] = (\n        ndetbinned_per_aliastype_recfracs\n    )\n\n\n    # get the overall recovered vars per pfmethod\n    overall_recvars_per_pfmethod = []\n\n    for pfm in all_pfmethods:\n\n        thispfm_recvars = np.array([\n            x for x in precvar['details'] if\n            ((x in recovered_periodicvars) and\n             (precvar['details'][x]['best_recovered_pfmethod'] == pfm))\n        ])\n        overall_recvars_per_pfmethod.append(thispfm_recvars)\n\n\n    # get the overall recovered vars per vartype\n    overall_recvars_per_vartype = []\n\n    for vt in all_vartypes:\n\n        thisvt_recvars = np.array([\n            x for x in precvar['details'] if\n            ((x in recovered_periodicvars) and\n             (precvar['details'][x]['actual_vartype'] == vt))\n        ])\n        overall_recvars_per_vartype.append(thisvt_recvars)\n\n\n    # get the overall recovered vars per magcol\n    overall_recvars_per_magcol = []\n\n    for mc in magcols:\n\n        thismc_recvars = np.array([\n            x for x in precvar['details'] if\n            ((x in recovered_periodicvars) and\n             (precvar['details'][x]['best_recovered_magcol'] == mc))\n        ])\n        overall_recvars_per_magcol.append(thismc_recvars)\n\n\n    # get the overall recovered vars per aliastype\n    overall_recvars_per_aliastype = []\n\n    for at in all_aliastypes:\n\n        thisat_recvars = np.array([\n            x for x in precvar['details'] if\n            ((x in recovered_periodicvars) and\n             (precvar['details'][x]['best_recovered_status'] == at))\n        ])\n        overall_recvars_per_aliastype.append(thisat_recvars)\n\n    # update the outdict with these\n    outdict['overall_recfrac_per_pfmethod'] = np.array([\n        x.size/actual_periodicvars.size for x in overall_recvars_per_pfmethod\n    ])\n    outdict['overall_recfrac_per_vartype'] = np.array([\n        x.size/actual_periodicvars.size for x in overall_recvars_per_vartype\n    ])\n    outdict['overall_recfrac_per_magcol'] = np.array([\n        x.size/actual_periodicvars.size for x in overall_recvars_per_magcol\n    ])\n    outdict['overall_recfrac_per_aliastype'] = np.array([\n        x.size/actual_periodicvars.size for x in overall_recvars_per_aliastype\n    ])\n\n\n    # 5. bar plot of overall recovery rate per pfmethod\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    xt = np.arange(len(all_pfmethods))\n    xl = all_pfmethods\n\n    plt.barh(xt, outdict['overall_recfrac_per_pfmethod'], 0.50)\n    plt.yticks(xt, xl)\n    plt.xlabel('period-finding method')\n    plt.ylabel('overall recovery rate')\n    plt.title('overall recovery rate per period-finding method')\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-overall-pfmethod.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # 6. bar plot of overall recovery rate per magcol\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    xt = np.arange(len(magcols))\n    xl = magcols\n\n    plt.barh(xt, outdict['overall_recfrac_per_magcol'], 0.50)\n    plt.yticks(xt, xl)\n    plt.xlabel('light curve magnitude column')\n    plt.ylabel('overall recovery rate')\n    plt.title('overall recovery rate per light curve magcol')\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-overall-magcol.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # 7. bar plot of overall recovery rate per aliastype\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    xt = np.arange(len(all_aliastypes))\n    xl = all_aliastypes\n\n    plt.barh(xt, outdict['overall_recfrac_per_aliastype'], 0.50)\n    plt.yticks(xt, xl)\n    plt.xlabel('period recovery status')\n    plt.ylabel('overall recovery rate')\n    plt.title('overall recovery rate per period recovery status')\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-overall-aliastype.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # 8. bar plot of overall recovery rate per vartype\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n\n    xt = np.arange(len(all_vartypes))\n    xl = all_vartypes\n\n    plt.barh(xt, outdict['overall_recfrac_per_vartype'], 0.50)\n    plt.yticks(xt, xl)\n    plt.xlabel('periodic variable type')\n    plt.ylabel('overall recovery rate')\n    plt.title('overall recovery rate per periodic variable type')\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recfrac-overall-vartype.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # 9. overall recovered period periodogram for objects that aren't actual\n    # periodic variables. this effectively should give us the window function of\n    # the observations\n\n    notvariable_recovered_periods = np.concatenate([\n        precvar['details'][x]['recovery_periods']\n        for x in precvar['details'] if\n        (precvar['details'][x]['actual_vartype'] is None)\n    ])\n    notvariable_recovered_lspvals = np.concatenate([\n        precvar['details'][x]['recovery_lspvals']\n        for x in precvar['details'] if\n        (precvar['details'][x]['actual_vartype'] is None)\n    ])\n\n    sortind = np.argsort(notvariable_recovered_periods)\n    notvariable_recovered_periods = notvariable_recovered_periods[sortind]\n    notvariable_recovered_lspvals = notvariable_recovered_lspvals[sortind]\n\n    outdict['notvariable_recovered_periods'] = notvariable_recovered_periods\n    outdict['notvariable_recovered_lspvals'] = notvariable_recovered_lspvals\n\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n    plt.plot(notvariable_recovered_periods,\n             notvariable_recovered_lspvals,\n             ms=1.0,linestyle='none',marker='.')\n    plt.xscale('log')\n    plt.xlabel('recovered periods [days]')\n    plt.ylabel('recovered normalized periodogram power')\n    plt.title('periodogram for actual not-variable objects')\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recovered-periodogram-nonvariables.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n    # 10. overall recovered period histogram for objects marked\n    # not-variable. this gives us the most common periods\n    fig = plt.figure(figsize=(6.4*1.5,4.8*1.5))\n    plt.hist(notvariable_recovered_periods,bins=np.arange(0.02,300.0,1.0e-3),\n             histtype='step')\n    plt.xscale('log')\n    plt.xlabel('recovered periods [days]')\n    plt.ylabel('number of times periods recovered')\n    plt.title('recovered period histogram for non-variable objects')\n    plt.savefig(\n        os.path.join(recplotdir,\n                     'recovered-period-hist-nonvariables.%s' % plotfile_ext),\n        dpi=100,\n        bbox_inches='tight'\n    )\n    plt.close('all')\n\n\n    # at the end, write the outdict to a pickle and return it\n    outfile = os.path.join(simbasedir, 'periodicvar-recovery-plotresults.pkl')\n    with open(outfile,'wb') as outfd:\n        pickle.dump(outdict, outfd, pickle.HIGHEST_PROTOCOL)\n\n    return outdict", "response": "This function plots the results of periodic var recovery."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef mast_query(service,\n               params,\n               data=None,\n               apiversion='v0',\n               forcefetch=False,\n               cachedir='~/.astrobase/mast-cache',\n               verbose=True,\n               timeout=10.0,\n               refresh=5.0,\n               maxtimeout=90.0,\n               maxtries=3,\n               raiseonfail=False,\n               jitter=5.0):\n    '''This queries the STScI MAST service for catalog data.\n\n    All results are downloaded as JSON files that are written to `cachedir`.\n\n    Parameters\n    ----------\n\n    service : str\n        This is the name of the service to use. See\n        https://mast.stsci.edu/api/v0/_services.html for a list of all available\n        services.\n\n    params : dict\n        This is a dict containing the input params to the service as described\n        on its details page linked in the `service description page on MAST\n        <https://mast.stsci.edu/api/v0/_services.html>`_.\n\n    data : dict or None\n        This contains optional data to upload to the service.\n\n    apiversion : str\n        The API version of the MAST service to use. This sets the URL that this\n        function will call, using `apiversion` as key into the `MAST_URLS` dict\n        above.\n\n    forcefetch : bool\n        If this is True, the query will be retried even if cached results for\n        it exist.\n\n    cachedir : str\n        This points to the directory where results will be downloaded.\n\n    verbose : bool\n        If True, will indicate progress and warn of any issues.\n\n    timeout : float\n        This sets the amount of time in seconds to wait for the service to\n        respond to our initial request.\n\n    refresh : float\n        This sets the amount of time in seconds to wait before checking if the\n        result file is available. If the results file isn't available after\n        `refresh` seconds have elapsed, the function will wait for `refresh`\n        seconds continuously, until `maxtimeout` is reached or the results file\n        becomes available.\n\n    maxtimeout : float\n        The maximum amount of time in seconds to wait for a result to become\n        available after submitting our query request.\n\n    maxtries : int\n        The maximum number of tries (across all mirrors tried) to make to either\n        submit the request or download the results, before giving up.\n\n    raiseonfail : bool\n        If this is True, the function will raise an Exception if something goes\n        wrong, instead of returning None.\n\n    jitter : float\n        This is used to control the scale of the random wait in seconds before\n        starting the query. Useful in parallelized situations.\n\n    Returns\n    -------\n\n    dict\n        This returns a dict of the following form::\n\n            {'params':dict of the input params used for the query,\n             'provenance':'cache' or 'new download',\n             'result':path to the file on disk with the downloaded data table}\n\n    '''\n\n    # this matches:\n    # https://mast.stsci.edu/api/v0/class_mashup_1_1_mashup_request.html\n    inputparams = {\n        'format':'json',\n        'params':params,\n        'service':service,\n        'timeout':timeout,\n    }\n\n    if data is not None:\n        inputparams['data'] = data\n\n    # see if the cachedir exists\n    if '~' in cachedir:\n        cachedir = os.path.expanduser(cachedir)\n    if not os.path.exists(cachedir):\n        os.makedirs(cachedir)\n\n    # generate the cachefname and look for it\n    xcachekey = '-'.join([repr(inputparams[x])\n                          for x in sorted(inputparams.keys())])\n    cachekey = hashlib.sha256(xcachekey.encode()).hexdigest()\n\n    cachefname = os.path.join(\n        cachedir,\n        '%s.json' % (cachekey,)\n    )\n    provenance = 'cache'\n\n    #####################\n    ## RUN A NEW QUERY ##\n    #####################\n\n    # otherwise, we check the cache if it's done already, or run it again if not\n    if forcefetch or (not os.path.exists(cachefname)):\n\n        time.sleep(random.randint(1,jitter))\n\n        provenance = 'new download'\n        waitdone = False\n        timeelapsed = 0.0\n        ntries = 1\n\n        url = MAST_URLS[apiversion]['url']\n        formdata = {'request':json.dumps(inputparams)}\n\n        while (not waitdone) or (ntries < maxtries):\n\n            if timeelapsed > maxtimeout:\n                retdict = None\n                break\n\n            try:\n\n                resp = requests.post(url,\n                                     data=formdata,\n                                     # we'll let the service time us out first\n                                     # if that fails, we'll timeout ourselves\n                                     timeout=timeout+1.0)\n                resp.raise_for_status()\n\n                respjson = resp.json()\n\n                if respjson['status'] == 'COMPLETE':\n\n                    data = respjson['data']\n                    nrows = len(data)\n\n                    if nrows > 0:\n\n                        with open(cachefname, 'w') as outfd:\n                            json.dump(respjson, outfd)\n\n                        retdict = {\n                            'params':inputparams,\n                            'provenance':provenance,\n                            'cachefname':cachefname\n                        }\n                        waitdone = True\n\n                        if verbose:\n                            LOGINFO('query successful. nmatches: %s' % nrows)\n\n                        break\n\n                    else:\n\n                        LOGERROR(\n                            'no matching objects found for inputparams: %r' %\n                            inputparams\n                        )\n                        retdict = None\n                        waitdone = True\n                        break\n\n                # if we're still executing after the initial timeout is done\n                elif respjson['status'] == 'EXECUTING':\n\n                    if verbose:\n                        LOGINFO('query is still executing, '\n                                'waiting %s seconds to retry...' % refresh)\n                    waitdone = False\n                    time.sleep(refresh)\n                    timeelapsed = timeelapsed + refresh\n                    retdict = None\n\n                else:\n\n                    LOGERROR('Query failed! Message from service: %s' %\n                             respjson['msg'])\n                    retdict = None\n                    waitdone = True\n                    break\n\n            except requests.exceptions.Timeout as e:\n\n                if verbose:\n                    LOGWARNING('MAST query try timed out, '\n                               'site is probably down. '\n                               'Waiting for %s seconds to try again...' %\n                               refresh)\n                waitdone = False\n                time.sleep(refresh)\n                timeelapsed = timeelapsed + refresh\n                retdict = None\n\n            except KeyboardInterrupt as e:\n\n                LOGERROR('MAST request wait aborted for '\n                         '%s' % repr(inputparams))\n                return None\n\n            except Exception as e:\n\n                LOGEXCEPTION('MAST query failed!')\n\n                if raiseonfail:\n                    raise\n\n                return None\n\n            #\n            # increment number of tries at the bottom of the loop\n            #\n            ntries = ntries + 1\n\n        #\n        # done with waiting for completion\n        #\n        if retdict is None:\n\n            LOGERROR('Timed out, errored out, or reached maximum number '\n                     'of tries with no response. Query was: %r' % inputparams)\n            return None\n\n        else:\n\n            return retdict\n\n    # otherwise, get the file from the cache\n    else:\n\n        if verbose:\n            LOGINFO('getting cached MAST query result for '\n                    'request: %s' %\n                    (repr(inputparams)))\n\n        retdict = {\n            'params':inputparams,\n            'provenance':provenance,\n            'cachefname':cachefname\n        }\n\n        return retdict", "response": "This function queries the STScI MAST service for catalog data."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef tic_xmatch(\n        ra,\n        decl,\n        radius_arcsec=5.0,\n        apiversion='v0',\n        forcefetch=False,\n        cachedir='~/.astrobase/mast-cache',\n        verbose=True,\n        timeout=90.0,\n        refresh=5.0,\n        maxtimeout=180.0,\n        maxtries=3,\n        jitter=5.0,\n        raiseonfail=False\n):\n    '''This does a cross-match with TIC.\n\n    Parameters\n    ----------\n\n    ra,decl : np.arrays or lists of floats\n        The coordinates that will be cross-matched against the TIC.\n\n    radius_arcsec : float\n        The cross-match radius in arcseconds.\n\n    apiversion : str\n        The API version of the MAST service to use. This sets the URL that this\n        function will call, using `apiversion` as key into the `MAST_URLS` dict\n        above.\n\n    forcefetch : bool\n        If this is True, the query will be retried even if cached results for\n        it exist.\n\n    cachedir : str\n        This points to the directory where results will be downloaded.\n\n    verbose : bool\n        If True, will indicate progress and warn of any issues.\n\n    timeout : float\n        This sets the amount of time in seconds to wait for the service to\n        respond to our initial request.\n\n    refresh : float\n        This sets the amount of time in seconds to wait before checking if the\n        result file is available. If the results file isn't available after\n        `refresh` seconds have elapsed, the function will wait for `refresh`\n        seconds continuously, until `maxtimeout` is reached or the results file\n        becomes available.\n\n    maxtimeout : float\n        The maximum amount of time in seconds to wait for a result to become\n        available after submitting our query request.\n\n    maxtries : int\n        The maximum number of tries (across all mirrors tried) to make to either\n        submit the request or download the results, before giving up.\n\n    jitter : float\n        This is used to control the scale of the random wait in seconds before\n        starting the query. Useful in parallelized situations.\n\n    raiseonfail : bool\n        If this is True, the function will raise an Exception if something goes\n        wrong, instead of returning None.\n\n    Returns\n    -------\n\n    dict\n        This returns a dict of the following form::\n\n            {'params':dict of the input params used for the query,\n             'provenance':'cache' or 'new download',\n             'result':path to the file on disk with the downloaded data table}\n\n    '''\n\n    service = 'Mast.Tic.Crossmatch'\n\n    xmatch_input = {'fields':[{'name':'ra','type':'float'},\n                              {'name':'dec','type':'float'}]}\n    xmatch_input['data'] = [{'ra':x, 'dec':y} for (x,y) in zip(ra, decl)]\n\n    params = {'raColumn':'ra',\n              'decColumn':'dec',\n              'radius':radius_arcsec/3600.0}\n\n    return mast_query(service,\n                      params,\n                      data=xmatch_input,\n                      jitter=jitter,\n                      apiversion=apiversion,\n                      forcefetch=forcefetch,\n                      cachedir=cachedir,\n                      verbose=verbose,\n                      timeout=timeout,\n                      refresh=refresh,\n                      maxtimeout=maxtimeout,\n                      maxtries=maxtries,\n                      raiseonfail=raiseonfail)", "response": "This function does a cross - match with TIC."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nmaking a periodogram plot on the first subplot of the input axes.", "response": "def _make_periodogram(axes,\n                      lspinfo,\n                      objectinfo,\n                      findercmap,\n                      finderconvolve,\n                      verbose=True,\n                      findercachedir='~/.astrobase/stamp-cache'):\n    '''Makes periodogram, objectinfo, and finder tile for `checkplot_png` and\n    `twolsp_checkplot_png`.\n\n    Parameters\n    ----------\n\n    axes : matplotlib.axes.Axes object\n        The Axes object which will contain the plot being made.\n\n    lspinfo : dict\n        Dict containing results from a period-finder in `astrobase.periodbase`\n        or a dict that corresponds to that format.\n\n    objectinfo : dict\n        Dict containing basic info about the object being processed.\n\n    findercmap : matplotlib Colormap object or str\n        The Colormap object to use for the finder chart image.\n\n    finderconvolve : astropy.convolution.Kernel object or None\n        If not None, the Kernel object to use for convolving the finder image.\n\n    verbose : bool\n        If True, indicates progress.\n\n    findercachedir : str\n        The directory where the FITS finder images are downloaded and cached.\n\n    Returns\n    -------\n\n    Does not return anything, works on the input Axes object directly.\n\n    '''\n\n    # get the appropriate plot ylabel\n    pgramylabel = PLOTYLABELS[lspinfo['method']]\n\n    # get the periods and lspvals from lspinfo\n    periods = lspinfo['periods']\n    lspvals = lspinfo['lspvals']\n    bestperiod = lspinfo['bestperiod']\n    nbestperiods = lspinfo['nbestperiods']\n    nbestlspvals = lspinfo['nbestlspvals']\n\n    # make the LSP plot on the first subplot\n    axes.plot(periods,lspvals)\n\n    axes.set_xscale('log',basex=10)\n    axes.set_xlabel('Period [days]')\n    axes.set_ylabel(pgramylabel)\n    plottitle = '%s - %.6f d' % (METHODLABELS[lspinfo['method']],\n                                 bestperiod)\n    axes.set_title(plottitle)\n\n    # show the best five peaks on the plot\n    for bestperiod, bestpeak in zip(nbestperiods,\n                                    nbestlspvals):\n        axes.annotate('%.6f' % bestperiod,\n                      xy=(bestperiod, bestpeak), xycoords='data',\n                      xytext=(0.0,25.0), textcoords='offset points',\n                      arrowprops=dict(arrowstyle=\"->\"),fontsize='14.0')\n\n    # make a grid\n    axes.grid(color='#a9a9a9',\n              alpha=0.9,\n              zorder=0,\n              linewidth=1.0,\n              linestyle=':')\n\n\n    # if objectinfo is present, get things from it\n    if (objectinfo and isinstance(objectinfo, dict) and\n        ('objectid' in objectinfo or 'hatid' in objectinfo) and\n        'ra' in objectinfo and 'decl' in objectinfo and\n        objectinfo['ra'] and objectinfo['decl']):\n\n        if 'objectid' not in objectinfo:\n            objectid = objectinfo['hatid']\n        else:\n            objectid = objectinfo['objectid']\n\n        if verbose:\n            LOGINFO('adding in object information and '\n                    'finder chart for %s at RA: %.3f, DEC: %.3f' %\n                    (objectid, objectinfo['ra'], objectinfo['decl']))\n\n\n        # calculate colors\n        if ('bmag' in objectinfo and 'vmag' in objectinfo and\n            'jmag' in objectinfo and 'kmag' in objectinfo and\n            'sdssi' in objectinfo and\n            objectinfo['bmag'] and objectinfo['vmag'] and\n            objectinfo['jmag'] and objectinfo['kmag'] and\n            objectinfo['sdssi']):\n            bvcolor = objectinfo['bmag'] - objectinfo['vmag']\n            jkcolor = objectinfo['jmag'] - objectinfo['kmag']\n            ijcolor = objectinfo['sdssi'] - objectinfo['jmag']\n        else:\n            bvcolor = None\n            jkcolor = None\n            ijcolor = None\n\n        if ('teff' in objectinfo and 'gmag' in objectinfo and\n            objectinfo['teff'] and objectinfo['gmag']):\n            # Gaia data input\n            teff_val = objectinfo['teff']\n            gmag = objectinfo['gmag']\n\n\n        # bump the ylim of the LSP plot so that the overplotted finder and\n        # objectinfo can fit in this axes plot\n        lspylim = axes.get_ylim()\n        axes.set_ylim(lspylim[0], lspylim[1]+0.75*(lspylim[1]-lspylim[0]))\n\n        # get the stamp\n        try:\n            dss, dssheader = skyview_stamp(objectinfo['ra'],\n                                           objectinfo['decl'],\n                                           convolvewith=finderconvolve,\n                                           flip=False,\n                                           cachedir=findercachedir,\n                                           verbose=verbose)\n            stamp = dss\n\n            # inset plot it on the current axes\n            inset = inset_axes(axes, width=\"40%\", height=\"40%\", loc=1)\n            inset.imshow(stamp, cmap=findercmap, origin='lower')\n            inset.set_xticks([])\n            inset.set_yticks([])\n            inset.set_frame_on(False)\n\n            # grid lines pointing to the center of the frame\n            inset.axvline(x=150,ymin=0.375,ymax=0.45,linewidth=2.0,color='b')\n            inset.axhline(y=150,xmin=0.375,xmax=0.45,linewidth=2.0,color='b')\n\n        except OSError as e:\n\n            LOGERROR('downloaded FITS appears to be corrupt, retrying...')\n\n            dss, dssheader = skyview_stamp(objectinfo['ra'],\n                                           objectinfo['decl'],\n                                           convolvewith=finderconvolve,\n                                           flip=False,\n                                           forcefetch=True,\n                                           cachedir=findercachedir,\n                                           verbose=verbose)\n            stamp = dss\n\n            # inset plot it on the current axes\n            inset = inset_axes(axes, width=\"40%\", height=\"40%\", loc=1)\n            inset.imshow(stamp, cmap=findercmap, origin='lower')\n            inset.set_xticks([])\n            inset.set_yticks([])\n            inset.set_frame_on(False)\n\n            # grid lines pointing to the center of the frame\n            inset.axvline(x=150,ymin=0.375,ymax=0.45,linewidth=2.0,color='b')\n            inset.axhline(y=150,xmin=0.375,xmax=0.45,linewidth=2.0,color='b')\n\n\n        except Exception as e:\n            LOGEXCEPTION('could not fetch a DSS stamp for this '\n                         'object %s using coords (%.3f,%.3f)' %\n                         (objectid, objectinfo['ra'], objectinfo['decl']))\n\n        # annotate with objectinfo\n        axes.text(\n            0.05,0.95,\n            '%s' % objectid,\n            ha='left',va='center',transform=axes.transAxes,\n            fontsize=18.0\n        )\n\n        axes.text(\n            0.05,0.91,\n            'RA = %.3f, DEC = %.3f' % (objectinfo['ra'], objectinfo['decl']),\n            ha='left',va='center',transform=axes.transAxes,\n            fontsize=18.0\n        )\n\n        if bvcolor:\n            axes.text(0.05,0.87,\n                      '$B - V$ = %.3f, $V$ = %.3f' % (bvcolor,\n                                                      objectinfo['vmag']),\n                      ha='left',va='center',transform=axes.transAxes,\n                      fontsize=18.0)\n        elif 'vmag' in objectinfo and objectinfo['vmag']:\n            axes.text(0.05,0.87,\n                      '$V$ = %.3f' % (objectinfo['vmag'],),\n                      ha='left',va='center',transform=axes.transAxes,\n                      fontsize=18.0)\n\n        if ijcolor:\n            axes.text(0.05,0.83,\n                      '$i - J$ = %.3f, $J$ = %.3f' % (ijcolor,\n                                                      objectinfo['jmag']),\n                      ha='left',va='center',transform=axes.transAxes,\n                      fontsize=18.0)\n        elif 'jmag' in objectinfo and objectinfo['jmag']:\n            axes.text(0.05,0.83,\n                      '$J$ = %.3f' % (objectinfo['jmag'],),\n                      ha='left',va='center',transform=axes.transAxes,\n                      fontsize=18.0)\n\n        if jkcolor:\n            axes.text(0.05,0.79,\n                      '$J - K$ = %.3f, $K$ = %.3f' % (jkcolor,\n                                                      objectinfo['kmag']),\n                      ha='left',va='center',transform=axes.transAxes,\n                      fontsize=18.0)\n        elif 'kmag' in objectinfo and objectinfo['kmag']:\n            axes.text(0.05,0.79,\n                      '$K$ = %.3f' % (objectinfo['kmag'],),\n                      ha='left',va='center',transform=axes.transAxes,\n                      fontsize=18.0)\n\n        if 'sdssr' in objectinfo and objectinfo['sdssr']:\n            axes.text(0.05,0.75,'SDSS $r$ = %.3f' % objectinfo['sdssr'],\n                      ha='left',va='center',transform=axes.transAxes,\n                      fontsize=18.0)\n\n        if ('teff' in objectinfo and 'gmag' in objectinfo and\n            objectinfo['teff'] and objectinfo['gmag']):\n\n            # gaia data available\n            axes.text(0.05,0.87,\n                      r'$G$ = %.1f, $T_\\mathrm{eff}$ = %d' % (\n                          gmag, int(teff_val)),\n                      ha='left',va='center',transform=axes.transAxes,\n                      fontsize=18.0)\n\n        # add in proper motion stuff if available in objectinfo\n        if ('pmra' in objectinfo and objectinfo['pmra'] and\n            'pmdecl' in objectinfo and objectinfo['pmdecl']):\n\n            pm = total_proper_motion(objectinfo['pmra'],\n                                     objectinfo['pmdecl'],\n                                     objectinfo['decl'])\n\n            axes.text(0.05,0.67,r'$\\mu$ = %.2f mas yr$^{-1}$' % pm,\n                      ha='left',va='center',transform=axes.transAxes,\n                      fontsize=18.0)\n\n            if 'jmag' in objectinfo and objectinfo['jmag']:\n\n                rpm = reduced_proper_motion(objectinfo['jmag'],pm)\n                axes.text(0.05,0.63,'$H_J$ = %.2f' % rpm,\n                          ha='left',va='center',transform=axes.transAxes,\n                          fontsize=18.0)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _make_magseries_plot(axes,\n                         stimes,\n                         smags,\n                         serrs,\n                         magsarefluxes=False,\n                         ms=2.0):\n    '''Makes the mag-series plot tile for `checkplot_png` and\n    `twolsp_checkplot_png`.\n\n    axes : matplotlib.axes.Axes object\n        The Axes object where the generated plot will go.\n\n    stimes,smags,serrs : np.array\n        The mag/flux time-series arrays along with associated errors. These\n        should all have been run through nan-stripping and sigma-clipping\n        beforehand.\n\n    magsarefluxes : bool\n        If True, indicates the input time-series is fluxes and not mags so the\n        plot y-axis direction and range can be set appropriately.\n\n    ms : float\n        The `markersize` kwarg to use when making the mag-series plot.\n\n    Returns\n    -------\n\n    Does not return anything, works on the input Axes object directly.\n\n    '''\n\n    scaledplottime = stimes - npmin(stimes)\n\n    axes.plot(scaledplottime,\n              smags,\n              marker='o',\n              ms=ms, ls='None',mew=0,\n              color='green',\n              rasterized=True)\n\n    # flip y axis for mags\n    if not magsarefluxes:\n        plot_ylim = axes.get_ylim()\n        axes.set_ylim((plot_ylim[1], plot_ylim[0]))\n\n    # set the x axis limit\n    axes.set_xlim((npmin(scaledplottime)-1.0,\n                   npmax(scaledplottime)+1.0))\n\n    # make a grid\n    axes.grid(color='#a9a9a9',\n              alpha=0.9,\n              zorder=0,\n              linewidth=1.0,\n              linestyle=':')\n\n    # make the x and y axis labels\n    plot_xlabel = 'JD - %.3f' % npmin(stimes)\n    if magsarefluxes:\n        plot_ylabel = 'flux'\n    else:\n        plot_ylabel = 'magnitude'\n\n    axes.set_xlabel(plot_xlabel)\n    axes.set_ylabel(plot_ylabel)\n\n    # fix the yaxis ticks (turns off offset and uses the full\n    # value of the yaxis tick)\n    axes.get_yaxis().get_major_formatter().set_useOffset(False)\n    axes.get_xaxis().get_major_formatter().set_useOffset(False)", "response": "Makes the mag - series plot tile for checkplot_png and twolsp_checkplot_png."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _make_phased_magseries_plot(axes,\n                                periodind,\n                                stimes, smags, serrs,\n                                varperiod, varepoch,\n                                phasewrap, phasesort,\n                                phasebin, minbinelems,\n                                plotxlim,\n                                lspmethod,\n                                lspmethodind=0,\n                                xliminsetmode=False,\n                                twolspmode=False,\n                                magsarefluxes=False,\n                                verbose=True,\n                                phasems=2.0,\n                                phasebinms=4.0):\n    '''Makes the phased magseries plot tile for the `checkplot_png` and\n    `twolsp_checkplot_png` functions.\n\n    Parameters\n    ----------\n\n    axes : matplotlib.axes.Axes object\n        The Axes object where the generated plot will be written.\n\n    periodind : int\n        The index of the current best period being processed in the lspinfo\n        dict.\n\n    stimes,smags,serrs : np.array\n        The mag/flux time-series arrays along with associated errors. These\n        should all have been run through nan-stripping and sigma-clipping\n        beforehand.\n\n    varperiod : float or None\n        The period to use for this phased light curve plot tile.\n\n    varepoch : 'min' or float or list of lists or None\n        The epoch to use for this phased light curve plot tile. If this is a\n        float, will use the provided value directly. If this is 'min', will\n        automatically figure out the time-of-minimum of the phased light\n        curve. If this is None, will use the mimimum value of `stimes` as the\n        epoch of the phased light curve plot. If this is a list of lists, will\n        use the provided value of `lspmethodind` to look up the current\n        period-finder method and the provided value of `periodind` to look up\n        the epoch associated with that method and the current period. This is\n        mostly only useful when `twolspmode` is True.\n\n    phasewrap : bool\n        If this is True, the phased time-series will be wrapped around\n        phase 0.0.\n\n    phasesort : bool\n        If True, will sort the phased light curve in order of increasing phase.\n\n    phasebin: float\n        The bin size to use to group together measurements closer than this\n        amount in phase. This is in units of phase. If this is a float, a\n        phase-binned version of the phased light curve will be overplotted on\n        top of the regular phased light curve.\n\n    minbinelems : int\n        The minimum number of elements required per phase bin to include it in\n        the phased LC plot.\n\n    plotxlim : sequence of two floats or None\n        The x-range (min, max) of the phased light curve plot. If None, will be\n        determined automatically.\n\n    lspmethod : str\n        One of the three-letter keys corresponding to period-finder method names\n        in the `astrobase.plotbase.METHODSHORTLABELS` dict. Used to set the plot\n        title correctly.\n\n    lspmethodind : int\n        If `twolspmode` is set, this will be used to look up the correct epoch\n        associated with the current period-finder method and period.\n\n    xliminsetmode : bool\n        If this is True, the generated phased light curve plot will use the\n        values of `plotxlim` as the main plot x-axis limits (i.e. zoomed-in if\n        `plotxlim` is a range smaller than the full phase range), and will show\n        the full phased light curve plot as an smaller inset. Useful for\n        planetary transit light curves.\n\n    twolspmode : bool\n        If this is True, will use the `lspmethodind` and `periodind` to look up\n        the correct values of epoch, etc. in the provided `varepoch` list of\n        lists for plotting purposes.\n\n    magsarefluxes : bool\n        If True, indicates the input time-series is fluxes and not mags so the\n        plot y-axis direction and range can be set appropriately.\n\n    verbose : bool\n        If True, indicates progress.\n\n    phasems : float\n        The marker size to use for the main phased light curve plot symbols.\n\n    phasebinms : float\n        The marker size to use for the binned phased light curve plot symbols.\n\n    Returns\n    -------\n\n    Does not return anything, works on the input Axes object directly.\n\n    '''\n\n    plotvarepoch = None\n\n    # figure out the epoch, if it's None, use the min of the time\n    if varepoch is None:\n        plotvarepoch = npmin(stimes)\n\n    # if the varepoch is 'min', then fit a spline to the light curve\n    # phased using the min of the time, find the fit mag minimum and use\n    # the time for that as the varepoch\n    elif isinstance(varepoch, str) and varepoch == 'min':\n\n        try:\n            spfit = spline_fit_magseries(stimes,\n                                         smags,\n                                         serrs,\n                                         varperiod,\n                                         magsarefluxes=magsarefluxes,\n                                         sigclip=None,\n                                         verbose=verbose)\n            plotvarepoch = spfit['fitinfo']['fitepoch']\n            if len(plotvarepoch) != 1:\n                plotvarepoch = varepoch[0]\n\n        except Exception as e:\n\n            LOGERROR('spline fit failed, trying SavGol fit')\n\n            sgfit = savgol_fit_magseries(stimes,\n                                         smags,\n                                         serrs,\n                                         varperiod,\n                                         sigclip=None,\n                                         magsarefluxes=magsarefluxes,\n                                         verbose=verbose)\n            plotvarepoch = sgfit['fitinfo']['fitepoch']\n            if len(plotvarepoch) != 1:\n                plotvarepoch = plotvarepoch[0]\n\n        finally:\n\n            if plotvarepoch is None:\n\n                LOGERROR('could not find a min epoch time, '\n                         'using min(times) as the epoch for '\n                         'the phase-folded LC')\n\n                plotvarepoch = npmin(stimes)\n\n    elif isinstance(varepoch, list):\n\n        try:\n\n            if twolspmode:\n\n                thisvarepochlist = varepoch[lspmethodind]\n                plotvarepoch = thisvarepochlist[periodind]\n\n            else:\n                plotvarepoch = varepoch[periodind]\n\n        except Exception as e:\n            LOGEXCEPTION(\n                \"varepoch provided in list form either doesn't match \"\n                \"the length of nbestperiods from the period-finder \"\n                \"result, or something else went wrong. using min(times) \"\n                \"as the epoch instead\"\n            )\n            plotvarepoch = npmin(stimes)\n\n    # the final case is to use the provided varepoch directly\n    else:\n        plotvarepoch = varepoch\n\n    if verbose:\n        LOGINFO('plotting phased LC with period %.6f, epoch %.5f' %\n                (varperiod, plotvarepoch))\n\n    # phase the magseries\n    phasedlc = phase_magseries(stimes,\n                               smags,\n                               varperiod,\n                               plotvarepoch,\n                               wrap=phasewrap,\n                               sort=phasesort)\n    plotphase = phasedlc['phase']\n    plotmags = phasedlc['mags']\n\n    # if we're supposed to bin the phases, do so\n    if phasebin:\n\n        binphasedlc = phase_bin_magseries(plotphase,\n                                          plotmags,\n                                          binsize=phasebin,\n                                          minbinelems=minbinelems)\n        binplotphase = binphasedlc['binnedphases']\n        binplotmags = binphasedlc['binnedmags']\n\n\n    # finally, make the phased LC plot\n    axes.plot(plotphase,\n              plotmags,\n              marker='o',\n              ms=phasems, ls='None',mew=0,\n              color='gray',\n              rasterized=True)\n\n    # overlay the binned phased LC plot if we're making one\n    if phasebin:\n        axes.plot(binplotphase,\n                  binplotmags,\n                  marker='o',\n                  ms=phasebinms, ls='None',mew=0,\n                  color='#1c1e57',\n                  rasterized=True)\n\n    # flip y axis for mags\n    if not magsarefluxes:\n        plot_ylim = axes.get_ylim()\n        axes.set_ylim((plot_ylim[1], plot_ylim[0]))\n\n    # set the x axis limit\n    if not plotxlim:\n        axes.set_xlim((npmin(plotphase)-0.1,\n                       npmax(plotphase)+0.1))\n    else:\n        axes.set_xlim((plotxlim[0],plotxlim[1]))\n\n    # make a grid\n    axes.grid(color='#a9a9a9',\n              alpha=0.9,\n              zorder=0,\n              linewidth=1.0,\n              linestyle=':')\n\n    # make the x and y axis labels\n    plot_xlabel = 'phase'\n    if magsarefluxes:\n        plot_ylabel = 'flux'\n    else:\n        plot_ylabel = 'magnitude'\n\n    axes.set_xlabel(plot_xlabel)\n    axes.set_ylabel(plot_ylabel)\n\n    # fix the yaxis ticks (turns off offset and uses the full\n    # value of the yaxis tick)\n    axes.get_yaxis().get_major_formatter().set_useOffset(False)\n    axes.get_xaxis().get_major_formatter().set_useOffset(False)\n\n    # make the plot title\n    if periodind == 0:\n        plottitle = '%s best period: %.6f d - epoch: %.5f' % (\n            METHODSHORTLABELS[lspmethod],\n            varperiod,\n            plotvarepoch\n        )\n    elif periodind == 1 and not twolspmode:\n        plottitle = '%s best period x 0.5: %.6f d - epoch: %.5f' % (\n            METHODSHORTLABELS[lspmethod],\n            varperiod,\n            plotvarepoch\n        )\n    elif periodind == 2 and not twolspmode:\n        plottitle = '%s best period x 2: %.6f d - epoch: %.5f' % (\n            METHODSHORTLABELS[lspmethod],\n            varperiod,\n            plotvarepoch\n        )\n    elif periodind > 2 and not twolspmode:\n        plottitle = '%s peak %s: %.6f d - epoch: %.5f' % (\n            METHODSHORTLABELS[lspmethod],\n            periodind-1,\n            varperiod,\n            plotvarepoch\n        )\n    elif periodind > 0:\n        plottitle = '%s peak %s: %.6f d - epoch: %.5f' % (\n            METHODSHORTLABELS[lspmethod],\n            periodind+1,\n            varperiod,\n            plotvarepoch\n        )\n\n    axes.set_title(plottitle)\n\n    # if we're making an inset plot showing the full range\n    if (plotxlim and isinstance(plotxlim, (list,tuple)) and\n        len(plotxlim) == 2 and xliminsetmode is True):\n\n        # bump the ylim of the plot so that the inset can fit in this axes plot\n        axesylim = axes.get_ylim()\n\n        if magsarefluxes:\n            axes.set_ylim(axesylim[0],\n                          axesylim[1] + 0.5*npabs(axesylim[1]-axesylim[0]))\n        else:\n            axes.set_ylim(axesylim[0],\n                          axesylim[1] - 0.5*npabs(axesylim[1]-axesylim[0]))\n\n        # put the inset axes in\n        inset = inset_axes(axes, width=\"40%\", height=\"40%\", loc=1)\n\n        # make the scatter plot for the phased LC plot\n        inset.plot(plotphase,\n                   plotmags,\n                   marker='o',\n                   ms=2.0, ls='None',mew=0,\n                   color='gray',\n                   rasterized=True)\n\n        # overlay the binned phased LC plot if we're making one\n        if phasebin:\n            inset.plot(binplotphase,\n                       binplotmags,\n                       marker='o',\n                       ms=4.0, ls='None',mew=0,\n                       color='#1c1e57',\n                       rasterized=True)\n\n        # show the full phase coverage\n        # show the full phase coverage\n        if phasewrap:\n            inset.set_xlim(-0.2,0.8)\n        else:\n            inset.set_xlim(-0.1,1.1)\n\n        # flip y axis for mags\n        if not magsarefluxes:\n            inset_ylim = inset.get_ylim()\n            inset.set_ylim((inset_ylim[1], inset_ylim[0]))\n\n        # set the plot title\n        inset.text(0.5,0.1,'full phased light curve',\n                   ha='center',va='center',transform=inset.transAxes)\n        # don't show axes labels or ticks\n        inset.set_xticks([])\n        inset.set_yticks([])", "response": "Makes the phased magseries plot tile for the checkplot_png and twolsp_checkplot_png functions."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nreturn a precessed set of equatorial coordinates for the object at the specified epoch.", "response": "def precess_coordinates(ra, dec,\n                        epoch_one, epoch_two,\n                        jd=None,\n                        mu_ra=0.0,\n                        mu_dec=0.0,\n                        outscalar=False):\n    '''Precesses target coordinates `ra`, `dec` from `epoch_one` to `epoch_two`.\n\n    This takes into account the jd of the observations, as well as the proper\n    motion of the target mu_ra, mu_dec. Adapted from J. D. Hartman's\n    VARTOOLS/converttime.c [coordprecess].\n\n    Parameters\n    ----------\n\n    ra,dec : float\n        The equatorial coordinates of the object at `epoch_one` to precess in\n        decimal degrees.\n\n    epoch_one : float\n        Origin epoch to precess from to target epoch. This is a float, like:\n        1985.0, 2000.0, etc.\n\n    epoch_two : float\n        Target epoch to precess from origin epoch. This is a float, like:\n        2000.0, 2018.0, etc.\n\n    jd : float\n        The full Julian date to use along with the propermotions in `mu_ra`, and\n        `mu_dec` to handle proper motion along with the coordinate frame\n        precession. If one of `jd`, `mu_ra`, or `mu_dec` is missing, the proper\n        motion will not be used to calculate the final precessed coordinates.\n\n    mu_ra,mu_dec : float\n        The proper motion in mas/yr in right ascension and declination. If these\n        are provided along with `jd`, the total proper motion of the object will\n        be taken into account to calculate the final precessed coordinates.\n\n    outscalar : bool\n        If True, converts the output coordinates from one-element np.arrays to\n        scalars.\n\n    Returns\n    -------\n\n    precessed_ra, precessed_dec : float\n        A tuple of precessed equatorial coordinates in decimal degrees at\n        `epoch_two` taking into account proper motion if `jd`, `mu_ra`, and\n        `mu_dec` are provided.\n\n    '''\n\n    raproc, decproc = np.radians(ra), np.radians(dec)\n\n    if ((mu_ra != 0.0) and (mu_dec != 0.0) and jd):\n\n        jd_epoch_one = JD2000 + (epoch_one - epoch_two)*365.25\n        raproc = (\n            raproc +\n            (jd - jd_epoch_one)*mu_ra*MAS_P_YR_TO_RAD_P_DAY/np.cos(decproc)\n        )\n        decproc = decproc + (jd - jd_epoch_one)*mu_dec*MAS_P_YR_TO_RAD_P_DAY\n\n    ca = np.cos(raproc)\n    cd = np.cos(decproc)\n    sa = np.sin(raproc)\n    sd = np.sin(decproc)\n\n    if epoch_one != epoch_two:\n\n        t1 = 1.0e-3 * (epoch_two - epoch_one)\n        t2 = 1.0e-3 * (epoch_one - 2000.0)\n\n        a = ( t1*ARCSEC_TO_RADIANS * (23062.181 + t2*(139.656 + 0.0139*t2) +\n                                      t1*(30.188 - 0.344*t2+17.998*t1)) )\n        b = t1*t1*ARCSEC_TO_RADIANS*(79.280 + 0.410*t2 + 0.205*t1) + a\n        c = (\n            ARCSEC_TO_RADIANS*t1*(20043.109 - t2*(85.33 + 0.217*t2) +\n                                  t1*(-42.665 - 0.217*t2 - 41.833*t2))\n        )\n        sina, sinb, sinc = np.sin(a), np.sin(b), np.sin(c)\n        cosa, cosb, cosc = np.cos(a), np.cos(b), np.cos(c)\n\n        precmatrix = np.matrix([[cosa*cosb*cosc - sina*sinb,\n                                 sina*cosb + cosa*sinb*cosc,\n                                 cosa*sinc],\n                                [-cosa*sinb - sina*cosb*cosc,\n                                 cosa*cosb - sina*sinb*cosc,\n                                 -sina*sinc],\n                                [-cosb*sinc,\n                                 -sinb*sinc,\n                                 cosc]])\n\n        precmatrix = precmatrix.transpose()\n\n        x = (np.matrix([cd*ca, cd*sa, sd])).transpose()\n\n        x2 = precmatrix * x\n\n        outra = np.arctan2(x2[1],x2[0])\n        outdec = np.arcsin(x2[2])\n\n\n        outradeg = np.rad2deg(outra)\n        outdecdeg = np.rad2deg(outdec)\n\n        if outradeg < 0.0:\n            outradeg = outradeg + 360.0\n\n        if outscalar:\n            return float(outradeg), float(outdecdeg)\n        else:\n            return outradeg, outdecdeg\n\n    else:\n\n        # if the epochs are the same and no proper motion, this will be the same\n        # as the input values. if the epochs are the same, but there IS proper\n        # motion (and a given JD), then these will be perturbed from the input\n        # values of ra, dec by the appropriate amount of motion\n        return np.degrees(raproc), np.degrees(decproc)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _single_true(iterable):\n    '''This returns True if only one True-ish element exists in `iterable`.\n\n    Parameters\n    ----------\n\n    iterable : iterable\n\n    Returns\n    -------\n\n    bool\n        True if only one True-ish element exists in `iterable`. False otherwise.\n\n    '''\n\n    # return True if exactly one true found\n    iterator = iter(iterable)\n\n    # consume from \"i\" until first true or it's exhausted\n    has_true = any(iterator)\n\n    # carry on consuming until another true value / exhausted\n    has_another_true = any(iterator)\n\n    return has_true and not has_another_true", "response": "This returns True if only one True -ish element exists in iterable. False otherwise."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef get_epochs_given_midtimes_and_period(\n        t_mid,\n        period,\n        err_t_mid=None,\n        t0_fixed=None,\n        t0_percentile=None,\n        verbose=False\n):\n    '''This calculates the future epochs for a transit, given a period and a\n    starting epoch\n\n    The equation used is::\n\n        t_mid = period*epoch + t0\n\n    Default behavior if no kwargs are used is to define `t0` as the median\n    finite time of the passed `t_mid` array.\n\n    Only one of `err_t_mid` or `t0_fixed` should be passed.\n\n    Parameters\n    ----------\n\n    t_mid : np.array\n        A np.array of transit mid-time measurements\n\n    period : float\n        The period used to calculate epochs, per the equation above. For typical\n        use cases, a period precise to ~1e-5 days is sufficient to get correct\n        epochs.\n\n    err_t_mid : None or np.array\n        If provided, contains the errors of the transit mid-time\n        measurements. The zero-point epoch is then set equal to the average of\n        the transit times, weighted as `1/err_t_mid^2` . This minimizes the\n        covariance between the transit epoch and the period (e.g., Gibson et\n        al. 2013). For standard O-C analysis this is the best method.\n\n    t0_fixed : None or float:\n        If provided, use this t0 as the starting epoch. (Overrides all others).\n\n    t0_percentile : None or float\n        If provided, use this percentile of `t_mid` to define `t0`.\n\n    Returns\n    -------\n\n    tuple\n        This is the of the form `(integer_epoch_array, t0)`.\n        `integer_epoch_array` is an array of integer epochs (float-type),\n        of length equal to the number of *finite* mid-times passed.\n\n    '''\n\n    kwargarr = np.array([isinstance(err_t_mid,np.ndarray),\n                         t0_fixed,\n                         t0_percentile])\n    if not _single_true(kwargarr) and not np.all(~kwargarr.astype(bool)):\n        raise AssertionError(\n            'can have at most one of err_t_mid, t0_fixed, t0_percentile')\n\n    t_mid = t_mid[np.isfinite(t_mid)]\n    N_midtimes = len(t_mid)\n\n    if t0_fixed:\n        t0 = t0_fixed\n    elif isinstance(err_t_mid,np.ndarray):\n        # get the weighted average. then round it to the nearest transit epoch.\n        t0_avg = np.average(t_mid, weights=1/err_t_mid**2)\n        t0_options = np.arange(min(t_mid), max(t_mid)+period, period)\n        t0 = t0_options[np.argmin(np.abs(t0_options - t0_avg))]\n    else:\n        if not t0_percentile:\n            # if there are an odd number of times, take the median time as\n            # epoch=0.  elif there are an even number of times, take the lower\n            # of the two middle times as epoch=0.\n            if N_midtimes % 2 == 1:\n                t0 = np.median(t_mid)\n            else:\n                t0 = t_mid[int(N_midtimes/2)]\n        else:\n            t0 = np.sort(t_mid)[int(N_midtimes*t0_percentile/100)]\n\n    epoch = (t_mid - t0)/period\n\n    # do not convert numpy entries to actual ints, because np.nan is float type\n    int_epoch = np.round(epoch, 0)\n\n    if verbose:\n        LOGINFO('epochs before rounding')\n        LOGINFO('\\n{:s}'.format(repr(epoch)))\n        LOGINFO('epochs after rounding')\n        LOGINFO('\\n{:s}'.format(repr(int_epoch)))\n\n    return int_epoch, t0", "response": "This function calculates the epochs for a transit given a set of mid - times and a period."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef unixtime_to_jd(unix_time):\n    '''This converts UNIX time in seconds to a Julian date in UTC (JD_UTC).\n\n    Parameters\n    ----------\n\n    unix_time : float\n        A UNIX time in decimal seconds since the 1970 UNIX epoch.\n\n    Returns\n    -------\n\n    jd : float\n        The Julian date corresponding to the provided UNIX time.\n\n    '''\n\n    # use astropy's time module\n    jdutc = astime.Time(unix_time, format='unix', scale='utc')\n    return jdutc.jd", "response": "This converts a UNIX time in seconds to a Julian date in UTC."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef jd_to_datetime(jd, returniso=False):\n    '''This converts a UTC JD to a Python `datetime` object or ISO date string.\n\n    Parameters\n    ----------\n\n    jd : float\n        The Julian date measured at UTC.\n\n    returniso : bool\n        If False, returns a naive Python `datetime` object corresponding to\n        `jd`. If True, returns the ISO format string corresponding to the date\n        and time at UTC from `jd`.\n\n    Returns\n    -------\n\n    datetime or str\n        Depending on the value of `returniso`.\n\n    '''\n\n    tt = astime.Time(jd, format='jd', scale='utc')\n\n    if returniso:\n        return tt.iso\n    else:\n        return tt.datetime", "response": "This converts a UTC JD to a Python datetime object or ISO date string."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef jd_corr(jd,\n            ra, dec,\n            obslon=None,\n            obslat=None,\n            obsalt=None,\n            jd_type='bjd'):\n    '''Returns BJD_TDB or HJD_TDB for input JD_UTC.\n\n    The equation used is::\n\n        BJD_TDB = JD_UTC + JD_to_TDB_corr + romer_delay\n\n    where:\n\n    - JD_to_TDB_corr is the difference between UTC and TDB JDs\n    - romer_delay is the delay caused by finite speed of light from Earth-Sun\n\n    This is based on the code at:\n\n    https://mail.scipy.org/pipermail/astropy/2014-April/003148.html\n\n    Note that this does not correct for:\n\n    1. precession of coordinates if the epoch is not 2000.0\n    2. precession of coordinates if the target has a proper motion\n    3. Shapiro delay\n    4. Einstein delay\n\n    Parameters\n    ----------\n\n    jd : float or array-like\n        The Julian date(s) measured at UTC.\n\n    ra,dec : float\n        The equatorial coordinates of the object in decimal degrees.\n\n    obslon,obslat,obsalt : float or None\n        The longitude, latitude of the observatory in decimal degrees and\n        altitude of the observatory in meters. If these are not provided, the\n        corrected JD will be calculated with respect to the center of the Earth.\n\n    jd_type : {'bjd','hjd'}\n        Conversion type to perform, either to Baryocentric Julian Date ('bjd')\n        or to Heliocenter Julian Date ('hjd').\n\n    Returns\n    -------\n\n    float or np.array\n        The converted BJD or HJD.\n\n    '''\n\n    if not HAVEKERNEL:\n        LOGERROR('no JPL kernel available, can\\'t continue!')\n        return\n\n    # Source unit-vector\n    ## Assume coordinates in ICRS\n    ## Set distance to unit (kilometers)\n\n    # convert the angles to degrees\n    rarad = np.radians(ra)\n    decrad = np.radians(dec)\n    cosra = np.cos(rarad)\n    sinra = np.sin(rarad)\n    cosdec = np.cos(decrad)\n    sindec = np.sin(decrad)\n\n    # this assumes that the target is very far away\n    src_unitvector = np.array([cosdec*cosra,cosdec*sinra,sindec])\n\n    # Convert epochs to astropy.time.Time\n    ## Assume JD(UTC)\n    if (obslon is None) or (obslat is None) or (obsalt is None):\n        t = astime.Time(jd, scale='utc', format='jd')\n    else:\n        t = astime.Time(jd, scale='utc', format='jd',\n                        location=('%.5fd' % obslon,\n                                  '%.5fd' % obslat,\n                                  obsalt))\n\n    # Get Earth-Moon barycenter position\n    ## NB: jplephem uses Barycentric Dynamical Time, e.g. JD(TDB)\n    ## and gives positions relative to solar system barycenter\n    barycenter_earthmoon = jplkernel[0,3].compute(t.tdb.jd)\n\n    # Get Moon position vectors from the center of Earth to the Moon\n    # this means we get the following vectors from the ephemerides\n    # Earth Barycenter (3) -> Moon (301)\n    # Earth Barycenter (3) -> Earth (399)\n    # so the final vector is [3,301] - [3,399]\n    # units are in km\n    moonvector = (jplkernel[3,301].compute(t.tdb.jd) -\n                  jplkernel[3,399].compute(t.tdb.jd))\n\n    # Compute Earth position vectors (this is for the center of the earth with\n    # respect to the solar system barycenter)\n    # all these units are in km\n    pos_earth = (barycenter_earthmoon - moonvector * 1.0/(1.0+EMRAT))\n\n    if jd_type == 'bjd':\n\n        # Compute BJD correction\n        ## Assume source vectors parallel at Earth and Solar System\n        ## Barycenter\n        ## i.e. source is at infinity\n        # the romer_delay correction is (r.dot.n)/c where:\n        # r is the vector from SSB to earth center\n        # n is the unit vector from\n        correction_seconds = np.dot(pos_earth.T, src_unitvector)/CLIGHT_KPS\n        correction_days = correction_seconds/SEC_P_DAY\n\n    elif jd_type == 'hjd':\n\n        # Compute HJD correction via Sun ephemeris\n\n        # this is the position vector of the center of the sun in km\n        # Solar System Barycenter (0) -> Sun (10)\n        pos_sun = jplkernel[0,10].compute(t.tdb.jd)\n\n        # this is the vector from the center of the sun to the center of the\n        # earth\n        sun_earth_vec = pos_earth - pos_sun\n\n        # calculate the heliocentric correction\n        correction_seconds = np.dot(sun_earth_vec.T, src_unitvector)/CLIGHT_KPS\n        correction_days = correction_seconds/SEC_P_DAY\n\n    # TDB is the appropriate time scale for these ephemerides\n    new_jd = t.tdb.jd + correction_days\n\n    return new_jd", "response": "Returns the BJD_TDB or HJD_TDB for input Julian date."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _lclist_parallel_worker(task):\n    '''This is a parallel worker for makelclist.\n\n    Parameters\n    ----------\n\n    task : tuple\n        This is a tuple containing the following items:\n\n        task[0] = lcf\n        task[1] = columns\n        task[2] = lcformat\n        task[3] = lcformatdir\n        task[4] = lcndetkey\n\n    Returns\n    -------\n\n    dict or None\n        This contains all of the info for the object processed in this LC read\n        operation. If this fails, returns None\n\n    '''\n\n    lcf, columns, lcformat, lcformatdir, lcndetkey = task\n\n    # get the bits needed for lcformat handling\n    # NOTE: we re-import things in this worker function because sometimes\n    # functions can't be pickled correctly for passing them to worker functions\n    # in a processing pool\n    try:\n        formatinfo = get_lcformat(lcformat,\n                                  use_lcformat_dir=lcformatdir)\n        if formatinfo:\n            (dfileglob, readerfunc,\n             dtimecols, dmagcols, derrcols,\n             magsarefluxes, normfunc) = formatinfo\n        else:\n            LOGERROR(\"can't figure out the light curve format\")\n            return None\n    except Exception as e:\n        LOGEXCEPTION(\"can't figure out the light curve format\")\n        return None\n\n    # we store the full path of the light curve\n    lcobjdict = {'lcfname':os.path.abspath(lcf)}\n\n    try:\n\n        # read the light curve in\n        lcdict = readerfunc(lcf)\n\n        # this should handle lists/tuples being returned by readerfunc\n        # we assume that the first element is the actual lcdict\n        # FIXME: figure out how to not need this assumption\n        if ( (isinstance(lcdict, (list, tuple))) and\n             (isinstance(lcdict[0], dict)) ):\n            lcdict = lcdict[0]\n\n        # insert all of the columns\n        for colkey in columns:\n\n            if '.' in colkey:\n                getkey = colkey.split('.')\n            else:\n                getkey = [colkey]\n\n            try:\n                thiscolval = _dict_get(lcdict, getkey)\n            except Exception as e:\n                LOGWARNING('column %s does not exist for %s' %\n                           (colkey, lcf))\n                thiscolval = np.nan\n\n            # update the lcobjdict with this value\n            lcobjdict[getkey[-1]] = thiscolval\n\n    except Exception as e:\n\n        LOGEXCEPTION('could not figure out columns for %s' % lcf)\n\n        # insert all of the columns as nans\n        for colkey in columns:\n\n            if '.' in colkey:\n                getkey = colkey.split('.')\n            else:\n                getkey = [colkey]\n\n            thiscolval = np.nan\n\n            # update the lclistdict with this value\n            lcobjdict[getkey[-1]] = thiscolval\n\n    # now get the actual ndets; this excludes nans and infs\n    for dk in lcndetkey:\n\n        try:\n\n            if '.' in dk:\n                getdk = dk.split('.')\n            else:\n                getdk = [dk]\n\n            ndetcol = _dict_get(lcdict, getdk)\n            actualndets = ndetcol[np.isfinite(ndetcol)].size\n            lcobjdict['%s.ndet' % getdk[-1]] = actualndets\n\n        except Exception as e:\n            lcobjdict['%s.ndet' % getdk[-1]] = np.nan\n\n\n    return lcobjdict", "response": "This function is used to make a worker function for makelclist. It is used to make a list of objects in a parallel thread. It is used to make a list of objects in a parallel thread."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef make_lclist(basedir,\n                outfile,\n                use_list_of_filenames=None,\n                lcformat='hat-sql',\n                lcformatdir=None,\n                fileglob=None,\n                recursive=True,\n                columns=['objectid',\n                         'objectinfo.ra',\n                         'objectinfo.decl',\n                         'objectinfo.ndet'],\n                makecoordindex=('objectinfo.ra','objectinfo.decl'),\n                field_fitsfile=None,\n                field_wcsfrom=None,\n                field_scale=ZScaleInterval(),\n                field_stretch=LinearStretch(),\n                field_colormap=plt.cm.gray_r,\n                field_findersize=None,\n                field_pltopts={'marker':'o',\n                               'markersize':10.0,\n                               'markerfacecolor':'none',\n                               'markeredgewidth':2.0,\n                               'markeredgecolor':'red'},\n                field_grid=False,\n                field_gridcolor='k',\n                field_zoomcontain=True,\n                maxlcs=None,\n                nworkers=NCPUS):\n\n    '''This generates a light curve catalog for all light curves in a directory.\n\n    Given a base directory where all the files are, and a light curve format,\n    this will find all light curves, pull out the keys in each lcdict requested\n    in the `columns` kwarg for each object, and write them to the requested\n    output pickle file. These keys should be pointers to scalar values\n    (i.e. something like `objectinfo.ra` is OK, but something like 'times' won't\n    work because it's a vector).\n\n    Generally, this works with light curve reading functions that produce\n    lcdicts as detailed in the docstring for `lcproc.register_lcformat`. Once\n    you've registered your light curve reader functions using the\n    `lcproc.register_lcformat` function, pass in the `formatkey` associated with\n    your light curve format, and this function will be able to read all light\n    curves in that format as well as the object information stored in their\n    `objectinfo` dict.\n\n    Parameters\n    ----------\n\n    basedir : str or list of str\n        If this is a str, points to a single directory to search for light\n        curves. If this is a list of str, it must be a list of directories to\n        search for light curves. All of these will be searched to find light\n        curve files matching either your light curve format's default fileglob\n        (when you registered your LC format), or a specific fileglob that you\n        can pass in using the `fileglob` kwargh here. If the `recursive` kwarg\n        is set, the provided directories will be searched recursively.\n\n        If `use_list_of_filenames` is not None, it will override this argument\n        and the function will take those light curves as the list of files it\n        must process instead of whatever is specified in `basedir`.\n\n    outfile : str\n        This is the name of the output file to write. This will be a pickle\n        file, so a good convention to use for this name is something like\n        'my-lightcurve-catalog.pkl'.\n\n    use_list_of_filenames : list of str or None\n        Use this kwarg to override whatever is provided in `basedir` and\n        directly pass in a list of light curve files to process. This can speed\n        up this function by a lot because no searches on disk will be performed\n        to find light curve files matching `basedir` and `fileglob`.\n\n    lcformat : str\n        This is the `formatkey` associated with your light curve format, which\n        you previously passed in to the `lcproc.register_lcformat`\n        function. This will be used to look up how to find and read the light\n        curves specified in `basedir` or `use_list_of_filenames`.\n\n    lcformatdir : str or None\n        If this is provided, gives the path to a directory when you've stored\n        your lcformat description JSONs, other than the usual directories lcproc\n        knows to search for them in. Use this along with `lcformat` to specify\n        an LC format JSON file that's not currently registered with lcproc.\n\n    fileglob : str or None\n        If provided, is a string that is a valid UNIX filename glob. Used to\n        override the default fileglob for this LC format when searching for\n        light curve files in `basedir`.\n\n    recursive : bool\n        If True, the directories specified in `basedir` will be searched\n        recursively for all light curve files that match the default fileglob\n        for this LC format or a specific one provided in `fileglob`.\n\n    columns : list of str\n        This is a list of keys in the lcdict produced by your light curve reader\n        function that contain object information, which will be extracted and\n        put into the output light curve catalog. It's highly recommended that\n        your LC reader function produce a lcdict that contains at least the\n        default keys shown here.\n\n        The lcdict keys to extract are specified by using an address scheme:\n\n        - First level dict keys can be specified directly:\n          e.g., 'objectid' will extract lcdict['objectid']\n        - Keys at other levels can be specified by using a period to indicate\n          the level:\n\n          - e.g., 'objectinfo.ra' will extract lcdict['objectinfo']['ra']\n          - e.g., 'objectinfo.varinfo.features.stetsonj' will extract\n            lcdict['objectinfo']['varinfo']['features']['stetsonj']\n\n    makecoordindex : list of two str or None\n        This is used to specify which lcdict keys contain the right ascension\n        and declination coordinates for this object. If these are provided, the\n        output light curve catalog will have a kdtree built on all object\n        coordinates, which enables fast spatial searches and cross-matching to\n        external catalogs by `checkplot` and `lcproc` functions.\n\n    field_fitsfile : str or None\n        If this is not None, it should be the path to a FITS image containing\n        the objects these light curves are for. If this is provided,\n        `make_lclist` will use the WCS information in the FITS itself if\n        `field_wcsfrom` is None (or from a WCS header file pointed to by\n        `field_wcsfrom`) to obtain x and y pixel coordinates for all of the\n        objects in the field. A finder chart will also be made using\n        `astrobase.plotbase.fits_finder_chart` using the corresponding\n        `field_scale`, `_stretch`, `_colormap`, `_findersize`, `_pltopts`,\n        `_grid`, and `_gridcolors` kwargs for that function, reproduced here to\n        enable customization of the finder chart plot.\n\n    field_wcsfrom : str or None\n        If `wcsfrom` is None, the WCS to transform the RA/Dec to pixel x/y will\n        be taken from the FITS header of `fitsfile`. If this is not None, it\n        must be a FITS or similar file that contains a WCS header in its first\n        extension.\n\n    field_scale : astropy.visualization.Interval object\n        `scale` sets the normalization for the FITS pixel values. This is an\n        astropy.visualization Interval object.\n        See http://docs.astropy.org/en/stable/visualization/normalization.html\n        for details on `scale` and `stretch` objects.\n\n    field_stretch : astropy.visualization.Stretch object\n        `stretch` sets the stretch function for mapping FITS pixel values to\n        output pixel values. This is an astropy.visualization Stretch object.\n        See http://docs.astropy.org/en/stable/visualization/normalization.html\n        for details on `scale` and `stretch` objects.\n\n    field_colormap : matplotlib Colormap object\n        `colormap` is a matplotlib color map object to use for the output image.\n\n    field_findersize : None or tuple of two ints\n        If `findersize` is None, the output image size will be set by the NAXIS1\n        and NAXIS2 keywords in the input `fitsfile` FITS header. Otherwise,\n        `findersize` must be a tuple with the intended x and y size of the image\n        in inches (all output images will use a DPI = 100).\n\n    field_pltopts : dict\n        `field_pltopts` controls how the overlay points will be plotted. This\n        a dict with standard matplotlib marker, etc. kwargs as key-val pairs,\n        e.g. 'markersize', 'markerfacecolor', etc. The default options make red\n        outline circles at the location of each object in the overlay.\n\n    field_grid : bool\n        `grid` sets if a grid will be made on the output image.\n\n    field_gridcolor : str\n        `gridcolor` sets the color of the grid lines. This is a usual matplotib\n        color spec string.\n\n    field_zoomcontain : bool\n        `field_zoomcontain` controls if the finder chart will be zoomed to\n        just contain the overlayed points. Everything outside the footprint of\n        these points will be discarded.\n\n    maxlcs : int or None\n        This sets how many light curves to process in the input LC list\n        generated by searching for LCs in `basedir` or in the list provided as\n        `use_list_of_filenames`.\n\n    nworkers : int\n        This sets the number of parallel workers to launch to collect\n        information from the light curves.\n\n    Returns\n    -------\n\n    str\n        Returns the path to the generated light curve catalog pickle file.\n\n    '''\n\n    try:\n        formatinfo = get_lcformat(lcformat,\n                                  use_lcformat_dir=lcformatdir)\n        if formatinfo:\n            (dfileglob, readerfunc,\n             dtimecols, dmagcols, derrcols,\n             magsarefluxes, normfunc) = formatinfo\n        else:\n            LOGERROR(\"can't figure out the light curve format\")\n            return None\n    except Exception as e:\n        LOGEXCEPTION(\"can't figure out the light curve format\")\n        return None\n\n    if not fileglob:\n        fileglob = dfileglob\n\n    # this is to get the actual ndet\n    # set to the magnitudes column\n    lcndetkey = dmagcols\n\n    if isinstance(use_list_of_filenames, list):\n\n        matching = use_list_of_filenames\n\n    else:\n\n        # handle the case where basedir is a list of directories\n        if isinstance(basedir, list):\n\n            matching = []\n\n            for bdir in basedir:\n\n                # now find the files\n                LOGINFO('searching for %s light curves in %s ...' % (lcformat,\n                                                                     bdir))\n\n                if recursive is False:\n                    matching.extend(glob.glob(os.path.join(bdir, fileglob)))\n\n                else:\n                    # use recursive glob for Python 3.5+\n                    if sys.version_info[:2] > (3,4):\n\n                        matching.extend(glob.glob(os.path.join(bdir,\n                                                               '**',\n                                                               fileglob),\n                                                  recursive=True))\n\n                    # otherwise, use os.walk and glob\n                    else:\n\n                        # use os.walk to go through the directories\n                        walker = os.walk(bdir)\n\n                        for root, dirs, _files in walker:\n                            for sdir in dirs:\n                                searchpath = os.path.join(root,\n                                                          sdir,\n                                                          fileglob)\n                                foundfiles = glob.glob(searchpath)\n\n                                if foundfiles:\n                                    matching.extend(foundfiles)\n\n\n        # otherwise, handle the usual case of one basedir to search in\n        else:\n\n            # now find the files\n            LOGINFO('searching for %s light curves in %s ...' %\n                    (lcformat, basedir))\n\n            if recursive is False:\n                matching = glob.glob(os.path.join(basedir, fileglob))\n\n            else:\n                # use recursive glob for Python 3.5+\n                if sys.version_info[:2] > (3,4):\n\n                    matching = glob.glob(os.path.join(basedir,\n                                                      '**',\n                                                      fileglob),recursive=True)\n\n                # otherwise, use os.walk and glob\n                else:\n\n                    # use os.walk to go through the directories\n                    walker = os.walk(basedir)\n                    matching = []\n\n                    for root, dirs, _files in walker:\n                        for sdir in dirs:\n                            searchpath = os.path.join(root,\n                                                      sdir,\n                                                      fileglob)\n                            foundfiles = glob.glob(searchpath)\n\n                            if foundfiles:\n                                matching.extend(foundfiles)\n\n    #\n    # now that we have all the files, process them\n    #\n    if matching and len(matching) > 0:\n\n        LOGINFO('found %s light curves' % len(matching))\n\n        # cut down matching to maxlcs\n        if maxlcs:\n            matching = matching[:maxlcs]\n\n        # prepare the output dict\n        lclistdict = {\n            'basedir':basedir,\n            'lcformat':lcformat,\n            'fileglob':fileglob,\n            'recursive':recursive,\n            'columns':columns,\n            'makecoordindex':makecoordindex,\n            'nfiles':len(matching),\n            'objects': {\n            }\n        }\n\n        # columns that will always be present in the output lclistdict\n        derefcols = ['lcfname']\n        derefcols.extend(['%s.ndet' % x.split('.')[-1] for x in lcndetkey])\n\n        for dc in derefcols:\n            lclistdict['objects'][dc] = []\n\n        # fill in the rest of the lclist columns from the columns kwarg\n        for col in columns:\n\n            # dereference the column\n            thiscol = col.split('.')\n            thiscol = thiscol[-1]\n            lclistdict['objects'][thiscol] = []\n            derefcols.append(thiscol)\n\n        # start collecting info\n        LOGINFO('collecting light curve info...')\n\n        tasks = [(x, columns, lcformat, lcformatdir, lcndetkey)\n                 for x in matching]\n\n        with ProcessPoolExecutor(max_workers=nworkers) as executor:\n            results = executor.map(_lclist_parallel_worker, tasks)\n\n        results = [x for x in results]\n\n        # update the columns in the overall dict from the results of the\n        # parallel map\n        for result in results:\n            for xcol in derefcols:\n                lclistdict['objects'][xcol].append(result[xcol])\n\n        executor.shutdown()\n\n        # done with collecting info\n        # turn all of the lists in the lclistdict into arrays\n        for col in lclistdict['objects']:\n            lclistdict['objects'][col] = np.array(lclistdict['objects'][col])\n\n        # handle duplicate objectids with different light curves\n\n        uniques, counts = np.unique(lclistdict['objects']['objectid'],\n                                    return_counts=True)\n\n        duplicated_objectids = uniques[counts > 1]\n\n        if duplicated_objectids.size > 0:\n\n            # redo the objectid array so it has a bit larger dtype so the extra\n            # tag can fit into the field\n            dt = lclistdict['objects']['objectid'].dtype.str\n            dt = '<U%s' % (\n                int(dt.replace('<','').replace('U','').replace('S','')) + 3\n            )\n            lclistdict['objects']['objectid'] = np.array(\n                lclistdict['objects']['objectid'],\n                dtype=dt\n            )\n\n            for objid in duplicated_objectids:\n\n                objid_inds = np.where(\n                    lclistdict['objects']['objectid'] == objid\n                )\n\n                # mark the duplicates, assume the first instance is the actual\n                # one\n                for ncounter, nind in enumerate(objid_inds[0][1:]):\n                    lclistdict['objects']['objectid'][nind] = '%s-%s' % (\n                        lclistdict['objects']['objectid'][nind],\n                        ncounter+2\n                    )\n                    LOGWARNING(\n                        'tagging duplicated instance %s of objectid: '\n                        '%s as %s-%s, lightcurve: %s' %\n                        (ncounter+2, objid, objid, ncounter+2,\n                         lclistdict['objects']['lcfname'][nind])\n                    )\n\n        # if we're supposed to make a spatial index, do so\n        if (makecoordindex and\n            isinstance(makecoordindex, (list, tuple)) and\n            len(makecoordindex) == 2):\n\n            try:\n\n                # deref the column names\n                racol, declcol = makecoordindex\n                racol = racol.split('.')[-1]\n                declcol = declcol.split('.')[-1]\n\n                # get the ras and decls\n                objra, objdecl = (lclistdict['objects'][racol],\n                                  lclistdict['objects'][declcol])\n\n                # get the xyz unit vectors from ra,decl\n                # since i had to remind myself:\n                # https://en.wikipedia.org/wiki/Equatorial_coordinate_system\n                cosdecl = np.cos(np.radians(objdecl))\n                sindecl = np.sin(np.radians(objdecl))\n                cosra = np.cos(np.radians(objra))\n                sinra = np.sin(np.radians(objra))\n                xyz = np.column_stack((cosra*cosdecl,sinra*cosdecl, sindecl))\n\n                # generate the kdtree\n                kdt = sps.cKDTree(xyz,copy_data=True)\n\n                # put the tree into the dict\n                lclistdict['kdtree'] = kdt\n\n                LOGINFO('kdtree generated for (ra, decl): (%s, %s)' %\n                        (makecoordindex[0], makecoordindex[1]))\n\n            except Exception as e:\n                LOGEXCEPTION('could not make kdtree for (ra, decl): (%s, %s)' %\n                             (makecoordindex[0], makecoordindex[1]))\n                raise\n\n        # generate the xy pairs if fieldfits is not None\n        if field_fitsfile and os.path.exists(field_fitsfile):\n\n            # read in the FITS file\n            if field_wcsfrom is None:\n\n                hdulist = pyfits.open(field_fitsfile)\n                hdr = hdulist[0].header\n                hdulist.close()\n\n                w = WCS(hdr)\n                wcsok = True\n\n            elif os.path.exists(field_wcsfrom):\n\n                w = WCS(field_wcsfrom)\n                wcsok = True\n\n            else:\n\n                LOGERROR('could not determine WCS info for input FITS: %s' %\n                         field_fitsfile)\n                wcsok = False\n\n            if wcsok:\n\n                # first, transform the ra/decl to x/y and put these in the\n                # lclist output dict\n                radecl = np.column_stack((objra, objdecl))\n                lclistdict['objects']['framexy'] = w.all_world2pix(\n                    radecl,\n                    1\n                )\n\n                # next, we'll make a PNG plot for the finder\n                finder_outfile = os.path.join(\n                    os.path.dirname(outfile),\n                    os.path.splitext(os.path.basename(outfile))[0] + '.png'\n                )\n\n                finder_png = fits_finder_chart(\n                    field_fitsfile,\n                    finder_outfile,\n                    wcsfrom=field_wcsfrom,\n                    scale=field_scale,\n                    stretch=field_stretch,\n                    colormap=field_colormap,\n                    findersize=field_findersize,\n                    overlay_ra=objra,\n                    overlay_decl=objdecl,\n                    overlay_pltopts=field_pltopts,\n                    overlay_zoomcontain=field_zoomcontain,\n                    grid=field_grid,\n                    gridcolor=field_gridcolor\n                )\n\n                if finder_png is not None:\n                    LOGINFO('generated a finder PNG '\n                            'with an object position overlay '\n                            'for this LC list: %s' % finder_png)\n\n\n        # write the pickle\n        with open(outfile,'wb') as outfd:\n            pickle.dump(lclistdict, outfd, protocol=pickle.HIGHEST_PROTOCOL)\n\n        LOGINFO('done. LC info -> %s' % outfile)\n        return outfile\n\n    else:\n\n        LOGERROR('no files found in %s matching %s' % (basedir, fileglob))\n        return None", "response": "This function generates a light curve catalog for all light curves in a directory."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef filter_lclist(lc_catalog,\n                  objectidcol='objectid',\n                  racol='ra',\n                  declcol='decl',\n                  xmatchexternal=None,\n                  xmatchdistarcsec=3.0,\n                  externalcolnums=(0,1,2),\n                  externalcolnames=['objectid','ra','decl'],\n                  externalcoldtypes='U20,f8,f8',\n                  externalcolsep=None,\n                  externalcommentchar='#',\n                  conesearch=None,\n                  conesearchworkers=1,\n                  columnfilters=None,\n                  field_fitsfile=None,\n                  field_wcsfrom=None,\n                  field_scale=ZScaleInterval(),\n                  field_stretch=LinearStretch(),\n                  field_colormap=plt.cm.gray_r,\n                  field_findersize=None,\n                  field_pltopts={'marker':'o',\n                                 'markersize':10.0,\n                                 'markerfacecolor':'none',\n                                 'markeredgewidth':2.0,\n                                 'markeredgecolor':'red'},\n                  field_grid=False,\n                  field_gridcolor='k',\n                  field_zoomcontain=True,\n                  copylcsto=None):\n\n    '''This is used to perform cone-search, cross-match, and column-filter\n    operations on a light curve catalog generated by `make_lclist`.\n\n    Uses the output of `make_lclist` above. This function returns a list of\n    light curves matching various criteria specified by the `xmatchexternal`,\n    `conesearch`, and `columnfilters kwargs`. Use this function to generate\n    input lists for other lcproc functions,\n    e.g. `lcproc.lcvfeatures.parallel_varfeatures`,\n    `lcproc.periodfinding.parallel_pf`, and `lcproc.lcbin.parallel_timebin`,\n    among others.\n\n    The operations are applied in this order if more than one is specified:\n    `xmatchexternal` -> `conesearch` -> `columnfilters`. All results from these\n    operations are joined using a logical AND operation.\n\n    Parameters\n    ----------\n\n    objectidcol : str\n        This is the name of the object ID column in the light curve catalog.\n\n    racol : str\n        This is the name of the RA column in the light curve catalog.\n\n    declcol : str\n        This is the name of the Dec column in the light curve catalog.\n\n    xmatchexternal : str or None\n        If provided, this is the filename of a text file containing objectids,\n        ras and decs to match the objects in the light curve catalog to by their\n        positions.\n\n    xmatchdistarcsec : float\n        This is the distance in arcseconds to use when cross-matching to the\n        external catalog in `xmatchexternal`.\n\n    externalcolnums : sequence of int\n        This a list of the zero-indexed column numbers of columns to extract\n        from the external catalog file.\n\n    externalcolnames : sequence of str\n        This is a list of names of columns that will be extracted from the\n        external catalog file. This is the same length as\n        `externalcolnums`. These must contain the names provided as the\n        `objectid`, `ra`, and `decl` column names so this function knows which\n        column numbers correspond to those columns and can use them to set up\n        the cross-match.\n\n    externalcoldtypes : str\n        This is a CSV string containing numpy dtype definitions for all columns\n        listed to extract from the external catalog file. The number of dtype\n        definitions should be equal to the number of columns to extract.\n\n    externalcolsep : str or None\n        The column separator to use when extracting columns from the external\n        catalog file. If None, any whitespace between columns is used as the\n        separator.\n\n    externalcommentchar : str\n        The character indicating that a line in the external catalog file is to\n        be ignored.\n\n    conesearch : list of float\n\n        This is used to specify cone-search parameters. It should be a three\n        element list:\n\n        [center_ra_deg, center_decl_deg, search_radius_deg]\n\n    conesearchworkers : int\n        The number of parallel workers to launch for the cone-search operation.\n\n    columnfilters : list of str\n\n        This is a list of strings indicating any filters to apply on each column\n        in the light curve catalog. All column filters are applied in the\n        specified sequence and are combined with a logical AND operator. The\n        format of each filter string should be:\n\n        '<lc_catalog column>|<operator>|<operand>'\n\n        where:\n\n        - <lc_catalog column> is a column in the lc_catalog pickle file\n        - <operator> is one of: 'lt', 'gt', 'le', 'ge', 'eq', 'ne', which\n          correspond to the usual operators: <, >, <=, >=, ==, != respectively.\n        - <operand> is a float, int, or string.\n\n    field_fitsfile : str or None\n        If this is not None, it should be the path to a FITS image containing\n        the objects these light curves are for. If this is provided,\n        `make_lclist` will use the WCS information in the FITS itself if\n        `field_wcsfrom` is None (or from a WCS header file pointed to by\n        `field_wcsfrom`) to obtain x and y pixel coordinates for all of the\n        objects in the field. A finder chart will also be made using\n        `astrobase.plotbase.fits_finder_chart` using the corresponding\n        `field_scale`, `_stretch`, `_colormap`, `_findersize`, `_pltopts`,\n        `_grid`, and `_gridcolors` kwargs for that function, reproduced here to\n        enable customization of the finder chart plot.\n\n    field_wcsfrom : str or None\n        If `wcsfrom` is None, the WCS to transform the RA/Dec to pixel x/y will\n        be taken from the FITS header of `fitsfile`. If this is not None, it\n        must be a FITS or similar file that contains a WCS header in its first\n        extension.\n\n    field_scale : astropy.visualization.Interval object\n        `scale` sets the normalization for the FITS pixel values. This is an\n        astropy.visualization Interval object.\n        See http://docs.astropy.org/en/stable/visualization/normalization.html\n        for details on `scale` and `stretch` objects.\n\n    field_stretch : astropy.visualization.Stretch object\n        `stretch` sets the stretch function for mapping FITS pixel values to\n        output pixel values. This is an astropy.visualization Stretch object.\n        See http://docs.astropy.org/en/stable/visualization/normalization.html\n        for details on `scale` and `stretch` objects.\n\n    field_colormap : matplotlib Colormap object\n        `colormap` is a matplotlib color map object to use for the output image.\n\n    field_findersize : None or tuple of two ints\n        If `findersize` is None, the output image size will be set by the NAXIS1\n        and NAXIS2 keywords in the input `fitsfile` FITS header. Otherwise,\n        `findersize` must be a tuple with the intended x and y size of the image\n        in inches (all output images will use a DPI = 100).\n\n    field_pltopts : dict\n        `field_pltopts` controls how the overlay points will be plotted. This\n        a dict with standard matplotlib marker, etc. kwargs as key-val pairs,\n        e.g. 'markersize', 'markerfacecolor', etc. The default options make red\n        outline circles at the location of each object in the overlay.\n\n    field_grid : bool\n        `grid` sets if a grid will be made on the output image.\n\n    field_gridcolor : str\n        `gridcolor` sets the color of the grid lines. This is a usual matplotib\n        color spec string.\n\n    field_zoomcontain : bool\n        `field_zoomcontain` controls if the finder chart will be zoomed to\n        just contain the overlayed points. Everything outside the footprint of\n        these points will be discarded.\n\n    copylcsto : str\n        If this is provided, it is interpreted as a directory target to copy\n        all the light curves that match the specified conditions.\n\n    Returns\n    -------\n\n    tuple\n        Returns a two elem tuple: (matching_object_lcfiles, matching_objectids)\n        if conesearch and/or column filters are used. If `xmatchexternal` is\n        also used, a three-elem tuple is returned: (matching_object_lcfiles,\n        matching_objectids, extcat_matched_objectids).\n\n    '''\n\n    with open(lc_catalog,'rb') as infd:\n        lclist = pickle.load(infd)\n\n    # generate numpy arrays of the matching object indexes. we do it this way so\n    # we can AND everything at the end, instead of having to look up the objects\n    # at these indices and running the columnfilter on them\n    xmatch_matching_index = np.full_like(lclist['objects'][objectidcol],\n                                         False,\n                                         dtype=np.bool)\n    conesearch_matching_index = np.full_like(lclist['objects'][objectidcol],\n                                             False,\n                                             dtype=np.bool)\n\n    # do the xmatch first\n    ext_matches = []\n    ext_matching_objects = []\n\n    if (xmatchexternal and\n        isinstance(xmatchexternal, str) and\n        os.path.exists(xmatchexternal)):\n\n        try:\n\n            # read in the external file\n            extcat = np.genfromtxt(xmatchexternal,\n                                   usecols=externalcolnums,\n                                   delimiter=externalcolsep,\n                                   names=externalcolnames,\n                                   dtype=externalcoldtypes,\n                                   comments=externalcommentchar)\n\n            ext_cosdecl = np.cos(np.radians(extcat['decl']))\n            ext_sindecl = np.sin(np.radians(extcat['decl']))\n            ext_cosra = np.cos(np.radians(extcat['ra']))\n            ext_sinra = np.sin(np.radians(extcat['ra']))\n\n            ext_xyz = np.column_stack((ext_cosra*ext_cosdecl,\n                                       ext_sinra*ext_cosdecl,\n                                       ext_sindecl))\n            ext_xyzdist = 2.0 * np.sin(np.radians(xmatchdistarcsec/3600.0)/2.0)\n\n            # get our kdtree\n            our_kdt = lclist['kdtree']\n\n            # get the external kdtree\n            ext_kdt = sps.cKDTree(ext_xyz)\n\n            # do a query_ball_tree\n            extkd_matchinds = ext_kdt.query_ball_tree(our_kdt, ext_xyzdist)\n\n            for extind, mind in enumerate(extkd_matchinds):\n                if len(mind) > 0:\n                    ext_matches.append(mind[0])\n\n                    # get the whole matching row for the ext objects recarray\n                    ext_matching_objects.append(extcat[extind])\n\n            ext_matches = np.array(ext_matches)\n\n            if ext_matches.size > 0:\n\n                # update the xmatch_matching_index\n                xmatch_matching_index[ext_matches] = True\n\n                LOGINFO('xmatch: objects matched to %s within %.1f arcsec: %s' %\n                        (xmatchexternal, xmatchdistarcsec, ext_matches.size))\n\n            else:\n\n                LOGERROR(\"xmatch: no objects were cross-matched to external \"\n                         \"catalog spec: %s, can't continue\" % xmatchexternal)\n                return None, None, None\n\n\n        except Exception as e:\n\n            LOGEXCEPTION('could not match to external catalog spec: %s' %\n                         repr(xmatchexternal))\n            raise\n\n\n    # do the cone search next\n    if (conesearch and\n        isinstance(conesearch, (list, tuple)) and\n        len(conesearch) == 3):\n\n        try:\n\n            racenter, declcenter, searchradius = conesearch\n            cosdecl = np.cos(np.radians(declcenter))\n            sindecl = np.sin(np.radians(declcenter))\n            cosra = np.cos(np.radians(racenter))\n            sinra = np.sin(np.radians(racenter))\n\n            # this is the search distance in xyz unit vectors\n            xyzdist = 2.0 * np.sin(np.radians(searchradius)/2.0)\n\n            # get the kdtree\n            our_kdt = lclist['kdtree']\n\n            # look up the coordinates\n            kdtindices = our_kdt.query_ball_point([cosra*cosdecl,\n                                                   sinra*cosdecl,\n                                                   sindecl],\n                                                  xyzdist,\n                                                  n_jobs=conesearchworkers)\n\n            if kdtindices and len(kdtindices) > 0:\n\n                LOGINFO('cone search: objects within %.4f deg '\n                        'of (%.3f, %.3f): %s' %\n                        (searchradius, racenter, declcenter, len(kdtindices)))\n\n                # update the conesearch_matching_index\n                matchingind = kdtindices\n                conesearch_matching_index[np.array(matchingind)] = True\n\n            # we fail immediately if we found nothing. this assumes the user\n            # cares more about the cone-search than the regular column filters\n            else:\n\n                LOGERROR(\"cone-search: no objects were found within \"\n                         \"%.4f deg of (%.3f, %.3f): %s, can't continue\" %\n                         (searchradius, racenter, declcenter, len(kdtindices)))\n                return None, None\n\n\n        except Exception as e:\n\n            LOGEXCEPTION('cone-search: could not run a cone-search, '\n                         'is there a kdtree present in %s?' % lc_catalog)\n            raise\n\n\n    # now that we're done with cone-search, do the column filtering\n    allfilterinds = []\n    if columnfilters and isinstance(columnfilters, list):\n\n        # go through each filter\n        for cfilt in columnfilters:\n\n            try:\n\n                fcol, foperator, foperand = cfilt.split('|')\n                foperator = FILTEROPS[foperator]\n\n                # generate the evalstring\n                filterstr = (\n                    \"np.isfinite(lclist['objects']['%s']) & \"\n                    \"(lclist['objects']['%s'] %s %s)\"\n                ) % (fcol, fcol, foperator, foperand)\n                filterind = eval(filterstr)\n\n                ngood = lclist['objects'][objectidcol][filterind].size\n                LOGINFO('filter: %s -> objects matching: %s ' % (cfilt, ngood))\n\n                allfilterinds.append(filterind)\n\n            except Exception as e:\n\n                LOGEXCEPTION('filter: could not understand filter spec: %s'\n                             % cfilt)\n                LOGWARNING('filter: not applying this broken filter')\n\n\n    # now that we have all the filter indices good to go\n    # logical-AND all the things\n\n    # make sure we only do filtering if we were told to do so\n    if (xmatchexternal or conesearch or columnfilters):\n\n        filterstack = []\n        if xmatchexternal:\n            filterstack.append(xmatch_matching_index)\n        if conesearch:\n            filterstack.append(conesearch_matching_index)\n        if columnfilters:\n            filterstack.extend(allfilterinds)\n\n        finalfilterind = np.column_stack(filterstack)\n        finalfilterind = np.all(finalfilterind, axis=1)\n\n        # get the filtered object light curves and object names\n        filteredobjectids = lclist['objects'][objectidcol][finalfilterind]\n        filteredlcfnames = lclist['objects']['lcfname'][finalfilterind]\n\n    else:\n\n        filteredobjectids = lclist['objects'][objectidcol]\n        filteredlcfnames = lclist['objects']['lcfname']\n\n\n    # if we're told to make a finder chart with the selected objects\n    if field_fitsfile is not None and os.path.exists(field_fitsfile):\n\n        # get the RA and DEC of the matching objects\n        matching_ra = lclist['objects'][racol][finalfilterind]\n        matching_decl = lclist['objects'][declcol][finalfilterind]\n\n        matching_postfix = []\n\n        if xmatchexternal is not None:\n            matching_postfix.append(\n                'xmatch_%s' %\n                os.path.splitext(os.path.basename(xmatchexternal))[0]\n            )\n        if conesearch is not None:\n            matching_postfix.append('conesearch_RA%.3f_DEC%.3f_RAD%.5f' %\n                                    tuple(conesearch))\n\n        if columnfilters is not None:\n            for cfi, cf in enumerate(columnfilters):\n                if cfi == 0:\n                    matching_postfix.append('filter_%s_%s_%s' %\n                                            tuple(cf.split('|')))\n                else:\n                    matching_postfix.append('_and_%s_%s_%s' %\n                                            tuple(cf.split('|')))\n\n        if len(matching_postfix) > 0:\n            matching_postfix = '-%s' % '_'.join(matching_postfix)\n        else:\n            matching_postfix = ''\n\n        # next, we'll make a PNG plot for the finder\n        finder_outfile = os.path.join(\n            os.path.dirname(lc_catalog),\n            '%s%s.png' %\n            (os.path.splitext(os.path.basename(lc_catalog))[0],\n             matching_postfix)\n        )\n\n        finder_png = fits_finder_chart(\n            field_fitsfile,\n            finder_outfile,\n            wcsfrom=field_wcsfrom,\n            scale=field_scale,\n            stretch=field_stretch,\n            colormap=field_colormap,\n            findersize=field_findersize,\n            overlay_ra=matching_ra,\n            overlay_decl=matching_decl,\n            overlay_pltopts=field_pltopts,\n            field_zoomcontain=field_zoomcontain,\n            grid=field_grid,\n            gridcolor=field_gridcolor\n        )\n\n        if finder_png is not None:\n            LOGINFO('generated a finder PNG '\n                    'with an object position overlay '\n                    'for this filtered LC list: %s' % finder_png)\n\n\n\n    # if copylcsto is not None, copy LCs over to it\n    if copylcsto is not None:\n\n        if not os.path.exists(copylcsto):\n            os.mkdir(copylcsto)\n\n        if TQDM:\n            lciter = tqdm(filteredlcfnames)\n        else:\n            lciter = filteredlcfnames\n\n        LOGINFO('copying matching light curves to %s' % copylcsto)\n\n        for lc in lciter:\n            shutil.copy(lc, copylcsto)\n\n\n    LOGINFO('done. objects matching all filters: %s' % filteredobjectids.size)\n\n    if xmatchexternal and len(ext_matching_objects) > 0:\n        return filteredlcfnames, filteredobjectids, ext_matching_objects\n    else:\n        return filteredlcfnames, filteredobjectids", "response": "This function filters a light curve catalog by applying cone - search cross - match and column - filter functions on it."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _cpinfo_key_worker(task):\n    '''This wraps `checkplotlist.checkplot_infokey_worker`.\n\n    This is used to get the correct dtype for each element in retrieved results.\n\n    Parameters\n    ----------\n\n    task : tuple\n        task[0] = cpfile\n        task[1] = keyspeclist (infokeys kwarg from `add_cpinfo_to_lclist`)\n\n    Returns\n    -------\n\n    dict\n        All of the requested keys from the checkplot are returned along with\n        their values in a dict.\n\n    '''\n\n    cpfile, keyspeclist = task\n\n    keystoget = [x[0] for x in keyspeclist]\n    nonesubs = [x[-2] for x in keyspeclist]\n    nansubs = [x[-1] for x in keyspeclist]\n\n    # reform the keystoget into a list of lists\n    for i, k in enumerate(keystoget):\n\n        thisk = k.split('.')\n        if sys.version_info[:2] < (3,4):\n            thisk = [(int(x) if x.isdigit() else x) for x in thisk]\n        else:\n            thisk = [(int(x) if x.isdecimal() else x) for x in thisk]\n\n        keystoget[i] = thisk\n\n    # add in the objectid as well to match to the object catalog later\n    keystoget.insert(0,['objectid'])\n    nonesubs.insert(0, '')\n    nansubs.insert(0,'')\n\n    # get all the keys we need\n    vals = checkplot_infokey_worker((cpfile, keystoget))\n\n    # if they have some Nones, nans, etc., reform them as expected\n    for val, nonesub, nansub, valind in zip(vals, nonesubs,\n                                            nansubs, range(len(vals))):\n\n        if val is None:\n            outval = nonesub\n        elif isinstance(val, float) and not np.isfinite(val):\n            outval = nansub\n        elif isinstance(val, (list, tuple)):\n            outval = ', '.join(val)\n        else:\n            outval = val\n\n        vals[valind] = outval\n\n    return vals", "response": "This wraps checkplot. checkplot_infokey_worker to get the correct dtype for each element in retrieved results."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nupdating the proxy with the given change.", "response": "def _update_proxy(self, change):\n        \"\"\" An observer which sends the state change to the proxy.\n\n        \"\"\"\n        if change['type'] == 'container':\n            #: Only update what's needed\n            self.proxy.update_points(change)\n        else:\n            super(MapPolyline, self)._update_proxy(change)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nupdate the proxy with the change.", "response": "def _update_proxy(self, change):\n        \"\"\" An observer which sends the state change to the proxy.\n\n        \"\"\"\n        if change['type'] == 'container':\n            #: Only update what's needed\n            self.proxy.update_points(change)\n        else:\n            super(MapPolygon, self)._update_proxy(change)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nhandling changes from atom ContainerLists", "response": "def handle_change(self, change):\n        \"\"\" Handle changes from atom ContainerLists \"\"\"\n        op = change['operation']\n        if op in 'append':\n            self.add(len(change['value']), LatLng(*change['item']))\n        elif op == 'insert':\n            self.add(change['index'], LatLng(*change['item']))\n        elif op == 'extend':\n            points = [LatLng(*p) for p in change['items']]\n            self.addAll([bridge.encode(c) for c in points])\n        elif op == '__setitem__':\n            self.set(change['index'], LatLng(*change['newitem']))\n        elif op == 'pop':\n            self.remove(change['index'])\n        else:\n            raise NotImplementedError(\n                \"Unsupported change operation {}\".format(op))"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ncreates the underlying widget.", "response": "def create_widget(self):\n        \"\"\" Create the underlying widget.\n\n        \"\"\"\n        self.init_options()\n\n        #: Retrieve the actual map\n        MapFragment.newInstance(self.options).then(\n            self.on_map_fragment_created)\n\n        # Holder for the fragment\n        self.widget = FrameLayout(self.get_context())\n\n        # I wrote this a few days ago and already forget how this hack works...\n        # lol We can't simply get a map reference using getMapAsync in the\n        # return value like we normally do with a normal call function return\n        # value. The bridge design was modified to store an object that cannot\n        # be decoded normally (via a standard Bridge.Packer) by saving the new\n        # object in the cache returning the id of the handler or proxy that\n        # invoked it. This way we can manually create a new id and pass that\n        # \"future reference-able\" object as our listener. At which point the\n        # bridge will create a reference entry in the cache for us with the of\n        # the object we gave it. Once in the cache we can use it like any\n        # bridge object we created.\n        self.map = GoogleMap(__id__=bridge.generate_id())"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef init_options(self):\n        self.options = GoogleMapOptions()\n        d = self.declaration\n        self.set_map_type(d.map_type)\n        if d.ambient_mode:\n            self.set_ambient_mode(d.ambient_mode)\n        if (d.camera_position or d.camera_zoom or\n                d.camera_tilt or d.camera_bearing):\n            self.update_camera()\n        if d.map_bounds:\n            self.set_map_bounds(d.map_bounds)\n        if not d.show_compass:\n            self.set_show_compass(d.show_compass)\n        if not d.show_zoom_controls:\n            self.set_show_zoom_controls(d.show_zoom_controls)\n        if not d.show_toolbar:\n            self.set_show_toolbar(d.show_toolbar)\n        if d.lite_mode:\n            self.set_lite_mode(d.lite_mode)\n        if not d.rotate_gestures:\n            self.set_rotate_gestures(d.rotate_gestures)\n        if not d.scroll_gestures:\n            self.set_scroll_gestures(d.scroll_gestures)\n        if not d.tilt_gestures:\n            self.set_tilt_gestures(d.tilt_gestures)\n        if not d.zoom_gestures:\n            self.set_zoom_gestures(d.zoom_gestures)\n        if d.min_zoom:\n            self.set_min_zoom(d.min_zoom)\n        if d.max_zoom:\n            self.set_max_zoom(d.max_zoom)", "response": "Initialize the underlying map options."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef init_map(self):\n        d = self.declaration\n        if d.show_location:\n            self.set_show_location(d.show_location)\n        if d.show_traffic:\n            self.set_show_traffic(d.show_traffic)\n        if d.show_indoors:\n            self.set_show_indoors(d.show_indoors)\n        if d.show_buildings:\n            self.set_show_buildings(d.show_buildings)\n\n        #: Local ref access is faster\n        mapview = self.map\n        mid = mapview.getId()\n\n        #: Connect signals\n        #: Camera\n        mapview.onCameraChange.connect(self.on_camera_changed)\n        mapview.onCameraMoveStarted.connect(self.on_camera_move_started)\n        mapview.onCameraMoveCanceled.connect(self.on_camera_move_stopped)\n        mapview.onCameraIdle.connect(self.on_camera_move_stopped)\n        mapview.setOnCameraChangeListener(mid)\n        mapview.setOnCameraMoveStartedListener(mid)\n        mapview.setOnCameraMoveCanceledListener(mid)\n        mapview.setOnCameraIdleListener(mid)\n\n        #: Clicks\n        mapview.onMapClick.connect(self.on_map_clicked)\n        mapview.setOnMapClickListener(mid)\n        mapview.onMapLongClick.connect(self.on_map_long_clicked)\n        mapview.setOnMapLongClickListener(mid)\n\n        #: Markers\n        mapview.onMarkerClick.connect(self.on_marker_clicked)\n        mapview.setOnMarkerClickListener(self.map.getId())\n        mapview.onMarkerDragStart.connect(self.on_marker_drag_start)\n        mapview.onMarkerDrag.connect(self.on_marker_drag)\n        mapview.onMarkerDragEnd.connect(self.on_marker_drag_end)\n        mapview.setOnMarkerDragListener(mid)\n\n        #: Info window\n        mapview.onInfoWindowClick.connect(self.on_info_window_clicked)\n        mapview.onInfoWindowLongClick.connect(self.on_info_window_long_clicked)\n        mapview.onInfoWindowClose.connect(self.on_info_window_closed)\n        mapview.setOnInfoWindowClickListener(mid)\n        mapview.setOnInfoWindowCloseListener(mid)\n        mapview.setOnInfoWindowLongClickListener(mid)\n\n        #: Polys\n        mapview.onPolygonClick.connect(self.on_poly_clicked)\n        mapview.onPolylineClick.connect(self.on_poly_clicked)\n        mapview.setOnPolygonClickListener(mid)\n        mapview.setOnPolylineClickListener(mid)\n\n        #: Circle\n        mapview.onCircleClick.connect(self.on_circle_clicked)\n        mapview.setOnCircleClickListener(mid)", "response": "Initialize the mapview with the current values."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ninitializing the info window adapter.", "response": "def init_info_window_adapter(self):\n        \"\"\" Initialize the info window adapter. Should only be done if one of \n        the markers defines a custom view.\n        \"\"\"\n        adapter = self.adapter\n        if adapter:\n            return  #: Already initialized\n        adapter = GoogleMap.InfoWindowAdapter()\n        adapter.getInfoContents.connect(self.on_info_window_contents_requested)\n        adapter.getInfoWindow.connect(self.on_info_window_requested)\n        self.map.setInfoWindowAdapter(adapter)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef on_map_fragment_created(self, obj_id):\n        self.fragment = MapFragment(__id__=obj_id)\n\n        #: Setup callback so we know when the map is ready\n        self.map.onMapReady.connect(self.on_map_ready)\n        self.fragment.getMapAsync(self.map.getId())\n\n        context = self.get_context()\n\n        def on_transaction(id):\n            trans = FragmentTransaction(__id__=id)\n            trans.add(self.widget.getId(), self.fragment)\n            trans.commit()\n\n        def on_fragment_manager(id):\n            fm = FragmentManager(__id__=id)\n            fm.beginTransaction().then(on_transaction)\n\n        context.widget.getSupportFragmentManager().then(on_fragment_manager)", "response": "Create the fragment and pull the map reference when it s loaded."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef on_map_clicked(self, pos):\n        d = self.declaration\n        d.clicked({\n            'click': 'short',\n            'position': tuple(pos)\n        })", "response": "Called when the map is clicked"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef on_map_long_clicked(self, pos):\n        d = self.declaration\n        d.clicked({\n            'click': 'long',\n            'position': tuple(pos)\n        })", "response": "Called when the long click is clicked"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef destroy(self):\n        marker = self.marker\n        parent = self.parent()\n        if marker:\n            if parent:\n                del parent.markers[marker.__id__]\n            marker.remove()\n        super(AndroidMapItemBase, self).destroy()", "response": "Remove the marker if it was added to the map when destroying"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef child_added(self, child):\n        if child.widget:\n            # TODO: Should we keep count and remove the adapter if not all\n            # markers request it?\n            self.parent().init_info_window_adapter()\n        super(AndroidMapMarker, self).child_added(child)", "response": "Called when a child is added to the map window."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef on_marker(self, marker):\n        mid, pos = marker\n        self.marker = Marker(__id__=mid)\n        mapview = self.parent()\n        # Save ref\n        mapview.markers[mid] = self\n\n        # Required so the packer can pass the id\n        self.marker.setTag(mid)\n\n        # If we have a child widget we must configure the map to use the\n        # custom adapter\n        for w in self.child_widgets():\n            mapview.init_info_window_adapter()\n            break\n\n        d = self.declaration\n        if d.show_info:\n            self.set_show_info(d.show_info)\n\n        #: Can free the options now\n        del self.options", "response": "Called when a marker is added to the mapview"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef on_marker(self, mid):\n        self.marker = Circle(__id__=mid)\n        self.parent().markers[mid] = self\n\n        #: Required so the packer can pass the id\n        self.marker.setTag(mid)\n\n        d = self.declaration\n        if d.clickable:\n            self.set_clickable(d.clickable)\n\n        #: Can free the options now\n        del self.options", "response": "Called when a marker is received."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef fit_transform(self, raw_documents, y=None):\n        documents = super(CountVectorizer, self).fit_transform(\n            raw_documents=raw_documents, y=y)\n        self.n = len(raw_documents)\n        m = (self.transform(raw_documents) > 0).astype(int)\n        m = m.sum(axis=0).A1\n        self.period_ = m\n        self.df_ = m / self.n\n        return documents", "response": "Fits the vocabulary dictionary and returns the term - document matrix."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nadd data to the flow", "response": "def data(self, X=None, y=None, sentences=None):\n        \"\"\"\n        Add data to flow\n\n        \"\"\"\n        self.X = X\n        self.y = y\n        self.sentences = sentences"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nadd a transformer to the flow and apply it to data in flow", "response": "def transform(self, transformer):\n        \"\"\"\n        Add transformer to flow and apply transformer to data in flow\n\n        Parameters\n        ----------\n        transformer : Transformer\n            a transformer to transform data\n        \"\"\"\n        self.transformers.append(transformer)\n        from languageflow.transformer.tagged import TaggedTransformer\n\n        if isinstance(transformer, TaggedTransformer):\n            self.X, self.y = transformer.transform(self.sentences)\n        if isinstance(transformer, TfidfVectorizer):\n            self.X = transformer.fit_transform(self.X)\n        if isinstance(transformer, CountVectorizer):\n            self.X = transformer.fit_transform(self.X)\n        if isinstance(transformer, NumberRemover):\n            self.X = transformer.transform(self.X)\n\n        if isinstance(transformer, MultiLabelBinarizer):\n            self.y = transformer.fit_transform(self.y)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ntraining model with transformed data", "response": "def train(self):\n        \"\"\"\n        Train model with transformed data\n        \"\"\"\n        for i, model in enumerate(self.models):\n            N = [int(i * len(self.y)) for i in self.lc_range]\n            for n in N:\n                X = self.X[:n]\n                y = self.y[:n]\n                e = Experiment(X, y, model.estimator, self.scores,\n                               self.validation_method)\n                e.log_folder = self.log_folder\n                e.train()"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nexport the model and transformers to export_folder", "response": "def export(self, model_name, export_folder):\n        \"\"\"\n        Export model and transformers to export_folder\n\n        Parameters\n        ----------\n        model_name: string\n            name of model to export\n        export_folder: string\n            folder to store exported model and transformers\n        \"\"\"\n        for transformer in self.transformers:\n            if isinstance(transformer, MultiLabelBinarizer):\n                joblib.dump(transformer,\n                            join(export_folder, \"label.transformer.bin\"),\n                            protocol=2)\n            if isinstance(transformer, TfidfVectorizer):\n                joblib.dump(transformer,\n                            join(export_folder, \"tfidf.transformer.bin\"),\n                            protocol=2)\n            if isinstance(transformer, CountVectorizer):\n                joblib.dump(transformer,\n                            join(export_folder, \"count.transformer.bin\"),\n                            protocol=2)\n            if isinstance(transformer, NumberRemover):\n                joblib.dump(transformer,\n                            join(export_folder, \"number.transformer.bin\"),\n                            protocol=2)\n        model = [model for model in self.models if model.name == model_name][0]\n        e = Experiment(self.X, self.y, model.estimator, None)\n        model_filename = join(export_folder, \"model.bin\")\n        e.export(model_filename)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nlearn vocabulary and idf return term - document matrix.", "response": "def fit_transform(self, raw_documents, y=None):\n        \"\"\"Learn vocabulary and idf, return term-document matrix.\n        This is equivalent to fit followed by transform, but more efficiently\n        implemented.\n        Parameters\n        ----------\n        raw_documents : iterable\n            an iterable which yields either str, unicode or file objects\n        Returns\n        -------\n        X : sparse matrix, [n_samples, n_features]\n            Tf-idf-weighted document-term matrix.\n        \"\"\"\n        documents = super(TfidfVectorizer, self).fit_transform(\n            raw_documents=raw_documents, y=y)\n        count = CountVectorizer(encoding=self.encoding,\n                                decode_error=self.decode_error,\n                                strip_accents=self.strip_accents,\n                                lowercase=self.lowercase,\n                                preprocessor=self.preprocessor,\n                                tokenizer=self.tokenizer,\n                                stop_words=self.stop_words,\n                                token_pattern=self.token_pattern,\n                                ngram_range=self.ngram_range,\n                                analyzer=self.analyzer,\n                                max_df=self.max_df,\n                                min_df=self.min_df,\n                                max_features=self.max_features,\n                                vocabulary=self.vocabulary_,\n                                binary=self.binary,\n                                dtype=self.dtype)\n        count.fit_transform(raw_documents=raw_documents, y=y)\n        self.period_ = count.period_\n        self.df_ = count.df_\n        self.n = count.n\n        return documents"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nfit linear model with Stochastic Gradient Descent.", "response": "def fit(self, X, y, coef_init=None, intercept_init=None,\n            sample_weight=None):\n        \"\"\"Fit linear model with Stochastic Gradient Descent.\n\n        Parameters\n        ----------\n        X : {array-like, sparse matrix}, shape (n_samples, n_features)\n            Training data\n\n        y : numpy array, shape (n_samples,)\n            Target values\n\n        coef_init : array, shape (n_classes, n_features)\n            The initial coefficients to warm-start the optimization.\n\n        intercept_init : array, shape (n_classes,)\n            The initial intercept to warm-start the optimization.\n\n        sample_weight : array-like, shape (n_samples,), optional\n            Weights applied to individual samples.\n            If not provided, uniform weights are assumed. These weights will\n            be multiplied with class_weight (passed through the\n            constructor) if class_weight is specified\n\n        Returns\n        -------\n        self : returns an instance of self.\n        \"\"\"\n        super(SGDClassifier, self).fit(X, y, coef_init, intercept_init,\n                                       sample_weight)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef print_cm(cm, labels, hide_zeroes=False, hide_diagonal=False, hide_threshold=None):\n    columnwidth = max([len(x) for x in labels] + [5])  # 5 is value length\n    empty_cell = \" \" * columnwidth\n    # Print header\n    print(\"    \" + empty_cell, end=\" \")\n    for label in labels:\n        print(\"%{0}s\".format(columnwidth) % label, end=\" \")\n    print()\n    # Print rows\n    for i, label1 in enumerate(labels):\n        print(\"    %{0}s\".format(columnwidth) % label1, end=\" \")\n        for j in range(len(labels)):\n            cell = \"%{0}.1f\".format(columnwidth) % cm[i, j]\n            if hide_zeroes:\n                cell = cell if float(cm[i, j]) != 0 else empty_cell\n            if hide_diagonal:\n                cell = cell if i != j else empty_cell\n            if hide_threshold:\n                cell = cell if cm[i, j] > hide_threshold else empty_cell\n            print(cell, end=\" \")\n        print()", "response": "pretty print confusion matrixes"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef load_big_file(f):\n    logger.info(f'loading file {f}')\n    with open(f, 'r+b') as f_in:\n        # mmap seems to be much more memory efficient\n        bf = mmap.mmap(f_in.fileno(), 0)\n        f_in.close()\n    return bf", "response": "Workaround for loading a big pickle file."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nconvert a url into a filename in a reversible way.", "response": "def url_to_filename(url: str, etag: str = None) -> str:\n    \"\"\"\n    Converts a url into a filename in a reversible way.\n    If `etag` is specified, add it on the end, separated by a period\n    (which necessarily won't appear in the base64-encoded filename).\n    Get rid of the quotes in the etag, since Windows doesn't like them.\n    \"\"\"\n    url_bytes = url.encode('utf-8')\n    b64_bytes = base64.b64encode(url_bytes)\n    decoded = b64_bytes.decode('utf-8')\n\n    if etag:\n        # Remove quotes from etag\n        etag = etag.replace('\"', '')\n        return f\"{decoded}.{etag}\"\n    else:\n        return decoded"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef filename_to_url(filename: str) -> Tuple[str, str]:\n    try:\n        # If there is an etag, it's everything after the first period\n        decoded, etag = filename.split(\".\", 1)\n    except ValueError:\n        # Otherwise, use None\n        decoded, etag = filename, None\n\n    filename_bytes = decoded.encode('utf-8')\n    url_bytes = base64.b64decode(filename_bytes)\n    return url_bytes.decode('utf-8'), etag", "response": "Converts a filename to a url."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ngiving something that might be a URL or a local path return the path to the cached file.", "response": "def cached_path(url_or_filename: str, cache_dir: Path) -> Path:\n    \"\"\"\n    Given something that might be a URL (or might be a local path),\n    determine which. If it's a URL, download the file and cache it, and\n    return the path to the cached file. If it's already a local path,\n    make sure the file exists and then return the path.\n    \"\"\"\n    dataset_cache = Path(CACHE_ROOT) / cache_dir\n\n    parsed = urlparse(url_or_filename)\n\n    if parsed.scheme in ('http', 'https'):\n        # URL, so get it from the cache (downloading if necessary)\n        return get_from_cache(url_or_filename, dataset_cache)\n    elif parsed.scheme == '' and Path(url_or_filename).exists():\n        # File, and it exists.\n        return Path(url_or_filename)\n    elif parsed.scheme == '':\n        # File, but it doesn't exist.\n        raise FileNotFoundError(\"file {} not found\".format(url_or_filename))\n    else:\n        # Something unknown\n        raise ValueError(\"unable to parse {} as a URL or as a local path\".format(url_or_filename))"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\ndownloading a file from local cache and return the path to the cached file.", "response": "def get_from_cache(url: str, cache_dir: Path = None) -> Path:\n    \"\"\"\n    Given a URL, look for the corresponding dataset in the local cache.\n    If it's not there, download it. Then return the path to the cached file.\n    \"\"\"\n    cache_dir.mkdir(parents=True, exist_ok=True)\n\n    filename = re.sub(r'.+/', '', url)\n    # get cache path to put the file\n    cache_path = cache_dir / filename\n    if cache_path.exists():\n        return cache_path\n\n    # make HEAD request to check ETag\n    response = requests.head(url)\n\n    if response.status_code != 200:\n        if \"www.dropbox.com\" in url:\n            # dropbox return code 301, so we ignore this error\n            pass\n        else:\n            raise IOError(\"HEAD request failed for url {}\".format(url))\n\n    # add ETag to filename if it exists\n    # etag = response.headers.get(\"ETag\")\n\n    if not cache_path.exists():\n        # Download to temporary file, then copy to cache dir once finished.\n        # Otherwise you get corrupt cache entries if the download gets interrupted.\n        fd, temp_filename = tempfile.mkstemp()\n        logger.info(\"%s not found in cache, downloading to %s\", url, temp_filename)\n\n        # GET file object\n        req = requests.get(url, stream=True)\n        content_length = req.headers.get('Content-Length')\n        total = int(content_length) if content_length is not None else None\n        progress = Tqdm.tqdm(unit=\"B\", total=total)\n        with open(temp_filename, 'wb') as temp_file:\n            for chunk in req.iter_content(chunk_size=1024):\n                if chunk: # filter out keep-alive new chunks\n                    progress.update(len(chunk))\n                    temp_file.write(chunk)\n\n        progress.close()\n\n        logger.info(\"copying %s to cache at %s\", temp_filename, cache_path)\n        shutil.copyfile(temp_filename, str(cache_path))\n        logger.info(\"removing temp file %s\", temp_filename)\n        os.close(fd)\n        os.remove(temp_filename)\n\n    return cache_path"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef fit(self, X, y):\n        trainer = pycrfsuite.Trainer(verbose=True)\n        for xseq, yseq in zip(X, y):\n            trainer.append(xseq, yseq)\n\n        trainer.set_params(self.params)\n        if self.filename:\n            filename = self.filename\n        else:\n            filename = 'model.tmp'\n        trainer.train(filename)\n        tagger = pycrfsuite.Tagger()\n        tagger.open(filename)\n        self.estimator = tagger", "response": "Fit CRF according to X y"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\npredicting class labels for samples in X.", "response": "def predict(self, X):\n        \"\"\"Predict class labels for samples in X.\n\n        Parameters\n        ----------\n        X : {array-like, sparse matrix}, shape = [n_samples, n_features]\n            Samples.\n        \"\"\"\n        if isinstance(X[0], list):\n            return [self.estimator.tag(x) for x in X]\n        return self.estimator.tag(X)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef serve(self, port=62000):\n\n        from http.server import HTTPServer, CGIHTTPRequestHandler\n        os.chdir(self.log_folder)\n\n        httpd = HTTPServer(('', port), CGIHTTPRequestHandler)\n        print(\"Starting LanguageBoard on port: \" + str(httpd.server_port))\n        webbrowser.open('http://0.0.0.0:{}'.format(port))\n        httpd.serve_forever()", "response": "Start the language board web application."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef predict(self, X):\n        x = X\n        if not isinstance(X, list):\n            x = [X]\n        y = self.estimator.predict(x)\n        y = [item[0] for item in y]\n        y = [self._remove_prefix(label) for label in y]\n        if not isinstance(X, list):\n            y = y[0]\n        return y", "response": "In order to obtain the most likely label for a list of text\n            Raw texts and labels for a list of text\n            Raw texts X returns the most likely label for a list of text\n            Raw texts and labels for a list of labels."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef log(model_folder, binary_file=\"count.transformer.bin\",\n            log_folder=\"analyze\"):\n        \"\"\"\n        Parameters\n        ----------\n        model_folder : string\n            folder contains binaries file of model\n        binary_file : string\n            file path to count transformer binary file\n        log_folder : string\n            log folder\n        \"\"\"\n        file = join(model_folder, binary_file)\n        vectorizer = joblib.load(file)\n        output = []\n\n        for token in vectorizer.vocabulary_:\n            index = vectorizer.vocabulary_[token]\n            ngram = len(token.split(\" \"))\n            output.append({\n                \"token\": token,\n                \"ngram\": ngram,\n                \"period\": vectorizer.period_[index].item(),\n                \"df\": vectorizer.df_[index],\n            })\n        output = sorted(output, key=lambda item: item[\"df\"])\n        content = json.dumps(output, ensure_ascii=False)\n        write(join(log_folder, \"count.json\"), content)", "response": "This function writes the count. json file of the count transformer binary file to the log folder."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nfits KimCNNClassifier according to X y", "response": "def fit(self, X, y):\n        \"\"\"Fit KimCNNClassifier according to X, y\n\n        Parameters\n        ----------\n        X : list of string\n            each item is a raw text\n        y : list of string\n            each item is a label\n        \"\"\"\n        ####################\n        # Data Loader\n        ####################\n        word_vector_transformer = WordVectorTransformer(padding='max')\n        X = word_vector_transformer.fit_transform(X)\n        X = LongTensor(X)\n        self.word_vector_transformer = word_vector_transformer\n\n        y_transformer = LabelEncoder()\n        y = y_transformer.fit_transform(y)\n        y = torch.from_numpy(y)\n        self.y_transformer = y_transformer\n\n        dataset = CategorizedDataset(X, y)\n        dataloader = DataLoader(dataset,\n                                batch_size=self.batch_size,\n                                shuffle=True,\n                                num_workers=4)\n\n        ####################\n        # Model\n        ####################\n        KERNEL_SIZES = self.kernel_sizes\n        NUM_KERNEL = self.num_kernel\n        EMBEDDING_DIM = self.embedding_dim\n\n        model = TextCNN(\n            vocab_size=word_vector_transformer.get_vocab_size(),\n            embedding_dim=EMBEDDING_DIM,\n            output_size=len(self.y_transformer.classes_),\n            kernel_sizes=KERNEL_SIZES,\n            num_kernel=NUM_KERNEL)\n        if USE_CUDA:\n            model = model.cuda()\n\n        ####################\n        # Train\n        ####################\n        EPOCH = self.epoch\n        LR = self.lr\n\n        loss_function = nn.CrossEntropyLoss()\n        optimizer = optim.Adam(model.parameters(), lr=LR)\n\n        for epoch in range(EPOCH):\n            losses = []\n            for i, data in enumerate(dataloader):\n                X, y = data\n                X, y = Variable(X), Variable(y)\n\n                optimizer.zero_grad()\n                model.train()\n                output = model(X)\n\n                loss = loss_function(output, y)\n                losses.append(loss.data.tolist()[0])\n                loss.backward()\n\n                optimizer.step()\n\n                if i % 100 == 0:\n                    print(\"[%d/%d] mean_loss : %0.2f\" % (\n                        epoch, EPOCH, np.mean(losses)))\n                    losses = []\n        self.model = model"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef config_sources(app, environment, cluster, configs_dirs, app_dir,\n                   local=False, build=False):\n    \"\"\"Return the config files for an environment & cluster specific app.\"\"\"\n    sources = [\n        # Machine-specific\n        (configs_dirs, 'hostname'),\n        (configs_dirs, 'hostname-local'),\n        (configs_dirs, 'hostname-build'),\n        # Global\n        (configs_dirs, 'common'),\n        # Environment + Cluster\n        (configs_dirs, 'common-%s' % environment),\n        (configs_dirs, 'common-%s-%s' % (environment, cluster)),\n        (configs_dirs, 'common-local'),\n        (configs_dirs, 'common-build'),\n        # Machine-specific overrides\n        (configs_dirs, 'common-overrides'),\n        # Application-specific\n        ([app_dir], '%s-default' % app),\n        ([app_dir], '%s-%s' % (app, environment)),\n        ([app_dir], '%s-%s-%s' % (app, environment, cluster)),\n        (configs_dirs, app),\n        (configs_dirs, '%s-%s' % (app, environment)),\n        (configs_dirs, '%s-%s-%s' % (app, environment, cluster)),\n        ([app_dir], '%s-local' % app),\n        ([app_dir], '%s-build' % app),\n        (configs_dirs, '%s-local' % app),\n        (configs_dirs, '%s-build' % app),\n        # Machine-specific application override\n        (configs_dirs, '%s-overrides' % app),\n    ]\n\n    # Filter out build sources if not requested\n    if not build:\n        sources = [source for source in sources\n                   if not source[1].endswith('-build')]\n        # Filter out local sources if not build and not local\n        if not local:\n            sources = [source for source in sources\n                       if not source[1].endswith('-local')]\n\n    return available_sources(sources)", "response": "Return the config files for an application environment & cluster specific app."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nyielding the sources that are present in the source directory.", "response": "def available_sources(sources):\n    \"\"\"Yield the sources that are present.\"\"\"\n    for dirs, name in sources:\n        for directory in dirs:\n            fn = os.path.join(directory, name) + '.py'\n            if os.path.isfile(fn):\n                yield fn"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nmerge the configuration sources and return the resulting DotDict.", "response": "def smush_config(sources, initial=None):\n    \"\"\"Merge the configuration sources and return the resulting DotDict.\"\"\"\n    if initial is None:\n        initial = {}\n    config = DotDict(initial)\n\n    for fn in sources:\n        log.debug('Merging %s', fn)\n        mod = get_config_module(fn)\n        config = mod.update(config)\n        log.debug('Current config:\\n%s', json.dumps(config, indent=4,\n                                                    cls=LenientJSONEncoder))\n    return config"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nmerge dictionary d2 into d1 overriding entries in d1 with values from d2.", "response": "def merge_dicts(d1, d2, _path=None):\n    \"\"\"\n    Merge dictionary d2 into d1, overriding entries in d1 with values from d2.\n\n    d1 is mutated.\n\n    _path is for internal, recursive use.\n    \"\"\"\n    if _path is None:\n        _path = ()\n    if isinstance(d1, dict) and isinstance(d2, dict):\n        for k, v in d2.items():\n            if isinstance(v, MissingValue) and v.name is None:\n                v.name = '.'.join(_path + (k,))\n\n            if isinstance(v, DeletedValue):\n                d1.pop(k, None)\n            elif k not in d1:\n                if isinstance(v, dict):\n                    d1[k] = merge_dicts({}, v, _path + (k,))\n                else:\n                    d1[k] = v\n            else:\n                if isinstance(d1[k], dict) and isinstance(v, dict):\n                    d1[k] = merge_dicts(d1[k], v, _path + (k,))\n                elif isinstance(d1[k], list) and isinstance(v, list):\n                    # Lists are only supported as leaves\n                    d1[k] += v\n                elif isinstance(d1[k], MissingValue):\n                    d1[k] = v\n                elif d1[k] is None:\n                    d1[k] = v\n                elif type(d1[k]) == type(v):\n                    d1[k] = v\n                else:\n                    raise TypeError('Refusing to replace a %s with a %s'\n                                    % (type(d1[k]), type(v)))\n    else:\n        raise TypeError('Cannot merge a %s with a %s' % (type(d1), type(d2)))\n\n    return d1"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef filter_dict(unfiltered, filter_keys):\n    filtered = DotDict()\n    for k in filter_keys:\n        filtered[k] = unfiltered[k]\n    return filtered", "response": "Return a subset of a dictionary using the specified keys."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _convert_item(self, obj):\n        if isinstance(obj, dict) and not isinstance(obj, DotDict):\n            obj = DotDict(obj)\n        elif isinstance(obj, list):\n            # must mutate and not just reassign, otherwise it will\n            # just use original object mutable/immutable\n            for i, item in enumerate(obj):\n                if isinstance(item, dict) and not isinstance(item, DotDict):\n                    obj[i] = DotDict(item)\n        return obj", "response": "Convert obj into a DotDict or list of DotDict.\n\n        Directly nested lists aren t supported."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nreturning a config subset using the filter defined in the deploy config.", "response": "def filter_config(config, deploy_config):\n    \"\"\"Return a config subset using the filter defined in the deploy config.\"\"\"\n    if not os.path.isfile(deploy_config):\n        return DotDict()\n    config_module = get_config_module(deploy_config)\n    return config_module.filter(config)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef seeded_auth_token(client, service, seed):\n    hash_func = hashlib.md5()\n    token = ','.join((client, service, seed)).encode('utf-8')\n    hash_func.update(token)\n    return hash_func.hexdigest()", "response": "Return an auth token based on the client + service + seed tuple."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef write_config(config, app_dir, filename='configuration.json'):\n    path = os.path.join(app_dir, filename)\n    with open(path, 'w') as f:\n        json.dump(\n            config, f, indent=4, cls=DetectMissingEncoder,\n            separators=(',', ': '))", "response": "Write the configuration to the applicaiton directory."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nimport the config file to yoconfigurator. configs.", "response": "def get_config_module(config_pathname):\n    \"\"\"Imports the config file to yoconfigurator.configs.<config_basename>.\"\"\"\n    configs_mod = 'yoconfigurator.configs'\n    if configs_mod not in sys.modules:\n        sys.modules[configs_mod] = types.ModuleType(configs_mod)\n    module_name = os.path.basename(config_pathname).rsplit('.', 1)[0]\n    module_name = configs_mod + '.' + module_name\n    return _load_module(module_name, config_pathname)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef validate_date(date_text):\n    try:\n        if int(date_text) < 0:\n            return True\n    except ValueError:\n        pass\n\n    try:\n        datetime.strptime(date_text, '%Y-%m-%d')\n        return True\n    except ValueError:\n        pass\n\n    raise ValueError('Dates must be negative integers or YYYY-MM-DD in the past.')", "response": "Return True if valid raise ValueError if not"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef get_download_total(rows):\n    headers = rows.pop(0)\n    index = headers.index('download_count')\n    total_downloads = sum(int(row[index]) for row in rows)\n\n    rows.insert(0, headers)\n    return total_downloads, index", "response": "Return the total downloads and the downloads column"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nadding a final row to rows showing the total downloads", "response": "def add_download_total(rows):\n    \"\"\"Add a final row to rows showing the total downloads\"\"\"\n    total_row = [\"\"] * len(rows[0])\n    total_row[0] = \"Total\"\n    total_downloads, downloads_column = get_download_total(rows)\n    total_row[downloads_column] = str(total_downloads)\n    rows.append(total_row)\n\n    return rows"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ngetting pypinfo for a single object.", "response": "def pypinfo(\n    ctx,\n    project,\n    fields,\n    auth,\n    run,\n    json,\n    indent,\n    timeout,\n    limit,\n    days,\n    start_date,\n    end_date,\n    where,\n    order,\n    all_installers,\n    percent,\n    markdown,\n):\n    \"\"\"Valid fields are:\\n\n    project | version | file | pyversion | percent3 | percent2 | impl | impl-version |\\n\n    openssl | date | month | year | country | installer | installer-version |\\n\n    setuptools-version | system | system-release | distro | distro-version | cpu\n    \"\"\"\n    if auth:\n        set_credentials(auth)\n        click.echo('Credentials location set to \"{}\".'.format(get_credentials()))\n        return\n\n    if project is None and not fields:\n        click.echo(ctx.get_help())\n        return\n\n    parsed_fields = []\n    for field in fields:\n        parsed = FIELD_MAP.get(field)\n        if parsed is None:\n            raise ValueError('\"{}\" is an unsupported field.'.format(field))\n        parsed_fields.append(parsed)\n\n    order_name = order\n    order = FIELD_MAP.get(order)\n    if order:\n        order_name = order.name\n        parsed_fields.insert(0, order)\n\n    built_query = build_query(\n        project,\n        parsed_fields,\n        limit=limit,\n        days=days,\n        start_date=start_date,\n        end_date=end_date,\n        where=where,\n        order=order_name,\n        pip=not all_installers,\n    )\n\n    if run:\n        client = create_client(get_credentials())\n        query_job = client.query(built_query, job_config=create_config())\n        query_rows = query_job.result(timeout=timeout // 1000)\n\n        # Cached\n        from_cache = not not query_job.cache_hit\n\n        # Processed\n        bytes_processed = query_job.total_bytes_processed or 0\n        processed_amount, processed_unit = convert_units(bytes_processed)\n\n        # Billed\n        bytes_billed = query_job.total_bytes_billed or 0\n        billed_amount, billed_unit = convert_units(bytes_billed)\n\n        # Cost\n        billing_tier = query_job.billing_tier or 1\n        estimated_cost = Decimal(TIER_COST * billing_tier) / TB * Decimal(bytes_billed)\n        estimated_cost = str(estimated_cost.quantize(TO_CENTS, rounding=ROUND_UP))\n\n        rows = parse_query_result(query_job, query_rows)\n\n        if percent:\n            rows = add_percentages(rows, include_sign=not json)\n\n        # Only for tables, and if more than the header row + a single data row\n        if len(rows) > 2 and not json:\n            rows = add_download_total(rows)\n\n        if not json:\n            click.echo('Served from cache: {}'.format(from_cache))\n            click.echo('Data processed: {:.2f} {}'.format(processed_amount, processed_unit))\n            click.echo('Data billed: {:.2f} {}'.format(billed_amount, billed_unit))\n            click.echo('Estimated cost: ${}'.format(estimated_cost))\n\n            click.echo()\n            click.echo(tabulate(rows, markdown))\n        else:\n            query_info = {\n                'cached': from_cache,\n                'bytes_processed': bytes_processed,\n                'bytes_billed': bytes_billed,\n                'estimated_cost': estimated_cost,\n            }\n            click.echo(format_json(rows, query_info, indent))\n    else:\n        click.echo(built_query)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nfinding and patch an entry in the soup.", "response": "def find_and_patch_entry(soup, entry):\n    \"\"\"\n    Modify soup so Dash.app can generate TOCs on the fly.\n    \"\"\"\n    link = soup.find(\"a\", {\"class\": \"headerlink\"}, href=\"#\" + entry.anchor)\n    tag = soup.new_tag(\"a\")\n    tag[\"name\"] = APPLE_REF_TEMPLATE.format(entry.type, entry.name)\n    if link:\n        link.parent.insert(0, tag)\n        return True\n    elif entry.anchor.startswith(\"module-\"):\n        soup.h1.parent.insert(0, tag)\n        return True\n    else:\n        return False"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef inv_entry_to_path(data):\n    path_tuple = data[2].split(\"#\")\n    if len(path_tuple) > 1:\n        path_str = \"#\".join((path_tuple[0], path_tuple[-1]))\n    else:\n        path_str = data[2]\n    return path_str", "response": "Convert an intersphinx inventory entry to a path string."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef main(\n    source,\n    force,\n    name,\n    quiet,\n    verbose,\n    destination,\n    add_to_dash,\n    add_to_global,\n    icon,\n    index_page,\n    enable_js,\n    online_redirect_url,\n    parser,\n):\n    \"\"\"\n    Convert docs from SOURCE to Dash.app's docset format.\n    \"\"\"\n    try:\n        logging.config.dictConfig(\n            create_log_config(verbose=verbose, quiet=quiet)\n        )\n    except ValueError as e:\n        click.secho(e.args[0], fg=\"red\")\n        raise SystemExit(1)\n\n    if icon:\n        icon_data = icon.read()\n        if not icon_data.startswith(PNG_HEADER):\n            log.error(\n                '\"{}\" is not a valid PNG image.'.format(\n                    click.format_filename(icon.name)\n                )\n            )\n            raise SystemExit(1)\n    else:\n        icon_data = None\n\n    source, dest, name = setup_paths(\n        source,\n        destination,\n        name=name,\n        add_to_global=add_to_global,\n        force=force,\n    )\n    if parser is None:\n        parser = parsers.get_doctype(source)\n        if parser is None:\n            log.error(\n                '\"{}\" does not contain a known documentation format.'.format(\n                    click.format_filename(source)\n                )\n            )\n            raise SystemExit(errno.EINVAL)\n    docset = prepare_docset(\n        source, dest, name, index_page, enable_js, online_redirect_url\n    )\n    doc_parser = parser(doc_path=docset.docs)\n    log.info(\n        (\n            \"Converting \"\n            + click.style(\"{parser_name}\", bold=True)\n            + ' docs from \"{src}\" to \"{dst}\".'\n        ).format(\n            parser_name=parser.name,\n            src=click.format_filename(source, shorten=True),\n            dst=click.format_filename(dest),\n        )\n    )\n\n    with docset.db_conn:\n        log.info(\"Parsing documentation...\")\n        toc = patch_anchors(doc_parser, show_progressbar=not quiet)\n        for entry in doc_parser.parse():\n            docset.db_conn.execute(\n                \"INSERT INTO searchIndex VALUES (NULL, ?, ?, ?)\",\n                entry.as_tuple(),\n            )\n            toc.send(entry)\n        count = docset.db_conn.execute(\n            \"SELECT COUNT(1) FROM searchIndex\"\n        ).fetchone()[0]\n        log.info(\n            (\n                \"Added \"\n                + click.style(\"{count:,}\", fg=\"green\" if count > 0 else \"red\")\n                + \" index entries.\"\n            ).format(count=count)\n        )\n        toc.close()\n\n    if icon_data:\n        add_icon(icon_data, dest)\n\n    if add_to_dash or add_to_global:\n        log.info(\"Adding to Dash.app...\")\n        os.system('open -a dash \"{}\"'.format(dest))", "response": "Convert a source file to Dash. app s docset format."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ncreate a log config dictionary.", "response": "def create_log_config(verbose, quiet):\n    \"\"\"\n    We use logging's levels as an easy-to-use verbosity controller.\n    \"\"\"\n    if verbose and quiet:\n        raise ValueError(\n            \"Supplying both --quiet and --verbose makes no sense.\"\n        )\n    elif verbose:\n        level = logging.DEBUG\n    elif quiet:\n        level = logging.ERROR\n    else:\n        level = logging.INFO\n\n    logger_cfg = {\"handlers\": [\"click_handler\"], \"level\": level}\n\n    return {\n        \"version\": 1,\n        \"formatters\": {\"click_formatter\": {\"format\": \"%(message)s\"}},\n        \"handlers\": {\n            \"click_handler\": {\n                \"level\": level,\n                \"class\": \"doc2dash.__main__.ClickEchoHandler\",\n                \"formatter\": \"click_formatter\",\n            }\n        },\n        \"loggers\": {\"doc2dash\": logger_cfg, \"__main__\": logger_cfg},\n    }"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ndetermines source and destination using the options.", "response": "def setup_paths(source, destination, name, add_to_global, force):\n    \"\"\"\n    Determine source and destination using the options.\n    \"\"\"\n    if source[-1] == \"/\":\n        source = source[:-1]\n    if not name:\n        name = os.path.split(source)[-1]\n    elif name.endswith(\".docset\"):\n        name = name.replace(\".docset\", \"\")\n    if add_to_global:\n        destination = DEFAULT_DOCSET_PATH\n    dest = os.path.join(destination or \"\", name + \".docset\")\n    dst_exists = os.path.lexists(dest)\n    if dst_exists and force:\n        shutil.rmtree(dest)\n    elif dst_exists:\n        log.error(\n            'Destination path \"{}\" already exists.'.format(\n                click.format_filename(dest)\n            )\n        )\n        raise SystemExit(errno.EEXIST)\n    return source, dest, name"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef prepare_docset(\n    source, dest, name, index_page, enable_js, online_redirect_url\n):\n    \"\"\"\n    Create boilerplate files & directories and copy vanilla docs inside.\n\n    Return a tuple of path to resources and connection to sqlite db.\n    \"\"\"\n    resources = os.path.join(dest, \"Contents\", \"Resources\")\n    docs = os.path.join(resources, \"Documents\")\n    os.makedirs(resources)\n\n    db_conn = sqlite3.connect(os.path.join(resources, \"docSet.dsidx\"))\n    db_conn.row_factory = sqlite3.Row\n    db_conn.execute(\n        \"CREATE TABLE searchIndex(id INTEGER PRIMARY KEY, name TEXT, \"\n        \"type TEXT, path TEXT)\"\n    )\n    db_conn.commit()\n\n    plist_path = os.path.join(dest, \"Contents\", \"Info.plist\")\n    plist_cfg = {\n        \"CFBundleIdentifier\": name,\n        \"CFBundleName\": name,\n        \"DocSetPlatformFamily\": name.lower(),\n        \"DashDocSetFamily\": \"python\",\n        \"isDashDocset\": True,\n        \"isJavaScriptEnabled\": enable_js,\n    }\n    if index_page is not None:\n        plist_cfg[\"dashIndexFilePath\"] = index_page\n    if online_redirect_url is not None:\n        plist_cfg[\"DashDocSetFallbackURL\"] = online_redirect_url\n\n    write_plist(plist_cfg, plist_path)\n\n    shutil.copytree(source, docs)\n\n    return DocSet(path=dest, docs=docs, plist=plist_path, db_conn=db_conn)", "response": "Prepare a new DocSet from a source file to a destination file."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef add_icon(icon_data, dest):\n    with open(os.path.join(dest, \"icon.png\"), \"wb\") as f:\n        f.write(icon_data)", "response": "Add icon to docset"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef patch_anchors(parser, show_progressbar):\n    files = defaultdict(list)\n    try:\n        while True:\n            pentry = (yield)\n            try:\n                fname, anchor = pentry.path.split(\"#\")\n                files[fname].append(\n                    TOCEntry(name=pentry.name, type=pentry.type, anchor=anchor)\n                )\n            except ValueError:\n                # pydoctor has no anchors for e.g. classes\n                pass\n    except GeneratorExit:\n        pass\n\n    def patch_files(files):\n        for fname, entries in files:\n            full_path = os.path.join(parser.doc_path, fname)\n            with codecs.open(full_path, mode=\"r\", encoding=\"utf-8\") as fp:\n                soup = BeautifulSoup(fp, \"html.parser\")\n                for entry in entries:\n                    if not parser.find_and_patch_entry(soup, entry):\n                        log.debug(\n                            \"Can't find anchor {} in {}.\".format(\n                                entry.anchor, click.format_filename(fname)\n                            )\n                        )\n            with open(full_path, mode=\"wb\") as fp:\n                fp.write(soup.encode(\"utf-8\"))\n\n    if show_progressbar is True:\n        with click.progressbar(\n            files.items(),\n            width=0,\n            length=len(files),\n            label=\"Adding table of contents meta data...\",\n        ) as pbar:\n            patch_files(pbar)\n    else:\n        patch_files(files.items())", "response": "Consume parse entries and patch the anchor names of the TOCs."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef has_file_with(path, filename, content):\n    try:\n        with open(os.path.join(path, filename), \"rb\") as f:\n            return content in f.read()\n    except IOError as e:\n        if e.errno == errno.ENOENT:\n            return False\n        else:\n            raise", "response": "Check whether filename in path contains the string content."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef run_cell(self, cell):\n        globals = self.ipy_shell.user_global_ns\n        locals = self.ipy_shell.user_ns\n        globals.update({\n            \"__ipy_scope__\": None,\n        })\n        try:\n            with redirect_stdout(self.stdout):\n                self.run(cell, globals, locals)\n        except:\n            self.code_error = True\n            if self.options.debug:\n                raise BdbQuit\n        finally:\n            self.finalize()", "response": "Run the IPython code using the IPython globals and locals of the cell."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef get_stack_data(self, frame, traceback, event_type):\n        heap_data = Heap(self.options)\n        stack_data = StackFrames(self.options)\n        stack_frames, cur_frame_ind = self.get_stack(frame, traceback)\n\n        for frame_ind, (frame, lineno) in enumerate(stack_frames):\n            skip_this_stack = False\n\n            # Skip the self.run calling frame (first frame)\n            if frame_ind == 0:\n                continue\n\n            # Skip stack after a certain stack frame depth\n            if len(stack_data) > self.options.depth:\n                skip_this_stack = True\n                break\n\n            # Skip stack when frames dont belong to the current notebook or\n            # current cell, I.e. frames in another global scope altogether\n            # or frames in other cells\n            if (not self.is_notebook_frame(frame) or\n                    self.is_other_cell_frame(frame)):\n                if not self.options.step_all:\n                    skip_this_stack = True\n                    break\n                lineno = 0  # So line markers dont display for these frames\n            else:\n                lineno += 1  # Because cell magic is actually line 1\n\n            # Filter out ignored names from the frame locals\n            user_locals = filter_dict(\n                frame.f_locals,\n                ignore_vars + list(self.ipy_shell.user_ns_hidden.keys())\n            )\n\n            # Add frame and heap data\n            stack_data.add(frame, lineno, event_type, user_locals)\n            heap_data.add(user_locals)\n\n        if not skip_this_stack and not stack_data.is_empty():\n            self.trace_history.append(\n                    stack_data,\n                    heap_data,\n                    self.stdout.getvalue()\n            )", "response": "Get the stack frames data at each of the hooks above ( Ie. for each iteration of the Python code."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef filter_dict(d, exclude):\n    ret = {}\n    for key, value in d.items():\n        if key not in exclude:\n            ret.update({key: value})\n    return ret", "response": "Returns a new dict with specified keys excluded from the origional dict\nCTYPE"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nredirect the stdout to use instead of the stdout.", "response": "def redirect_stdout(new_stdout):\n    \"\"\"Redirect the stdout\n\n    Args:\n        new_stdout (io.StringIO): New stdout to use instead\n    \"\"\"\n    old_stdout, sys.stdout = sys.stdout, new_stdout\n    try:\n        yield None\n    finally:\n        sys.stdout = old_stdout"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef format(obj, options):\n    formatters = {\n        float_types: lambda x: '{:.{}g}'.format(x, options.digits),\n    }\n    for _types, fmtr in formatters.items():\n        if isinstance(obj, _types):\n            return fmtr(obj)\n    try:\n        if six.PY2 and isinstance(obj, six.string_types):\n            return str(obj.encode('utf-8'))\n        return str(obj)\n    except:\n        return 'OBJECT'", "response": "Return a string representation of the Python object obj."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef get_type_info(obj):\n    if isinstance(obj, primitive_types):\n        return ('primitive', type(obj).__name__)\n    if isinstance(obj, sequence_types):\n        return ('sequence', type(obj).__name__)\n    if isinstance(obj, array_types):\n        return ('array', type(obj).__name__)\n    if isinstance(obj, key_value_types):\n        return ('key-value', type(obj).__name__)\n    if isinstance(obj, types.ModuleType):\n        return ('module', type(obj).__name__)\n    if isinstance(obj, (types.FunctionType, types.MethodType)):\n        return ('function', type(obj).__name__)\n    if isinstance(obj, type):\n        if hasattr(obj, '__dict__'):\n            return ('class', obj.__name__)\n    if isinstance(type(obj), type):\n        if hasattr(obj, '__dict__'):\n            cls_name = type(obj).__name__\n            if cls_name == 'classobj':\n                cls_name = obj.__name__\n                return ('class', '{}'.format(cls_name))\n            if cls_name == 'instance':\n                cls_name = obj.__class__.__name__\n            return ('instance', '{} instance'.format(cls_name))\n\n    return ('unknown', type(obj).__name__)", "response": "Get type information for a Python object\nTaxonomy"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef refresh(self):\n        self.accounts = self.accounts or []\n        idx = 0\n        for _acc in self._backend.accounts():\n            _acc.wallet = self\n            try:\n                if self.accounts[idx]:\n                    continue\n            except IndexError:\n                pass\n            self.accounts.append(_acc)\n            idx += 1", "response": "Reloads the wallet and its accounts."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn private spend key.", "response": "def spend_key(self):\n        \"\"\"\n        Returns private spend key. None if wallet is view-only.\n\n        :rtype: str or None\n        \"\"\"\n        key = self._backend.spend_key()\n        if key == numbers.EMPTY_KEY:\n            return None\n        return key"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ncreate a new account and appends it to the wallet s account list and returns it.", "response": "def new_account(self, label=None):\n        \"\"\"\n        Creates new account, appends it to the :class:`Wallet`'s account list and returns it.\n\n        :param label: account label as `str`\n        :rtype: :class:`Account`\n        \"\"\"\n        acc, addr = self._backend.new_account(label=label)\n        assert acc.index == len(self.accounts)\n        self.accounts.append(acc)\n        return acc"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nreturns the number of confirmations for given transaction or payment.", "response": "def confirmations(self, txn_or_pmt):\n        \"\"\"\n        Returns the number of confirmations for given\n        :class:`Transaction <monero.transaction.Transaction>` or\n        :class:`Payment <monero.transaction.Payment>` object.\n\n        :rtype: int\n        \"\"\"\n        if isinstance(txn_or_pmt, Payment):\n            txn = txn_or_pmt.transaction\n        else:\n            txn = txn_or_pmt\n        try:\n            return max(0, self.height() - txn.height)\n        except TypeError:\n            return 0"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ncalculate the sub - address for the given account index and address index within the account.", "response": "def get_address(self, major, minor):\n        \"\"\"\n        Calculates sub-address for account index (`major`) and address index within\n        the account (`minor`).\n\n        :rtype: :class:`BaseAddress <monero.address.BaseAddress>`\n        \"\"\"\n        # ensure indexes are within uint32\n        if major < 0 or major >= 2**32:\n            raise ValueError('major index {} is outside uint32 range'.format(major))\n        if minor < 0 or minor >= 2**32:\n            raise ValueError('minor index {} is outside uint32 range'.format(minor))\n        master_address = self.address()\n        if major == minor == 0:\n            return master_address\n        master_svk = unhexlify(self.view_key())\n        master_psk = unhexlify(self.address().spend_key())\n        # m = Hs(\"SubAddr\\0\" || master_svk || major || minor)\n        hsdata = b''.join([\n                b'SubAddr\\0', master_svk,\n                struct.pack('<I', major), struct.pack('<I', minor)])\n        m = keccak_256(hsdata).digest()\n        # D = master_psk + m * B\n        D = ed25519.add_compressed(\n                ed25519.decodepoint(master_psk),\n                ed25519.scalarmult(ed25519.B, ed25519.decodeint(m)))\n        # C = master_svk * D\n        C = ed25519.scalarmult(D, ed25519.decodeint(master_svk))\n        netbyte = bytearray([\n                42 if master_address.is_mainnet() else \\\n                63 if master_address.is_testnet() else 36])\n        data = netbyte + ed25519.encodepoint(D) + ed25519.encodepoint(C)\n        checksum = keccak_256(data).digest()[:4]\n        return address.SubAddress(base58.encode(hexlify(data + checksum)))"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nsend a transfer from the default account.", "response": "def transfer(self, address, amount,\n            priority=prio.NORMAL, payment_id=None, unlock_time=0,\n            relay=True):\n        \"\"\"\n        Sends a transfer from the default account. Returns a list of resulting transactions.\n\n        :param address: destination :class:`Address <monero.address.Address>` or subtype\n        :param amount: amount to send\n        :param priority: transaction priority, implies fee. The priority can be a number\n                    from 1 to 4 (unimportant, normal, elevated, priority) or a constant\n                    from `monero.prio`.\n        :param payment_id: ID for the payment (must be None if\n                        :class:`IntegratedAddress <monero.address.IntegratedAddress>`\n                        is used as the destination)\n        :param unlock_time: the extra unlock delay\n        :param relay: if `True`, the wallet will relay the transaction(s) to the network\n                        immediately; when `False`, it will only return the transaction(s)\n                        so they might be broadcasted later\n        :rtype: list of :class:`Transaction <monero.transaction.Transaction>`\n        \"\"\"\n        return self.accounts[0].transfer(\n                address,\n                amount,\n                priority=priority,\n                payment_id=payment_id,\n                unlock_time=unlock_time,\n                relay=relay)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef transfer_multiple(self, destinations,\n            priority=prio.NORMAL, payment_id=None, unlock_time=0,\n            relay=True):\n        \"\"\"\n        Sends a batch of transfers from the default account. Returns a list of resulting\n        transactions.\n\n        :param destinations: a list of destination and amount pairs: [(address, amount), ...]\n        :param priority: transaction priority, implies fee. The priority can be a number\n                    from 1 to 4 (unimportant, normal, elevated, priority) or a constant\n                    from `monero.prio`.\n        :param payment_id: ID for the payment (must be None if\n                        :class:`IntegratedAddress <monero.address.IntegratedAddress>`\n                        is used as a destination)\n        :param unlock_time: the extra unlock delay\n        :param relay: if `True`, the wallet will relay the transaction(s) to the network\n                        immediately; when `False`, it will only return the transaction(s)\n                        so they might be broadcasted later\n        :rtype: list of :class:`Transaction <monero.transaction.Transaction>`\n        \"\"\"\n        return self.accounts[0].transfer_multiple(\n                destinations,\n                priority=priority,\n                payment_id=payment_id,\n                unlock_time=unlock_time,\n                relay=relay)", "response": "Send a batch of transfers from the default account."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef balance(self, unlocked=False):\n        return self._backend.balances(account=self.index)[1 if unlocked else 0]", "response": "Returns the current balance of the current account."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ncreate a new address.", "response": "def new_address(self, label=None):\n        \"\"\"\n        Creates a new address.\n\n        :param label: address label as `str`\n        :rtype: :class:`SubAddress <monero.address.SubAddress>`\n        \"\"\"\n        return self._backend.new_address(account=self.index, label=label)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nsend a transfer to the specified address.", "response": "def transfer(self, address, amount,\n            priority=prio.NORMAL, payment_id=None, unlock_time=0,\n            relay=True):\n        \"\"\"\n        Sends a transfer. Returns a list of resulting transactions.\n\n        :param address: destination :class:`Address <monero.address.Address>` or subtype\n        :param amount: amount to send\n        :param priority: transaction priority, implies fee. The priority can be a number\n                    from 1 to 4 (unimportant, normal, elevated, priority) or a constant\n                    from `monero.prio`.\n        :param payment_id: ID for the payment (must be None if\n                        :class:`IntegratedAddress <monero.address.IntegratedAddress>`\n                        is used as the destination)\n        :param unlock_time: the extra unlock delay\n        :param relay: if `True`, the wallet will relay the transaction(s) to the network\n                        immediately; when `False`, it will only return the transaction(s)\n                        so they might be broadcasted later\n        :rtype: list of :class:`Transaction <monero.transaction.Transaction>`\n        \"\"\"\n        return self._backend.transfer(\n            [(address, amount)],\n            priority,\n            payment_id,\n            unlock_time,\n            account=self.index,\n            relay=relay)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef transfer_multiple(self, destinations,\n            priority=prio.NORMAL, payment_id=None, unlock_time=0,\n            relay=True):\n        \"\"\"\n        Sends a batch of transfers. Returns a list of resulting transactions.\n\n        :param destinations: a list of destination and amount pairs:\n                    [(:class:`Address <monero.address.Address>`, `Decimal`), ...]\n        :param priority: transaction priority, implies fee. The priority can be a number\n                    from 1 to 4 (unimportant, normal, elevated, priority) or a constant\n                    from `monero.prio`.\n        :param payment_id: ID for the payment (must be None if\n                        :class:`IntegratedAddress <monero.address.IntegratedAddress>`\n                        is used as the destination)\n        :param unlock_time: the extra unlock delay\n        :param relay: if `True`, the wallet will relay the transaction(s) to the network\n                        immediately; when `False`, it will only return the transaction(s)\n                        so they might be broadcasted later\n        :rtype: list of :class:`Transaction <monero.transaction.Transaction>`\n        \"\"\"\n        return self._backend.transfer(\n            destinations,\n            priority,\n            payment_id,\n            unlock_time,\n            account=self.index,\n            relay=relay)", "response": "Send a batch of transfers."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nconverts Monero decimal to atomic integer of piconero.", "response": "def to_atomic(amount):\n    \"\"\"Convert Monero decimal to atomic integer of piconero.\"\"\"\n    if not isinstance(amount, (Decimal, float) + _integer_types):\n        raise ValueError(\"Amount '{}' doesn't have numeric type. Only Decimal, int, long and \"\n                \"float (not recommended) are accepted as amounts.\")\n    return int(amount * 10**12)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ngive a mnemonic word string confirm the computed checksum.", "response": "def _validate_checksum(self):\n        \"\"\"Given a mnemonic word string, confirm seed checksum (last word) matches the computed checksum.\n\n        :rtype: bool\n        \"\"\"\n        phrase = self.phrase.split(\" \")\n        if self.word_list.get_checksum(self.phrase) == phrase[-1]:\n            return True\n        raise ValueError(\"Invalid checksum\")"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning the master address represented by the seed.", "response": "def public_address(self, net='mainnet'):\n        \"\"\"Returns the master :class:`Address <monero.address.Address>` represented by the seed.\n\n        :param net: the network, one of 'mainnet', 'testnet', 'stagenet'. Default is 'mainnet'.\n\n        :rtype: :class:`Address <monero.address.Address>`\n        \"\"\"\n        if net not in ('mainnet', 'testnet', 'stagenet'):\n            raise ValueError(\n                \"Invalid net argument. Must be one of ('mainnet', 'testnet', 'stagenet').\")\n        netbyte = 18 if net == 'mainnet' else 53 if net == 'testnet' else 24\n        data = \"{:x}{:s}{:s}\".format(netbyte, self.public_spend_key(), self.public_view_key())\n        h = keccak_256()\n        h.update(unhexlify(data))\n        checksum = h.hexdigest()\n        return address(base58.encode(data + checksum[0:8]))"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ndiscover the proper class and return instance for a given Monero address.", "response": "def address(addr, label=None):\n    \"\"\"Discover the proper class and return instance for a given Monero address.\n\n    :param addr: the address as a string-like object\n    :param label: a label for the address (defaults to `None`)\n\n    :rtype: :class:`Address`, :class:`SubAddress` or :class:`IntegratedAddress`\n    \"\"\"\n    addr = str(addr)\n    if _ADDR_REGEX.match(addr):\n        netbyte = bytearray(unhexlify(base58.decode(addr)))[0]\n        if netbyte in Address._valid_netbytes:\n            return Address(addr, label=label)\n        elif netbyte in SubAddress._valid_netbytes:\n            return SubAddress(addr, label=label)\n        raise ValueError(\"Invalid address netbyte {nb:x}. Allowed values are: {allowed}\".format(\n            nb=netbyte,\n            allowed=\", \".join(map(\n                lambda b: '%02x' % b,\n                sorted(Address._valid_netbytes + SubAddress._valid_netbytes)))))\n    elif _IADDR_REGEX.match(addr):\n        return IntegratedAddress(addr)\n    raise ValueError(\"Address must be either 95 or 106 characters long base58-encoded string, \"\n        \"is {addr} ({len} chars length)\".format(addr=addr, len=len(addr)))"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nintegrate the payment id into the address.", "response": "def with_payment_id(self, payment_id=0):\n        \"\"\"Integrates payment id into the address.\n\n        :param payment_id: int, hexadecimal string or :class:`PaymentID <monero.numbers.PaymentID>`\n                    (max 64-bit long)\n\n        :rtype: `IntegratedAddress`\n        :raises: `TypeError` if the payment id is too long\n        \"\"\"\n        payment_id = numbers.PaymentID(payment_id)\n        if not payment_id.is_short():\n            raise TypeError(\"Payment ID {0} has more than 64 bits and cannot be integrated\".format(payment_id))\n        prefix = 54 if self.is_testnet() else 25 if self.is_stagenet() else 19\n        data = bytearray([prefix]) + self._decoded[1:65] + struct.pack('>Q', int(payment_id))\n        checksum = bytearray(keccak_256(data).digest()[:4])\n        return IntegratedAddress(base58.encode(hexlify(data + checksum)))"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef base_address(self):\n        prefix = 53 if self.is_testnet() else 24 if self.is_stagenet() else 18\n        data = bytearray([prefix]) + self._decoded[1:65]\n        checksum = keccak_256(data).digest()[:4]\n        return Address(base58.encode(hexlify(data + checksum)))", "response": "Returns the base address without payment id."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef encode(hex):\n    '''Encode hexadecimal string as base58 (ex: encoding a Monero address).'''\n    data = _hexToBin(hex)\n    l_data = len(data)\n\n    if l_data == 0:\n        return \"\"\n\n    full_block_count = l_data // __fullBlockSize\n    last_block_size = l_data % __fullBlockSize\n    res_size = full_block_count * __fullEncodedBlockSize + __encodedBlockSizes[last_block_size]\n\n    res = bytearray([__alphabet[0]] * res_size)\n\n    for i in range(full_block_count):\n        res = encode_block(data[(i*__fullBlockSize):(i*__fullBlockSize+__fullBlockSize)], res, i * __fullEncodedBlockSize)\n\n    if last_block_size > 0:\n        res = encode_block(data[(full_block_count*__fullBlockSize):(full_block_count*__fullBlockSize+last_block_size)], res, full_block_count * __fullEncodedBlockSize)\n\n    return bytes(res).decode('ascii')", "response": "Encode hexadecimal string as base58."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ndecoding a base58 string into hexidecimal form.", "response": "def decode(enc):\n    '''Decode a base58 string (ex: a Monero address) into hexidecimal form.'''\n    enc = bytearray(enc, encoding='ascii')\n    l_enc = len(enc)\n\n    if l_enc == 0:\n        return \"\"\n\n    full_block_count = l_enc // __fullEncodedBlockSize\n    last_block_size = l_enc % __fullEncodedBlockSize\n    try:\n        last_block_decoded_size = __encodedBlockSizes.index(last_block_size)\n    except ValueError:\n        raise ValueError(\"Invalid encoded length: %d\" % l_enc)\n\n    data_size = full_block_count * __fullBlockSize + last_block_decoded_size\n\n    data = bytearray(data_size)\n    for i in range(full_block_count):\n        data = decode_block(enc[(i*__fullEncodedBlockSize):(i*__fullEncodedBlockSize+__fullEncodedBlockSize)], data, i * __fullBlockSize)\n\n    if last_block_size > 0:\n        data = decode_block(enc[(full_block_count*__fullEncodedBlockSize):(full_block_count*__fullEncodedBlockSize+last_block_size)], data, full_block_count * __fullBlockSize)\n\n    return _binToHex(data)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nconverting hexadecimal string to mnemonic word representation with checksum.", "response": "def encode(cls, hex):\n        \"\"\"Convert hexadecimal string to mnemonic word representation with checksum.\n        \"\"\"\n        out = []\n        for i in range(len(hex) // 8):\n            word = endian_swap(hex[8*i:8*i+8])\n            x = int(word, 16)\n            w1 = x % cls.n\n            w2 = (x // cls.n + w1) % cls.n\n            w3 = (x // cls.n // cls.n + w2) % cls.n\n            out += [cls.word_list[w1], cls.word_list[w2], cls.word_list[w3]]\n        checksum = cls.get_checksum(\" \".join(out))\n        out.append(checksum)\n        return \" \".join(out)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef decode(cls, phrase):\n        phrase = phrase.split(\" \")\n        out = \"\"\n        for i in range(len(phrase) // 3):\n            word1, word2, word3 = phrase[3*i:3*i+3]\n            w1 = cls.word_list.index(word1)\n            w2 = cls.word_list.index(word2) % cls.n\n            w3 = cls.word_list.index(word3) % cls.n\n            x = w1 + cls.n *((w2 - w1) % cls.n) + cls.n * cls.n * ((w3 - w2) % cls.n)\n            out += endian_swap(\"%08x\" % x)\n        return out", "response": "Decode a string into a hexadecimal representation of the class."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\ngive a mnemonic word string return a string of the computed checksum.", "response": "def get_checksum(cls, phrase):\n        \"\"\"Given a mnemonic word string, return a string of the computed checksum.\n\n        :rtype: str\n        \"\"\"\n        phrase_split = phrase.split(\" \")\n        if len(phrase_split) < 12:\n            raise ValueError(\"Invalid mnemonic phrase\")\n        if len(phrase_split) > 13:\n            # Standard format\n            phrase = phrase_split[:24]\n        else:\n            # MyMonero format\n            phrase = phrase_split[:12]\n        wstr = \"\".join(word[:cls.unique_prefix_length] for word in phrase)\n        wstr = bytearray(wstr.encode('utf-8'))\n        z = ((crc32(wstr) & 0xffffffff) ^ 0xffffffff ) >> 0\n        z2 = ((z ^ 0xffffffff) >> 0) % len(phrase)\n        return phrase_split[z2]"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nsend a transaction to all peers.", "response": "def send_transaction(self, tx, relay=True):\n        \"\"\"\n        Sends a transaction generated by a :class:`Wallet <monero.wallet.Wallet>`.\n\n        :param tx: :class:`Transaction <monero.transaction.Transaction>`\n        :param relay: whether to relay the transaction to peers. If `False`, the daemon will have\n                to mine the transaction itself in order to have it included in the blockchain.\n        \"\"\"\n        return self._backend.send_transaction(tx.blob, relay=relay)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\ncalling one in a while loop to allow user to pick many options. Returns a list of chosen options.", "response": "def many(prompt, *args, **kwargs):\n    \"\"\"Calls `pick` in a while loop to allow user to pick many\n    options. Returns a list of chosen options.\n    \"\"\"\n    def get_options(options, chosen):\n        return [options[i] for i, c in enumerate(chosen) if c]\n\n    def get_verbose_options(verbose_options, chosen):\n        no, yes = ' ', '\u2714'\n        if sys.version_info < (3, 3):\n            no, yes = ' ', '@'\n        opts = ['{} {}'.format(yes if c else no, verbose_options[i]) for i, c in enumerate(chosen)]\n        return opts + ['{}{}'.format('  ', kwargs.get('done', 'done...'))]\n\n    options, verbose_options = prepare_options(args)\n    chosen = [False] * len(options)\n    index = kwargs.get('idx', 0)\n\n    default = kwargs.get('default', None)\n    if isinstance(default, list):\n        for idx in default:\n            chosen[idx] = True\n    if isinstance(default, int):\n        chosen[default] = True\n\n    while True:\n        try:\n            index = one(prompt, *get_verbose_options(verbose_options, chosen), return_index=True, idx=index)\n        except QuestionnaireGoBack:\n            if any(chosen):\n                raise QuestionnaireGoBack(0)\n            else:\n                raise QuestionnaireGoBack\n        if index == len(options):\n            return get_options(options, chosen)\n        chosen[index] = not chosen[index]"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef prepare_options(options):\n    options_, verbose_options = [], []\n    for option in options:\n        if is_string(option):\n            options_.append(option)\n            verbose_options.append(option)\n        else:\n            options_.append(option[0])\n            verbose_options.append(option[1])\n    return options_, verbose_options", "response": "Create options and verbose options from strings and non - string iterables in\n    options array."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef raw(prompt, *args, **kwargs):\n    go_back = kwargs.get('go_back', '<')\n    type_ = kwargs.get('type', str)\n    default = kwargs.get('default', '')\n    with stdout_redirected(sys.stderr):\n        while True:\n            try:\n                if kwargs.get('secret', False):\n                    answer = getpass.getpass(prompt)\n                elif sys.version_info < (3, 0):\n                    answer = raw_input(prompt)\n                else:\n                    answer = input(prompt)\n\n                if not answer:\n                    answer = default\n\n                if answer == go_back:\n                    raise QuestionnaireGoBack\n                return type_(answer)\n            except ValueError:\n                eprint('\\n`{}` is not a valid `{}`\\n'.format(answer, type_))", "response": "Calls raw input to allow user to input an arbitrary string."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef stdout_redirected(to):\n    stdout = sys.stdout\n\n    stdout_fd = fileno(stdout)\n    # copy stdout_fd before it is overwritten\n    with os.fdopen(os.dup(stdout_fd), 'wb') as copied:\n        stdout.flush()  # flush library buffers that dup2 knows nothing about\n        try:\n            os.dup2(fileno(to), stdout_fd)  # $ exec >&to\n        except ValueError:  # filename\n            with open(to, 'wb') as to_file:\n                os.dup2(to_file.fileno(), stdout_fd)  # $ exec > to\n        try:\n            yield stdout  # allow code to be run with the redirected stdout\n        finally:\n            # restore stdout to its previous value\n            stdout.flush()\n            os.dup2(copied.fileno(), stdout_fd)", "response": "A context manager that redirects stdout to a file."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef get_operator(self, op):\n        if op in self.OPERATORS:\n            return self.OPERATORS.get(op)\n        try:\n            n_args = len(inspect.getargspec(op)[0])\n            if n_args != 2:\n                raise TypeError\n        except:\n            eprint('Error: invalid operator function. Operators must accept two args.')\n            raise\n        else:\n            return op", "response": "Assigns function to the operators property of the instance."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nassign a prompter to this question.", "response": "def assign_prompter(self, prompter):\n        \"\"\"If you want to change the core prompters registry, you can\n        override this method in a Question subclass.\n        \"\"\"\n        if is_string(prompter):\n            if prompter not in prompters:\n                eprint(\"Error: '{}' is not a core prompter\".format(prompter))\n                sys.exit()\n            self.prompter = prompters[prompter]\n        else:\n            self.prompter = prompter"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef add(self, *args, **kwargs):\n        if 'question' in kwargs and isinstance(kwargs['question'], Question):\n            question = kwargs['question']\n        else:\n            question = Question(*args, **kwargs)\n        self.questions.setdefault(question.key, []).append(question)\n        return question", "response": "Add a Question instance to the questions dict."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef ask(self, error=None):\n        q = self.next_question\n        if q is None:\n            return\n\n        try:\n            answer = q.prompter(self.get_prompt(q, error), *q.prompter_args, **q.prompter_kwargs)\n        except QuestionnaireGoBack as e:\n            steps = e.args[0] if e.args else 1\n            if steps == 0:\n                self.ask()  # user can redo current question even if `can_go_back` is `False`\n                return\n            self.go_back(steps)\n        else:\n            if q._validate:\n                error = q._validate(answer)\n                if error:\n                    self.ask(error)\n                    return\n            if q._transform:\n                answer = q._transform(answer)\n            self.answers[q.key] = answer\n            return answer", "response": "Asks the next question in the questionnaire and returns the answer. If user goes back the questionnaire will go back and return the answer."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreturns the next question in the questionnaire or None if there are no questions left.", "response": "def next_question(self):\n        \"\"\"Returns the next `Question` in the questionnaire, or `None` if there\n        are no questions left. Returns first question for whose key there is no\n        answer and for which condition is satisfied, or for which there is no\n        condition.\n        \"\"\"\n        for key, questions in self.questions.items():\n            if key in self.answers:\n                continue\n            for question in questions:\n                if self.check_condition(question._condition):\n                    return question\n        return None"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nmove n questions back in the questionnaire by removing the last n answers.", "response": "def go_back(self, n=1):\n        \"\"\"Move `n` questions back in the questionnaire by removing the last `n`\n        answers.\n        \"\"\"\n        if not self.can_go_back:\n            return\n        N = max(len(self.answers)-abs(n), 0)\n        self.answers = OrderedDict(islice(self.answers.items(), N))"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef format_answers(self, fmt='obj'):\n        fmts = ('obj', 'array', 'plain')\n        if fmt not in fmts:\n            eprint(\"Error: '{}' not in {}\".format(fmt, fmts))\n            return\n\n        def stringify(val):\n            if type(val) in (list, tuple):\n                return ', '.join(str(e) for e in val)\n            return val\n\n        if fmt == 'obj':\n            return json.dumps(self.answers)\n        elif fmt == 'array':\n            answers = [[k, v] for k, v in self.answers.items()]\n            return json.dumps(answers)\n        elif fmt == 'plain':\n            answers = '\\n'.join('{}: {}'.format(k, stringify(v)) for k, v in self.answers.items())\n            return answers", "response": "Formats the answers depending on fmt."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nadding a new intent to the cache", "response": "def add_intent(self, name, lines, reload_cache=False):\n        \"\"\"\n        Creates a new intent, optionally checking the cache first\n\n        Args:\n            name (str): The associated name of the intent\n            lines (list<str>): All the sentences that should activate the intent\n            reload_cache: Whether to ignore cached intent if exists\n        \"\"\"\n        self.intents.add(name, lines, reload_cache)\n        self.padaos.add_intent(name, lines)\n        self.must_train = True"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef add_entity(self, name, lines, reload_cache=False):\n        Entity.verify_name(name)\n        self.entities.add(Entity.wrap_name(name), lines, reload_cache)\n        self.padaos.add_entity(name, lines)\n        self.must_train = True", "response": "Adds an entity that matches the given lines to the internal list of entities."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef load_entity(self, name, file_name, reload_cache=False):\n        Entity.verify_name(name)\n        self.entities.load(Entity.wrap_name(name), file_name, reload_cache)\n        with open(file_name) as f:\n            self.padaos.add_entity(name, f.read().split('\\n'))\n        self.must_train = True", "response": "Loads an entity from a file."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nloads an intent from a file and adds it to the cache", "response": "def load_intent(self, name, file_name, reload_cache=False):\n        \"\"\"\n        Loads an intent, optionally checking the cache first\n\n        Args:\n            name (str): The associated name of the intent\n            file_name (str): The location of the intent file\n            reload_cache (bool): Whether to refresh all of cache\n        \"\"\"\n        self.intents.load(name, file_name, reload_cache)\n        with open(file_name) as f:\n            self.padaos.add_intent(name, f.read().split('\\n'))\n        self.must_train = True"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nunloads an intent from the cache", "response": "def remove_intent(self, name):\n        \"\"\"Unload an intent\"\"\"\n        self.intents.remove(name)\n        self.padaos.remove_intent(name)\n        self.must_train = True"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nunloading an entity from the cache.", "response": "def remove_entity(self, name):\n        \"\"\"Unload an entity\"\"\"\n        self.entities.remove(name)\n        self.padaos.remove_entity(name)"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\ntrains all the loaded intents that need to be updated from file", "response": "def train(self, debug=True, force=False, single_thread=False, timeout=20):\n        \"\"\"\n        Trains all the loaded intents that need to be updated\n        If a cache file exists with the same hash as the intent file,\n        the intent will not be trained and just loaded from file\n\n        Args:\n            debug (bool): Whether to print a message to stdout each time a new intent is trained\n            force (bool): Whether to force training if already finished\n            single_thread (bool): Whether to force running in a single thread\n            timeout (float): Seconds before cancelling training\n        Returns:\n            bool: True if training succeeded without timeout\n        \"\"\"\n        if not self.must_train and not force:\n            return\n        self.padaos.compile()\n        self.train_thread = Thread(target=self._train, kwargs=dict(\n            debug=debug,\n            single_thread=single_thread,\n            timeout=timeout\n        ), daemon=True)\n        self.train_thread.start()\n        self.train_thread.join(timeout)\n\n        self.must_train = False\n        return not self.train_thread.is_alive()"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ntrains in a subprocess which provides a timeout guarantees everything shuts down properly", "response": "def train_subprocess(self, *args, **kwargs):\n        \"\"\"\n        Trains in a subprocess which provides a timeout guarantees everything shuts down properly\n\n        Args:\n            See <train>\n        Returns:\n            bool: True for success, False if timed out\n        \"\"\"\n        ret = call([\n            sys.executable, '-m', 'padatious', 'train', self.cache_dir,\n            '-d', json.dumps(self.serialized_args),\n            '-a', json.dumps(args),\n            '-k', json.dumps(kwargs),\n        ])\n        if ret == 2:\n            raise TypeError('Invalid train arguments: {} {}'.format(args, kwargs))\n        data = self.serialized_args\n        self.clear()\n        self.apply_training_args(data)\n        self.padaos.compile()\n        if ret == 0:\n            self.must_train = False\n            return True\n        elif ret == 10:  # timeout\n            return False\n        else:\n            raise ValueError('Training failed and returned code: {}'.format(ret))"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef calc_intents(self, query):\n        if self.must_train:\n            self.train()\n        intents = {} if self.train_thread and self.train_thread.is_alive() else {\n            i.name: i for i in self.intents.calc_intents(query, self.entities)\n        }\n        sent = tokenize(query)\n        for perfect_match in self.padaos.calc_intents(query):\n            name = perfect_match['name']\n            intents[name] = MatchData(name, sent, matches=perfect_match['entities'], conf=1.0)\n        return list(intents.values())", "response": "Tests all the intents against the query and returns the data on how well each one matched against the query and returns the data on how well each one matched against the query"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ntests all the intents against the query and returns the best intent", "response": "def calc_intent(self, query):\n        \"\"\"\n        Tests all the intents against the query and returns\n        match data of the best intent\n\n        Args:\n            query (str): Input sentence to test against intents\n        Returns:\n            MatchData: Best intent match\n        \"\"\"\n        matches = self.calc_intents(query)\n        if len(matches) == 0:\n            return MatchData('', '')\n        best_match = max(matches, key=lambda x: x.conf)\n        best_matches = (match for match in matches if match.conf == best_match.conf)\n        return min(best_matches, key=lambda x: sum(map(len, x.matches.values())))"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nreturns a list of all sub - sentences expanded in the order of the time they were created.", "response": "def expand(self):\r\n        \"\"\"\r\n        Creates a combination of all sub-sentences.\r\n\r\n        Returns:\r\n            List<List<str>>: A list with all subsentence expansions combined in\r\n                                every possible way\r\n        \"\"\"\r\n        old_expanded = [[]]\r\n        for sub in self._tree:\r\n            sub_expanded = sub.expand()\r\n            new_expanded = []\r\n            while len(old_expanded) > 0:\r\n                sentence = old_expanded.pop()\r\n                for new in sub_expanded:\r\n                    new_expanded.append(sentence + new)\r\n            old_expanded = new_expanded\r\n        return old_expanded"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn all of the options as seperated sub - sentences.", "response": "def expand(self):\r\n        \"\"\"\r\n        Returns all of its options as seperated sub-sentences.\r\n\r\n        Returns:\r\n            List<List<str>>: A list containing the sentences created by all\r\n                                expansions of its sub-sentences\r\n        \"\"\"\r\n        options = []\r\n        for option in self._tree:\r\n            options.extend(option.expand())\r\n        return options"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _parse_expr(self):\r\n        # List of all generated sentences\r\n        sentence_list = []\r\n        # Currently active sentence\r\n        cur_sentence = []\r\n        sentence_list.append(Sentence(cur_sentence))\r\n        # Determine which form the current expression has\r\n        while self._current_position < len(self.tokens):\r\n            cur = self.tokens[self._current_position]\r\n            self._current_position += 1\r\n            if cur == '(':\r\n                # Parse the subexpression\r\n                subexpr = self._parse_expr()\r\n                # Check if the subexpression only has one branch\r\n                # -> If so, append \"(\" and \")\" and add it as is\r\n                normal_brackets = False\r\n                if len(subexpr.tree()) == 1:\r\n                    normal_brackets = True\r\n                    cur_sentence.append(Word('('))\r\n                # add it to the sentence\r\n                cur_sentence.append(subexpr)\r\n                if normal_brackets:\r\n                    cur_sentence.append(Word(')'))\r\n            elif cur == '|':\r\n                # Begin parsing a new sentence\r\n                cur_sentence = []\r\n                sentence_list.append(Sentence(cur_sentence))\r\n            elif cur == ')':\r\n                # End parsing the current subexpression\r\n                break\r\n            # TODO anything special about {sth}?\r\n            else:\r\n                cur_sentence.append(Word(cur))\r\n        return Options(sentence_list)", "response": "Parse the expression and return a list of all the tokens that are in the current expression."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _train_and_save(obj, cache, data, print_updates):\n    obj.train(data)\n    if print_updates:\n        print('Regenerated ' + obj.name + '.')\n    obj.save(cache)", "response": "Internal pickleable function used to train and save objects in another process"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nwrapping SkillName into SkillName. entity into SkillName. entity", "response": "def wrap_name(name):\n        \"\"\"Wraps SkillName:entity into SkillName:{entity}\"\"\"\n        if ':' in name:\n            parts = name.split(':')\n            intent_name, ent_name = parts[0], parts[1:]\n            return intent_name + ':{' + ':'.join(ent_name) + '}'\n        else:\n            return '{' + name + '}'"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nreturns a unique id for the given list of lines", "response": "def lines_hash(lines):\n    \"\"\"\n    Creates a unique binary id for the given lines\n    Args:\n        lines (list<str>): List of strings that should be collectively hashed\n    Returns:\n        bytearray: Binary hash\n    \"\"\"\n    x = xxh32()\n    for i in lines:\n        x.update(i.encode())\n    return x.digest()"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ntaking a single sentence and returns a list of individual significant units ie. this is a sentence.", "response": "def tokenize(sentence):\n    \"\"\"\n    Converts a single sentence into a list of individual significant units\n    Args:\n        sentence (str): Input string ie. 'This is a sentence.'\n    Returns:\n        list<str>: List of tokens ie. ['this', 'is', 'a', 'sentence']\n    \"\"\"\n    tokens = []\n\n    class Vars:\n        start_pos = -1\n        last_type = 'o'\n\n    def update(c, i):\n        if c.isalpha() or c in '-{}':\n            t = 'a'\n        elif c.isdigit() or c == '#':\n            t = 'n'\n        elif c.isspace():\n            t = 's'\n        else:\n            t = 'o'\n\n        if t != Vars.last_type or t == 'o':\n            if Vars.start_pos >= 0:\n                token = sentence[Vars.start_pos:i].lower()\n                if token not in '.!?':\n                    tokens.append(token)\n            Vars.start_pos = -1 if t == 's' else i\n        Vars.last_type = t\n\n    for i, char in enumerate(sentence):\n        update(char, i)\n    update(' ', len(sentence))\n    return tokens"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nchecking for duplicate inputs and sets the output to the max of the two outputs", "response": "def resolve_conflicts(inputs, outputs):\n    \"\"\"\n    Checks for duplicate inputs and if there are any,\n    remove one and set the output to the max of the two outputs\n    Args:\n        inputs (list<list<float>>): Array of input vectors\n        outputs (list<list<float>>): Array of output vectors\n    Returns:\n        tuple<inputs, outputs>: The modified inputs and outputs\n    \"\"\"\n    data = {}\n    for inp, out in zip(inputs, outputs):\n        tup = tuple(inp)\n        if tup in data:\n            data[tup].append(out)\n        else:\n            data[tup] = [out]\n\n    inputs, outputs = [], []\n    for inp, outs in data.items():\n        inputs.append(list(inp))\n        combined = [0] * len(outs[0])\n        for i in range(len(combined)):\n            combined[i] = max(j[i] for j in outs)\n        outputs.append(combined)\n    return inputs, outputs"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nre-apply type annotations from .pyi stubs to your codebase.", "response": "def main(src, pyi_dir, target_dir, incremental, quiet, replace_any, hg, traceback):\n    \"\"\"Re-apply type annotations from .pyi stubs to your codebase.\"\"\"\n    Config.incremental = incremental\n    Config.replace_any = replace_any\n    returncode = 0\n    for src_entry in src:\n        for file, error, exc_type, tb in retype_path(\n            Path(src_entry),\n            pyi_dir=Path(pyi_dir),\n            targets=Path(target_dir),\n            src_explicitly_given=True,\n            quiet=quiet,\n            hg=hg,\n        ):\n            print(f'error: {file}: {error}', file=sys.stderr)\n            if traceback:\n                print('Traceback (most recent call last):', file=sys.stderr)\n                for line in tb:\n                    print(line, file=sys.stderr, end='')\n                print(f'{exc_type.__name__}: {error}', file=sys.stderr)\n            returncode += 1\n    if not src and not quiet:\n        print('warning: no sources given', file=sys.stderr)\n\n    # According to http://tldp.org/LDP/abs/html/index.html starting with 126\n    # we have special returncodes.\n    sys.exit(min(returncode, 125))"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef retype_file(src, pyi_dir, targets, *, quiet=False, hg=False):\n    with tokenize.open(src) as src_buffer:\n        src_encoding = src_buffer.encoding\n        src_node = lib2to3_parse(src_buffer.read())\n    try:\n        with open((pyi_dir / src.name).with_suffix('.pyi')) as pyi_file:\n            pyi_txt = pyi_file.read()\n    except FileNotFoundError:\n        if not quiet:\n            print(\n                f'warning: .pyi file for source {src} not found in {pyi_dir}',\n                file=sys.stderr,\n            )\n    else:\n        pyi_ast = ast3.parse(pyi_txt)\n        assert isinstance(pyi_ast, ast3.Module)\n        reapply_all(pyi_ast.body, src_node)\n    fix_remaining_type_comments(src_node)\n    targets.mkdir(parents=True, exist_ok=True)\n    with open(targets / src.name, 'w', encoding=src_encoding) as target_file:\n        target_file.write(lib2to3_unparse(src_node, hg=hg))\n    return targets / src.name", "response": "Retype src finding types in pyi_dir and save in targets."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef lib2to3_parse(src_txt):\n    grammar = pygram.python_grammar_no_print_statement\n    drv = driver.Driver(grammar, pytree.convert)\n    if src_txt[-1] != '\\n':\n        nl = '\\r\\n' if '\\r\\n' in src_txt[:1024] else '\\n'\n        src_txt += nl\n    try:\n        result = drv.parse_string(src_txt, True)\n    except ParseError as pe:\n        lineno, column = pe.context[1]\n        lines = src_txt.splitlines()\n        try:\n            faulty_line = lines[lineno - 1]\n        except IndexError:\n            faulty_line = \"<line number missing in source>\"\n        raise ValueError(f\"Cannot parse: {lineno}:{column}: {faulty_line}\") from None\n\n    if isinstance(result, Leaf):\n        result = Node(syms.file_input, [result])\n\n    return result", "response": "Given a string with source return the lib2to3 Node."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ngiving a lib2to3 node return its string representation.", "response": "def lib2to3_unparse(node, *, hg=False):\n    \"\"\"Given a lib2to3 node, return its string representation.\"\"\"\n    code = str(node)\n    if hg:\n        from retype_hgext import apply_job_security\n        code = apply_job_security(code)\n    return code"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreapplying the typed_ast node into the lib2to3 tree.", "response": "def reapply_all(ast_node, lib2to3_node):\n    \"\"\"Reapplies the typed_ast node into the lib2to3 tree.\n\n    Also does post-processing. This is done in reverse order to enable placing\n    TypeVars and aliases that depend on one another.\n    \"\"\"\n    late_processing = reapply(ast_node, lib2to3_node)\n    for lazy_func in reversed(late_processing):\n        lazy_func()"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef fix_remaining_type_comments(node):\n    assert node.type == syms.file_input\n\n    last_n = None\n    for n in node.post_order():\n        if last_n is not None:\n            if n.type == token.NEWLINE and is_assignment(last_n):\n                fix_variable_annotation_type_comment(n, last_n)\n            elif n.type == syms.funcdef and last_n.type == syms.suite:\n                fix_signature_annotation_type_comment(n, last_n, offset=1)\n            elif n.type == syms.async_funcdef and last_n.type == syms.suite:\n                fix_signature_annotation_type_comment(n, last_n, offset=2)\n        last_n = n", "response": "Converts type comments in node to proper annotated assignments."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef get_function_signature(fun, *, is_method=False):\n    args = fun.args\n    returns = fun.returns\n    if fun.type_comment:\n        try:\n            args_tc, returns_tc = parse_signature_type_comment(fun.type_comment)\n            if returns and returns_tc:\n                raise ValueError(\n                    \"using both a type annotation and a type comment is not allowed\"\n                )\n            returns = returns_tc\n            copy_arguments_to_annotations(args, args_tc, is_method=is_method)\n        except (SyntaxError, ValueError) as exc:\n            raise ValueError(\n                f\"Annotation problem in function {fun.name!r}: \" +\n                f\"{fun.lineno}:{fun.col_offset + 1}: {exc}\"\n            )\n    copy_type_comments_to_annotations(args)\n\n    return args, returns", "response": "Returns the signature of a function."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nparsing the fugly signature type comment into AST nodes.", "response": "def parse_signature_type_comment(type_comment):\n    \"\"\"Parse the fugly signature type comment into AST nodes.\n\n    Caveats: ASTifying **kwargs is impossible with the current grammar so we\n    hack it into unary subtraction (to differentiate from Starred in vararg).\n\n    For example from:\n    \"(str, int, *int, **Any) -> 'SomeReturnType'\"\n\n    To:\n    ([ast3.Name, ast.Name, ast3.Name, ast.Name], ast3.Str)\n    \"\"\"\n    try:\n        result = ast3.parse(type_comment, '<func_type>', 'func_type')\n    except SyntaxError:\n        raise ValueError(f\"invalid function signature type comment: {type_comment!r}\")\n\n    assert isinstance(result, ast3.FunctionType)\n    if len(result.argtypes) == 1:\n        argtypes = result.argtypes[0]\n    else:\n        argtypes = result.argtypes\n    return argtypes, result.returns"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nparses a type comment string into AST nodes.", "response": "def parse_type_comment(type_comment):\n    \"\"\"Parse a type comment string into AST nodes.\"\"\"\n    try:\n        result = ast3.parse(type_comment, '<type_comment>', 'eval')\n    except SyntaxError:\n        raise ValueError(f\"invalid type comment: {type_comment!r}\") from None\n\n    assert isinstance(result, ast3.Expression)\n    return result.body"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef parse_arguments(arguments):\n    arguments = f\"def f{arguments}: ...\"\n    try:\n        result = ast3.parse(arguments, '<arguments>', 'exec')\n    except SyntaxError:\n        raise ValueError(f\"invalid arguments: {arguments!r}\") from None\n\n    assert isinstance(result, ast3.Module)\n    assert len(result.body) == 1\n    assert isinstance(result.body[0], ast3.FunctionDef)\n    args = result.body[0].args\n    copy_type_comments_to_annotations(args)\n    return args", "response": "Parse a string with function arguments into an AST node."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\ncopy AST nodes from type_comment into the ast3. arguments in args.", "response": "def copy_arguments_to_annotations(args, type_comment, *, is_method=False):\n    \"\"\"Copies AST nodes from `type_comment` into the ast3.arguments in `args`.\n\n    Does validaation of argument count (allowing for untyped self/cls)\n    and type (vararg and kwarg).\n    \"\"\"\n    if isinstance(type_comment, ast3.Ellipsis):\n        return\n\n    expected = len(args.args)\n    if args.vararg:\n        expected += 1\n    expected += len(args.kwonlyargs)\n    if args.kwarg:\n        expected += 1\n    actual = len(type_comment) if isinstance(type_comment, list) else 1\n    if expected != actual:\n        if is_method and expected - actual == 1:\n            pass  # fine, we're just skipping `self`, `cls`, etc.\n        else:\n            raise ValueError(\n                f\"number of arguments in type comment doesn't match; \" +\n                f\"expected {expected}, found {actual}\"\n            )\n\n    if isinstance(type_comment, list):\n        next_value = type_comment.pop\n    else:\n        # If there's just one value, only one of the loops and ifs below will\n        # be populated. We ensure this with the expected/actual length check\n        # above.\n        _tc = type_comment\n\n        def next_value(index: int = 0) -> ast3.expr:\n            return _tc\n\n    for arg in args.args[expected - actual:]:\n        ensure_no_annotation(arg.annotation)\n        arg.annotation = next_value(0)\n\n    if args.vararg:\n        ensure_no_annotation(args.vararg.annotation)\n        args.vararg.annotation = next_value(0)\n\n    for arg in args.kwonlyargs:\n        ensure_no_annotation(arg.annotation)\n        arg.annotation = next_value(0)\n\n    if args.kwarg:\n        ensure_no_annotation(args.kwarg.annotation)\n        args.kwarg.annotation = next_value(0)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ncopying argument type comments from the legacy long form to annotations in the entire function signature.", "response": "def copy_type_comments_to_annotations(args):\n    \"\"\"Copies argument type comments from the legacy long form to annotations\n    in the entire function signature.\n    \"\"\"\n    for arg in args.args:\n        copy_type_comment_to_annotation(arg)\n\n    if args.vararg:\n        copy_type_comment_to_annotation(args.vararg)\n\n    for arg in args.kwonlyargs:\n        copy_type_comment_to_annotation(arg)\n\n    if args.kwarg:\n        copy_type_comment_to_annotation(args.kwarg)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef maybe_replace_any_if_equal(name, expected, actual):\n    is_equal = expected == actual\n    if not is_equal and Config.replace_any:\n        actual_str = minimize_whitespace(str(actual))\n        if actual_str and actual_str[0] in {'\"', \"'\"}:\n            actual_str = actual_str[1:-1]\n        is_equal = actual_str in {'Any', 'typing.Any', 't.Any'}\n\n    if not is_equal:\n        expected_annotation = minimize_whitespace(str(expected))\n        actual_annotation = minimize_whitespace(str(actual))\n        raise ValueError(\n            f\"incompatible existing {name}. \" +\n            f\"Expected: {expected_annotation!r}, actual: {actual_annotation!r}\"\n        )\n\n    return expected or actual", "response": "Return the type given in expected and actual. Raise ValueError if expected is not equal to actual."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nremove the legacy signature type comment leaving other comments if any.", "response": "def remove_function_signature_type_comment(body):\n    \"\"\"Removes the legacy signature type comment, leaving other comments if any.\"\"\"\n    for node in body.children:\n        if node.type == token.INDENT:\n            prefix = node.prefix.lstrip()\n            if prefix.startswith('# type: '):\n                node.prefix = '\\n'.join(prefix.split('\\n')[1:])\n            break"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef flatten_some(children):\n    for node in children:\n        if node.type in (syms.try_stmt, syms.suite):\n            yield from flatten_some(node.children)\n        else:\n            yield node", "response": "Generates nodes or leaves of a try or finally statement."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef pop_param(params):\n    default = None\n\n    name = params.pop(0)\n    if name in (_star, _dstar):\n        default = params.pop(0)\n        if default == _comma:\n            return name, default\n\n    try:\n        remainder = params.pop(0)\n        if remainder == _eq:\n            default = params.pop(0)\n            remainder = params.pop(0)\n        if remainder != _comma:\n            raise ValueError(f\"unexpected token: {remainder}\")\n\n    except IndexError:\n        pass\n    return name, default", "response": "Pops the parameter and the remainder. Returns a tuple of name default or _star if the parameter is not present."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nreturning the offset after which a statement can be inserted to the body.", "response": "def get_offset_and_prefix(body, skip_assignments=False):\n    \"\"\"Returns the offset after which a statement can be inserted to the `body`.\n\n    This offset is calculated to come after all imports, and maybe existing\n    (possibly annotated) assignments if `skip_assignments` is True.\n\n    Also returns the indentation prefix that should be applied to the inserted\n    node.\n    \"\"\"\n    assert body.type in (syms.file_input, syms.suite)\n\n    _offset = 0\n    prefix = ''\n    for _offset, child in enumerate(body.children):\n        if child.type == syms.simple_stmt:\n            stmt = child.children[0]\n            if stmt.type == syms.expr_stmt:\n                expr = stmt.children\n                if not skip_assignments:\n                    break\n\n                if (\n                    len(expr) != 2 or\n                    expr[0].type != token.NAME or\n                    expr[1].type != syms.annassign or\n                    _eq in expr[1].children\n                ):\n                    break\n\n            elif stmt.type not in (syms.import_name, syms.import_from, token.STRING):\n                break\n\n        elif child.type == token.INDENT:\n            assert isinstance(child, Leaf)\n            prefix = child.value\n        elif child.type != token.NEWLINE:\n            break\n\n    prefix, child.prefix = child.prefix, prefix\n    return _offset, prefix"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef fix_line_numbers(body):\n    maxline = 0\n    for node in body.pre_order():\n        maxline += node.prefix.count('\\n')\n        if isinstance(node, Leaf):\n            node.lineno = maxline\n            maxline += str(node.value).count('\\n')", "response": "r Recomputes all line numbers based on the number of \\ n characters."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef new(n, prefix=None):\n\n    if isinstance(n, Leaf):\n        return Leaf(n.type, n.value, prefix=n.prefix if prefix is None else prefix)\n\n    # this is hacky, we assume complex nodes are just being reused once from the\n    # original AST.\n    n.parent = None\n    if prefix is not None:\n        n.prefix = prefix\n    return n", "response": "lib2to3 s AST requires unique objects as children."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef apply_job_security(code):\n    buf = io.BytesIO(code.encode('utf8'))\n    tokens = tokenize.tokenize(buf.readline)\n    # NOTE: by setting the fullname to `mercurial.pycompat` below, we're\n    # ensuring that hg-specific pycompat imports aren't inserted to the code.\n    data = tokenize.untokenize(replacetokens(list(tokens), 'mercurial.pycompat'))\n    return cast(str, data.decode('utf8'))", "response": "Treat input code like Python 2."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _load_info(self):\n        '''Get user info for GBDX S3, put into instance vars for convenience.\n\n        Args:\n            None.\n\n        Returns:\n            Dictionary with S3 access key, S3 secret key, S3 session token,\n            user bucket and user prefix (dict).\n        '''\n\n        url = '%s/prefix?duration=36000' % self.base_url\n        r = self.gbdx_connection.get(url)\n        r.raise_for_status()\n        return r.json()", "response": "Get user info for GBDX S3 put into instance vars for convenience."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef download(self, location, local_dir='.'):\n        '''Download content from bucket/prefix/location.\n           Location can be a directory or a file (e.g., my_dir or my_dir/my_image.tif)\n           If location is a directory, all files in the directory are\n           downloaded. If it is a file, then that file is downloaded.\n\n           Args:\n               location (str): S3 location within prefix.\n               local_dir (str): Local directory where file(s) will be stored.\n                                Default is here.\n        '''\n\n        self.logger.debug('Getting S3 info')\n        bucket = self.info['bucket']\n        prefix = self.info['prefix']\n\n        self.logger.debug('Connecting to S3')\n        s3conn = self.client\n\n        # remove head and/or trail backslash from location\n        location = location.strip('/')\n\n        self.logger.debug('Downloading contents')\n        objects = s3conn.list_objects(Bucket=bucket, Prefix=(prefix+'/'+location))\n        if 'Contents' not in objects:\n            raise ValueError('Download target {}/{}/{} was not found or inaccessible.'.format(bucket, prefix, location))\n        for s3key in objects['Contents']:\n            key = s3key['Key']\n    \n            # skip directory keys\n            if not key or key.endswith('/'):\n                continue\n\n            # get path to each file\n            filepath = key.replace(prefix+'/'+location, '', 1).lstrip('/')\n            filename = key.split('/')[-1]\n            \n            #self.logger.debug(filename)\n            file_dir = filepath.split('/')[:-1]\n            file_dir = '/'.join(file_dir)\n            full_dir = os.path.join(local_dir, file_dir)\n\n            # make sure directory exists\n            if not os.path.isdir(full_dir):\n                os.makedirs(full_dir)\n\n            # download file\n            s3conn.download_file(bucket, key, os.path.join(full_dir, filename))\n\n        self.logger.debug('Done!')", "response": "Download content from bucket / prefix / location."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\ndeleting content in bucket / prefix / location.", "response": "def delete(self, location):\n        '''Delete content in bucket/prefix/location.\n           Location can be a directory or a file (e.g., my_dir or my_dir/my_image.tif)\n           If location is a directory, all files in the directory are deleted.\n           If it is a file, then that file is deleted.\n\n           Args:\n               location (str): S3 location within prefix. Can be a directory or\n                               a file (e.g., my_dir or my_dir/my_image.tif).\n        '''\n        bucket = self.info['bucket']\n        prefix = self.info['prefix']\n\n        self.logger.debug('Connecting to S3')\n        s3conn = self.client \n\n        # remove head and/or trail backslash from location\n        if location[0] == '/':\n            location = location[1:]\n        if location[-1] == '/':\n            location = location[:-2]\n\n        self.logger.debug('Deleting contents')\n\n        for s3key in s3conn.list_objects(Bucket=bucket, Prefix=(prefix+'/'+location))['Contents']:\n            s3conn.delete_object(Bucket=bucket, Key=s3key['Key'])\n\n        self.logger.debug('Done!')"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef upload(self, local_file, s3_path=None):\n        '''\n        Upload files to your DG S3 bucket/prefix.\n\n        Args:\n            local_file (str): a path to a local file to upload, directory structures are not mirrored\n            s3_path: a key (location) on s3 to upload the file to\n\n        Returns:\n            str: s3 path file was saved to\n\n        Examples:\n            >>> upload('path/to/image.tif')\n            'mybucket/myprefix/image.tif'\n\n            >>> upload('./images/image.tif')\n            'mybucket/myprefix/image.tif'\n\n            >>> upload('./images/image.tif', s3_path='images/image.tif')\n            'mybucket/myprefix/images/image.tif'\n        '''\n        if not os.path.exists(local_file):\n            raise Exception(local_file + \" does not exist.\")\n            \n        if s3_path is None:\n            s3_path = os.path.basename(local_file)\n\n        bucket = self.info['bucket']\n        prefix = self.info['prefix']\n\n        self.logger.debug('Connecting to S3')\n        s3conn = self.client \n        self.logger.debug('Uploading file {}'.format(local_file))\n        s3conn.upload_file(local_file, bucket, prefix+'/'+s3_path)\n        self.logger.debug('Done!')\n        return '{}/{}/{}'.format(bucket, prefix, s3_path)", "response": "Uploads a local file to the DG S3 bucket and returns the path to the file uploaded to the DG S3 bucket."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef rgb(self, **kwargs):\n        ''' Convert the image to a 3 band RGB for plotting\n        \n        This method shares the same arguments as plot(). It will perform visual adjustment on the\n        image and prepare the data for plotting in MatplotLib. Values are converted to an\n        appropriate precision and the axis order is changed to put the band axis last.\n        '''\n\n        if \"bands\" in kwargs:\n            use_bands = kwargs[\"bands\"]\n            assert len(use_bands) == 3, 'Plot method only supports single or 3-band outputs'\n            del kwargs[\"bands\"]\n        else:\n            use_bands = self._rgb_bands\n        if kwargs.get('blm') == True:\n            return self.histogram_match(use_bands, **kwargs)\n        # if not specified or DRA'ed, default to a 2-98 stretch\n        if \"histogram\" not in kwargs:\n            if \"stretch\" not in kwargs:\n                if not self.options.get('dra'):\n                    kwargs['stretch'] = [2,98]\n            return self.histogram_stretch(use_bands, **kwargs)\n        elif kwargs[\"histogram\"] == \"equalize\":\n            return self.histogram_equalize(use_bands, **kwargs)\n        elif kwargs[\"histogram\"] == \"match\":\n            return self.histogram_match(use_bands, **kwargs)\n        elif kwargs[\"histogram\"] == \"minmax\":\n            return self.histogram_stretch(use_bands, stretch=[0, 100], **kwargs)\n        # DRA'ed images should be left alone if not explicitly adjusted\n        elif kwargs[\"histogram\"] == \"ignore\" or self.options.get('dra'):\n            data = self._read(self[use_bands,...], **kwargs)\n            return np.rollaxis(data, 0, 3)\n        else:\n            raise KeyError('Unknown histogram parameter, use \"equalize\", \"match\", \"minmax\", or \"ignore\"')", "response": "Convert the image to a 3 band RGB for plotting\n        \n           "}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nequalizes and the histogram and normalize value range", "response": "def histogram_equalize(self, use_bands, **kwargs):\n        ''' Equalize and the histogram and normalize value range\n            Equalization is on all three bands, not per-band'''\n        data = self._read(self[use_bands,...], **kwargs)\n        data = np.rollaxis(data.astype(np.float32), 0, 3)\n        flattened = data.flatten()\n        if 0 in data:\n            masked = np.ma.masked_values(data, 0).compressed()\n            image_histogram, bin_edges = np.histogram(masked, 256)\n        else:\n            image_histogram, bin_edges = np.histogram(flattened, 256)\n        bins = (bin_edges[:-1] + bin_edges[1:]) / 2.0\n        cdf = image_histogram.cumsum() \n        cdf = cdf / float(cdf[-1])\n        image_equalized = np.interp(flattened, bins, cdf).reshape(data.shape)\n        if 'stretch' in kwargs or 'gamma' in kwargs:\n            return self._histogram_stretch(image_equalized, **kwargs)\n        else:\n            return image_equalized"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef histogram_match(self, use_bands, blm_source=None, **kwargs):\n        ''' Match the histogram to existing imagery '''\n        assert has_rio, \"To match image histograms please install rio_hist\"\n        data = self._read(self[use_bands,...], **kwargs)\n        data = np.rollaxis(data.astype(np.float32), 0, 3)\n        if 0 in data:\n            data = np.ma.masked_values(data, 0)\n        bounds = self._reproject(box(*self.bounds), from_proj=self.proj, to_proj=\"EPSG:4326\").bounds\n        if blm_source == 'browse':\n            from gbdxtools.images.browse_image import BrowseImage\n            ref = BrowseImage(self.cat_id, bbox=bounds).read()\n        else:\n            from gbdxtools.images.tms_image import TmsImage\n            tms = TmsImage(zoom=self._calc_tms_zoom(self.affine[0]), bbox=bounds, **kwargs)\n            ref = np.rollaxis(tms.read(), 0, 3)\n        out = np.dstack([rio_match(data[:,:,idx], ref[:,:,idx].astype(np.double)/255.0)\n                        for idx in range(data.shape[-1])])\n        if 'stretch' in kwargs or 'gamma' in kwargs:\n            return self._histogram_stretch(out, **kwargs)\n        else:\n            return out", "response": "Match the histogram to existing imagery"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _histogram_stretch(self, data, **kwargs):\n        ''' perform a contrast stretch and/or gamma adjustment '''\n        limits = {}\n        # get the image min-max statistics\n        for x in range(3):\n            band = data[:,:,x]\n            try:\n                limits[x] = np.percentile(band, kwargs.get(\"stretch\", [0,100]))\n            except IndexError:\n                # this band has no dynamic range and cannot be stretched\n                return data\n        # compute the stretch\n        for x in range(3):\n            band = data[:,:,x]\n            if 0 in band:\n                band = np.ma.masked_values(band, 0).compressed()\n            top = limits[x][1]\n            bottom = limits[x][0]\n            if top != bottom: # catch divide by zero\n                data[:,:,x] = (data[:,:,x] - bottom) / float(top - bottom) * 255.0\n        data = np.clip(data, 0, 255).astype(\"uint8\")\n        # gamma adjust\n        if \"gamma\" in kwargs:\n            invGamma = 1.0 / kwargs['gamma']\n            lut = np.array([((i / 255.0) ** invGamma) * 255\n\t\t            for i in np.arange(0, 256)]).astype(\"uint8\")\n            data = np.take(lut, data)\n        return data", "response": "perform a contrast stretch and gamma adjustment"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef ndvi(self, **kwargs):\n        data = self._read(self[self._ndvi_bands,...]).astype(np.float32)\n        return (data[0,:,:] - data[1,:,:]) / (data[0,:,:] + data[1,:,:])", "response": "Calculates Normalized Difference Vegetation Index using NIR and Red of an image."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef ndwi(self):\n        data = self._read(self[self._ndwi_bands,...]).astype(np.float32)\n        return (data[1,:,:] - data[0,:,:]) / (data[0,:,:] + data[1,:,:])", "response": "Calculates Normalized Difference Water Index using Coastal and NIR2 bands for Landsat8 and sentinel2 calculated by using Green and NIR2 bands for Landsat8 and sentinel2."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nplot the image with MatplotLib and return the image.", "response": "def plot(self, spec=\"rgb\", **kwargs):\n        ''' Plot the image with MatplotLib\n\n        Plot sizing includes default borders and spacing. If the image is shown in Jupyter the outside whitespace will be automatically cropped to save size, resulting in a smaller sized image than expected.\n\n        Histogram options:\n            * 'equalize': performs histogram equalization on the image.\n            * 'minmax': stretch the pixel range to the minimum and maximum input pixel values. Equivalent to stretch=[0,100].\n            * 'match': match the histogram to the Maps API imagery. Pass the additional keyword blm_source='browse' to match to the Browse Service (image thumbnail) instead.\n            * 'ignore': Skip dynamic range adjustment, in the event the image is already correctly balanced and the values are in the correct range.\n            \n        Gamma values greater than 1 will brighten the image midtones, values less than 1 will darken the midtones.\n\n        Plots generated with the histogram options of 'match' and 'equalize' can be combined with the stretch and gamma options. The stretch and gamma adjustments will be applied after the histogram adjustments.\n        \n        Args:\n            w (float or int): width of plot in inches at 72 dpi, default is 10\n            h (float or int): height of plot in inches at 72 dpi, default is 10\n            title (str): Title to use on the plot\n            fontsize (int): Size of title font, default is 22. Size is measured in points.\n            bands (list): bands to use for plotting, such as bands=[4,2,1]. Defaults to the image's natural RGB bands. This option is useful for generating pseudocolor images when passed a list of three bands. If only a single band is provided, a colormapped plot will be generated instead.\n            cmap (str): MatPlotLib colormap name to use for single band images. Default is colormap='Grey_R'.\n            histogram (str): either 'equalize', 'minmax', 'match', or ignore\n            stretch (list): stretch the histogram between two percentile values, default is [2,98]\n            gamma (float): adjust image gamma, default is 1.0\n        '''\n\n        if self.shape[0] == 1 or (\"bands\" in kwargs and len(kwargs[\"bands\"]) == 1):\n            if \"cmap\" in kwargs:\n                cmap = kwargs[\"cmap\"]\n                del kwargs[\"cmap\"]\n            else:\n                cmap = \"Greys_r\"\n            self._plot(tfm=self._single_band, cmap=cmap, **kwargs)\n        else:\n            if spec == \"rgb\" and self._has_token(**kwargs):\n                self._plot(tfm=self.rgb, **kwargs)\n            else:\n                self._plot(tfm=getattr(self, spec), **kwargs)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef get_images_by_catid_and_aoi(self, catid, aoi_wkt):\n\n        self.logger.debug('Retrieving IDAHO metadata')\n\n        # use the footprint to get the IDAHO id\n        url = '%s/search' % self.base_url\n\n        body = {\"filters\": [\"catalogID = '%s'\" % catid],\n                \"types\": [\"IDAHOImage\"],\n                \"searchAreaWkt\": aoi_wkt}\n\n        r = self.gbdx_connection.post(url, data=json.dumps(body))\n\n        r.raise_for_status()\n        if r.status_code == 200:\n            results = r.json()\n            numresults = len(results['results'])\n            self.logger.debug('%s IDAHO images found associated with catid %s'\n                              % (numresults, catid))\n\n            return results", "response": "Get the IDAHO images associated with a given catalog ID and well - known text of the area of interest."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_images_by_catid(self, catid):\n\n        self.logger.debug('Retrieving IDAHO metadata')\n\n        # get the footprint of the catid's strip\n        footprint = self.catalog.get_strip_footprint_wkt(catid)\n\n        # try to convert from multipolygon to polygon:\n        try:\n            footprint = from_wkt(footprint).geoms[0].wkt\n        except:\n            pass\n\n        if not footprint:\n            self.logger.debug(\"\"\"Cannot get IDAHO metadata for strip %s,\n                                 footprint not found\"\"\" % catid)\n            return None\n\n        return self.get_images_by_catid_and_aoi(catid=catid,\n                                                aoi_wkt=footprint)", "response": "Get the full catalog - search response for IDAHO images associated with a given source catalog ID."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef describe_images(self, idaho_image_results):\n\n        results = idaho_image_results['results']\n\n        # filter only idaho images:\n        results = [r for r in results if 'IDAHOImage' in r['type']]\n        self.logger.debug('Describing %s IDAHO images.' % len(results))\n\n        # figure out which catids are represented in this set of images\n        catids = set([r['properties']['catalogID'] for r in results])\n\n        description = {}\n\n        for catid in catids:\n            # images associated with a single catid\n            description[catid] = {}\n            description[catid]['parts'] = {}\n            images = [r for r in results if r['properties']['catalogID'] == catid]\n            for image in images:\n                description[catid]['sensorPlatformName'] = image['properties']['sensorPlatformName']\n                part = int(image['properties']['vendorDatasetIdentifier'].split(':')[1][-3:])\n                color = image['properties']['colorInterpretation']\n                bucket = image['properties']['tileBucketName']\n                identifier = image['identifier']\n                boundstr = image['properties']['footprintWkt']\n\n                try:\n                    description[catid]['parts'][part]\n                except:\n                    description[catid]['parts'][part] = {}\n\n                description[catid]['parts'][part][color] = {}\n                description[catid]['parts'][part][color]['id'] = identifier\n                description[catid]['parts'][part][color]['bucket'] = bucket\n                description[catid]['parts'][part][color]['boundstr'] = boundstr\n\n        return description", "response": "Describe the result set of a catalog search for IDAHO images."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ndownloading a native resolution orthorectified chip in tif format from a user - specified catalog id.", "response": "def get_chip(self, coordinates, catid, chip_type='PAN', chip_format='TIF', filename='chip.tif'):\n        \"\"\"Downloads a native resolution, orthorectified chip in tif format\n        from a user-specified catalog id.\n\n        Args:\n            coordinates (list): Rectangle coordinates in order West, South, East, North.\n                                West and East are longitudes, North and South are latitudes.\n                                The maximum chip size is (2048 pix)x(2048 pix)\n            catid (str): The image catalog id.\n            chip_type (str): 'PAN' (panchromatic), 'MS' (multispectral), 'PS' (pansharpened).\n                             'MS' is 4 or 8 bands depending on sensor.\n            chip_format (str): 'TIF' or 'PNG'\n            filename (str): Where to save chip.\n\n        Returns:\n            True if chip is successfully downloaded; else False.\n        \"\"\"\n\n        def t2s1(t):\n            # Tuple to string 1\n            return str(t).strip('(,)').replace(',', '')\n\n        def t2s2(t):\n            # Tuple to string 2\n            return str(t).strip('(,)').replace(' ', '')\n\n        if len(coordinates) != 4:\n            print('Wrong coordinate entry')\n            return False\n\n        W, S, E, N = coordinates\n        box = ((W, S), (W, N), (E, N), (E, S), (W, S))\n        box_wkt = 'POLYGON ((' + ','.join([t2s1(corner) for corner in box]) + '))'\n\n        # get IDAHO images which intersect box\n        results = self.get_images_by_catid_and_aoi(catid=catid, aoi_wkt=box_wkt)\n        description = self.describe_images(results)\n\n        pan_id, ms_id, num_bands = None, None, 0\n        for catid, images in description.items():\n            for partnum, part in images['parts'].items():\n                if 'PAN' in part.keys():\n                    pan_id = part['PAN']['id']\n                    bucket = part['PAN']['bucket']\n                if 'WORLDVIEW_8_BAND' in part.keys():\n                    ms_id = part['WORLDVIEW_8_BAND']['id']\n                    num_bands = 8\n                    bucket = part['WORLDVIEW_8_BAND']['bucket']\n                elif 'RGBN' in part.keys():\n                    ms_id = part['RGBN']['id']\n                    num_bands = 4\n                    bucket = part['RGBN']['bucket']\n\n        # specify band information\n        band_str = ''\n        if chip_type == 'PAN':\n            band_str = pan_id + '?bands=0'\n        elif chip_type == 'MS':\n            band_str = ms_id + '?'\n        elif chip_type == 'PS':\n            if num_bands == 8:\n                band_str = ms_id + '?bands=4,2,1&panId=' + pan_id\n            elif num_bands == 4:\n                band_str = ms_id + '?bands=0,1,2&panId=' + pan_id\n\n        # specify location information\n        location_str = '&upperLeft={}&lowerRight={}'.format(t2s2((W, N)), t2s2((E, S)))\n\n        service_url = 'https://idaho.geobigdata.io/v1/chip/bbox/' + bucket + '/'\n        url = service_url + band_str + location_str\n        url += '&format=' + chip_format + '&token=' + self.gbdx_connection.access_token\n        r = requests.get(url)\n\n        if r.status_code == 200:\n            with open(filename, 'wb') as f:\n                f.write(r.content)\n                return True\n        else:\n            print('Cannot download chip')\n            return False"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ngetting a list of urls and bounding boxes corrsponding to idaho images for a given catalog id.", "response": "def get_tms_layers(self,\n                       catid,\n                       bands='4,2,1',\n                       gamma=1.3,\n                       highcutoff=0.98,\n                       lowcutoff=0.02,\n                       brightness=1.0,\n                       contrast=1.0):\n        \"\"\"Get list of urls and bounding boxes corrsponding to idaho images for a given catalog id.\n\n        Args:\n           catid (str): Catalog id\n           bands (str): Bands to display, separated by commas (0-7).\n           gamma (float): gamma coefficient. This is for on-the-fly pansharpening.\n           highcutoff (float): High cut off coefficient (0.0 to 1.0). This is for on-the-fly pansharpening.\n           lowcutoff (float): Low cut off coefficient (0.0 to 1.0). This is for on-the-fly pansharpening.\n           brightness (float): Brightness coefficient (0.0 to 1.0). This is for on-the-fly pansharpening.\n           contrast (float): Contrast coefficient (0.0 to 1.0). This is for on-the-fly pansharpening.\n\n        Returns:\n           urls (list): TMS urls.\n           bboxes (list of tuples): Each tuple is (W, S, E, N) where (W,S,E,N) are the bounds of the corresponding idaho part.\n        \"\"\"\n\n        description = self.describe_images(self.get_images_by_catid(catid))\n        service_url = 'http://idaho.geobigdata.io/v1/tile/'\n\n        urls, bboxes = [], []\n        for catid, images in description.items():\n            for partnum, part in images['parts'].items():\n                if 'PAN' in part.keys():\n                    pan_id = part['PAN']['id']\n                if 'WORLDVIEW_8_BAND' in part.keys():\n                    ms_id = part['WORLDVIEW_8_BAND']['id']\n                    ms_partname = 'WORLDVIEW_8_BAND'\n                elif 'RGBN' in part.keys():\n                    ms_id = part['RGBN']['id']\n                    ms_partname = 'RGBN'\n\n                if ms_id:\n                    if pan_id:\n                        band_str = ms_id + '/{z}/{x}/{y}?bands=' + bands + '&panId=' + pan_id\n                    else:\n                        band_str = ms_id + '/{z}/{x}/{y}?bands=' + bands\n                    bbox = from_wkt(part[ms_partname]['boundstr']).bounds\n                elif not ms_id and pan_id:\n                    band_str = pan_id + '/{z}/{x}/{y}?bands=0'\n                    bbox = from_wkt(part['PAN']['boundstr']).bounds\n                else:\n                    continue\n\n                bboxes.append(bbox)\n\n                # Get the bucket. It has to be the same for all entries in the part.\n                bucket = part[list(part.keys())[0]]['bucket']\n\n                # Get the token\n                token = self.gbdx_connection.access_token\n\n                # Assemble url\n                url = (service_url + bucket + '/'\n                       + band_str\n                       + \"\"\"&gamma={}\n                                        &highCutoff={}\n                                        &lowCutoff={}\n                                        &brightness={}\n                                        &contrast={}\n                                        &token={}\"\"\".format(gamma,\n                                                            highcutoff,\n                                                            lowcutoff,\n                                                            brightness,\n                                                            contrast,\n                                                            token))\n                urls.append(url)\n\n        return urls, bboxes"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef create_leaflet_viewer(self, idaho_image_results, filename):\n\n        description = self.describe_images(idaho_image_results)\n        if len(description) > 0:\n            functionstring = ''\n            for catid, images in description.items():\n                for partnum, part in images['parts'].items():\n\n                    num_images = len(list(part.keys()))\n                    partname = None\n                    if num_images == 1:\n                        # there is only one image, use the PAN\n                        partname = [p for p in list(part.keys())][0]\n                        pan_image_id = ''\n                    elif num_images == 2:\n                        # there are two images in this part, use the multi (or pansharpen)\n                        partname = [p for p in list(part.keys()) if p is not 'PAN'][0]\n                        pan_image_id = part['PAN']['id']\n\n                    if not partname:\n                        self.logger.debug(\"Cannot find part for idaho image.\")\n                        continue\n\n                    bandstr = {\n                        'RGBN': '0,1,2',\n                        'WORLDVIEW_8_BAND': '4,2,1',\n                        'PAN': '0'\n                    }.get(partname, '0,1,2')\n\n                    part_boundstr_wkt = part[partname]['boundstr']\n                    part_polygon = from_wkt(part_boundstr_wkt)\n                    bucketname = part[partname]['bucket']\n                    image_id = part[partname]['id']\n                    W, S, E, N = part_polygon.bounds\n\n                    functionstring += \"addLayerToMap('%s','%s',%s,%s,%s,%s,'%s');\\n\" % (\n                        bucketname, image_id, W, S, E, N, pan_image_id)\n\n            __location__ = os.path.realpath(\n                os.path.join(os.getcwd(), os.path.dirname(__file__)))\n            try:\n                with open(os.path.join(__location__, 'leafletmap_template.html'), 'r') as htmlfile:\n                    data = htmlfile.read().decode(\"utf8\")\n            except AttributeError:\n                with open(os.path.join(__location__, 'leafletmap_template.html'), 'r') as htmlfile:\n                    data = htmlfile.read()\n\n            data = data.replace('FUNCTIONSTRING', functionstring)\n            data = data.replace('CENTERLAT', str(S))\n            data = data.replace('CENTERLON', str(W))\n            data = data.replace('BANDS', bandstr)\n            data = data.replace('TOKEN', self.gbdx_connection.access_token)\n\n            with codecs.open(filename, 'w', 'utf8') as outputfile:\n                self.logger.debug(\"Saving %s\" % filename)\n                outputfile.write(data)\n        else:\n            print('No items returned.')", "response": "Create a leaflet viewer html file for viewing IDAHO images."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nchecks to see if an image has been ordered or not.", "response": "def is_ordered(cat_id):\n    \"\"\"\n      Checks to see if a CatalogID has been ordered or not.\n\n      Args:\n        catalogID (str): The catalog ID from the platform catalog.\n      Returns:\n        ordered (bool): Whether or not the image has been ordered\n    \"\"\"\n    url = 'https://rda.geobigdata.io/v1/stripMetadata/{}'.format(cat_id)\n    auth = Auth()\n    r = _req_with_retries(auth.gbdx_connection, url)\n    if r is not None:\n        return r.status_code == 200\n    return False"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nchecks to see if an image is atmos. compensated or not.", "response": "def can_acomp(cat_id):\n    \"\"\"\n      Checks to see if a CatalogID can be atmos. compensated or not.\n\n      Args:\n        catalogID (str): The catalog ID from the platform catalog.\n      Returns:\n        available (bool): Whether or not the image can be acomp'd\n    \"\"\"\n    url = 'https://rda.geobigdata.io/v1/stripMetadata/{}/capabilities'.format(cat_id)\n    auth = Auth()\n    r = _req_with_retries(auth.gbdx_connection, url)\n    try: \n        data = r.json()\n        return data['acompVersion'] is not None\n    except:\n        return False"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef deprecate_module_attr(mod, deprecated):\n    deprecated = set(deprecated)\n    class Wrapper(object):\n        def __getattr__(self, attr):\n            if attr in deprecated:\n                warnings.warn(\"Property {} is deprecated\".format(attr), GBDXDeprecation)\n\n            return getattr(mod, attr)\n\n        def __setattr__(self, attr, value):\n            if attr in deprecated:\n                warnings.warn(\"Property {} is deprecated\".format(attr), GBDXDeprecation)\n            return setattr(mod, attr, value)\n    return Wrapper()", "response": "Returns a wrapped object that warns about deprecated accesses"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\ngives a name figure out if a multiplex port prefixes this name and return it. Otherwise return None.", "response": "def get_matching_multiplex_port(self,name):\n        \"\"\"\n        Given a name, figure out if a multiplex port prefixes this name and return it.  Otherwise return none.\n        \"\"\"\n\n        # short circuit:  if the attribute name already exists return none\n        # if name in self._portnames: return None\n        # if not len([p for p in self._portnames if name.startswith(p) and name != p]): return None\n\n        matching_multiplex_ports = [self.__getattribute__(p) for p in self._portnames\n            if name.startswith(p)\n            and name != p\n            and hasattr(self, p)\n            and self.__getattribute__(p).is_multiplex\n        ]\n\n        for port in matching_multiplex_ports:\n            return port\n\n        return None"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef set(self, **kwargs):\n        for port_name, port_value in kwargs.items():\n            # Support both port and port.value\n            if hasattr(port_value, 'value'):\n                port_value = port_value.value\n\n            self.inputs.__setattr__(port_name, port_value)", "response": "Set input values on task\n\n           "}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nsaves output data from any task in this workflow to S3", "response": "def savedata(self, output, location=None):\n        '''\n        Save output data from any task in this workflow to S3\n\n        Args:\n               output: Reference task output (e.g. task.outputs.output1).\n\n               location (optional): Subfolder under which the output will be saved.\n                                    It will be placed under the account directory in gbd-customer-data bucket:\n                                    s3://gbd-customer-data/{account_id}/{location}\n                                    Leave blank to save to: workflow_output/{workflow_id}/{task_name}/{port_name}\n\n        Returns:\n            None\n        '''\n\n        output.persist = True\n        if location:\n            output.persist_location = location"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef list_workflow_outputs(self):\n        '''\n        Get a list of outputs from the workflow that are saved to S3. To get resolved locations call workflow status.\n        Args:\n            None\n\n        Returns:\n            list\n        '''\n        workflow_outputs = []\n        for task in self.tasks:\n            for output_port_name in task.outputs._portnames:\n                if task.outputs.__getattribute__(output_port_name).persist:\n                    workflow_outputs.append(task.name + ':' + output_port_name)\n\n        return workflow_outputs", "response": "Get a list of outputs from the workflow that are saved to S3. To get resolved locations call workflow status."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ngenerating the workflow json for launching the workflow against the gbdx api.", "response": "def generate_workflow_description(self):\n        '''\n        Generate workflow json for launching the workflow against the gbdx api\n\n        Args:\n            None\n\n        Returns:\n            json string\n        '''\n        if not self.tasks:\n            raise WorkflowError('Workflow contains no tasks, and cannot be executed.')\n\n        self.definition = self.workflow_skeleton()\n\n        if self.batch_values:\n            self.definition[\"batch_values\"] = self.batch_values\n\n        all_input_port_values = [t.inputs.__getattribute__(input_port_name).value for t in self.tasks for\n                                 input_port_name in t.inputs._portnames]\n        for task in self.tasks:\n            # only include multiplex output ports in this task if other tasks refer to them in their inputs.\n            # 1. find the multplex output port_names in this task\n            # 2. see if they are referred to in any other tasks inputs\n            # 3. If not, exclude them from the workflow_def\n            output_multiplex_ports_to_exclude = []\n            multiplex_output_port_names = [portname for portname in task.outputs._portnames if\n                                           task.outputs.__getattribute__(portname).is_multiplex]\n            for p in multiplex_output_port_names:\n                output_port_reference = 'source:' + task.name + ':' + p\n                if output_port_reference not in all_input_port_values:\n                    output_multiplex_ports_to_exclude.append(p)\n\n            task_def = task.generate_task_workflow_json(\n                output_multiplex_ports_to_exclude=output_multiplex_ports_to_exclude)\n            self.definition['tasks'].append(task_def)\n\n        if self.callback:\n            self.definition['callback'] = self.callback\n\n        return self.definition"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef execute(self):\n        '''\n        Execute the workflow.\n\n        Args:\n            None\n\n        Returns:\n            Workflow_id\n        '''\n        # if not self.tasks:\n        #     raise WorkflowError('Workflow contains no tasks, and cannot be executed.')\n\n        # for task in self.tasks:\n        #     self.definition['tasks'].append( task.generate_task_workflow_json() )\n\n        self.generate_workflow_description()\n\n        # hit batch workflow endpoint if batch values\n        if self.batch_values:\n            self.id = self.workflow.launch_batch_workflow(self.definition)\n\n        # use regular workflow endpoint if no batch values\n        else:\n            self.id = self.workflow.launch(self.definition)\n\n        return self.id", "response": "Execute the workflow.\n\n        Args:\n            None\n\n        Returns:\n            Workflow_id"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef task_ids(self):\n        '''\n        Get the task IDs of a running workflow\n\n        Args:\n            None\n\n        Returns:\n            List of task IDs\n        '''\n        if not self.id:\n            raise WorkflowError('Workflow is not running.  Cannot get task IDs.')\n\n        if self.batch_values:\n            raise NotImplementedError(\"Query Each Workflow Id within the Batch Workflow for task IDs.\")\n\n        wf = self.workflow.get(self.id)\n\n        return [task['id'] for task in wf['tasks']]", "response": "Get the task IDs of a running workflow."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef cancel(self):\n        '''\n        Cancel a running workflow.\n\n        Args:\n            None\n\n        Returns:\n            None\n        '''\n        if not self.id:\n            raise WorkflowError('Workflow is not running.  Cannot cancel.')\n\n        if self.batch_values:\n            self.workflow.batch_workflow_cancel(self.id)\n        else:\n            self.workflow.cancel(self.id)", "response": "Cancel a running workflow."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef stdout(self):\n        ''' Get stdout from all the tasks of a workflow.\n\n        Returns:\n            (list): tasks with their stdout\n        \n        Example:\n            >>> workflow.stdout\n            [\n                {\n                    \"id\": \"4488895771403082552\",\n                    \"taskType\": \"AOP_Strip_Processor\",\n                    \"name\": \"Task1\",\n                    \"stdout\": \"............\"\n                }\n            ]\n\n        '''\n        if not self.id:\n            raise WorkflowError('Workflow is not running.  Cannot get stdout.')\n        if self.batch_values:\n            raise NotImplementedError(\"Query Each Workflow Id within the Batch Workflow for stdout.\")\n\n        wf = self.workflow.get(self.id)\n\n        stdout_list = []\n        for task in wf['tasks']:\n            stdout_list.append(\n                {\n                    'id': task['id'],\n                    'taskType': task['taskType'],\n                    'name': task['name'],\n                    'stdout': self.workflow.get_stdout(self.id, task['id'])\n                }\n            )\n\n        return stdout_list", "response": "Get all the tasks with their stdout."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nget stderr from all the tasks of a workflow.", "response": "def stderr(self):\n        '''Get stderr from all the tasks of a workflow.\n\n        Returns:\n            (list): tasks with their stderr\n\n        Example:\n            >>> workflow.stderr\n            [\n                {\n                    \"id\": \"4488895771403082552\",\n                    \"taskType\": \"AOP_Strip_Processor\",\n                    \"name\": \"Task1\",\n                    \"stderr\": \"............\"\n                }\n            ]\n\n        '''\n\n        if not self.id:\n            raise WorkflowError('Workflow is not running.  Cannot get stderr.')\n        if self.batch_values:\n            raise NotImplementedError(\"Query Each Workflow Id within the Batch Workflow for stderr.\")\n\n        wf = self.workflow.get(self.id)\n\n        stderr_list = []\n        for task in wf['tasks']:\n            stderr_list.append(\n                {\n                    'id': task['id'],\n                    'taskType': task['taskType'],\n                    'name': task['name'],\n                    'stderr': self.workflow.get_stderr(self.id, task['id'])\n                }\n            )\n\n        return stderr_list"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef layers(self):\n        layers = [self._layer_def(style) for style in self.styles]\n        return layers", "response": "Renders the list of layers to add to the map."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn a pyproj projection for the specified project code.", "response": "def get_proj(prj_code):\n    \"\"\"\n      Helper method for handling projection codes that are unknown to pyproj\n\n      Args:\n          prj_code (str): an epsg proj code\n\n      Returns:\n          projection: a pyproj projection\n    \"\"\"\n    if prj_code in CUSTOM_PRJ:\n        proj = pyproj.Proj(CUSTOM_PRJ[prj_code])\n    else:\n        proj = pyproj.Proj(init=prj_code)\n    return proj"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nshowing a slippy map preview of the image. Requires iPython.", "response": "def preview(image, **kwargs):\n    ''' Show a slippy map preview of the image. Requires iPython.\n\n    Args:\n        image (image): image object to display\n        zoom (int): zoom level to intialize the map, default is 16\n        center (list): center coordinates to initialize the map, defaults to center of image\n        bands (list): bands of image to display, defaults to the image's default RGB bands\n    '''\n    \n    try:\n        from IPython.display import Javascript, HTML, display\n        from gbdxtools.rda.interface import RDA\n        from gbdxtools import Interface\n        gbdx = Interface()\n    except:\n        print(\"IPython is required to produce maps.\")\n        return\n\n    zoom = kwargs.get(\"zoom\", 16)\n    bands = kwargs.get(\"bands\")\n    if bands is None:\n        bands = image._rgb_bands\n    wgs84_bounds = kwargs.get(\"bounds\", list(loads(image.metadata[\"image\"][\"imageBoundsWGS84\"]).bounds))\n    center = kwargs.get(\"center\", list(shape(image).centroid.bounds[0:2]))\n    \n    if image.proj != 'EPSG:4326':\n        code = image.proj.split(':')[1]\n        conn = gbdx.gbdx_connection\n        proj_info = conn.get('https://ughlicoordinates.geobigdata.io/ughli/v1/projinfo/{}'.format(code)).json()\n        tfm = partial(pyproj.transform, pyproj.Proj(init='EPSG:4326'), pyproj.Proj(init=image.proj))\n        bounds = list(ops.transform(tfm, box(*wgs84_bounds)).bounds)\n    else:\n        proj_info = {}\n        bounds = wgs84_bounds\n    # Applying DRA to a DRA'ed image looks bad, skip if already in graph\n    if not image.options.get('dra'):\n        rda = RDA()\n        # Need some simple DRA to get the image in range for display.\n        dra = rda.HistogramDRA(image)\n        image = dra.aoi(bbox=image.bounds)\n    graph_id = image.rda_id\n    node_id = image.rda.graph()['nodes'][0]['id']\n    map_id = \"map_{}\".format(str(int(time.time())))\n    scales = ','.join(['1'] * len(bands))\n    offsets = ','.join(['0'] * len(bands))\n\n    display(HTML(Template('''\n       <div id=\"$map_id\"/>\n       <link href='https://openlayers.org/en/v4.6.4/css/ol.css' rel='stylesheet' />\n       <script src=\"https://cdn.polyfill.io/v2/polyfill.min.js?features=requestAnimationFrame,Element.prototype.classList,URL\"></script>\n       <style>body{margin:0;padding:0;}#$map_id{position:relative;top:0;bottom:0;width:100%;height:400px;}</style>\n       <style></style>\n    ''').substitute({\"map_id\": map_id})))\n\n    js = Template(\"\"\"\n        require.config({\n            paths: {\n                oljs: 'https://cdnjs.cloudflare.com/ajax/libs/openlayers/4.6.4/ol',\n                proj4: 'https://cdnjs.cloudflare.com/ajax/libs/proj4js/2.4.4/proj4'\n            }\n        });\n\n        require(['oljs', 'proj4'], function(oljs, proj4) {\n            oljs.proj.setProj4(proj4)\n            var md = $md;\n            var georef = $georef;\n            var graphId = '$graphId';\n            var nodeId = '$nodeId';\n            var extents = $bounds;\n\n            var x1 = md.minTileX * md.tileXSize;\n            var y1 = ((md.minTileY + md.numYTiles) * md.tileYSize + md.tileYSize);\n            var x2 = ((md.minTileX + md.numXTiles) * md.tileXSize + md.tileXSize);\n            var y2 = md.minTileY * md.tileYSize;\n            var tileLayerResolutions = [georef.scaleX];\n\n            var url = '$url' + '/tile/';\n            url += graphId + '/' + nodeId;\n            url += \"/{x}/{y}.png?token=$token&display_bands=$bands&display_scales=$scales&display_offsets=$offsets\";\n\n            var proj = '$proj';\n            var projInfo = $projInfo;\n\n            if ( proj !== 'EPSG:4326' ) {\n                var proj4def = projInfo[\"proj4\"];\n                proj4.defs(proj, proj4def);\n                var area = projInfo[\"area_of_use\"];\n                var bbox = [area[\"area_west_bound_lon\"], area[\"area_south_bound_lat\"],\n                            area[\"area_east_bound_lon\"], area[\"area_north_bound_lat\"]]\n                var projection = oljs.proj.get(proj);\n                var fromLonLat = oljs.proj.getTransform('EPSG:4326', projection);\n                var extent = oljs.extent.applyTransform(\n                    [bbox[0], bbox[1], bbox[2], bbox[3]], fromLonLat);\n                projection.setExtent(extent);\n            } else {\n                var projection = oljs.proj.get(proj);\n            }\n\n            var rda = new oljs.layer.Tile({\n              title: 'RDA',\n              opacity: 1,\n              extent: extents,\n              source: new oljs.source.TileImage({\n                      crossOrigin: null,\n                      projection: projection,\n                      extent: extents,\n\n                      tileGrid: new oljs.tilegrid.TileGrid({\n                          extent: extents,\n                          origin: [extents[0], extents[3]],\n                          resolutions: tileLayerResolutions,\n                          tileSize: [md.tileXSize, md.tileYSize],\n                      }),\n                      tileUrlFunction: function (coordinate) {\n                          if (coordinate === null) return undefined;\n                          const x = coordinate[1] + md.minTileX;\n                          const y = -(coordinate[2] + 1 - md.minTileY);\n                          if (x < md.minTileX || x > md.maxTileX) return undefined;\n                          if (y < md.minTileY || y > md.maxTileY) return undefined;\n                          return url.replace('{x}', x).replace('{y}', y);\n                      }\n                  })\n            });\n\n            var map = new oljs.Map({\n              layers: [ rda ],\n              target: '$map_id',\n              view: new oljs.View({\n                projection: projection,\n                center: $center,\n                zoom: $zoom\n              })\n            });\n        });\n    \"\"\").substitute({\n        \"map_id\": map_id,\n        \"proj\": image.proj,\n        \"projInfo\": json.dumps(proj_info),\n        \"graphId\": graph_id,\n        \"bounds\": bounds,\n        \"bands\": \",\".join(map(str, bands)),\n        \"nodeId\": node_id,\n        \"md\": json.dumps(image.metadata[\"image\"]),\n        \"georef\": json.dumps(image.metadata[\"georef\"]),\n        \"center\": center,\n        \"zoom\": zoom,\n        \"token\": gbdx.gbdx_connection.access_token,\n        \"scales\": scales,\n        \"offsets\": offsets,\n        \"url\": VIRTUAL_RDA_URL\n    })\n    display(Javascript(js))"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ncomputes the gain and offset for each band of the specified image metadata", "response": "def calc_toa_gain_offset(meta):\n    \"\"\"\n    Compute (gain, offset) tuples for each band of the specified image metadata\n    \"\"\"\n    # Set satellite index to look up cal factors\n    sat_index = meta['satid'].upper() + \"_\" + meta['bandid'].upper()\n\n    # Set scale for at sensor radiance\n    # Eq is:\n    # L = GAIN * DN * (ACF/EBW) + Offset\n    # ACF abscal factor from meta data\n    # EBW effectiveBandwidth from meta data\n    # Gain provided by abscal from const\n    # Offset provided by abscal from const\n    acf = np.asarray(meta['abscalfactor'])  # Should be nbands length\n    ebw = np.asarray(meta['effbandwidth'])  # Should be nbands length\n    gain = np.asarray(constants.DG_ABSCAL_GAIN[sat_index])\n    scale = (acf / ebw) * gain\n    offset = np.asarray(constants.DG_ABSCAL_OFFSET[sat_index])\n\n    e_sun_index = meta['satid'].upper() + \"_\" + meta['bandid'].upper()\n    e_sun = np.asarray(constants.DG_ESUN[e_sun_index])\n    sun = ephem.Sun()\n    img_obs = ephem.Observer()\n    img_obs.lon = meta['latlonhae'][1]\n    img_obs.lat = meta['latlonhae'][0]\n    img_obs.elevation = meta['latlonhae'][2]\n    img_obs.date = datetime.datetime.fromtimestamp(meta['img_datetime_obj_utc']['$date'] / 1000.0).strftime(\n        '%Y-%m-%d %H:%M:%S.%f')\n    sun.compute(img_obs)\n    d_es = sun.earth_distance\n\n    # Pull sun elevation from the image metadata\n    # theta_s can be zenith or elevation - the calc below will us either\n    # a cos or s in respectively\n    # theta_s = float(self.meta_dg.IMD.IMAGE.MEANSUNEL)\n    theta_s = 90 - float(meta['mean_sun_el'])\n    scale2 = (d_es ** 2 * np.pi) / (e_sun * np.cos(np.deg2rad(theta_s)))\n\n    # Return scaled data\n    # Radiance = Scale * Image + offset, Reflectance = Radiance * Scale2\n    return zip(scale, scale2, offset)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nmaterializing images into gbdx user buckets in s3.", "response": "def materialize(self, node=None, bounds=None, callback=None, out_format='TILE_STREAM', **kwargs):\n        \"\"\"\n          Materializes images into gbdx user buckets in s3.\n          Note: This method is only available to RDA based image classes. \n\n          Args:\n            node (str): the node in the graph to materialize\n            bounds (list): optional bbox for cropping what gets materialized in s3\n            out_format (str): VECTOR_TILE, VECTOR, TIF, TILE_STREAM\n            callback (str): a callback url like an `sns://`\n          Returns:\n            job_id (str): the job_id of the materialization \n        \"\"\"\n        kwargs.update({\n          \"node\": node,\n          \"bounds\": bounds,\n          \"callback\": callback,\n          \"out_format\": out_format\n        })\n        return self.rda._materialize(**kwargs)"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef list(self):\n        r = self.gbdx_connection.get(self._base_url)\n        raise_for_status(r)\n\n        return r.json()['tasks']", "response": "Lists available and visible GBDX tasks."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nregisters a new GBDX task.", "response": "def register(self, task_json=None, json_filename=None):\n        \"\"\"Registers a new GBDX task.\n\n        Args:\n            task_json (dict): Dictionary representing task definition.\n            json_filename (str): A full path of a file with json representing the task definition.\n            Only one out of task_json and json_filename should be provided.\n        Returns:\n            Response (str).\n        \"\"\"\n        if not task_json and not json_filename:\n            raise Exception(\"Both task json and filename can't be none.\")\n\n        if task_json and json_filename:\n            raise Exception(\"Both task json and filename can't be provided.\")\n\n        if json_filename:\n            task_json = json.load(open(json_filename, 'r'))\n\n        r = self.gbdx_connection.post(self._base_url, json=task_json)\n        raise_for_status(r)\n\n        return r.text"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_definition(self, task_name):\n        r = self.gbdx_connection.get(self._base_url + '/' + task_name)\n        raise_for_status(r)\n\n        return r.json()", "response": "Gets the definition of a registered GBDX task."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\ndelete a GBDX task.", "response": "def delete(self, task_name):\n        \"\"\"Deletes a GBDX task.\n\n        Args:\n            task_name (str): Task name.\n\n        Returns:\n            Response (str).\n        \"\"\"\n        r = self.gbdx_connection.delete(self._base_url + '/' + task_name)\n        raise_for_status(r)\n\n        return r.text"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef update(self, task_name, task_json):\n        r = self.gbdx_connection.put(self._base_url + '/' + task_name, json=task_json)\n        raise_for_status(r)\n\n        return r.json()", "response": "Updates a GBDX task."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef ingest_vectors(self, output_port_value):\n        ''' append two required tasks to the given output to ingest to VS\n        '''\n        # append two tasks to self['definition']['tasks']\n        ingest_task = Task('IngestItemJsonToVectorServices')\n        ingest_task.inputs.items = output_port_value\n        ingest_task.impersonation_allowed = True\n\n        stage_task = Task('StageDataToS3')\n        stage_task.inputs.destination = 's3://{vector_ingest_bucket}/{recipe_id}/{run_id}/{task_name}'\n        stage_task.inputs.data = ingest_task.outputs.result.value\n\n        self.definition['tasks'].append(ingest_task.generate_task_workflow_json())\n        self.definition['tasks'].append(stage_task.generate_task_workflow_json())", "response": "Ingests the given output port to VS and adds two required tasks to the given output to VS"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef get(self, recipe_id):\n        '''\n        Retrieves an AnswerFactory Recipe by id\n\n        Args:\n            recipe_id The id of the recipe\n\n        Returns:\n            A JSON representation of the recipe\n        '''\n        self.logger.debug('Retrieving recipe by id: ' + recipe_id)\n        url = '%(base_url)s/recipe/%(recipe_id)s' % {\n            'base_url': self.base_url, 'recipe_id': recipe_id\n        }\n        r = self.gbdx_connection.get(url)\n        r.raise_for_status()\n        return r.json()", "response": "Retrieves an AnswerFactory Recipe by id"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nsaves an AnswerFactory Recipe object to the server.", "response": "def save(self, recipe):\n        '''\n        Saves an AnswerFactory Recipe\n\n        Args:\n            recipe (dict): Dictionary specifying a recipe\n\n        Returns:\n            AnswerFactory Recipe id\n        '''\n        # test if this is a create vs. an update\n        if 'id' in recipe and recipe['id'] is not None:\n            # update -> use put op\n            self.logger.debug(\"Updating existing recipe: \" + json.dumps(recipe))\n            url = '%(base_url)s/recipe/json/%(recipe_id)s' % {\n                'base_url': self.base_url, 'recipe_id': recipe['id']\n            }\n            r = self.gbdx_connection.put(url, json=recipe)\n            try:\n                r.raise_for_status()\n            except:\n                print(r.text)\n                raise\n            return recipe['id']\n        else:\n            # create -> use post op\n            self.logger.debug(\"Creating new recipe: \" + json.dumps(recipe))\n            url = '%(base_url)s/recipe/json' % {\n                'base_url': self.base_url\n            }\n            r = self.gbdx_connection.post(url, json=recipe)\n            try:\n                r.raise_for_status()\n            except:\n                print(r.text)\n                raise\n            recipe_json = r.json()\n            return recipe_json['id']"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nsaves an AnswerFactory Project to the AnswerFactory Project.", "response": "def save(self, project):\n        '''\n        Saves an AnswerFactory Project\n\n        Args:\n            project (dict): Dictionary specifying an AnswerFactory Project.\n\n        Returns:\n            AnswerFactory Project id\n        '''\n\n        # test if this is a create vs. an update\n        if 'id' in project and project['id'] is not None:\n            # update -> use put op\n            self.logger.debug('Updating existing project: ' + json.dumps(project))\n            url = '%(base_url)s/%(project_id)s' % {\n                'base_url': self.base_url, 'project_id': project['id']\n            }\n            r = self.gbdx_connection.put(url, json=project)\n            try:\n                r.raise_for_status()\n            except:\n                print(r.text)\n                raise\n            # updates only get the Accepted response -> return the original project id\n            return project['id']\n        else:\n            self.logger.debug('Creating new project: ' + json.dumps(project))\n            # create -> use post op\n            url = self.base_url\n            r = self.gbdx_connection.post(url, json=project)\n            try:\n                r.raise_for_status()\n            except:\n                print(r.text)\n                raise\n            project_json = r.json()\n            # create returns the saved project -> return the project id that's saved\n            return project_json['id']"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ndeletes a specific resource from the database by id.", "response": "def delete(self, project_id):\n        '''\n        Deletes a project by id\n\n        Args:\n             project_id: The project id to delete\n\n        Returns:\n             Nothing\n        '''\n        self.logger.debug('Deleting project by id: ' + project_id)\n        url = '%(base_url)s/%(project_id)s' % {\n            'base_url': self.base_url, 'project_id': project_id\n        }\n        r = self.gbdx_connection.delete(url)\n        r.raise_for_status()"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef paint(self):\n        # TODO Figure out why i cant use some of these props\n        snippet = {\n            'line-opacity': VectorStyle.get_style_value(self.opacity),\n            'line-color': VectorStyle.get_style_value(self.color),\n            #'line-cap': self.cap,\n            #'line-join': self.join,\n            'line-width': VectorStyle.get_style_value(self.width),\n            #'line-gap-width': self.gap_width,\n            #'line-blur': self.blur,\n        }\n        if self.translate:\n            snippet['line-translate'] = self.translate\n\n        if self.dasharray:\n            snippet['line-dasharray'] = VectorStyle.get_style_value(self.dasharray)\n\n        return snippet", "response": "Returns a javascript snippet suitable for use as a mapbox - gl line paint entry\n           "}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreturn a javascript snippet suitable for use as a mapbox - gl fill paint entry", "response": "def paint(self):\n        \"\"\"\n        Renders a javascript snippet suitable for use as a mapbox-gl fill paint entry\n\n        Returns:\n            A dict that can be converted to a mapbox-gl javascript paint snippet\n        \"\"\"\n        snippet = {\n            'fill-opacity': VectorStyle.get_style_value(self.opacity),\n            'fill-color': VectorStyle.get_style_value(self.color),\n            'fill-outline-color': VectorStyle.get_style_value(self.outline_color)\n        }\n        if self.translate:\n            snippet['fill-translate'] = self.translate\n\n        return snippet"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nreturning a javascript snippet suitable for use as a mapbox - gl fill - extrusion paint entry", "response": "def paint(self):\n        \"\"\"\n        Renders a javascript snippet suitable for use as a mapbox-gl fill-extrusion paint entry\n\n        Returns:\n            A dict that can be converted to a mapbox-gl javascript paint snippet\n        \"\"\"\n        snippet = {\n            'fill-extrusion-opacity': VectorStyle.get_style_value(self.opacity),\n            'fill-extrusion-color': VectorStyle.get_style_value(self.color),\n            'fill-extrusion-base': VectorStyle.get_style_value(self.base),\n            'fill-extrusion-height': VectorStyle.get_style_value(self.height)\n        }\n        if self.translate:\n            snippet['fill-extrusion-translate'] = self.translate\n\n        return snippet"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreturns a javascript snippet suitable for use as a mapbox - gl heatmap paint entry", "response": "def paint(self):\n        \"\"\"\n        Renders a javascript snippet suitable for use as a mapbox-gl heatmap paint entry\n\n        Returns:\n            A dict that can be converted to a mapbox-gl javascript paint snippet\n        \"\"\"\n        snippet = {\n            'heatmap-radius': VectorStyle.get_style_value(self.radius),\n            'heatmap-opacity': VectorStyle.get_style_value(self.opacity),\n            'heatmap-color': VectorStyle.get_style_value(self.color),\n            'heatmap-intensity': VectorStyle.get_style_value(self.intensity),\n            'heatmap-weight': VectorStyle.get_style_value(self.weight)\n        }\n\n        return snippet"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ncreate a new item in the vector service.", "response": "def create(self,vectors):\n        \"\"\" Create a vectors in the vector service.\n\n        Args:\n            vectors: A single geojson vector or a list of geojson vectors. Item_type and ingest_source are required.\n\n        Returns:\n            (list): IDs of the vectors created\n\n        Example:\n            >>> vectors.create(\n            ...     {\n            ...         \"type\": \"Feature\",\n            ...         \"geometry\": {\n            ...             \"type\": \"Point\",\n            ...             \"coordinates\": [1.0,1.0]\n            ...         },\n            ...         \"properties\": {\n            ...             \"text\" : \"item text\",\n            ...             \"name\" : \"item name\",\n            ...             \"item_type\" : \"type\",\n            ...             \"ingest_source\" : \"source\",\n            ...             \"attributes\" : {\n            ...                 \"latitude\" : 1,\n            ...                 \"institute_founded\" : \"2015-07-17\",\n            ...                 \"mascot\" : \"moth\"\n            ...             }\n            ...         }\n            ...     }\n            ... )\n\n        \"\"\"\n        if type(vectors) is dict:\n            vectors = [vectors]\n\n        # validate they all have item_type and ingest_source in properties\n        for vector in vectors:\n            if not 'properties' in list(vector.keys()):\n                raise Exception('Vector does not contain \"properties\" field.')\n\n            if not 'item_type' in list(vector['properties'].keys()):\n                raise Exception('Vector does not contain \"item_type\".')\n\n            if not 'ingest_source' in list(vector['properties'].keys()):\n                raise Exception('Vector does not contain \"ingest_source\".')\n\n        r = self.gbdx_connection.post(self.create_url, data=json.dumps(vectors))\n        r.raise_for_status()\n        return r.json()"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef create_from_wkt(self, wkt, item_type, ingest_source, **attributes):\n        '''\n        Create a single vector in the vector service\n\n        Args:\n            wkt (str): wkt representation of the geometry\n            item_type (str): item_type of the vector\n            ingest_source (str): source of the vector\n            attributes: a set of key-value pairs of attributes\n\n        Returns:\n            id (str): string identifier of the vector created\n        '''\n        # verify the \"depth\" of the attributes is single layer\n\n        geojson = load_wkt(wkt).__geo_interface__\n        vector = {\n            'type': \"Feature\",\n            'geometry': geojson,\n            'properties': {\n                'item_type': item_type,\n                'ingest_source': ingest_source,\n                'attributes': attributes\n            }\n        }\n\n        return self.create(vector)[0]", "response": "Create a single vector in the vector service from a WKT string"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nretrieves a vector. Not usually necessary because searching is the best way to find & get stuff.", "response": "def get(self, ID, index='vector-web-s'):\n        '''Retrieves a vector.  Not usually necessary because searching is the best way to find & get stuff.\n\n        Args:\n            ID (str): ID of the vector object\n            index (str): Optional.  Index the object lives in.  defaults to 'vector-web-s'\n\n        Returns:\n            record (dict): A dict object identical to the json representation of the catalog record\n        '''\n\n        url = self.get_url % index\n        r = self.gbdx_connection.get(url + ID)\n        r.raise_for_status()\n        return r.json()"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nperform a query using the QUERY API.", "response": "def query(self, searchAreaWkt, query, count=100, ttl='5m', index=default_index):\n        '''\n        Perform a vector services query using the QUERY API\n        (https://gbdxdocs.digitalglobe.com/docs/vs-query-list-vector-items-returns-default-fields)\n\n        Args:\n            searchAreaWkt: WKT Polygon of area to search\n            query: Elastic Search query\n            count: Maximum number of results to return\n            ttl: Amount of time for each temporary vector page to exist\n\n        Returns:\n            List of vector results\n    \n        '''\n        if count < 1000:\n            # issue a single page query\n            search_area_polygon = from_wkt(searchAreaWkt)\n            left, lower, right, upper = search_area_polygon.bounds\n\n            params = {\n                \"q\": query,\n                \"count\": min(count,1000),\n                \"left\": left,\n                \"right\": right,\n                \"lower\": lower,\n                \"upper\": upper\n            }\n\n            url = self.query_index_url % index if index else self.query_url\n            r = self.gbdx_connection.get(url, params=params)\n            r.raise_for_status()\n            return r.json()\n        else:\n            return list(self.query_iteratively(searchAreaWkt, query, count, ttl, index))"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nperforms a query using the QUERY API.", "response": "def query_iteratively(self, searchAreaWkt, query, count=100, ttl='5m', index=default_index):\n        '''\n        Perform a vector services query using the QUERY API\n        (https://gbdxdocs.digitalglobe.com/docs/vs-query-list-vector-items-returns-default-fields)\n\n        Args:\n            searchAreaWkt: WKT Polygon of area to search\n            query: Elastic Search query\n            count: Maximum number of results to return\n            ttl: Amount of time for each temporary vector page to exist\n\n        Returns:\n            generator of vector results\n    \n        '''\n\n        search_area_polygon = from_wkt(searchAreaWkt)\n        left, lower, right, upper = search_area_polygon.bounds\n\n        params = {\n            \"q\": query,\n            \"count\": min(count,1000),\n            \"ttl\": ttl,\n            \"left\": left,\n            \"right\": right,\n            \"lower\": lower,\n            \"upper\": upper\n        }\n\n        # initialize paging request\n        url = self.query_index_page_url % index if index else self.query_page_url\n        r = self.gbdx_connection.get(url, params=params)\n        r.raise_for_status()\n        page = r.json()\n        paging_id = page['next_paging_id']\n        item_count = int(page['item_count'])\n        data = page['data']\n\n\n        num_results = 0\n        for vector in data:\n          num_results += 1\n          if num_results > count: break\n          yield vector\n\n        if num_results == count:\n          return\n\n\n        # get vectors from each page\n        while paging_id and item_count > 0 and num_results < count:\n\n          headers = {'Content-Type':'application/x-www-form-urlencoded'}\n          data = {\n              \"pagingId\": paging_id,\n              \"ttl\": ttl\n          }\n\n          r = self.gbdx_connection.post(self.page_url, headers=headers, data=data)\n          r.raise_for_status()\n          page = r.json()\n          paging_id = page['next_paging_id']\n          item_count = int(page['item_count'])\n          data = page['data']\n\n          for vector in data:\n              num_results += 1\n              if num_results > count: break\n              yield vector"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\naggregate results of a query into buckets defined by the 'agg_def' parameter. The aggregations are represented by dicts containing a 'name' key and a 'terms' key holding a list of the aggregation buckets. Each bucket element is a dict containing a 'term' key containing the term used for this bucket, a 'count' key containing the count of items that match this bucket, and an 'aggregations' key containing any child aggregations. Args: searchAreaWkt (str): wkt representation of the geometry agg_def (str or AggregationDef): the aggregation definitions query (str): a valid Elasticsearch query string to constrain the items going into the aggregation start_date (str): either an ISO-8601 date string or a 'now' expression (e.g. \"now-6d\" or just \"now\") end_date (str): either an ISO-8601 date string or a 'now' expression (e.g. \"now-6d\" or just \"now\") count (int): the number of buckets to include in the aggregations (the top N will be returned) index (str): the index (or alias or wildcard index expression) to run aggregations against, set to None for the entire set of vector indexes Returns: results (list): A (usually single-element) list of dict objects containing the aggregation results.", "response": "def aggregate_query(self, searchAreaWkt, agg_def, query=None, start_date=None, end_date=None, count=10, index=default_index):\n        \"\"\"Aggregates results of a query into buckets defined by the 'agg_def' parameter.  The aggregations are\n        represented by dicts containing a 'name' key and a 'terms' key holding a list of the aggregation buckets.\n        Each bucket element is a dict containing a 'term' key containing the term used for this bucket, a 'count' key\n        containing the count of items that match this bucket, and an 'aggregations' key containing any child\n        aggregations.\n\n        Args:\n            searchAreaWkt (str): wkt representation of the geometry\n            agg_def (str or AggregationDef): the aggregation definitions\n            query (str): a valid Elasticsearch query string to constrain the items going into the aggregation\n            start_date (str): either an ISO-8601 date string or a 'now' expression (e.g. \"now-6d\" or just \"now\")\n            end_date (str): either an ISO-8601 date string or a 'now' expression (e.g. \"now-6d\" or just \"now\")\n            count (int): the number of buckets to include in the aggregations (the top N will be returned)\n            index (str): the index (or alias or wildcard index expression) to run aggregations against, set to None for the entire set of vector indexes\n\n        Returns:\n            results (list): A (usually single-element) list of dict objects containing the aggregation results.\n        \"\"\"\n\n        geojson = load_wkt(searchAreaWkt).__geo_interface__\n        aggs_str = str(agg_def) # could be string or AggregationDef\n\n        params = {\n            \"count\": count,\n            \"aggs\": aggs_str\n        }\n\n        if query:\n            params['query'] = query\n        if start_date:\n            params['start_date'] = start_date\n        if end_date:\n            params['end_date'] = end_date\n\n        url = self.aggregations_by_index_url % index if index else self.aggregations_url\n\n        r = self.gbdx_connection.post(url, params=params, json=geojson)\n        r.raise_for_status()\n\n        return r.json(object_pairs_hook=OrderedDict)['aggregations']"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nrender a mapbox gl map from a vector service query", "response": "def tilemap(self, query, styles={}, bbox=[-180,-90,180,90], zoom=16, \n                      api_key=os.environ.get('MAPBOX_API_KEY', None), \n                      image=None, image_bounds=None,\n                      index=\"vector-user-provided\", name=\"GBDX_Task_Output\", **kwargs):\n        \"\"\"\n          Renders a mapbox gl map from a vector service query\n        \"\"\"\n        try:\n            from IPython.display import display\n        except:\n            print(\"IPython is required to produce maps.\")\n            return\n\n        assert api_key is not None, \"No Mapbox API Key found. You can either pass in a token or set the MAPBOX_API_KEY environment variable.\"\n\n        wkt = box(*bbox).wkt\n        features = self.query(wkt, query, index=index)\n\n        union = cascaded_union([shape(f['geometry']) for f in features])\n        lon, lat = union.centroid.coords[0]\n        url = 'https://vector.geobigdata.io/insight-vector/api/mvt/{z}/{x}/{y}?';\n        url += 'q={}&index={}'.format(query, index);\n\n        if styles is not None and not isinstance(styles, list):\n            styles = [styles]\n\n        map_id = \"map_{}\".format(str(int(time.time())))\n        map_data = VectorTileLayer(url, source_name=name, styles=styles, **kwargs)\n        image_layer = self._build_image_layer(image, image_bounds)\n\n        template = BaseTemplate(map_id, **{\n            \"lat\": lat,\n            \"lon\": lon,\n            \"zoom\": zoom,\n            \"datasource\": json.dumps(map_data.datasource),\n            \"layers\": json.dumps(map_data.layers),\n            \"image_layer\": image_layer,\n            \"mbkey\": api_key,\n            \"token\": self.gbdx_connection.access_token\n        })\n        \n        template.inject()"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef map(self, features=None, query=None, styles=None,\n                  bbox=[-180,-90,180,90], zoom=10, center=None, \n                  image=None, image_bounds=None, cmap='viridis',\n                  api_key=os.environ.get('MAPBOX_API_KEY', None), **kwargs):\n        \"\"\"\n          Renders a mapbox gl map from a vector service query or a list of geojson features\n\n          Args:\n            features (list): a list of geojson features\n            query (str): a VectorServices query \n            styles (list): a list of VectorStyles to apply to the features  \n            bbox (list): a bounding box to query for features ([minx, miny, maxx, maxy])\n            zoom (int): the initial zoom level of the map\n            center (list): a list of [lat, lon] used to center the map\n            api_key (str): a valid Mapbox API key\n            image (dict): a CatalogImage or a ndarray\n            image_bounds (list): a list of bounds for image positioning \n            Use outside of GBDX Notebooks requires a MapBox API key, sign up for free at https://www.mapbox.com/pricing/\n            Pass the key using the `api_key` keyword or set an environmental variable called `MAPBOX API KEY`\n            cmap (str): MatPlotLib colormap to use for rendering single band images (default: viridis)\n        \"\"\"\n        try:\n            from IPython.display import display\n        except:\n            print(\"IPython is required to produce maps.\")\n            return\n\n        assert api_key is not None, \"No Mapbox API Key found. You can either pass in a key or set the MAPBOX_API_KEY environment variable. Use outside of GBDX Notebooks requires a MapBox API key, sign up for free at https://www.mapbox.com/pricing/\"\n        if features is None and query is not None:\n            wkt = box(*bbox).wkt\n            features = self.query(wkt, query, index=None)\n        elif features is None and query is None and image is None:\n            print('Must provide either a list of features or a query or an image')\n            return\n\n        if styles is not None and not isinstance(styles, list):\n            styles = [styles]\n\n        geojson = {\"type\":\"FeatureCollection\", \"features\": features}\n\n        if center is None and features is not None:\n            union = cascaded_union([shape(f['geometry']) for f in features])\n            lon, lat = union.centroid.coords[0]\n        elif center is None and image is not None:\n            try:\n                lon, lat = shape(image).centroid.coords[0]\n            except:\n                lon, lat = box(*image_bounds).centroid.coords[0]\n        else:\n            lat, lon = center\n\n        map_id = \"map_{}\".format(str(int(time.time())))\n        map_data = VectorGeojsonLayer(geojson, styles=styles, **kwargs)\n        image_layer = self._build_image_layer(image, image_bounds, cmap)\n\n        template = BaseTemplate(map_id, **{\n            \"lat\": lat, \n            \"lon\": lon, \n            \"zoom\": zoom,\n            \"datasource\": json.dumps(map_data.datasource),\n            \"layers\": json.dumps(map_data.layers),\n            \"image_layer\": image_layer,\n            \"mbkey\": api_key,\n            \"token\": 'dummy'\n        })\n        template.inject()", "response": "Renders a Mapbox gl map from a vector service query or a list of geojson features"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nread data from a dask array and returns the computed ndarray matching the given bands", "response": "def read(self, bands=None, **kwargs):\n        \"\"\"Reads data from a dask array and returns the computed ndarray matching the given bands\n\n        Args:\n            bands (list): band indices to read from the image. Returns bands in the order specified in the list of bands.\n\n        Returns:\n            ndarray: a numpy array of image data\n        \"\"\"\n        arr = self\n        if bands is not None:\n            arr = self[bands, ...]\n        return arr.compute(scheduler=threaded_get)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nget a random window of a given shape from within an image.", "response": "def randwindow(self, window_shape):\n        \"\"\"Get a random window of a given shape from within an image\n\n        Args:\n            window_shape (tuple): The desired shape of the returned image as (height, width) in pixels.\n\n        Returns:\n            image: a new image object of the specified shape and same type\n        \"\"\"\n        row = random.randrange(window_shape[0], self.shape[1])\n        col = random.randrange(window_shape[1], self.shape[2])\n        return self[:, row-window_shape[0]:row, col-window_shape[1]:col]"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef iterwindows(self, count=64, window_shape=(256, 256)):\n        if count is None:\n            while True:\n                yield self.randwindow(window_shape)\n        else:\n            for i in xrange(count):\n                yield self.randwindow(window_shape)", "response": "Iterate over random windows of an image\n           "}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef window_at(self, geom, window_shape):\n        # Centroids of the input geometry may not be centered on the object.\n        # For a covering image we use the bounds instead.\n        # This is also a workaround for issue 387.\n        y_size, x_size = window_shape[0], window_shape[1]\n        bounds = box(*geom.bounds)\n        px = ops.transform(self.__geo_transform__.rev, bounds).centroid\n        miny, maxy = int(px.y - y_size/2), int(px.y + y_size/2)\n        minx, maxx = int(px.x - x_size/2), int(px.x + x_size/2)\n        _, y_max, x_max = self.shape\n        if minx < 0 or miny < 0 or maxx > x_max or maxy > y_max:\n            raise ValueError(\"Input geometry resulted in a window outside of the image\")\n        return self[:, miny:maxy, minx:maxx]", "response": "Return a subsetted window of a given size centered on a geometry object."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef window_cover(self, window_shape, pad=True):\n        size_y, size_x = window_shape[0], window_shape[1]\n        _ndepth, _nheight, _nwidth = self.shape\n        nheight, _m = divmod(_nheight, size_y)\n        nwidth, _n = divmod(_nwidth, size_x)\n\n        img = self\n        if pad is True:\n            new_height, new_width = _nheight, _nwidth\n            if _m != 0:\n                new_height = (nheight + 1) * size_y\n            if _n != 0:\n                new_width = (nwidth + 1) * size_x\n            if (new_height, new_width) != (_nheight, _nwidth):\n                bounds = box(0, 0, new_width, new_height)\n                geom = ops.transform(self.__geo_transform__.fwd, bounds)\n                img = self[geom]\n\n        row_lims = range(0, img.shape[1], size_y)\n        col_lims = range(0, img.shape[2], size_x)\n        for maxy, maxx in product(row_lims, col_lims):\n            reg = img[:, maxy:(maxy + size_y), maxx:(maxx + size_x)]\n            if pad is False:\n                if reg.shape[1:] == window_shape:\n                    yield reg\n            else:\n                yield reg", "response": "Iterate over a grid of windows of a specified shape covering an image."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef pxbounds(self, geom, clip=False):\n\n        try:\n            if isinstance(geom, dict):\n                if 'geometry' in geom:\n                    geom = shape(geom['geometry'])\n                else:\n                    geom = shape(geom)\n            elif isinstance(geom, BaseGeometry):\n                geom = shape(geom)\n            else:\n                geom = wkt.loads(geom)\n        except:\n            raise TypeError (\"Invalid geometry object\")\n\n        # if geometry doesn't overlap the image, return an error\n        if geom.disjoint(shape(self)):\n            raise ValueError(\"Geometry outside of image bounds\")\n        # clip to pixels within the image\n        (xmin, ymin, xmax, ymax) = ops.transform(self.__geo_transform__.rev, geom).bounds\n        _nbands, ysize, xsize = self.shape\n        if clip:\n            xmin = max(xmin, 0)\n            ymin = max(ymin, 0)\n            xmax = min(xmax, xsize)\n            ymax = min(ymax, ysize)\n\n        return (xmin, ymin, xmax, ymax)", "response": "Returns the bounds of an image in pixel coordinates."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef geotiff(self, **kwargs):\n\n        if 'proj' not in kwargs:\n            kwargs['proj'] = self.proj\n        return to_geotiff(self, **kwargs)", "response": "Creates a geotiff file on the filesystem holding the neccesary data for the tif\n           "}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nwarping the image into a new image with the same shape as the image.", "response": "def warp(self, dem=None, proj=\"EPSG:4326\", **kwargs):\n        \"\"\"Delayed warp across an entire AOI or Image\n\n        Creates a new dask image by deferring calls to the warp_geometry on chunks\n\n        Args:\n            dem (ndarray): optional. A DEM for warping to specific elevation planes\n            proj (str): optional. An EPSG proj string to project the image data into (\"EPSG:32612\")\n\n        Returns:\n            daskarray: a warped image as deferred image array\n        \"\"\"\n        try:\n            img_md = self.rda.metadata[\"image\"]\n            x_size = img_md[\"tileXSize\"]\n            y_size = img_md[\"tileYSize\"]\n        except (AttributeError, KeyError):\n            x_size = kwargs.get(\"chunk_size\", 256)\n            y_size = kwargs.get(\"chunk_size\", 256)\n\n        # Create an affine transform to convert between real-world and pixels\n        if self.proj is None:\n            from_proj = \"EPSG:4326\"\n        else:\n            from_proj = self.proj\n\n        try:\n            # NOTE: this only works on images that have rda rpcs metadata\n            center = wkt.loads(self.rda.metadata[\"image\"][\"imageBoundsWGS84\"]).centroid\n            g = box(*(center.buffer(self.rda.metadata[\"rpcs\"][\"gsd\"] / 2).bounds))\n            tfm = partial(pyproj.transform, pyproj.Proj(init=\"EPSG:4326\"), pyproj.Proj(init=proj))\n            gsd = kwargs.get(\"gsd\", ops.transform(tfm, g).area ** 0.5)\n            current_bounds = wkt.loads(self.rda.metadata[\"image\"][\"imageBoundsWGS84\"]).bounds\n        except (AttributeError, KeyError, TypeError):\n            tfm = partial(pyproj.transform, pyproj.Proj(init=self.proj), pyproj.Proj(init=proj))\n            gsd = kwargs.get(\"gsd\", (ops.transform(tfm, shape(self)).area / (self.shape[1] * self.shape[2])) ** 0.5 )\n            current_bounds = self.bounds\n\n        tfm = partial(pyproj.transform, pyproj.Proj(init=from_proj), pyproj.Proj(init=proj))\n        itfm = partial(pyproj.transform, pyproj.Proj(init=proj), pyproj.Proj(init=from_proj))\n        output_bounds = ops.transform(tfm, box(*current_bounds)).bounds\n        gtf = Affine.from_gdal(output_bounds[0], gsd, 0.0, output_bounds[3], 0.0, -1 * gsd)\n\n        ll = ~gtf * (output_bounds[:2])\n        ur = ~gtf * (output_bounds[2:])\n        x_chunks = int((ur[0] - ll[0]) / x_size) + 1\n        y_chunks = int((ll[1] - ur[1]) / y_size) + 1\n\n        num_bands = self.shape[0]\n\n        try:\n            dtype = RDA_TO_DTYPE[img_md[\"dataType\"]]\n        except:\n            dtype = 'uint8'\n\n        daskmeta = {\n            \"dask\": {},\n            \"chunks\": (num_bands, y_size, x_size),\n            \"dtype\": dtype,\n            \"name\": \"warp-{}\".format(self.name),\n            \"shape\": (num_bands, y_chunks * y_size, x_chunks * x_size)\n        }\n\n        def px_to_geom(xmin, ymin):\n            xmax = int(xmin + x_size)\n            ymax = int(ymin + y_size)\n            bounds = list((gtf * (xmin, ymax)) + (gtf * (xmax, ymin)))\n            return box(*bounds)\n\n        full_bounds = box(*output_bounds)\n\n        dasks = []\n        if isinstance(dem, GeoDaskImage):\n            if dem.proj != proj:\n                dem = dem.warp(proj=proj, dem=dem)\n            dasks.append(dem.dask)\n\n        for y in xrange(y_chunks):\n            for x in xrange(x_chunks):\n                xmin = x * x_size\n                ymin = y * y_size\n                geometry = px_to_geom(xmin, ymin)\n                daskmeta[\"dask\"][(daskmeta[\"name\"], 0, y, x)] = (self._warp, geometry, gsd, dem, proj, dtype, 5)\n        daskmeta[\"dask\"], _ = optimization.cull(HighLevelGraph.merge(daskmeta[\"dask\"], *dasks), list(daskmeta[\"dask\"].keys()))\n\n        gi = mapping(full_bounds)\n        gt = AffineTransform(gtf, proj)\n        image = GeoDaskImage(daskmeta, __geo_interface__ = gi, __geo_transform__ = gt)\n        return image[box(*output_bounds)]"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nfind supported geometry types parses them and returns the bbox", "response": "def _parse_geoms(self, **kwargs):\n        \"\"\" Finds supported geometry types, parses them and returns the bbox \"\"\"\n        bbox = kwargs.get('bbox', None)\n        wkt_geom = kwargs.get('wkt', None)\n        geojson = kwargs.get('geojson', None)\n        if bbox is not None:\n            g = box(*bbox)\n        elif wkt_geom is not None:\n            g = wkt.loads(wkt_geom)\n        elif geojson is not None:\n            g = shape(geojson)\n        else:\n            return None\n        if self.proj is None:\n            return g\n        else:\n            return self._reproject(g, from_proj=kwargs.get('from_proj', 'EPSG:4326'))"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef load_url(url, shape=(8, 256, 256)):\n    thread_id = threading.current_thread().ident\n    _curl = _curl_pool[thread_id]\n    _curl.setopt(_curl.URL, url)\n    _curl.setopt(pycurl.NOSIGNAL, 1)\n    _, ext = os.path.splitext(urlparse(url).path)\n    with NamedTemporaryFile(prefix=\"gbdxtools\", suffix=\".\"+ext, delete=False) as temp: # TODO: apply correct file extension\n        _curl.setopt(_curl.WRITEDATA, temp.file)\n        _curl.perform()\n        code = _curl.getinfo(pycurl.HTTP_CODE)\n        try:\n            if(code != 200):\n                raise TypeError(\"Request for {} returned unexpected error code: {}\".format(url, code))\n            arr = np.rollaxis(imread(temp), 2, 0)\n        except Exception as e:\n            print(e)\n            temp.seek(0)\n            print(temp.read())\n            arr = np.zeros(shape, dtype=np.uint8)\n            _curl.close()\n            del _curl_pool[thread_id]\n        finally:\n            temp.file.flush()\n            temp.close()\n            os.remove(temp.name)\n        return arr", "response": "Loads a geotiff url inside a thread and returns as an ndarray"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nconvert mercator bbox to tile index limits", "response": "def _tile_coords(self, bounds):\n        \"\"\" convert mercator bbox to tile index limits \"\"\"\n        tfm = partial(pyproj.transform,\n                      pyproj.Proj(init=\"epsg:3857\"),\n                      pyproj.Proj(init=\"epsg:4326\"))\n        bounds = ops.transform(tfm, box(*bounds)).bounds\n\n        # because tiles have a common corner, the tiles that cover a\n        # given tile includes the adjacent neighbors.\n        # https://github.com/mapbox/mercantile/issues/84#issuecomment-413113791\n\n        west, south, east, north = bounds\n        epsilon = 1.0e-10\n        if east != west and north != south:\n            # 2D bbox\n            # shrink the bounds a small amount so that\n            # shapes/tiles round trip.\n            west += epsilon\n            south += epsilon\n            east -= epsilon\n            north -= epsilon\n\n        params = [west, south, east, north, [self.zoom_level]]\n        tile_coords = [(tile.x, tile.y) for tile in mercantile.tiles(*params)]\n        xtiles, ytiles = zip(*tile_coords)\n        minx = min(xtiles)\n        miny = min(ytiles)\n        maxx = max(xtiles) \n        maxy = max(ytiles)\n        return minx, miny, maxx, maxy"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get(self, key, default=None):\n        if key not in self._ports:\n            self._ports[key] = self._port_template(key)\n        return self._vals.get(key, default)", "response": "Get a key from the cache."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef load_url(url, token, shape=(8, 256, 256)):\n    _, ext = os.path.splitext(urlparse(url).path)\n    success = False\n    for i in xrange(MAX_RETRIES):\n        thread_id = threading.current_thread().ident\n        _curl = _curl_pool[thread_id]\n        _curl.setopt(_curl.URL, url)\n        _curl.setopt(pycurl.NOSIGNAL, 1)\n        _curl.setopt(pycurl.HTTPHEADER, ['Authorization: Bearer {}'.format(token)])\n        with NamedTemporaryFile(prefix=\"gbdxtools\", suffix=ext, delete=False) as temp: # TODO: apply correct file extension\n            _curl.setopt(_curl.WRITEDATA, temp.file)\n            _curl.perform()\n            code = _curl.getinfo(pycurl.HTTP_CODE)\n            try:\n                if(code != 200):\n                    raise TypeError(\"Request for {} returned unexpected error code: {}\".format(url, code))\n                temp.file.flush()\n                temp.close()\n                arr = imread(temp.name)\n                if len(arr.shape) == 3:\n                    arr = np.rollaxis(arr, 2, 0)\n                else:\n                    arr = np.expand_dims(arr, axis=0)\n                success = True\n                return arr\n            except Exception as e:\n                _curl.close()\n                del _curl_pool[thread_id]\n            finally:\n                temp.close()\n                os.remove(temp.name)\n\n    if success is False:\n        raise TypeError(\"Request for {} returned unexpected error code: {}\".format(url, code))\n    return arr", "response": "Loads a geotiff url inside a thread and returns as an ndarray"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef launch(self, workflow):\n\n        # hit workflow api\n        try:\n            r = self.gbdx_connection.post(self.workflows_url, json=workflow)\n            try:\n                r.raise_for_status()\n            except:\n                print(\"GBDX API Status Code: %s\" % r.status_code)\n                print(\"GBDX API Response: %s\" % r.text)\n                r.raise_for_status()\n            workflow_id = r.json()['id']\n            return workflow_id\n        except TypeError:\n            self.logger.debug('Workflow not launched!')", "response": "Launches GBDX workflow.\n\n        Args:\n            workflow (dict): Dictionary specifying workflow tasks.\n\n        Returns:\n            Workflow id (str)."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef status(self, workflow_id):\n        self.logger.debug('Get status of workflow: ' + workflow_id)\n        url = '%(wf_url)s/%(wf_id)s' % {\n            'wf_url': self.workflows_url, 'wf_id': workflow_id\n        }\n        r = self.gbdx_connection.get(url)\n        r.raise_for_status()\n        return r.json()['state']", "response": "Checks the status of a specific node in the cluster."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ngets the stdout of a particular task.", "response": "def get_stdout(self, workflow_id, task_id):\n        \"\"\"Get stdout for a particular task.\n\n         Args:\n             workflow_id (str): Workflow id.\n             task_id (str): Task id.\n\n         Returns:\n             Stdout of the task (string).\n        \"\"\"\n        url = '%(wf_url)s/%(wf_id)s/tasks/%(task_id)s/stdout' % {\n            'wf_url': self.workflows_url, 'wf_id': workflow_id, 'task_id': task_id\n        }\n        r = self.gbdx_connection.get(url)\n        r.raise_for_status()\n\n        return r.text"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef cancel(self, workflow_id):\n        self.logger.debug('Canceling workflow: ' + workflow_id)\n        url = '%(wf_url)s/%(wf_id)s/cancel' % {\n            'wf_url': self.workflows_url, 'wf_id': workflow_id\n        }\n        r = self.gbdx_connection.post(url, data='')\n        r.raise_for_status()", "response": "Cancels a running workflow."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef launch_batch_workflow(self, batch_workflow):\n\n        # hit workflow api\n        url = '%(base_url)s/batch_workflows' % {\n            'base_url': self.base_url\n        }\n        try:\n            r = self.gbdx_connection.post(url, json=batch_workflow)\n            batch_workflow_id = r.json()['batch_workflow_id']\n            return batch_workflow_id\n        except TypeError as e:\n            self.logger.debug('Batch Workflow not launched, reason: {0}'.format(e))", "response": "Launches GBDX batch workflow."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef batch_workflow_status(self, batch_workflow_id):\n        self.logger.debug('Get status of batch workflow: ' + batch_workflow_id)\n        url = '%(base_url)s/batch_workflows/%(batch_id)s' % {\n            'base_url': self.base_url, 'batch_id': batch_workflow_id\n        }\n        r = self.gbdx_connection.get(url)\n\n        return r.json()", "response": "Checks GBDX batch workflow status."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef batch_workflow_cancel(self, batch_workflow_id):\n        self.logger.debug('Cancel batch workflow: ' + batch_workflow_id)\n        url = '%(base_url)s/batch_workflows/%(batch_id)s/cancel' % {\n            'base_url': self.base_url, 'batch_id': batch_workflow_id\n        }\n        r = self.gbdx_connection.post(url)\n\n        return r.json()", "response": "Cancels GBDX batch workflow."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef search(self, lookback_h=12, owner=None, state=\"all\"):\n        postdata = {\n            \"lookback_h\": lookback_h,\n            \"state\": state\n        }\n\n        if owner is not None:\n            postdata['owner'] = owner\n\n        url = \"{}/workflows/search\".format(self.base_url)\n        headers = {'Content-Type':'application/json'}\n        r = self.gbdx_connection.post(url, headers=headers, data=json.dumps(postdata))\n        return r.json()", "response": "Cancels GBDX batch workflow."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nordering images from GBDX.", "response": "def order(self, image_catalog_ids, batch_size=100, callback=None):\n        '''Orders images from GBDX.\n\n           Args:\n               image_catalog_ids (str or list): A single catalog id or a list of \n                                                catalog ids.\n               batch_size (int): The image_catalog_ids will be split into \n                                 batches of batch_size. The ordering API max \n                                 batch size is 100, if batch_size is greater \n                                 than 100 it will be truncated.\n               callback (str): A url to call when ordering is completed.\n\n           Returns:\n               order_ids (str or list): If one batch, returns a string. If more\n                                        than one batch, returns a list of order ids, \n                                        one for each batch.\n        '''\n        def _order_single_batch(url_, ids, results_list):\n            data = json.dumps(ids) if callback is None else json.dumps({\"acquisitionIds\": ids, \"callback\": callback})\n            r = self.gbdx_connection.post(url_, data=data)\n            r.raise_for_status()\n            order_id = r.json().get(\"order_id\")\n            if order_id:\n                results_list.append(order_id)\n\n        self.logger.debug('Place order')\n        url = ('%s/order' if callback is None else '%s/ordercb') % self.base_url\n\n        batch_size = min(100, batch_size)\n        \n        if not isinstance(image_catalog_ids, list):\n            image_catalog_ids = [image_catalog_ids]\n\n        sanitized_ids = list(set((id for id in (_id.strip() for _id in image_catalog_ids) if id)))\n\n        res = []\n        # Use itertool batch recipe\n        acq_ids_by_batch = zip(*([iter(sanitized_ids)] * batch_size))\n        for ids_batch in acq_ids_by_batch:\n            _order_single_batch(url, ids_batch, res)\n\n        # Order reminder\n        remain_count = len(sanitized_ids) % batch_size\n        if remain_count > 0:\n            _order_single_batch(url, sanitized_ids[-remain_count:], res)\n\n        if len(res) == 1:\n            return res[0]\n        elif len(res)>1:\n            return res"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nchecks imagery order status.", "response": "def status(self, order_id):\n        '''Checks imagery order status. There can be more than one image per\n           order and this function returns the status of all images\n           within the order.\n\n           Args:\n               order_id (str): The id of the order placed.\n\n           Returns:\n               List of dictionaries, one per image. Each dictionary consists\n               of the keys 'acquisition_id', 'location' and 'state'.\n        '''\n\n        self.logger.debug('Get status of order ' + order_id)\n        url = '%(base_url)s/order/%(order_id)s' % {\n            'base_url': self.base_url, 'order_id': order_id\n        }\n        r = self.gbdx_connection.get(url)\n        r.raise_for_status()\n        return r.json().get(\"acquisitions\", {})"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nchecks the heartbeat of the ordering API Returns True or False", "response": "def heartbeat(self):\n        '''\n        Check the heartbeat of the ordering API\n\n        Args: None\n\n        Returns:  True or False\n        '''\n        url = '%s/heartbeat' % self.base_url\n        # Auth is not required to hit the heartbeat\n        r = requests.get(url) \n\n        try:\n            return r.json() == \"ok\"\n        except:\n            return False"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nretrieves the strip footprint WKT string given a cat ID.", "response": "def get(self, catID, includeRelationships=False):\n        '''Retrieves the strip footprint WKT string given a cat ID.\n\n        Args:\n            catID (str): The source catalog ID from the platform catalog.\n            includeRelationships (bool): whether to include graph links to related objects.  Default False.\n\n        Returns:\n            record (dict): A dict object identical to the json representation of the catalog record\n        '''\n        url = '%(base_url)s/record/%(catID)s' % {\n            'base_url': self.base_url, 'catID': catID\n        }\n        r = self.gbdx_connection.get(url)\n        r.raise_for_status()\n        return r.json()"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef get_strip_metadata(self, catID):\n        '''Retrieves the strip catalog metadata given a cat ID.\n\n        Args:\n            catID (str): The source catalog ID from the platform catalog.\n\n        Returns:\n            metadata (dict): A metadata dictionary .\n\n            TODO: have this return a class object with interesting information exposed.\n        '''\n\n        self.logger.debug('Retrieving strip catalog metadata')\n        url = '%(base_url)s/record/%(catID)s?includeRelationships=false' % {\n            'base_url': self.base_url, 'catID': catID\n        }\n        r = self.gbdx_connection.get(url)\n        if r.status_code == 200:\n            return r.json()['properties']\n        elif r.status_code == 404:\n            self.logger.debug('Strip not found: %s' % catID)\n            r.raise_for_status()\n        else:\n            self.logger.debug('There was a problem retrieving catid: %s' % catID)\n            r.raise_for_status()", "response": "Retrieves the strip catalog metadata given a cat ID."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef get_address_coords(self, address):\n        ''' Use the google geocoder to get latitude and longitude for an address string\n\n        Args:\n            address: any address string\n\n        Returns:\n            A tuple of (lat,lng)\n        '''\n        url = \"https://maps.googleapis.com/maps/api/geocode/json?&address=\" + address\n        r = requests.get(url)\n        r.raise_for_status()\n        results = r.json()['results']\n        lat = results[0]['geometry']['location']['lat']\n        lng = results[0]['geometry']['location']['lng']\n        return lat, lng", "response": "Use the google geocoder to get latitude and longitude for an address string"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef search_address(self, address, filters=None, startDate=None, endDate=None, types=None):\n        ''' Perform a catalog search over an address string\n\n        Args:\n            address: any address string\n            filters: Array of filters.  Optional.  Example:\n            [\n                \"(sensorPlatformName = 'WORLDVIEW01' OR sensorPlatformName ='QUICKBIRD02')\",\n                \"cloudCover < 10\",\n                \"offNadirAngle < 10\"\n            ]\n            startDate: string.  Optional.  Example: \"2004-01-01T00:00:00.000Z\"\n            endDate: string.  Optional.  Example: \"2004-01-01T00:00:00.000Z\"\n            types: Array of types to search for.  Optional.  Example (and default):  [\"Acquisition\"]\n\n        Returns:\n            catalog search resultset\n        '''\n        lat, lng = self.get_address_coords(address)\n        return self.search_point(lat,lng, filters=filters, startDate=startDate, endDate=endDate, types=types)", "response": "Perform a catalog search over an address string"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef search_point(self, lat, lng, filters=None, startDate=None, endDate=None, types=None, type=None):\n        ''' Perform a catalog search over a specific point, specified by lat,lng\n\n        Args:\n            lat: latitude\n            lng: longitude\n            filters: Array of filters.  Optional.  Example:\n            [\n                \"(sensorPlatformName = 'WORLDVIEW01' OR sensorPlatformName ='QUICKBIRD02')\",\n                \"cloudCover < 10\",\n                \"offNadirAngle < 10\"\n            ]\n            startDate: string.  Optional.  Example: \"2004-01-01T00:00:00.000Z\"\n            endDate: string.  Optional.  Example: \"2004-01-01T00:00:00.000Z\"\n            types: Array of types to search for.  Optional.  Example (and default):  [\"Acquisition\"]\n\n        Returns:\n            catalog search resultset\n        '''\n        searchAreaWkt = \"POLYGON ((%s %s, %s %s, %s %s, %s %s, %s %s))\" % (lng, lat,lng,lat,lng,lat,lng,lat,lng,lat)\n        return self.search(searchAreaWkt=searchAreaWkt, filters=filters, startDate=startDate, endDate=endDate, types=types)", "response": "Perform a catalog search over a specific point."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nfinds and return the S3 location of the data associated with a catalog ID.", "response": "def get_data_location(self, catalog_id):\n        \"\"\"\n        Find and return the S3 data location given a catalog_id.\n\n        Args:\n            catalog_id: The catalog ID\n\n        Returns:\n            A string containing the s3 location of the data associated with a catalog ID.  Returns\n            None if the catalog ID is not found, or if there is no data yet associated with it.\n        \"\"\"\n\n        try:\n            record = self.get(catalog_id)\n        except:\n            return None\n\n        # Handle Landsat8\n        if 'Landsat8' in record['type'] and 'LandsatAcquisition' in record['type']:\n            bucket = record['properties']['bucketName']\n            prefix = record['properties']['bucketPrefix']\n            return 's3://' + bucket + '/' + prefix\n\n        # Handle DG Acquisition\n        if 'DigitalGlobeAcquisition' in record['type']:\n            o = Ordering()\n            res = o.location([catalog_id])\n            return res['acquisitions'][0]['location']\n\n        return None"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef search(self, searchAreaWkt=None, filters=None, startDate=None, endDate=None, types=None):\n        ''' Perform a catalog search\n\n        Args:\n            searchAreaWkt: WKT Polygon of area to search.  Optional.\n            filters: Array of filters.  Optional.  Example:\n            [\n                \"(sensorPlatformName = 'WORLDVIEW01' OR sensorPlatformName ='QUICKBIRD02')\",\n                \"cloudCover < 10\",\n                \"offNadirAngle < 10\"\n            ]\n            startDate: string.  Optional.  Example: \"2004-01-01T00:00:00.000Z\"\n            endDate: string.  Optional.  Example: \"2004-01-01T00:00:00.000Z\"\n            types: Array of types to search for.  Optional.  Example (and default):  [\"Acquisition\"]\n\n        Returns:\n            catalog search resultset\n        '''\n        # Default to search for Acquisition type objects.\n        if not types:\n            types = ['Acquisition']\n\n        # validation:  we must have either a WKT or one-week of time window\n        if startDate:\n            startDateTime = datetime.datetime.strptime(startDate, '%Y-%m-%dT%H:%M:%S.%fZ')\n\n        if endDate:\n            endDateTime = datetime.datetime.strptime(endDate, '%Y-%m-%dT%H:%M:%S.%fZ')\n\n        if startDate and endDate:\n            diff = endDateTime - startDateTime\n            if diff.days < 0:\n                raise Exception(\"startDate must come before endDate.\")\n\n        postdata = {\n            \"searchAreaWkt\": searchAreaWkt,\n            \"types\": types,\n            \"startDate\": startDate,\n            \"endDate\": endDate,\n        }\n\n        if filters:\n            postdata['filters'] = filters\n\n        if searchAreaWkt:\n            postdata['searchAreaWkt'] = searchAreaWkt\n\n        url = '%(base_url)s/search' % {\n            'base_url': self.base_url\n        }\n        headers = {'Content-Type':'application/json'}\n        r = self.gbdx_connection.post(url, headers=headers, data=json.dumps(postdata))\n        r.raise_for_status()\n        results = r.json()['results']\n\n        return results", "response": "Perform a catalog search for a set of acquisition types."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreturns the most recent images from a catalog resultset.", "response": "def get_most_recent_images(self, results, types=[], sensors=[], N=1):\n        ''' Return the most recent image\n\n        Args:\n            results: a catalog resultset, as returned from a search\n            types: array of types you want. optional.\n            sensors: array of sensornames. optional.\n            N: number of recent images to return.  defaults to 1.\n\n        Returns:\n            single catalog item, or none if not found\n\n        '''\n        if not len(results):\n            return None\n\n        # filter on type\n        if types:\n            results = [r for r in results if r['type'] in types]\n\n        # filter on sensor\n        if sensors:\n            results = [r for r in results if r['properties'].get('sensorPlatformName') in sensors]\n\n\n        # sort by date:\n        #sorted(results, key=results.__getitem__('properties').get('timestamp'))\n        newlist = sorted(results, key=lambda k: k['properties'].get('timestamp'), reverse=True)\n        return newlist[:N]"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef get_bytes_from_blob(val) -> bytes:\n    if isinstance(val, bytes):\n        return val\n    elif isinstance(val, memoryview):\n        return val.tobytes()\n    else:\n        raise TypeError('invalid type for get bytes')", "response": "Get bytes from blob"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef pagination_calc(items_count, page_size, cur_page=1, nearby=2):\n    if type(cur_page) == str:\n        # noinspection PyUnresolvedReferences\n        cur_page = int(cur_page) if cur_page.isdigit() else 1\n    elif type(cur_page) == int:\n        if cur_page <= 0:\n            cur_page = 1\n    else:\n        cur_page = 1\n\n    page_count = 1 if page_size == -1 else int(math.ceil(items_count / page_size))\n    items_length = nearby * 2 + 1\n\n    # if first page in page items, first_page is None,\n    # it means the \"go to first page\" button should not be available.\n    first_page = None\n    last_page = None\n\n    prev_page = cur_page - 1 if cur_page != 1 else None\n    next_page = cur_page + 1 if cur_page != page_count else None\n\n    if page_count <= items_length:\n        items = range(1, page_count + 1)\n    elif cur_page <= nearby:\n        # start of items\n        items = range(1, items_length + 1)\n        last_page = True\n    elif cur_page >= page_count - nearby:\n        # end of items\n        items = range(page_count - items_length + 1, page_count + 1)\n        first_page = True\n    else:\n        items = range(cur_page - nearby, cur_page + nearby + 1)\n        first_page, last_page = True, True\n\n    if first_page:\n        first_page = 1\n    if last_page:\n        last_page = page_count\n\n    return {\n        'cur_page': cur_page,\n        'prev_page': prev_page,\n        'next_page': next_page,\n\n        'first_page': first_page,\n        'last_page': last_page,\n        'page_numbers': list(items),\n\n        'info': {\n            'page_size': page_size,\n            'page_count': page_count,\n            'items_count': items_count,\n        }\n    }", "response": "Calculate the pagination for a single item or a list of items."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nadd a common check to the list of events that need to be executed before query .", "response": "def add_common_check(self, actions, table, func):\n        \"\"\"\n        emitted before query\n        :param actions:\n        :param table:\n        :param func:\n        :return:\n        \"\"\"\n        self.common_checks.append([table, actions, func])\n\n        \"\"\"def func(ability, user, action, available_columns: list):\n            pass\n        \"\"\""}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nadds a record check to the list of record checks that can be performed for the given table and action.", "response": "def add_record_check(self, actions, table, func):\n        # emitted after query\n        # table: 'table_name'\n        # column: ('table_name', 'column_name')\n        assert isinstance(table, str), '`table` must be table name'\n        for i in actions:\n            assert i not in (A.QUERY, A.CREATE), \"meaningless action check with record: [%s]\" % i\n\n        self.record_checks.append([table, actions, func])\n\n        \"\"\"def func(ability, user, action, record: DataRecord, available_columns: list):\n            pass\n        \"\"\""}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _parse_permission(self, obj):\n        if isinstance(obj, str):\n            if obj == '*':\n                return A.ALL\n            elif obj in A.ALL:\n                return obj,\n            else:\n                logger.warning('Invalid permission action: %s', obj)\n        elif isinstance(obj, (list, tuple)):\n            for i in obj:\n                if i not in A.ALL:\n                    logger.warning('Invalid permission action: %s', i)\n            return obj\n        elif isinstance(obj, dict):\n            return self._parse_permission(obj.get('*'))", "response": "\u4ece obj \u4e2d\u53d6\u51fa\u6743\u9650\n        :param obj:\n        :return: [A.QUERY, A.WRITE, ...]"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nchecks if user can access table with given columns.", "response": "def can_with_columns(self, user, action, table, columns):\n        \"\"\"\n        \u6839\u636e\u6743\u9650\u8fdb\u884c\u5217\u8fc7\u6ee4\n        \u6ce8\u610f\u4e00\u70b9\uff0c\u53ea\u8981\u6709\u4e00\u4e2a\u6761\u4ef6\u80fd\u591f\u901a\u8fc7\u6743\u9650\u68c0\u6d4b\uff0c\u90a3\u4e48\u8fc7\u6ee4\u540e\u8fd8\u4f1a\u6709\u5269\u4f59\u6761\u4ef6\uff0c\u6700\u7ec8\u5c31\u4e0d\u4f1a\u62a5\u9519\u3002\n        \u5982\u679c\u5168\u90e8\u6761\u4ef6\u90fd\u4e0d\u80fd\u8fc7\u68c0\u6d4b\uff0c\u5c31\u4f1a\u7206\u51fa\u6743\u9650\u9519\u8bef\u4e86\u3002\n\n        :param user:\n        :param action: \u884c\u4e3a\n        :param table: \u8868\u540d\n        :param columns: \u5217\u540d\u5217\u8868\n        :return: \u53ef\u7528\u5217\u7684\u5217\u8868\n        \"\"\"\n        # TODO: \u6b64\u8fc7\u7a0b\u53ef\u4ee5\u52a0\u7f13\u5b58\n        # \u5168\u5c40\n\n        global_data = self.rules.get('*')\n        global_actions = self._parse_permission(global_data)\n        if global_actions and action in global_actions:\n            available = list(columns)\n        else:\n            available = []\n\n        # table\n        table_data = self.rules.get(table)\n        table_actions = self._parse_permission(table_data)\n\n        if table_actions and action in table_actions:\n            available = list(columns)\n\n        # column\n        if type(table_data) == dict:\n            # \u8fd9\u610f\u5473\u7740\u6709\u8be6\u7ec6\u7684\u5217\u6743\u9650\u8bbe\u5b9a\uff0c\u4e0d\u7136\u7c7b\u578b\u662f list\n            for column in columns:\n                column_actions = self._parse_permission(table_data.get(column))\n                if column_actions is not None:\n                    if action in column_actions:\n                        # \u6709\u6743\u9650\uff0c\u8bd5\u56fe\u52a0\u5165\u5217\u8868\n                        if column not in available:\n                            available.append(column)\n                    else:\n                        # \u65e0\u6743\u9650\uff0c\u4ece\u5217\u8868\u5254\u9664\n                        if column in available:\n                            available.remove(column)\n\n        for check in self.common_checks:\n            if check[0] == table and action in check[1]:\n                ret = check[-1](self, user, action, available)\n                if isinstance(ret, (tuple, set, list)):\n                    # \u8fd4\u56de\u5217\u8868\u5219\u8fdb\u884c\u503c\u8986\u76d6\n                    available = list(ret)\n                elif ret == '*':\n                    # \u8fd4\u56de * \u52a0\u4e0a\u6240\u6709\u53ef\u7528\u5217\n                    available = list(columns)\n                elif ret is False:\n                    # \u8fd4\u56de false \u6e05\u7a7a\n                    available = []\n                if not available: break\n\n        return available"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef can_with_record(self, user, action, record: DataRecord, *, available=None):\n        assert action not in (A.QUERY, A.CREATE), \"meaningless action check with record: [%s]\" % action\n\n        # \u5148\u884c\u5339\u914d\u89c4\u5219\u9002\u7528\u8303\u56f4\n        rules = []\n        for rule in self.record_checks:\n            if record.table == rule[0] and action in rule[1]:\n                rules.append(rule)\n\n        # \u9010\u4e2a\u8fc7\u68c0\u67e5\n        if available is None: available = self.can_with_columns(user, action, record.table, record.keys())\n        else: available = list(available)\n        bak = available.copy()\n\n        for rule in rules:\n            ret = rule[-1](self, user, action, record, available)\n            if isinstance(ret, (tuple, set, list)):\n                available = list(ret)\n            elif ret == '*':\n                available = list(bak)\n            elif not ret:\n                available = []\n\n        return available", "response": "Returns a list of all available data records for the given user and action."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ngets ip address of client", "response": "async def get_ip(self) -> Union[IPv4Address, IPv6Address]:\n        \"\"\"\n        get ip address of client\n        :return:\n        \"\"\"\n        xff = await self.get_x_forwarded_for()\n        if xff: return xff[0]\n        ip_addr = self._request.transport.get_extra_info('peername')[0]\n        return ip_address(ip_addr)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef finish(self, code, data=NotImplemented):\n        if data is NotImplemented:\n            data = RETCODE.txt_cn.get(code, None)\n        self.ret_val = {'code': code, 'data': data}  # for access in inhreads method\n        self.response = web.json_response(self.ret_val, dumps=json_ex_dumps)\n        logger.debug('finish: %s' % self.ret_val)\n        self._finish_end()", "response": "Set response as {'code': xxx, 'data': xxx}\n        :param code:\n        :param data:\n        :return:"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nsets raw response :param body: :param status: :param content_type: :return:", "response": "def finish_raw(self, body: bytes, status: int = 200, content_type: Optional[str] = None):\n        \"\"\"\n        Set raw response\n        :param body:\n        :param status:\n        :param content_type:\n        :return:\n        \"\"\"\n        self.ret_val = body\n        self.response = web.Response(body=body, status=status, content_type=content_type)\n        logger.debug('finish: raw body(%d bytes)' % len(body))\n        self._finish_end()"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nadd soft foreign key to the tables", "response": "def add_soft_foreign_key(cls, column, table_name, alias=None):\n        \"\"\"\n        the column stores foreign table's primary key but isn't a foreign key (to avoid constraint)\n        warning: if the table not exists, will crash when query with loadfk\n        :param column: table's column\n        :param table_name: foreign table name\n        :param alias: table name's alias. Default is as same as table name.\n        :return: True, None\n        \"\"\"\n        if column in cls.fields:\n            table = SQLForeignKey(table_name, column, cls.fields[column], True)\n\n            if alias:\n                if alias in cls.foreign_keys_table_alias:\n                    logger.warning(\"This alias of table is already exists, overwriting: %s.%s to %s\" %\n                                   (cls.__name__, column, table_name))\n                cls.foreign_keys_table_alias[alias] = table\n\n            if column not in cls.foreign_keys:\n                cls.foreign_keys[column] = [table]\n            else:\n                if not alias:\n                    logger.warning(\"The soft foreign key will not work, an alias required: %s.%s to %r\" %\n                                   (cls.__name__, column, table_name))\n                cls.foreign_keys[column].append(table)\n            return True"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef current_request_role(self) -> [int, str]:\n        role_val = self.headers.get('Role')\n        return int(role_val) if role_val and role_val.isdigit() else role_val", "response": "Get the current role requested by client."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\nasync def load_fk(self, info: SQLQueryInfo, records: Iterable[DataRecord]) -> Union[List, Iterable]:\n\n        # if not items, items is probably [], so return itself.\n        # if not items: return items\n\n        # 1. get tables' instances\n        # table_map = {}\n        # for column in info['loadfk'].keys():\n        #     tbl_name = self.foreign_keys[column][0]\n        #     table_map[column] = self.app.tables[tbl_name]\n\n        # 2. get query parameters\n        async def check(data, records):\n            for column, fkvalues_lst in data.items():\n                for fkvalues in fkvalues_lst:\n                    pks = []\n                    all_ni = True\n                    vcls = self.app.tables[fkvalues['table']]\n\n                    for i in records:\n                        val = i.get(column, NotImplemented)\n                        if val != NotImplemented:\n                            all_ni = False\n                        pks.append(val)\n\n                    if all_ni:\n                        logger.debug(\"load foreign key failed, do you have read permission to the column %r?\" % column)\n                        continue\n\n                    # 3. query foreign keys\n                    v = vcls(self.app, self._request)  # fake view\n                    await v._prepare()\n                    info2 = SQLQueryInfo()\n                    info2.set_select(ALL_COLUMNS)\n                    info2.add_condition(PRIMARY_KEY, SQL_OP.IN, pks)\n                    info2.bind(v)\n\n                    # ability = vcls.permission.request_role(self.current_user, fkvalues['role'])\n                    # info2.check_query_permission_full(self.current_user, fktable, ability)\n\n                    try:\n                        fk_records, count = await v._sql.select_page(info2, size=-1)\n                    except RecordNotFound:\n                        # \u5916\u952e\u6ca1\u6709\u627e\u5230\u503c\uff0c\u4e5f\u8bb8\u5168\u90e8\u90fd\u662fnull\uff0c\u8fd9\u5f88\u5e38\u89c1\n                        continue\n\n                    # if not fk_records: continue\n                    await v.check_records_permission(info2, fk_records)\n\n                    fk_dict = {}\n                    for i in fk_records:\n                        # \u4e3b\u952e: \u6570\u636e\n                        fk_dict[i[vcls.primary_key]] = i\n\n                    column_to_set = fkvalues.get('as', column) or column\n                    for _, record in enumerate(records):\n                        k = record.get(column, NotImplemented)\n                        if k in fk_dict:\n                            record[column_to_set] = fk_dict[k]\n\n                    if fkvalues['loadfk']:\n                        await check(fkvalues['loadfk'], fk_records)\n\n        await check(info.loadfk, records)\n        return records", "response": "Load the foreign key records from the database and return the items."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ncall and check result of handle_query", "response": "async def _call_handle(self, func, *args):\n        \"\"\" call and check result of handle_query/read/insert/update \"\"\"\n        await async_call(func, *args)\n\n        if self.is_finished:\n            raise FinishQuitException()"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\nasync def after_update(self, raw_post: Dict, values: SQLValuesToWrite,\n                           old_records: List[DataRecord], records: List[DataRecord]):\n        \"\"\"\n        :param old_records:\n        :param raw_post:\n        :param values:\n        :param records:\n        :return:\n        \"\"\"", "response": "After update event handler."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef roles(self):\n        ret = {None}\n        if self.state == POST_STATE.DEL:\n            return ret\n        ret.add('user')\n        return ret", "response": "Return a list of all roles that the user is in."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ngenerate a password and salt", "response": "def gen_password_and_salt(cls, password_text):\n        \"\"\" \u751f\u6210\u52a0\u5bc6\u540e\u7684\u5bc6\u7801\u548c\u76d0 \"\"\"\n        salt = os.urandom(32)\n        dk = hashlib.pbkdf2_hmac(\n            config.PASSWORD_HASH_FUNC_NAME,\n            password_text.encode('utf-8'),\n            salt,\n            config.PASSWORD_HASH_ITERATIONS,\n        )\n        return {'password': dk, 'salt': salt}"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef gen_token(cls):\n        token = os.urandom(16)\n        token_time = int(time.time())\n        return {'token': token, 'token_time': token_time}", "response": "Generate a new access token"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef set_password(self, new_password):\n        info = self.gen_password_and_salt(new_password)\n        self.password = info['password']\n        self.salt = info['salt']\n        self.save()", "response": "Set the password of the current object"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\ngetting a session instance.", "response": "async def get_session(cls, view):\n        \"\"\"\n        Every request have a session instance\n        :param view:\n        :return:\n        \"\"\"\n        session = cls(view)\n        session.key = await session.get_key()\n        session._data = await session.load() or {}\n        return session"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\nasync def select_page(self, info: SQLQueryInfo, size=1, page=1) -> Tuple[Tuple[DataRecord, ...], int]:\n        raise NotImplementedError()", "response": "Select page of records from database."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\nasync def update(self, records: Iterable[DataRecord], values: SQLValuesToWrite, returning=False) -> Union[int, Iterable[DataRecord]]:\n        raise NotImplementedError()", "response": "Update the records with the given values."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\nasync def insert(self, values_lst: Iterable[SQLValuesToWrite], returning=False) -> Union[int, List[DataRecord]]:\n        raise NotImplementedError()", "response": "Insert a list of SQLValuesToWrite objects into the cache."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef parse_order(text):\n        orders = []\n        for i in map(str.strip, text.split(',')):\n            items = i.split('.', 2)\n\n            if len(items) == 1: column, order = items[0], 'default'\n            elif len(items) == 2: column, order = items\n            else: raise InvalidParams(\"Invalid order syntax\")\n\n            order = order.lower()\n            if order not in ('asc', 'desc', 'default'):\n                raise InvalidParams('Invalid order mode: %s' % order)\n\n            if order != 'default':\n                orders.append(SQLQueryOrder(column, order))\n        return orders", "response": "Parse order from text."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef parse_select(cls, text: str) -> Set:\n        if text == '*':\n            return ALL_COLUMNS  # None means ALL\n        selected_columns = set(filter(lambda x: x, map(str.strip, text.split(','))))\n        if not selected_columns:\n            raise InvalidParams(\"No column(s) selected\")\n        return selected_columns", "response": "parse select text into set of columns"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nparses load fk from the data dictionary.", "response": "def parse_load_fk(cls, data: Dict[str, List[Dict[str, object]]]) -> Dict[str, List[Dict[str, object]]]:\n        \"\"\"\n        :param data:{\n            <column>: role,\n            <column2>: role,\n            <column>: {\n                'role': role,\n                'loadfk': { ... },\n            },\n        :return: {\n            <column>: {\n                'role': role,\n            },\n            ...\n            <column3>: {\n                'role': role,\n                'loadfk': { ... },\n            },\n        }\n        \"\"\"\n        default_value_dict = {'role': None, 'as': None, 'table': None, 'loadfk': None}\n\n        def value_normalize_dict(value):\n            def check(k, v):\n                if k == 'role': return isinstance(v, str)\n                if k == 'as': return isinstance(v, str)\n                if k == 'table': return isinstance(v, str)\n                if k == 'loadfk': return isinstance(v, dict)\n\n            valid = {k: v for k, v in value.items() if check(k, v)}\n            if not valid: return default_value_dict.copy()\n            if 'loadfk' in valid and valid['loadfk']:\n                valid['loadfk'] = cls.parse_load_fk(valid['loadfk'])\n            for k, v in default_value_dict.items():\n                valid.setdefault(k, v)\n            return valid\n\n        def value_normalize(value, no_list=True):\n            if value is None:\n                return default_value_dict.copy()\n            elif not no_list and isinstance(value, List):\n                # <column>: [value1, value2, ...]\n                return list(map(value_normalize, value))\n            elif isinstance(value, str):\n                # <column>: role\n                val = default_value_dict.copy()\n                val['role'] = value\n                return val\n            elif isinstance(value, Dict):\n                # {'role': <str>, 'as': <str>, ...}\n                return value_normalize_dict(value)\n            else:\n                raise InvalidParams('Invalid syntax for \"loadfk\": %s' % value)\n\n        # \u5bf9\u5168\u90e8\u9879\u8fdb\u884c\u68c0\u67e5\n        new_data = {}\n        if not isinstance(data, dict):\n            raise InvalidParams('Invalid syntax for \"loadfk\": %s' % data)\n        for k, v in data.items():\n            nv = value_normalize(v, False)\n            new_data[k] = nv if isinstance(nv, List) else [nv]\n        return new_data"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nadds a query condition and validate it.", "response": "def add_condition(self, field_name, op, value):\n        \"\"\"\n        Add a query condition and validate it.\n        raise ParamsException if failed.\n        self.view required\n        :param field_name:\n        :param op:\n        :param value:\n        :return: None\n        \"\"\"\n        if not isinstance(op, SQL_OP):\n            if op not in SQL_OP.txt2op:\n                raise SQLOperatorInvalid(op)\n            else:\n                op = SQL_OP.txt2op.get(op)\n        self.conditions.append([field_name, op, value])"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nserialize a Python object into MessagePack bytes.", "response": "def _packb2(obj, **options):\n    \"\"\"\n    Serialize a Python object into MessagePack bytes.\n\n    Args:\n        obj: a Python object\n\n    Kwargs:\n        ext_handlers (dict): dictionary of Ext handlers, mapping a custom type\n                             to a callable that packs an instance of the type\n                             into an Ext object\n        force_float_precision (str): \"single\" to force packing floats as\n                                     IEEE-754 single-precision floats,\n                                     \"double\" to force packing floats as\n                                     IEEE-754 double-precision floats.\n\n    Returns:\n        A 'str' containing serialized MessagePack bytes.\n\n    Raises:\n        UnsupportedType(PackException):\n            Object type not supported for packing.\n\n    Example:\n    >>> umsgpack.packb({u\"compact\": True, u\"schema\": 0})\n    '\\x82\\xa7compact\\xc3\\xa6schema\\x00'\n    >>>\n    \"\"\"\n    fp = io.BytesIO()\n    _pack2(obj, fp, **options)\n    return fp.getvalue()"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nserializes a Python object into MessagePack bytes.", "response": "def _packb3(obj, **options):\n    \"\"\"\n    Serialize a Python object into MessagePack bytes.\n\n    Args:\n        obj: a Python object\n\n    Kwargs:\n        ext_handlers (dict): dictionary of Ext handlers, mapping a custom type\n                             to a callable that packs an instance of the type\n                             into an Ext object\n        force_float_precision (str): \"single\" to force packing floats as\n                                     IEEE-754 single-precision floats,\n                                     \"double\" to force packing floats as\n                                     IEEE-754 double-precision floats.\n\n    Returns:\n        A 'bytes' containing serialized MessagePack bytes.\n\n    Raises:\n        UnsupportedType(PackException):\n            Object type not supported for packing.\n\n    Example:\n    >>> umsgpack.packb({u\"compact\": True, u\"schema\": 0})\n    b'\\x82\\xa7compact\\xc3\\xa6schema\\x00'\n    >>>\n    \"\"\"\n    fp = io.BytesIO()\n    _pack3(obj, fp, **options)\n    return fp.getvalue()"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _unpackb3(s, **options):\n    if not isinstance(s, (bytes, bytearray)):\n        raise TypeError(\"packed data must be type 'bytes' or 'bytearray'\")\n    return _unpack(io.BytesIO(s), options)", "response": "Deserialize a byte array containing serialized MessagePack bytes into a Python object."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nbind view to app.", "response": "def view_bind(app, cls_url, view_cls: Type['BaseView']):\n    \"\"\"\n    \u5c06 API \u7ed1\u5b9a\u5230 web \u670d\u52a1\u4e0a\n    :param view_cls:\n    :param app:\n    :param cls_url:\n    :return:\n    \"\"\"\n    if view_cls._no_route: return\n    cls_url = cls_url or view_cls.__class__.__name__.lower()\n\n    def add_route(name, route_info, beacon_info):\n        for method in route_info['method']:\n            async def beacon(request): pass\n            route_key = route_info['url'] if route_info['url'] else name\n            app._raw_app.router.add_route(method, urljoin('/api', cls_url, route_key), beacon)\n            app.route._beacons[beacon] = beacon_info\n\n    # noinspection PyProtectedMember\n    for name, route_info_lst in view_cls._interface.items():\n        for route_info in route_info_lst:\n            real_handler = getattr(view_cls, name, None)\n            if real_handler is None: continue # TODO: delete\n            assert real_handler is not None, \"handler must be exists\"\n\n            handler_name = '%s.%s' % (view_cls.__name__, real_handler.__name__)\n            assert iscoroutinefunction(real_handler), \"Add 'async' before interface function %r\" % handler_name\n\n            beacon_info = {\n                'view': view_cls,\n                'name': name,\n                'handler': real_handler,\n                'route_info': route_info\n            }\n            add_route(name, route_info, beacon_info)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef add_static(self, prefix, path, **kwargs):\n        self.statics.append((prefix, path, kwargs),)", "response": "Add static files to the cache."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nparse a query by JSON string.", "response": "def parse_query_by_json(data):\n    \"\"\"\n    ['and',\n        ['==', 't1', 'col1', val1],\n        ['!=', 't1', 'col2', 't2', 'col2'],\n        ['and',\n            ['==', 't1', 'col3', val3],\n            ['!=', 't2', 'col4', val4],\n        ]\n    ]\n    :return:\n    :param data: \n    :return: \n    \"\"\"\n    data = json.loads(data)\n\n    for i in ('tables', 'columns', 'conditions'):\n        if i not in data:\n            raise QueryException(\"query: %s not found\" % i)\n\n    tables = data['tables']\n    columns = data['columns']\n    conditions = data['conditions']\n\n    def parse_stmt(s, expr_cls, all_op, multi_items_op):\n        if len(s) == 0:\n            return []\n\n        if s[0] in all_op:\n            if s[0] in multi_items_op:\n                values = []\n                for i in s[1:]:\n                    values.append(parse_stmt(i, expr_cls, all_op, multi_items_op))\n                return expr_cls(None, s[0], None, values=values)\n            else:\n                if len(s) == 5:\n                    # t1.c1 == t2.c2\n                    lhs = Column(s[2], table=s[1])\n                    rhs = Column(s[4], table=s[3])\n                    if (s[1] not in tables) or (s[3] not in tables):\n                        raise QueryException('Bad query')\n                    return expr_cls(lhs, s[0], rhs)\n                else:\n                    # t1.c1 == val\n                    lhs = Column(s[2], table=s[1])\n                    if s[1] not in tables:\n                        raise QueryException('Bad query')\n                    return expr_cls(lhs, s[0], s[3])\n        else:\n            raise QueryException('Bad query')\n\n    query_op = ('+', '-', '*', '/')\n    query_columns = []\n\n    for i in columns:\n        if len(i) == 2:\n            query_columns.append(Column(i[1], table=i[0]))\n        else:\n            query_columns.append(parse_stmt(i, QueryExpression, query_op, query_op))\n    wheres = parse_stmt(conditions, ConditionExpression, _operator_map, ('and', 'or',))\n\n    return {\n        'tables': tables,\n        'columns': query_columns,\n        'wheres': wheres,\n    }"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nrun the current phase.", "response": "def run(self, ctx):\n        \"\"\"\n        Runs the current phase.\n        \"\"\"\n        # Reverse engine assertion if needed\n        if ctx.reverse:\n            self.engine.reverse()\n\n        if self.engine.empty:\n            raise AssertionError('grappa: no assertions to run')\n\n        try:\n            # Run assertion in series and return error, if present\n            return self.run_assertions(ctx)\n        except Exception as _err:\n            # Handle legit grappa internval errors\n            if getattr(_err, '__legit__', False):\n                raise _err\n            # Otherwise render it\n            return self.render_error(ctx, _err)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef observe(matcher):\n        @functools.wraps(matcher)\n        def observer(self, subject, *expected, **kw):\n            # Trigger before hook, if present\n            if hasattr(self, 'before'):\n                self.before(subject, *expected, **kw)\n\n            # Trigger matcher method\n            result = matcher(self, subject, *expected, **kw)\n\n            # After error hook\n            if result is not True and hasattr(self, 'after_error'):\n                self.after_error(result, subject, *expected, **kw)\n\n            # After success hook\n            if result is True and hasattr(self, 'after_success'):\n                self.after_success(subject, *expected, **kw)\n\n            # Enable diff comparison on error, if needed\n            if not hasattr(self, 'show_diff'):\n                self.show_diff = all([\n                    isinstance(subject, six.string_types),\n                    all([isinstance(x, six.string_types) for x in expected]),\n                ])\n\n            return result\n        return observer", "response": "Decorator to trigger operator hooks before and after matcher execution."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef run_matcher(self, subject, *expected, **kw):\n        # Update assertion expectation\n        self.expected = expected\n\n        _args = (subject,)\n        if self.kind == OperatorTypes.MATCHER:\n            _args += expected\n\n        try:\n            result = self.match(*_args, **kw)\n        except Exception as error:\n            return self._make_error(error=error)\n\n        reasons = []\n        if isinstance(result, tuple):\n            result, reasons = result\n\n        if result is False and self.ctx.negate:\n            return True\n\n        if result is True and not self.ctx.negate:\n            return True\n\n        return self._make_error(reasons=reasons)", "response": "Runs the operator matcher test function."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nrunning the current operator with the subject arguments to test.", "response": "def run(self, *args, **kw):\n        \"\"\"\n        Runs the current operator with the subject arguments to test.\n\n        This method is implemented by matchers only.\n        \"\"\"\n        log.debug('[operator] run \"{}\" with arguments: {}'.format(\n            self.__class__.__name__, args\n        ))\n\n        if self.kind == OperatorTypes.ATTRIBUTE:\n            return self.match(self.ctx)\n        else:\n            return self.run_matcher(*args, **kw)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef operator(name=None, operators=None, aliases=None, kind=None):\n    def delegator(assertion, subject, expected, *args, **kw):\n        return assertion.test(subject, expected, *args, **kw)\n\n    def decorator(fn):\n        operator = Operator(fn=fn, aliases=aliases, kind=kind)\n        _name = name if isinstance(name, six.string_types) else fn.__name__\n        operator.operators = (_name,)\n\n        _operators = operators\n        if isinstance(_operators, list):\n            _operators = tuple(_operators)\n\n        if isinstance(_operators, tuple):\n            operator.operators += _operators\n\n        # Register operator\n        Engine.register(operator)\n        return functools.partial(delegator, operator)\n\n    return decorator(name) if inspect.isfunction(name) else decorator", "response": "Decorator for creating an operator function in the test engine."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn a new attribute only operator function in the test engine.", "response": "def attribute(*args, **kw):\n    \"\"\"\n    Registers a new attribute only operator function in the test engine.\n\n    Arguments:\n        *args: variadic arguments.\n        **kw: variadic keyword arguments.\n\n    Returns:\n        function\n    \"\"\"\n    return operator(kind=Operator.Type.ATTRIBUTE, *args, **kw)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nregister plugin in grappa.", "response": "def use(plugin):\n    \"\"\"\n    Register plugin in grappa.\n\n    `plugin` argument can be a function or a object that implement `register`\n    method, which should accept one argument: `grappa.Engine` instance.\n\n    Arguments:\n        plugin (function|module): grappa plugin object to register.\n\n    Raises:\n        ValueError: if `plugin` is not a valid interface.\n\n    Example::\n\n        import grappa\n\n        class MyOperator(grappa.Operator):\n            pass\n\n        def my_plugin(engine):\n            engine.register(MyOperator)\n\n        grappa.use(my_plugin)\n    \"\"\"\n    log.debug('register new plugin: {}'.format(plugin))\n\n    if inspect.isfunction(plugin):\n        return plugin(Engine)\n\n    if plugin and hasattr(plugin, 'register'):\n        return plugin.register(Engine)\n\n    raise ValueError('invalid plugin: must be a function or '\n                     'implement register() method')"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef load():\n    for operator in operators:\n        module, symbols = operator[0], operator[1:]\n        path = 'grappa.operators.{}'.format(module)\n\n        # Dynamically import modules\n        operator = __import__(path, None, None, symbols)\n\n        # Register operators in the test engine\n        for symbol in symbols:\n            Engine.register(getattr(operator, symbol))", "response": "Loads the built - in operators into the global test engine."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef register_operators(*operators):\n    def validate(operator):\n        if isoperator(operator):\n            return True\n\n        raise NotImplementedError('invalid operator: {}'.format(operator))\n\n    def register(operator):\n        # Register operator by DSL keywords\n        for name in operator.operators:\n            # Check valid operators\n            if name in Engine.operators:\n                raise ValueError('operator name \"{}\" from {} is already '\n                                 'in use by other operator'.format(\n                                    name,\n                                    operator.__name__\n                                 ))\n\n            # Register operator by name\n            Engine.operators[name] = operator\n\n    # Validates and registers operators\n    [register(operator) for operator in operators if validate(operator)]", "response": "Registers one or multiple operators in the test engine."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nfinds the OMXPlayer DBus connection and returns the path of the first available address file.", "response": "def find_address_file(self):\n        \"\"\"\n        Finds the OMXPlayer DBus connection\n        Assumes there is an alive OMXPlayer process.\n        :return:\n        \"\"\"\n        possible_address_files = []\n        while not possible_address_files:\n            # filter is used here as glob doesn't support regexp :(\n            isnt_pid_file = lambda path: not path.endswith('.pid')\n            possible_address_files = list(filter(isnt_pid_file,\n                                            glob('/tmp/omxplayerdbus.*')))\n            possible_address_files.sort(key=lambda path: os.path.getmtime(path))\n            time.sleep(0.05)\n\n        self.path = possible_address_files[-1]"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef load(self, source, pause=False):\n        self._source = source\n        self._load_source(source)\n        if pause:\n            time.sleep(0.5)  # Wait for the DBus interface to be initialised\n            self.pause()", "response": "Loads a new source from a file path or URL."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nsetting the volume of the current object.", "response": "def set_volume(self, volume):\n        \"\"\"\n        Args:\n            float: volume in the interval [0, 10]\n        \"\"\"\n        # 0 isn't handled correctly so we have to set it to a very small value to achieve the same purpose\n        if volume == 0:\n            volume = 1e-10\n        return self._player_interface_property('Volume', dbus.Double(volume))"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef set_rate(self, rate):\n        self._rate = self._player_interface_property('Rate', dbus.Double(rate))\n        return self._rate", "response": "Set the playback rate of the video as a multiple of the default playback speed"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef pause(self):\n        self._player_interface.Pause()\n        self._is_playing = False\n        self.pauseEvent(self)", "response": "pause the current process"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef play_pause(self):\n        self._player_interface.PlayPause()\n        self._is_playing = not self._is_playing\n        if self._is_playing:\n            self.playEvent(self)\n        else:\n            self.pauseEvent(self)", "response": "Play pause or start playing if currently playing."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nseek the video by relative_position seconds.", "response": "def seek(self, relative_position):\n        \"\"\"\n        Seek the video by `relative_position` seconds\n\n        Args:\n            relative_position (float): The position in seconds to seek to.\n        \"\"\"\n        self._player_interface.Seek(Int64(1000.0 * 1000 * relative_position))\n        self.seekEvent(self, relative_position)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nset the video to playback position to position seconds from the start of the video", "response": "def set_position(self, position):\n        \"\"\"\n        Set the video to playback position to `position` seconds from the start of the video\n\n        Args:\n            position (float): The position in seconds.\n        \"\"\"\n        self._player_interface.SetPosition(ObjectPath(\"/not/used\"), Int64(position * 1000.0 * 1000))\n        self.positionEvent(self, position)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef set_video_pos(self, x1, y1, x2, y2):\n        position = \"%s %s %s %s\" % (str(x1),str(y1),str(x2),str(y2))\n        self._player_interface.VideoPos(ObjectPath('/not/used'), String(position))", "response": "Set the video position on the screen."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef video_pos(self):\n        position_string = self._player_interface.VideoPos(ObjectPath('/not/used'))\n        return list(map(int, position_string.split(\" \")))", "response": "Returns the video spatial position of the current user s video."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nsetting the video crop position.", "response": "def set_video_crop(self, x1, y1, x2, y2):\n        \"\"\"\n        Args:\n            x1 (int): Top left x coordinate (px)\n            y1 (int): Top left y coordinate (px)\n            x2 (int): Bottom right x coordinate (px)\n            y2 (int): Bottom right y coordinate (px)\n        \"\"\"\n        crop = \"%s %s %s %s\" % (str(x1),str(y1),str(x2),str(y2))\n        self._player_interface.SetVideoCropPos(ObjectPath('/not/used'), String(crop))"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreturning True if the player is playing False otherwise.", "response": "def is_playing(self):\n        \"\"\"\n        Returns:\n            bool: Whether the player is playing\n        \"\"\"\n        self._is_playing = (self.playback_status() == \"Playing\")\n        logger.info(\"Playing?: %s\" % self._is_playing)\n        return self._is_playing"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef play_sync(self):\n        self.play()\n        logger.info(\"Playing synchronously\")\n        try:\n            time.sleep(0.05)\n            logger.debug(\"Wait for playing to start\")\n            while self.is_playing():\n                time.sleep(0.05)\n        except DBusException:\n            logger.error(\n                \"Cannot play synchronously any longer as DBus calls timed out.\"\n            )", "response": "Play the video synchronously and block whilst the video is playing."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef play(self):\n        if not self.is_playing():\n            self.play_pause()\n            self._is_playing = True\n            self.playEvent(self)", "response": "Play the video asynchronously returning control immediately to the calling code."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nquits the process and wait until it has died", "response": "def quit(self):\n        \"\"\"\n        Quit the player, blocking until the process has died\n        \"\"\"\n        if self._process is None:\n            logger.debug('Quit was called after self._process had already been released')\n            return\n        try:\n            logger.debug('Quitting OMXPlayer')\n            process_group_id = os.getpgid(self._process.pid)\n            os.killpg(process_group_id, signal.SIGTERM)\n            logger.debug('SIGTERM Sent to pid: %s' % process_group_id)\n            self._process_monitor.join()\n        except OSError:\n            logger.error('Could not find the process to kill')\n\n        self._process = None"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef render_to_response(self, context, **response_kwargs):\n        if self.request.is_ajax():\n            template = self.page_template\n        else:\n            template = self.get_template_names()\n        return self.response_class(\n            request=self.request,\n            template=template,\n            context=context,\n            **response_kwargs\n        )", "response": "Returns a response object with the template name and the context."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef translate_value(document_field, form_value):\n    value = form_value\n    if isinstance(document_field, ReferenceField):\n        value = document_field.document_type.objects.get(id=form_value) if form_value else None\n    return value", "response": "Translate a value from a document field to a form value."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning the smallest delimited version of field_key that is an attribute on document.", "response": "def trim_field_key(document, field_key):\n    \"\"\"\n    Returns the smallest delimited version of field_key that\n    is an attribute on document.\n\n    return (key, left_over_array)\n    \"\"\"\n    trimming = True\n    left_over_key_values = []\n    current_key = field_key\n    while trimming and current_key:\n        if hasattr(document, current_key):\n            trimming = False\n        else:\n            key_array = current_key.split(\"_\")\n            left_over_key_values.append(key_array.pop())\n            current_key = u\"_\".join(key_array)\n\n    left_over_key_values.reverse()\n    return current_key, left_over_key_values"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef has_edit_permission(self, request):\n        return request.user.is_authenticated and request.user.is_active and request.user.is_staff", "response": "Check if the user can edit this object"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef has_add_permission(self, request):\n        return request.user.is_authenticated and request.user.is_active and request.user.is_staff", "response": "Returns True if the user can add this object"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef has_delete_permission(self, request):\n        return request.user.is_authenticated and request.user.is_active and request.user.is_superuser", "response": "Returns True if the user has permission to delete this object"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ntaking a model dictionary representation and creates a dictionary keyed by form field.", "response": "def get_form_field_dict(self, model_dict):\n        \"\"\"\n        Takes a model dictionary representation and creates a dictionary\n        keyed by form field.  Each value is a  keyed 4 tuple of:\n        (widget, mode_field_instance, model_field_type, field_key)\n        \"\"\"\n        return_dict = OrderedDict()\n        # Workaround: mongoengine doesn't preserve form fields ordering from metaclass __new__\n        if hasattr(self.model, 'Meta') and hasattr(self.model.Meta, 'form_fields_ordering'):\n            field_order_list = tuple(form_field for form_field\n                                     in self.model.Meta.form_fields_ordering\n                                     if form_field in model_dict.iterkeys())\n            order_dict = OrderedDict.fromkeys(field_order_list)\n            return_dict = order_dict\n\n        for field_key, field_dict in sorted(model_dict.items()):\n            if not field_key.startswith(\"_\"):\n                widget = field_dict.get('_widget', None)\n                if widget is None:\n                    return_dict[field_key] = self.get_form_field_dict(field_dict)\n                    return_dict[field_key].update({'_field_type': field_dict.get('_field_type', None)})\n                else:\n                    return_dict[field_key] = FieldTuple(widget,\n                                             field_dict.get('_document_field', None),\n                                             field_dict.get('_field_type', None),\n                                             field_dict.get('_key', None))\n        return return_dict"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef set_form_fields(self, form_field_dict, parent_key=None, field_type=None):\n        for form_key, field_value in form_field_dict.items():\n            form_key = make_key(parent_key, form_key) if parent_key is not None else form_key\n            if isinstance(field_value, tuple):\n\n                set_list_class = False\n                base_key = form_key\n\n                # Style list fields\n                if ListField in (field_value.field_type, field_type):\n\n                    # Nested lists/embedded docs need special care to get\n                    # styles to work out nicely.\n                    if parent_key is None or ListField == field_value.field_type:\n                        if field_type != EmbeddedDocumentField:\n                            field_value.widget.attrs['class'] += ' listField {0}'.format(form_key)\n                        set_list_class = True\n                    else:\n                        field_value.widget.attrs['class'] += ' listField'\n\n                    # Compute number value for list key\n                    list_keys = [field_key for field_key in self.form.fields.keys()\n                                           if has_digit(field_key)]\n\n                    key_int = 0\n                    while form_key in list_keys:\n                        key_int += 1\n                    form_key = make_key(form_key, key_int)\n\n                if parent_key is not None:\n\n                    # Get the base key for our embedded field class\n                    valid_base_keys = [model_key for model_key in self.model_map_dict.keys()\n                                                 if not model_key.startswith(\"_\")]\n                    while base_key not in valid_base_keys and base_key:\n                        base_key = make_key(base_key, exclude_last_string=True)\n\n                    # We need to remove the trailing number from the key\n                    # so that grouping will occur on the front end when we have a list.\n                    embedded_key_class = None\n                    if set_list_class:\n                        field_value.widget.attrs['class'] += \" listField\".format(base_key)\n                        embedded_key_class = make_key(field_key, exclude_last_string=True)\n\n                    field_value.widget.attrs['class'] += \" embeddedField\"\n\n                    # Setting the embedded key correctly allows to visually nest the\n                    # embedded documents on the front end.\n                    if base_key == parent_key:\n                        field_value.widget.attrs['class'] += ' {0}'.format(base_key)\n                    else:\n                        field_value.widget.attrs['class'] += ' {0} {1}'.format(base_key, parent_key)\n\n                    if embedded_key_class is not None:\n                        field_value.widget.attrs['class'] += ' {0}'.format(embedded_key_class)\n\n                default_value = self.get_field_value(form_key)\n\n                # Style embedded documents\n                if isinstance(default_value, list) and len(default_value) > 0:\n                    key_index = int(form_key.split(\"_\")[-1])\n                    new_base_key = make_key(form_key, exclude_last_string=True)\n\n                    for list_value in default_value:\n                        # Note, this is copied every time so each widget gets a different class\n                        list_widget = deepcopy(field_value.widget)\n                        new_key = make_key(new_base_key, six.text_type(key_index))\n                        list_widget.attrs['class'] += \" {0}\".format(make_key(base_key, key_index))\n                        self.set_form_field(list_widget, field_value.document_field, new_key, list_value)\n                        key_index += 1\n                else:\n                    self.set_form_field(field_value.widget, field_value.document_field,\n                                        form_key, default_value)\n\n            elif isinstance(field_value, dict):\n                self.set_form_fields(field_value, form_key, field_value.get(\"_field_type\", None))", "response": "Set the form fields for every key in the form_field_dict."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef set_form_field(self, widget, model_field, field_key, default_value):\n        # Empty lists cause issues on form validation\n        if default_value == []:\n            default_value = None\n\n        if widget and isinstance(widget, forms.widgets.Select):\n            self.form.fields[field_key] = forms.ChoiceField(label=model_field.name,\n                                                            required=model_field.required,\n                                                            widget=widget)\n        else:\n            field_class = get_form_field_class(model_field)\n            self.form.fields[field_key] = field_class(label=model_field.name,\n                                                      required=model_field.required,\n                                                      widget=widget)\n\n        if default_value is not None:\n            if isinstance(default_value, Document):\n                # Probably a reference field, therefore, add id\n                self.form.fields[field_key].initial = getattr(default_value, 'id', None)\n            else:\n                self.form.fields[field_key].initial = default_value\n        else:\n            self.form.fields[field_key].initial = getattr(model_field, 'default', None)\n\n        if isinstance(model_field, ReferenceField):\n            self.form.fields[field_key].choices = [(six.text_type(x.id), get_document_unicode(x))\n                                                    for x in model_field.document_type.objects.all()]\n            # Adding in blank choice so a reference field can be deleted by selecting blank\n            self.form.fields[field_key].choices.insert(0, (\"\", \"\"))\n\n        elif model_field.choices:\n            self.form.fields[field_key].choices = model_field.choices\n\n        for key, form_attr in CHECK_ATTRS.items():\n            if hasattr(model_field, key):\n                value = getattr(model_field, key)\n                setattr(self.form.fields[field_key], key, value)", "response": "Set the form field for the current object."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreturns the value of a field in the current object.", "response": "def get_field_value(self, field_key):\n        \"\"\"\n        Given field_key will return value held at self.model_instance.  If\n        model_instance has not been provided will return None.\n        \"\"\"\n\n        def get_value(document, field_key):\n            # Short circuit the function if we do not have a document\n            if document is None:\n                return None\n\n            current_key, new_key_array = trim_field_key(document, field_key)\n            key_array_digit = int(new_key_array[-1]) if new_key_array and has_digit(new_key_array) else None\n            new_key = make_key(new_key_array)\n\n            if key_array_digit is not None and len(new_key_array) > 0:\n                # Handleing list fields\n                if len(new_key_array) == 1:\n                    return_data = document._data.get(current_key, [])\n                elif isinstance(document, BaseList):\n                    return_list = []\n                    if len(document) > 0:\n                        return_list = [get_value(doc, new_key) for doc in document]\n                    return_data = return_list\n                else:\n                    return_data = get_value(getattr(document, current_key), new_key)\n\n            elif len(new_key_array) > 0:\n                return_data = get_value(document._data.get(current_key), new_key)\n            else:\n                # Handeling all other fields and id\n                try: # Added try except otherwise we get \"TypeError: getattr(): attribute name must be string\" error from mongoengine/base/datastructures.py \n                    return_data = (document._data.get(None, None) if current_key == \"id\" else\n                              document._data.get(current_key, None))\n                except: \n                    return_data = document._data.get(current_key, None)\n\n            return return_data\n\n        if self.is_initialized:\n            return get_value(self.model_instance, field_key)\n        else:\n            return None"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef has_digit(string_or_list, sep=\"_\"):\n    if isinstance(string_or_list, (tuple, list)):\n        list_length = len(string_or_list)\n        if list_length:\n            return six.text_type(string_or_list[-1]).isdigit()\n        else:\n            return False\n    else:\n        return has_digit(string_or_list.split(sep))", "response": "Returns True if the last word in a list is a digit."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\ngives any number of lists and strings will join them in order as one string separated by the sep kwarg. sep defaults to u\"_\". Add exclude_last_string=True as a kwarg to exclude the last item in a given string after being split by sep. Note if you only have one word in your string you can end up getting an empty string. Example uses: >>> from mongonaut.forms.form_utils import make_key >>> make_key('hi', 'my', 'firend') >>> u'hi_my_firend' >>> make_key('hi', 'my', 'firend', sep='i') >>> 'hiimyifirend' >>> make_key('hi', 'my', 'firend',['this', 'be', 'what'], sep='i') >>> 'hiimyifirendithisibeiwhat' >>> make_key('hi', 'my', 'firend',['this', 'be', 'what']) >>> u'hi_my_firend_this_be_what'", "response": "def make_key(*args, **kwargs):\n    \"\"\"\n    Given any number of lists and strings will join them in order as one\n    string separated by the sep kwarg.  sep defaults to u\"_\".\n\n    Add exclude_last_string=True as a kwarg to exclude the last item in a\n    given string after being split by sep.  Note if you only have one word\n    in your string you can end up getting an empty string.\n\n    Example uses:\n\n    >>> from mongonaut.forms.form_utils import make_key\n    >>> make_key('hi', 'my', 'firend')\n    >>> u'hi_my_firend'\n\n    >>> make_key('hi', 'my', 'firend', sep='i')\n    >>> 'hiimyifirend'\n\n    >>> make_key('hi', 'my', 'firend',['this', 'be', 'what'], sep='i')\n    >>> 'hiimyifirendithisibeiwhat'\n\n    >>> make_key('hi', 'my', 'firend',['this', 'be', 'what'])\n    >>> u'hi_my_firend_this_be_what'\n\n    \"\"\"\n    sep = kwargs.get('sep', u\"_\")\n    exclude_last_string = kwargs.get('exclude_last_string', False)\n    string_array = []\n\n    for arg in args:\n        if isinstance(arg, list):\n            string_array.append(six.text_type(sep.join(arg)))\n        else:\n            if exclude_last_string:\n                new_key_array = arg.split(sep)[:-1]\n                if len(new_key_array) > 0:\n                    string_array.append(make_key(new_key_array))\n            else:\n                string_array.append(six.text_type(arg))\n    return sep.join(string_array)"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nsets existing data to form fields.", "response": "def set_fields(self):\n        \"\"\"Sets existing data to form fields.\"\"\"\n\n        # Get dictionary map of current model\n        if self.is_initialized:\n            self.model_map_dict = self.create_document_dictionary(self.model_instance)\n        else:\n            self.model_map_dict = self.create_document_dictionary(self.model)\n\n        form_field_dict = self.get_form_field_dict(self.model_map_dict)\n        self.set_form_fields(form_field_dict)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef set_post_data(self):\n        self.form.data = self.post_data_dict\n\n        # Specifically adding list field keys to the form so they are included\n        # in form.cleaned_data after the call to is_valid\n        for field_key, field in self.form.fields.items():\n            if has_digit(field_key):\n                # We have a list field.\n                base_key = make_key(field_key, exclude_last_string=True)\n\n                # Add new key value with field to form fields so validation\n                # will work correctly\n                for key in self.post_data_dict.keys():\n                    if base_key in key:\n                        self.form.fields.update({key: field})", "response": "Set the form data so that validation on all post data occurs and validation on all post data occurs."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ngenerate the form for view.", "response": "def get_form(self):\n        \"\"\"\n        Generate the form for view.\n        \"\"\"\n        self.set_fields()\n        if self.post_data_dict is not None:\n            self.set_post_data()\n        return self.form"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\ngenerating a dictionary representation of the document.", "response": "def create_doc_dict(self, document, doc_key=None, owner_document=None):\n        \"\"\"\n        Generate a dictionary representation of the document.  (no recursion)\n\n        DO NOT CALL DIRECTLY\n        \"\"\"\n        # Get doc field for top level documents\n        if owner_document:\n            doc_field = owner_document._fields.get(doc_key, None) if doc_key else None\n        else:\n            doc_field = document._fields.get(doc_key, None) if doc_key else None\n\n        # Generate the base fields for the document\n        doc_dict = {\"_document\": document if owner_document is None else owner_document,\n                    \"_key\": document.__class__.__name__.lower() if doc_key is None else doc_key,\n                    \"_document_field\": doc_field}\n\n        if not isinstance(document, TopLevelDocumentMetaclass) and doc_key:\n            doc_dict.update({\"_field_type\": EmbeddedDocumentField})\n\n        for key, field in document._fields.items():\n            doc_dict[key] = field\n\n        return doc_dict"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef create_list_dict(self, document, list_field, doc_key):\n        list_dict = {\"_document\": document}\n\n        if isinstance(list_field.field, EmbeddedDocumentField):\n            list_dict.update(self.create_document_dictionary(document=list_field.field.document_type_obj,\n                                                             owner_document=document))\n\n        # Set the list_dict after it may have been updated\n        list_dict.update({\"_document_field\": list_field.field,\n                          \"_key\": doc_key,\n                          \"_field_type\": ListField,\n                          \"_widget\": get_widget(list_field.field),\n                          \"_value\": getattr(document, doc_key, None)})\n\n        return list_dict", "response": "Generates a dictionary representation of the list field."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ngive a document generates a dictionary representation of the document.", "response": "def create_document_dictionary(self, document, document_key=None,\n                                                        owner_document=None):\n        \"\"\"\n        Given document generates a dictionary representation of the document.\n        Includes the widget for each for each field in the document.\n        \"\"\"\n        doc_dict = self.create_doc_dict(document, document_key, owner_document)\n\n        for doc_key, doc_field in doc_dict.items():\n            # Base fields should not be evaluated\n            if doc_key.startswith(\"_\"):\n                continue\n\n            if isinstance(doc_field, ListField):\n                doc_dict[doc_key] = self.create_list_dict(document, doc_field, doc_key)\n\n            elif isinstance(doc_field, EmbeddedDocumentField):\n                doc_dict[doc_key] = self.create_document_dictionary(doc_dict[doc_key].document_type_obj,\n                                                                    doc_key)\n            else:\n                doc_dict[doc_key] = {\"_document\": document,\n                                     \"_key\": doc_key,\n                                     \"_document_field\": doc_field,\n                                     \"_widget\": get_widget(doc_dict[doc_key], getattr(doc_field, 'disabled', False))}\n\n        return doc_dict"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef get_widget(model_field, disabled=False):\n\n    attrs = get_attrs(model_field, disabled)\n\n    if hasattr(model_field, \"max_length\") and not model_field.max_length:\n        return forms.Textarea(attrs=attrs)\n\n    elif isinstance(model_field, DateTimeField):\n        return forms.DateTimeInput(attrs=attrs)\n\n    elif isinstance(model_field, BooleanField):\n        return forms.CheckboxInput(attrs=attrs)\n\n    elif isinstance(model_field, ReferenceField) or model_field.choices:\n        return forms.Select(attrs=attrs)\n\n    elif (isinstance(model_field, ListField) or\n          isinstance(model_field, EmbeddedDocumentField) or\n          isinstance(model_field, GeoPointField)):\n        return None\n\n    else:\n        return forms.TextInput(attrs=attrs)", "response": "Choose which widget to display for a field."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef get_attrs(model_field, disabled=False):\n    attrs = {}\n    attrs['class'] = 'span6 xlarge'\n    if disabled or isinstance(model_field, ObjectIdField):\n        attrs['class'] += ' disabled'\n        attrs['readonly'] = 'readonly'\n    return attrs", "response": "Set attributes on the display widget."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef get_form_field_class(model_field):\n\n    FIELD_MAPPING = {\n        IntField: forms.IntegerField,\n        StringField: forms.CharField,\n        FloatField: forms.FloatField,\n        BooleanField: forms.BooleanField,\n        DateTimeField: forms.DateTimeField,\n        DecimalField: forms.DecimalField,\n        URLField: forms.URLField,\n        EmailField: forms.EmailField\n    }\n\n    return FIELD_MAPPING.get(model_field.__class__, forms.CharField)", "response": "Gets the default form field class for a mongoenigne field."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef get_document_value(document, key):\n    '''\n    Returns the display value of a field for a particular MongoDB document.\n    '''\n    value = getattr(document, key)\n    if isinstance(value, ObjectId):\n        return value\n\n    if isinstance(document._fields.get(key), URLField):\n        return mark_safe(\"\"\"<a href=\"{0}\">{1}</a>\"\"\".format(value, value))\n\n    if isinstance(value, Document):\n        app_label = value.__module__.replace(\".models\", \"\")\n        document_name = value._class_name\n        url = reverse(\n            \"document_detail\",\n            kwargs={'app_label': app_label, 'document_name': document_name,\n                    'id': value.id})\n        return mark_safe(\"\"\"<a href=\"{0}\">{1}</a>\"\"\".format(url, value))\n\n    return value", "response": "Returns the display value of a particular MongoDB document."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_qset(self, queryset, q):\n        if self.mongoadmin.search_fields and q:\n            params = {}\n            for field in self.mongoadmin.search_fields:\n                if field == 'id':\n                    # check to make sure this is a valid ID, otherwise we just continue\n                    if is_valid_object_id(q):\n                        return queryset.filter(pk=q)\n                    continue\n                search_key = \"{field}__icontains\".format(field=field)\n                params[search_key] = q\n\n            queryset = queryset.filter(**params)\n        return queryset", "response": "Performs filtering against the default queryset returned by mongoengine."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_queryset(self):\n        if hasattr(self, \"queryset\") and self.queryset:\n            return self.queryset\n\n        self.set_mongonaut_base()\n        self.set_mongoadmin()\n        self.document = getattr(self.models, self.document_name)\n        queryset = self.document.objects.all()\n\n        if self.mongoadmin.ordering:\n            queryset = queryset.order_by(*self.mongoadmin.ordering)\n\n        # search. move this to get_queryset\n        # search. move this to get_queryset\n        q = self.request.GET.get('q')\n        queryset = self.get_qset(queryset, q)\n\n        ### Start pagination\n        ### Note:\n        ###    Can't use Paginator in Django because mongoengine querysets are\n        ###    not the same as Django ORM querysets and it broke.\n        # Make sure page request is an int. If not, deliver first page.\n        try:\n            self.page = int(self.request.GET.get('page', '1'))\n        except ValueError:\n            self.page = 1\n\n        obj_count = queryset.count()\n        self.total_pages = math.ceil(obj_count / self.documents_per_page)\n\n        if self.page > self.total_pages:\n            self.page = self.total_pages\n\n        if self.page < 1:\n            self.page = 1\n\n        start = (self.page - 1) * self.documents_per_page\n        end = self.page * self.documents_per_page\n\n        queryset = queryset[start:end] if obj_count else queryset\n        self.queryset = queryset\n        return queryset", "response": "Replicates Django CBV get_queryset method."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nusing during adding or editing of data.", "response": "def get_initial(self):\n        \"\"\"Used during adding/editing of data.\"\"\"\n        self.query = self.get_queryset()\n        mongo_ids = {'mongo_id': [str(x.id) for x in self.query]}\n        return mongo_ids"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef get_context_data(self, **kwargs):\n        context = super(DocumentListView, self).get_context_data(**kwargs)\n        context = self.set_permissions_in_context(context)\n\n        if not context['has_view_permission']:\n            return HttpResponseForbidden(\"You do not have permissions to view this content.\")\n\n        context['object_list'] = self.get_queryset()\n\n        context['document'] = self.document\n        context['app_label'] = self.app_label\n        context['document_name'] = self.document_name\n        context['request'] = self.request\n\n        # pagination bits\n        context['page'] = self.page\n        context['documents_per_page'] = self.documents_per_page\n\n        if self.page > 1:\n            previous_page_number = self.page - 1\n        else:\n            previous_page_number = None\n\n        if self.page < self.total_pages:\n            next_page_number = self.page + 1\n        else:\n            next_page_number = None\n\n        context['previous_page_number'] = previous_page_number\n        context['has_previous_page'] = previous_page_number is not None\n        context['next_page_number'] = next_page_number\n        context['has_next_page'] = next_page_number is not None\n        context['total_pages'] = self.total_pages\n\n        # Part of upcoming list view form functionality\n        if self.queryset.count():\n            context['keys'] = ['id', ]\n\n            # Show those items for which we've got list_fields on the mongoadmin\n            for key in [x for x in self.mongoadmin.list_fields if x != 'id' and x in self.document._fields.keys()]:\n\n                # TODO - Figure out why this EmbeddedDocumentField and ListField breaks this view\n                # Note - This is the challenge part, right? :)\n                if isinstance(self.document._fields[key], EmbeddedDocumentField):\n                    continue\n                if isinstance(self.document._fields[key], ListField):\n                    continue\n                context['keys'].append(key)\n\n        if self.mongoadmin.search_fields:\n            context['search_field'] = True\n\n        return context", "response": "Injects data into the context to replicate CBV ListView."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ncreate new mongoengine records.", "response": "def post(self, request, *args, **kwargs):\n        \"\"\"Creates new mongoengine records.\"\"\"\n        # TODO - make sure to check the rights of the poster\n        #self.get_queryset() # TODO - write something that grabs the document class better\n        form_class = self.get_form_class()\n        form = self.get_form(form_class)\n        mongo_ids = self.get_initial()['mongo_id']\n        for form_mongo_id in form.data.getlist('mongo_id'):\n            for mongo_id in mongo_ids:\n                if form_mongo_id == mongo_id:\n                    self.document.objects.get(pk=mongo_id).delete()\n\n        return self.form_invalid(form)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef get_context_data(self, **kwargs):\n        context = super(DocumentAddFormView, self).get_context_data(**kwargs)\n        self.set_mongoadmin()\n        context = self.set_permissions_in_context(context)\n        self.document_type = getattr(self.models, self.document_name)\n\n        context['app_label'] = self.app_label\n        context['document_name'] = self.document_name\n        context['form_action'] = reverse('document_detail_add_form', args=[self.kwargs.get('app_label'),\n                                                                           self.kwargs.get('document_name')])\n\n        return context", "response": "This method is called by the base class to get the context data for the add form."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn a list of all mongoadmin implementations for the site", "response": "def get_mongoadmins(self):\n        \"\"\" Returns a list of all mongoadmin implementations for the site \"\"\"\n        apps = []\n        for app_name in settings.INSTALLED_APPS:\n            mongoadmin = \"{0}.mongoadmin\".format(app_name)\n            try:\n                module = import_module(mongoadmin)\n            except ImportError as e:\n                if str(e).startswith(\"No module named\"):\n                    continue\n                raise e\n\n            app_store = AppStore(module)\n            apps.append(dict(\n                app_name=app_name,\n                obj=app_store\n            ))\n        return apps"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nsetting the base of the commonly used attributes.", "response": "def set_mongonaut_base(self):\n        \"\"\" Sets a number of commonly used attributes \"\"\"\n        if hasattr(self, \"app_label\"):\n            # prevents us from calling this multiple times\n            return None\n        self.app_label = self.kwargs.get('app_label')\n        self.document_name = self.kwargs.get('document_name')\n\n        # TODO Allow this to be assigned via url variable\n        self.models_name = self.kwargs.get('models_name', 'models')\n\n        # import the models file\n        self.model_name = \"{0}.{1}\".format(self.app_label, self.models_name)\n        self.models = import_module(self.model_name)"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nreturns the MongoAdmin object for an app_label document_name style view", "response": "def set_mongoadmin(self):\n        \"\"\" Returns the MongoAdmin object for an app_label/document_name style view\n        \"\"\"\n        if hasattr(self, \"mongoadmin\"):\n            return None\n\n        if not hasattr(self, \"document_name\"):\n            self.set_mongonaut_base()\n\n        for mongoadmin in self.get_mongoadmins():\n            for model in mongoadmin['obj'].models:\n                if model.name == self.document_name:\n                    self.mongoadmin = model.mongoadmin\n                    break\n        # TODO change this to use 'finally' or 'else' or something\n        if not hasattr(self, \"mongoadmin\"):\n            raise NoMongoAdminSpecified(\"No MongoAdmin for {0}.{1}\".format(self.app_label, self.document_name))"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef set_permissions_in_context(self, context={}):\n\n        context['has_view_permission'] = self.mongoadmin.has_view_permission(self.request)\n        context['has_edit_permission'] = self.mongoadmin.has_edit_permission(self.request)\n        context['has_add_permission'] = self.mongoadmin.has_add_permission(self.request)\n        context['has_delete_permission'] = self.mongoadmin.has_delete_permission(self.request)\n        return context", "response": "Provides permissions for use in the context"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef process_document(self, document, form_key, passed_key):\n        if passed_key is not None:\n            current_key, remaining_key_array = trim_field_key(document, passed_key)\n        else:\n            current_key, remaining_key_array = trim_field_key(document, form_key)\n\n        key_array_digit = remaining_key_array[-1] if remaining_key_array and has_digit(remaining_key_array) else None\n        remaining_key = make_key(remaining_key_array)\n\n        if current_key.lower() == 'id':\n            raise KeyError(u\"Mongonaut does not work with models which have fields beginning with id_\")\n\n        # Create boolean checks to make processing document easier\n        is_embedded_doc = (isinstance(document._fields.get(current_key, None), EmbeddedDocumentField)\n                          if hasattr(document, '_fields') else False)\n        is_list = not key_array_digit is None\n        key_in_fields = current_key in document._fields.keys() if hasattr(document, '_fields') else False\n\n        # This ensures you only go through each documents keys once, and do not duplicate data\n        if key_in_fields:\n            if is_embedded_doc:\n                self.set_embedded_doc(document, form_key, current_key, remaining_key)\n            elif is_list:\n                self.set_list_field(document, form_key, current_key, remaining_key, key_array_digit)\n            else:\n                value = translate_value(document._fields[current_key],\n                                        self.form.cleaned_data[form_key])\n                setattr(document, current_key, value)", "response": "This method will evaluate the document and set values correctly for the given form_key."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef set_embedded_doc(self, document, form_key, current_key, remaining_key):\n\n        embedded_doc = getattr(document, current_key, False)\n        if not embedded_doc:\n            embedded_doc = document._fields[current_key].document_type_obj()\n\n        new_key, new_remaining_key_array = trim_field_key(embedded_doc, remaining_key)\n        self.process_document(embedded_doc, form_key, make_key(new_key, new_remaining_key_array))\n        setattr(document, current_key, embedded_doc)", "response": "Set the embedded document for the given form key."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef set_list_field(self, document, form_key, current_key, remaining_key, key_array_digit):\n\n        document_field = document._fields.get(current_key)\n\n        # Figure out what value the list ought to have\n        # None value for ListFields make mongoengine very un-happy\n        list_value = translate_value(document_field.field, self.form.cleaned_data[form_key])\n        if list_value is None or (not list_value and not bool(list_value)):\n            return None\n\n        current_list = getattr(document, current_key, None)\n\n        if isinstance(document_field.field, EmbeddedDocumentField):\n            embedded_list_key = u\"{0}_{1}\".format(current_key, key_array_digit)\n\n            # Get the embedded document if it exists, else create it.\n            embedded_list_document = self.embedded_list_docs.get(embedded_list_key, None)\n            if embedded_list_document is None:\n                embedded_list_document = document_field.field.document_type_obj()\n\n            new_key, new_remaining_key_array = trim_field_key(embedded_list_document, remaining_key)\n            self.process_document(embedded_list_document, form_key, new_key)\n\n            list_value = embedded_list_document\n            self.embedded_list_docs[embedded_list_key] = embedded_list_document\n\n            if isinstance(current_list, list):\n                # Do not add the same document twice\n                if embedded_list_document not in current_list:\n                    current_list.append(embedded_list_document)\n            else:\n                setattr(document, current_key, [embedded_list_document])\n\n        elif isinstance(current_list, list):\n            current_list.append(list_value)\n        else:\n            setattr(document, current_key, [list_value])", "response": "Set the list of items in the document."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef with_tz(request):\n    \n    dt = datetime.now() \n    t = Template('{% load tz %}{% localtime on %}{% get_current_timezone as TIME_ZONE %}{{ TIME_ZONE }}{% endlocaltime %}') \n    c = RequestContext(request)\n    response = t.render(c)\n    return HttpResponse(response)", "response": "Get the time with TZ enabled\n   "}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef without_tz(request):\n    \n    t = Template('{% load tz %}{% get_current_timezone as TIME_ZONE %}{{ TIME_ZONE }}') \n    c = RequestContext(request)\n    response = t.render(c)\n    return HttpResponse(response)", "response": "Get the time without TZ enabled\n   "}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef is_valid_ip(ip_address):\n\n    try:\n        ip = ipaddress.ip_address(u'' + ip_address)\n        return True\n    except ValueError as e:\n        return False", "response": "Check Validity of an IP address"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nchecks if IP is local", "response": "def is_local_ip(ip_address):\n    \"\"\" Check if IP is local \"\"\"\n\n    try:\n        ip = ipaddress.ip_address(u'' + ip_address)\n        return ip.is_loopback\n    except ValueError as e:\n        return None"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nprocessing the request and return the set of related attributes.", "response": "def process_request(self, request):\n        \"\"\"\n        If we can get a valid IP from the request,\n        look up that address in the database to get the appropriate timezone\n        and activate it.\n\n        Else, use the default.\n\n        \"\"\"\n\n        if not request:\n            return\n\n        if not db_loaded:\n            load_db()\n\n        tz = request.session.get('django_timezone')\n\n        if not tz:\n            # use the default timezone (settings.TIME_ZONE) for localhost\n            tz = timezone.get_default_timezone()\n\n            client_ip = get_ip_address_from_request(request)\n            ip_addrs = client_ip.split(',')\n            for ip in ip_addrs:\n                if is_valid_ip(ip) and not is_local_ip(ip):\n                    if ':' in ip:\n                        tz = db_v6.time_zone_by_addr(ip)\n                        break\n                    else:\n                        tz = db.time_zone_by_addr(ip)\n                        break\n\n        if tz:\n            timezone.activate(tz)\n            request.session['django_timezone'] = str(tz)\n            if getattr(settings, 'AUTH_USER_MODEL', None) and getattr(request, 'user', None):\n                detected_timezone.send(sender=get_user_model(), instance=request.user, timezone=tz)\n        else:\n            timezone.deactivate()"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef elastic_query(model, query, session=None, enabled_fields=None):\n    # TODO: make session to optional\n    instance = ElasticQuery(model, query, session, enabled_fields)\n    return instance.search()", "response": "This method is used to make an elastic query"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef parse_field(self, field, field_value):\n        if type(field_value) is dict:\n            # TODO: check operators and emit error\n            operator = list(field_value)[0]\n            if self.verify_operator(operator) is False:\n                return \"Error: operator does not exist\", operator\n            value = field_value[operator]\n        elif type(field_value) is unicode:\n            operator = u'equals'\n            value = field_value\n        return field, operator, value", "response": "Parse the field and value"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ncreate a query from the attribute list.", "response": "def create_query(self, attr):\n        \"\"\" Mix all values and make the query \"\"\"\n        field = attr[0]\n        operator = attr[1]\n        value = attr[2]\n        model = self.model\n\n        if '.' in field:\n            field_items = field.split('.')\n            field_name = getattr(model, field_items[0], None)\n            class_name = field_name.property.mapper.class_\n            new_model = getattr(class_name, field_items[1])\n            return field_name.has(OPERATORS[operator](new_model, value))\n\n        return OPERATORS[operator](getattr(model, field, None), value)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef sort(self, sort_list):\n        order = []\n        for sort in sort_list:\n            if sort_list[sort] == \"asc\":\n                order.append(asc(getattr(self.model, sort, None)))\n            elif sort_list[sort] == \"desc\":\n                order.append(desc(getattr(self.model, sort, None)))\n        return order", "response": "Sort the list of items in the database."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nsending a message using the default SMTP settings.", "response": "def sendmail(self, msg_from, msg_to, msg):\n        \"\"\"Remember the recipients.\"\"\"\n        SMTP_dummy.msg_from = msg_from\n        SMTP_dummy.msg_to = msg_to\n        SMTP_dummy.msg = msg"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nparse a raw email message and return a tuple of the message sender and recipients.", "response": "def parsemail(raw_message):\n    \"\"\"Parse message headers, then remove BCC header.\"\"\"\n    message = email.parser.Parser().parsestr(raw_message)\n\n    # Detect encoding\n    detected = chardet.detect(bytearray(raw_message, \"utf-8\"))\n    encoding = detected[\"encoding\"]\n    print(\">>> encoding {}\".format(encoding))\n    for part in message.walk():\n        if part.get_content_maintype() == 'multipart':\n            continue\n        part.set_charset(encoding)\n\n    # Extract recipients\n    addrs = email.utils.getaddresses(message.get_all(\"TO\", [])) + \\\n        email.utils.getaddresses(message.get_all(\"CC\", [])) + \\\n        email.utils.getaddresses(message.get_all(\"BCC\", []))\n    recipients = [x[1] for x in addrs]\n    message.__delitem__(\"bcc\")\n    message.__setitem__('Date', email.utils.formatdate())\n    sender = message[\"from\"]\n\n    return (message, sender, recipients)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _create_boundary(message):\n    if not message.is_multipart() or message.get_boundary() is not None:\n        return message\n    # HACK: Python2 lists do not natively have a `copy` method. Unfortunately,\n    # due to a bug in the Backport for the email module, the method\n    # `Message.set_boundary` converts the Message headers into a native list,\n    # so that other methods that rely on \"copying\" the Message headers fail.\n    # `Message.set_boundary` is called from `Generator.handle_multipart` if the\n    # message does not already have a boundary present. (This method itself is\n    # called from `Message.as_string`.)\n    # Hence, to prevent `Message.set_boundary` from being called, add a\n    # boundary header manually.\n    from future.backports.email.generator import Generator\n    # pylint: disable=protected-access\n    boundary = Generator._make_boundary(message.policy.linesep)\n    message.set_param('boundary', boundary)\n    return message", "response": "Add boundary parameter to multipart message if they are not present."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nconvert a message into a multipart message.", "response": "def make_message_multipart(message):\n    \"\"\"Convert a message into a multipart message.\"\"\"\n    if not message.is_multipart():\n        multipart_message = email.mime.multipart.MIMEMultipart('alternative')\n        for header_key in set(message.keys()):\n            # Preserve duplicate headers\n            values = message.get_all(header_key, failobj=[])\n            for value in values:\n                multipart_message[header_key] = value\n        original_text = message.get_payload()\n        multipart_message.attach(email.mime.text.MIMEText(original_text))\n        message = multipart_message\n    # HACK: For Python2 (see comments in `_create_boundary`)\n    message = _create_boundary(message)\n    return message"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nconvert markdown in message text to HTML.", "response": "def convert_markdown(message):\n    \"\"\"Convert markdown in message text to HTML.\"\"\"\n    assert message['Content-Type'].startswith(\"text/markdown\")\n    del message['Content-Type']\n    # Convert the text from markdown and then make the message multipart\n    message = make_message_multipart(message)\n    for payload_item in set(message.get_payload()):\n        # Assume the plaintext item is formatted with markdown.\n        # Add corresponding HTML version of the item as the last part of\n        # the multipart message (as per RFC 2046)\n        if payload_item['Content-Type'].startswith('text/plain'):\n            original_text = payload_item.get_payload()\n            html_text = markdown.markdown(original_text)\n            html_payload = future.backports.email.mime.text.MIMEText(\n                \"<html><body>{}</body></html>\".format(html_text),\n                \"html\",\n            )\n            message.attach(html_payload)\n    return message"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef addattachments(message, template_path):\n    if 'attachment' not in message:\n        return message, 0\n\n    message = make_message_multipart(message)\n\n    attachment_filepaths = message.get_all('attachment', failobj=[])\n    template_parent_dir = os.path.dirname(template_path)\n\n    for attachment_filepath in attachment_filepaths:\n        attachment_filepath = os.path.expanduser(attachment_filepath.strip())\n        if not attachment_filepath:\n            continue\n        if not os.path.isabs(attachment_filepath):\n            # Relative paths are relative to the template's parent directory\n            attachment_filepath = os.path.join(template_parent_dir,\n                                               attachment_filepath)\n        normalized_path = os.path.abspath(attachment_filepath)\n        # Check that the attachment exists\n        if not os.path.exists(normalized_path):\n            print(\"Error: can't find attachment \" + normalized_path)\n            sys.exit(1)\n\n        filename = os.path.basename(normalized_path)\n        with open(normalized_path, \"rb\") as attachment:\n            part = email.mime.application.MIMEApplication(attachment.read(),\n                                                          Name=filename)\n        part.add_header('Content-Disposition',\n                        'attachment; filename=\"{}\"'.format(filename))\n        message.attach(part)\n        print(\">>> attached {}\".format(normalized_path))\n\n    del message['attachment']\n    return message, len(attachment_filepaths)", "response": "Add the attachments from the message from the commandline options."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef sendmail(message, sender, recipients, config_filename):\n    # Read config file from disk to get SMTP server host, port, username\n    if not hasattr(sendmail, \"host\"):\n        config = configparser.RawConfigParser()\n        config.read(config_filename)\n        sendmail.host = config.get(\"smtp_server\", \"host\")\n        sendmail.port = config.getint(\"smtp_server\", \"port\")\n        sendmail.username = config.get(\"smtp_server\", \"username\")\n        sendmail.security = config.get(\"smtp_server\", \"security\")\n        print(\">>> Read SMTP server configuration from {}\".format(\n            config_filename))\n        print(\">>>   host = {}\".format(sendmail.host))\n        print(\">>>   port = {}\".format(sendmail.port))\n        print(\">>>   username = {}\".format(sendmail.username))\n        print(\">>>   security = {}\".format(sendmail.security))\n\n    # Prompt for password\n    if not hasattr(sendmail, \"password\"):\n        if sendmail.security == \"Dummy\" or sendmail.username == \"None\":\n            sendmail.password = None\n        else:\n            prompt = \">>> password for {} on {}: \".format(sendmail.username,\n                                                          sendmail.host)\n            sendmail.password = getpass.getpass(prompt)\n\n    # Connect to SMTP server\n    if sendmail.security == \"SSL/TLS\":\n        smtp = smtplib.SMTP_SSL(sendmail.host, sendmail.port)\n    elif sendmail.security == \"STARTTLS\":\n        smtp = smtplib.SMTP(sendmail.host, sendmail.port)\n        smtp.ehlo()\n        smtp.starttls()\n        smtp.ehlo()\n    elif sendmail.security == \"Never\":\n        smtp = smtplib.SMTP(sendmail.host, sendmail.port)\n    elif sendmail.security == \"Dummy\":\n        smtp = smtp_dummy.SMTP_dummy()\n    else:\n        raise configparser.Error(\"Unrecognized security type: {}\".format(\n            sendmail.security))\n\n    # Send credentials\n    if sendmail.username != \"None\":\n        smtp.login(sendmail.username, sendmail.password)\n\n    # Send message.  Note that we can't use the elegant\n    # \"smtp.send_message(message)\" because that's python3 only\n    smtp.sendmail(sender, recipients, message.as_string())\n    smtp.close()", "response": "Send an email using Python SMTP library."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef create_sample_input_files(template_filename,\n                              database_filename,\n                              config_filename):\n    \"\"\"Create sample template email and database.\"\"\"\n    print(\"Creating sample template email {}\".format(template_filename))\n    if os.path.exists(template_filename):\n        print(\"Error: file exists: \" + template_filename)\n        sys.exit(1)\n    with io.open(template_filename, \"w\") as template_file:\n        template_file.write(\n            u\"TO: {{email}}\\n\"\n            u\"SUBJECT: Testing mailmerge\\n\"\n            u\"FROM: My Self <myself@mydomain.com>\\n\"\n            u\"\\n\"\n            u\"Hi, {{name}},\\n\"\n            u\"\\n\"\n            u\"Your number is {{number}}.\\n\"\n        )\n    print(\"Creating sample database {}\".format(database_filename))\n    if os.path.exists(database_filename):\n        print(\"Error: file exists: \" + database_filename)\n        sys.exit(1)\n    with io.open(database_filename, \"w\") as database_file:\n        database_file.write(\n            u'email,name,number\\n'\n            u'myself@mydomain.com,\"Myself\",17\\n'\n            u'bob@bobdomain.com,\"Bob\",42\\n'\n        )\n    print(\"Creating sample config file {}\".format(config_filename))\n    if os.path.exists(config_filename):\n        print(\"Error: file exists: \" + config_filename)\n        sys.exit(1)\n    with io.open(config_filename, \"w\") as config_file:\n        config_file.write(\n            u\"# Example: GMail\\n\"\n            u\"[smtp_server]\\n\"\n            u\"host = smtp.gmail.com\\n\"\n            u\"port = 465\\n\"\n            u\"security = SSL/TLS\\n\"\n            u\"username = YOUR_USERNAME_HERE\\n\"\n            u\"#\\n\"\n            u\"# Example: Wide open\\n\"\n            u\"# [smtp_server]\\n\"\n            u\"# host = open-smtp.example.com\\n\"\n            u\"# port = 25\\n\"\n            u\"# security = Never\\n\"\n            u\"# username = None\\n\"\n            u\"#\\n\"\n            u\"# Example: University of Michigan\\n\"\n            u\"# [smtp_server]\\n\"\n            u\"# host = smtp.mail.umich.edu\\n\"\n            u\"# port = 465\\n\"\n            u\"# security = SSL/TLS\\n\"\n            u\"# username = YOUR_USERNAME_HERE\\n\"\n            u\"#\\n\"\n            u\"# Example: University of Michigan EECS Dept., with STARTTLS security\\n\"  # noqa: E501\n            u\"# [smtp_server]\\n\"\n            u\"# host = newman.eecs.umich.edu\\n\"\n            u\"# port = 25\\n\"\n            u\"# security = STARTTLS\\n\"\n            u\"# username = YOUR_USERNAME_HERE\\n\"\n            u\"#\\n\"\n            u\"# Example: University of Michigan EECS Dept., with no encryption\\n\"  # noqa: E501\n            u\"# [smtp_server]\\n\"\n            u\"# host = newman.eecs.umich.edu\\n\"\n            u\"# port = 25\\n\"\n            u\"# security = Never\\n\"\n            u\"# username = YOUR_USERNAME_HERE\\n\"\n        )\n    print(\"Edit these files, and then run mailmerge again\")", "response": "Create sample template and database files."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _tailCallback(f, uid):\n    def t(*args):\n        raise _TailCall(f, args, uid)\n    t.C = f\n    return t", "response": "Returns a function that will be called when the last call to the function f is called."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef with_continuations(**c):\n    if len(c): keys, k = zip(*c.items())\n    else: keys, k = tuple([]), tuple([])\n    def d(f):\n        return C(\n            lambda kself, *conts:\n                lambda *args:\n                    f(*args, self=kself, **dict(zip(keys, conts)))) (*k)\n    return d", "response": "A decorator for defining tail - call optimized functions."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef parse_int_list(string):\n    integers = []\n    for comma_part in string.split(\",\"):\n        for substring in comma_part.split(\" \"):\n            if len(substring) == 0:\n                continue\n            if \"-\" in substring:\n                left, right = substring.split(\"-\")\n                left_val = int(left.strip())\n                right_val = int(right.strip())\n                integers.extend(range(left_val, right_val + 1))\n            else:\n                integers.append(int(substring.strip()))\n    return integers", "response": "Parses a string of numbers and ranges into a list of integers."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef sanitize_params(method, **kwargs):\n        # items which does not have a key starting with `_`\n        items = [(key, value) for key, value in kwargs.items()\n                 if not key.startswith(\"_\")]\n        params, skip_params = {}, False\n\n        for key, value in items:\n            # binary data\n            if hasattr(value, 'read') or isinstance(value, bytes):\n                params[key] = value\n                # The params won't be used to make the signature\n                skip_params = True\n\n            # booleans conversion\n            elif isinstance(value, bool):\n                params[key] = \"true\" if value else \"false\"\n\n            # iterables conversion\n            elif isinstance(value, iterable):\n                params[key] = \",\".join(map(str, value))\n\n            # skip params with value None\n            elif value is None:\n                pass\n\n            # the rest is converted to str\n            # (make sure you don't send something wrong)\n            else:\n                params[key] = str(value)\n\n        # dict with other items (+ strip \"_\" from keys)\n        kwargs = {key[1:]: value for key, value in kwargs.items()\n                  if key.startswith(\"_\")}\n\n        if method == \"post\" and not kwargs.get('data', None) and params:\n            kwargs['data'] = params  # post requests use the data argument\n\n        elif not kwargs.get('params', None) and params:\n            kwargs['params'] = params\n\n        return kwargs, skip_params", "response": "Sanitize the request parameters for the new requests."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ncreate the base url for the api and version", "response": "def _get_base_url(base_url, api, version):\n        \"\"\"\n            create the base url for the api\n\n        Parameters\n        ----------\n        base_url : str\n            format of the base_url using {api} and {version}\n        api : str\n            name of the api to use\n        version : str\n            version of the api\n\n        Returns\n        -------\n        str\n            the base url of the api you want to use\n        \"\"\"\n        format_args = {}\n\n        if \"{api}\" in base_url:\n            if api == \"\":\n                base_url = base_url.replace('{api}.', '')\n            else:\n                format_args['api'] = api\n\n        if \"{version}\" in base_url:\n            if version == \"\":\n                base_url = base_url.replace('/{version}', '')\n            else:\n                format_args['version'] = version\n\n        return base_url.format(api=api, version=version)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\nasync def request(self, method, url, future,\n                      headers=None,\n                      session=None,\n                      encoding=None,\n                      **kwargs):\n        \"\"\"\n            Make requests to the REST API\n\n        Parameters\n        ----------\n        future : asyncio.Future\n            Future used to return the response\n        method : str\n            Method to be used by the request\n        url : str\n            URL of the resource\n        headers : .oauth.PeonyHeaders\n            Custom headers (doesn't overwrite `Authorization` headers)\n        session : aiohttp.ClientSession, optional\n            Client session used to make the request\n\n        Returns\n        -------\n        data.PeonyResponse\n            Response to the request\n        \"\"\"\n        await self.setup\n\n        # prepare request arguments, particularly the headers\n        req_kwargs = await self.headers.prepare_request(\n            method=method,\n            url=url,\n            headers=headers,\n            proxy=self.proxy,\n            **kwargs\n        )\n\n        if encoding is None:\n            encoding = self.encoding\n\n        session = session if (session is not None) else self._session\n\n        logger.debug(\"making request with parameters: %s\" % req_kwargs)\n\n        async with session.request(**req_kwargs) as response:\n            if response.status < 400:\n                data = await data_processing.read(response, self._loads,\n                                                  encoding=encoding)\n\n                future.set_result(data_processing.PeonyResponse(\n                    data=data,\n                    headers=response.headers,\n                    url=response.url,\n                    request=req_kwargs\n                ))\n            else:  # throw exception if status is not 2xx\n                await exceptions.throw(response, loads=self._loads,\n                                       encoding=encoding, url=url)", "response": "Makes a request to the REST API and returns the response."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef stream_request(self, method, url, headers=None, _session=None,\n                       *args, **kwargs):\n        \"\"\"\n            Make requests to the Streaming API\n\n        Parameters\n        ----------\n        method : str\n            Method to be used by the request\n        url : str\n            URL of the resource\n        headers : dict\n            Custom headers (doesn't overwrite `Authorization` headers)\n        _session : aiohttp.ClientSession, optional\n            The session to use for this specific request, the session\n            given as argument of :meth:`__init__` is used by default\n\n        Returns\n        -------\n        .stream.StreamResponse\n            Stream context for the request\n        \"\"\"\n        return StreamResponse(\n            method=method,\n            url=url,\n            client=self,\n            headers=headers,\n            session=_session,\n            proxy=self.proxy,\n            **kwargs\n        )", "response": "Make a request to the Streaming API and return a StreamResponse object."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nreturning the list of tasks attached to the instance", "response": "def get_tasks(self):\n        \"\"\"\n            Get the tasks attached to the instance\n\n        Returns\n        -------\n        list\n            List of tasks (:class:`asyncio.Task`)\n        \"\"\"\n        tasks = self._get_tasks()\n        tasks.extend(self._streams.get_tasks(self))\n\n        return tasks"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\nasync def run_tasks(self):\n        tasks = self.get_tasks()\n        self._gathered_tasks = asyncio.gather(*tasks, loop=self.loop)\n        try:\n            await self._gathered_tasks\n        except CancelledError:\n            pass", "response": "Run the tasks attached to the instance"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\nasync def _get_twitter_configuration(self):\n        api = self['api', general.twitter_api_version,\n                   \".json\", general.twitter_base_api_url]\n\n        return await api.help.configuration.get()", "response": "get the twitter configuration"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\nasync def _get_user(self):\n        api = self['api', general.twitter_api_version,\n                   \".json\", general.twitter_base_api_url]\n\n        return await api.account.verify_credentials.get()", "response": "get the user from twitter"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\nasync def _chunked_upload(self, media, media_size,\n                              path=None,\n                              media_type=None,\n                              media_category=None,\n                              chunk_size=2**20,\n                              **params):\n        \"\"\"\n            upload media in chunks\n\n        Parameters\n        ----------\n        media : file object\n            a file object of the media\n        media_size : int\n            size of the media\n        path : str, optional\n            filename of the media\n        media_type : str, optional\n            mime type of the media\n        media_category : str, optional\n            twitter media category, must be used with ``media_type``\n        chunk_size : int, optional\n            size of a chunk in bytes\n        params : dict, optional\n            additional parameters of the request\n\n        Returns\n        -------\n        .data_processing.PeonyResponse\n            Response of the request\n        \"\"\"\n        if isinstance(media, bytes):\n            media = io.BytesIO(media)\n\n        chunk = media.read(chunk_size)\n        is_coro = asyncio.iscoroutine(chunk)\n\n        if is_coro:\n            chunk = await chunk\n\n        if media_type is None:\n            media_metadata = await utils.get_media_metadata(chunk, path)\n            media_type, media_category = media_metadata\n        elif media_category is None:\n            media_category = utils.get_category(media_type)\n\n        response = await self.upload.media.upload.post(\n            command=\"INIT\",\n            total_bytes=media_size,\n            media_type=media_type,\n            media_category=media_category,\n            **params\n        )\n\n        media_id = response['media_id']\n        i = 0\n\n        while chunk:\n            if is_coro:\n                req = self.upload.media.upload.post(command=\"APPEND\",\n                                                    media_id=media_id,\n                                                    media=chunk,\n                                                    segment_index=i)\n                chunk, _ = await asyncio.gather(media.read(chunk_size), req)\n            else:\n                await self.upload.media.upload.post(command=\"APPEND\",\n                                                    media_id=media_id,\n                                                    media=chunk,\n                                                    segment_index=i)\n\n                chunk = media.read(chunk_size)\n\n            i += 1\n\n        status = await self.upload.media.upload.post(command=\"FINALIZE\",\n                                                     media_id=media_id)\n\n        if 'processing_info' in status:\n            while status['processing_info'].get('state') != \"succeeded\":\n                processing_info = status['processing_info']\n                if processing_info.get('state') == \"failed\":\n                    error = processing_info.get('error', {})\n\n                    message = error.get('message', str(status))\n\n                    raise exceptions.MediaProcessingError(data=status,\n                                                          message=message,\n                                                          **params)\n\n                delay = processing_info['check_after_secs']\n                await asyncio.sleep(delay)\n\n                status = await self.upload.media.upload.get(\n                    command=\"STATUS\",\n                    media_id=media_id,\n                    **params\n                )\n\n        return response", "response": "This method is used to upload a media in chunks and return a dict containing the response from the server."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\nasync def upload_media(self, file_,\n                           media_type=None,\n                           media_category=None,\n                           chunked=None,\n                           size_limit=None,\n                           **params):\n        \"\"\"\n            upload a media on twitter\n\n        Parameters\n        ----------\n        file_ : str or pathlib.Path or file\n            Path to the file or file object\n        media_type : str, optional\n            mime type of the media\n        media_category : str, optional\n            Twitter's media category of the media, must be used with\n            ``media_type``\n        chunked : bool, optional\n            If True, force the use of the chunked upload for the media\n        size_limit : int, optional\n            If set, the media will be sent using a multipart upload if\n            its size is over ``size_limit`` bytes\n        params : dict\n            parameters used when making the request\n\n        Returns\n        -------\n        .data_processing.PeonyResponse\n            Response of the request\n        \"\"\"\n        if isinstance(file_, str):\n            url = urlparse(file_)\n            if url.scheme.startswith('http'):\n                media = await self._session.get(file_)\n            else:\n                path = urlparse(file_).path.strip(\" \\\"'\")\n                media = await utils.execute(open(path, 'rb'))\n        elif hasattr(file_, 'read') or isinstance(file_, bytes):\n            media = file_\n        else:\n            raise TypeError(\"upload_media input must be a file object or a \"\n                            \"filename or binary data or an aiohttp request\")\n\n        media_size = await utils.get_size(media)\n        if chunked is not None:\n            size_test = False\n        else:\n            size_test = await self._size_test(media_size, size_limit)\n\n        if isinstance(media, aiohttp.ClientResponse):\n            # send the content of the response\n            media = media.content\n\n        if chunked or (size_test and chunked is None):\n            args = media, media_size, file_, media_type, media_category\n            response = await self._chunked_upload(*args, **params)\n        else:\n            response = await self.upload.media.upload.post(media=media,\n                                                           **params)\n\n        if not hasattr(file_, 'read') and not getattr(media, 'closed', True):\n            media.close()\n\n        return response", "response": "Uploads a media to twitter and returns a response."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef split_stdout_lines(stdout):\n    # all the NetMHC formats use lines full of dashes before any actual\n    # binding results\n    seen_dash = False\n    for l in stdout.split(\"\\n\"):\n        l = l.strip()\n        # wait for a line like '----------' before trying to parse entries\n        # have to include multiple dashes here since NetMHC 4.0 sometimes\n        # gives negative positions in its \"peptide\" input mode\n        if l.startswith(\"---\"):\n            seen_dash = True\n            continue\n        if not seen_dash:\n            continue\n        # ignore empty lines and comments\n        if not l or l.startswith(\"#\"):\n            continue\n        # beginning of headers in NetMHC\n        if any(l.startswith(word) for word in NETMHC_TOKENS):\n            continue\n        yield l.split()", "response": "Given the standard output from NetMHC and NetMHCcons tools split the output into individual lines."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef clean_fields(fields, ignored_value_indices, transforms):\n    cleaned_fields = []\n    for i, field in enumerate(fields):\n        if field in ignored_value_indices:\n            ignored_index = ignored_value_indices[field]\n\n            # Is the value we want to ignore at the index where we'd ignore it?\n            if ignored_index == i:\n                continue\n\n        # transform this field if the index is in transforms, otherwise leave alone\n        cleaned_field = transforms[i](field) if i in transforms else field\n        cleaned_fields.append(cleaned_field)\n    return cleaned_fields", "response": "Clean the fields that are not populated by the user."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef parse_stdout(\n        stdout,\n        prediction_method_name,\n        sequence_key_mapping,\n        key_index,\n        offset_index,\n        peptide_index,\n        allele_index,\n        ic50_index,\n        rank_index,\n        log_ic50_index,\n        ignored_value_indices={},\n        transforms={}):\n    \"\"\"\n    Generic function for parsing any NetMHC* output, given expected indices\n    of values of interest.\n\n    Parameters\n    ----------\n    ignored_value_indices : dict\n        Map from values to the positions we'll ignore them at. See clean_fields.\n\n    transforms  : dict\n        Map from field index to a transform function to be applied to values in\n        that field. See clean_fields.\n\n    Returns BindingPredictionCollection\n    \"\"\"\n\n    binding_predictions = []\n    for fields in split_stdout_lines(stdout):\n        fields = clean_fields(fields, ignored_value_indices, transforms)\n\n        offset = int(fields[offset_index])\n        peptide = str(fields[peptide_index])\n        allele = str(fields[allele_index])\n        ic50 = float(fields[ic50_index])\n        rank = float(fields[rank_index]) if rank_index else 0.0\n        log_ic50 = float(fields[log_ic50_index])\n\n        key = str(fields[key_index])\n        if sequence_key_mapping:\n            original_key = sequence_key_mapping[key]\n        else:\n            # if sequence_key_mapping isn't provided then let's assume it's the\n            # identity function\n            original_key = key\n\n        binding_predictions.append(BindingPrediction(\n            source_sequence_name=original_key,\n            offset=offset,\n            peptide=peptide,\n            allele=normalize_allele_name(allele),\n            affinity=ic50,\n            percentile_rank=rank,\n            log_affinity=log_ic50,\n            prediction_method_name=prediction_method_name))\n    return binding_predictions", "response": "Parses the NetMHC output of a NetMHC binding prediction command."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef parse_netmhc3_stdout(\n        stdout,\n        prediction_method_name=\"netmhc3\",\n        sequence_key_mapping=None):\n    \"\"\"\n    Parse the output format for NetMHC 3.x, which looks like:\n\n    ----------------------------------------------------------------------------------------------------\n    pos    peptide      logscore affinity(nM) Bind Level    Protein Name     Allele\n    ----------------------------------------------------------------------------------------------------\n    0  SIINKFELL         0.437          441         WB              A1 HLA-A02:01\n    --------------------------------------------------------------------------------------------------\n    0  SIINKFFFQ         0.206         5411                         A2 HLA-A02:01\n    1  IINKFFFQQ         0.128        12544                         A2 HLA-A02:01\n    2  INKFFFQQQ         0.046        30406                         A2 HLA-A02:01\n    3  NKFFFQQQQ         0.050        29197                         A2 HLA-A02:01\n    --------------------------------------------------------------------------------------------------\n    \"\"\"\n    return parse_stdout(\n        stdout=stdout,\n        prediction_method_name=prediction_method_name,\n        sequence_key_mapping=sequence_key_mapping,\n        key_index=4,\n        offset_index=0,\n        peptide_index=1,\n        allele_index=5,\n        ic50_index=3,\n        rank_index=None,\n        log_ic50_index=2,\n        ignored_value_indices={\"WB\": 4, \"SB\": 4})", "response": "Parse the NetMHC 3. x output and return a list of the result."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nparsing the netmhc4 output of the next log entry.", "response": "def parse_netmhc4_stdout(\n        stdout,\n        prediction_method_name=\"netmhc4\",\n        sequence_key_mapping=None):\n    \"\"\"\n    # Peptide length 9\n    # Rank Threshold for Strong binding peptides   0.500\n    # Rank Threshold for Weak binding peptides   2.000\n    -----------------------------------------------------------------------------------\n      pos          HLA      peptide         Core Offset  I_pos  I_len  D_pos  D_len        iCore        Identity 1-log50k(aff) Affinity(nM)    %Rank  BindLevel\n    -----------------------------------------------------------------------------------\n        0    HLA-A0201    TMDKSELVQ    TMDKSELVQ      0      0      0      0      0    TMDKSELVQ 143B_BOVIN_P293         0.051     28676.59    43.00\n        1    HLA-A0201    MDKSELVQK    MDKSELVQK      0      0      0      0      0    MDKSELVQK 143B_BOVIN_P293         0.030     36155.15    70.00\n        2    HLA-A0201    DKSELVQKA    DKSELVQKA      0      0      0      0      0    DKSELVQKA 143B_BOVIN_P293         0.030     36188.42    70.00\n        3    HLA-A0201    KSELVQKAK    KSELVQKAK      0      0      0      0      0    KSELVQKAK 143B_BOVIN_P293         0.032     35203.22    65.00\n        4    HLA-A0201    SELVQKAKL    SELVQKAKL      0      0      0      0      0    SELVQKAKL 143B_BOVIN_P293         0.031     35670.99    65.00\n        5    HLA-A0201    ELVQKAKLA    ELVQKAKLA      0      0      0      0      0    ELVQKAKLA 143B_BOVIN_P293         0.080     21113.07    29.00\n        6    HLA-A0201    LVQKAKLAE    LVQKAKLAE      0      0      0      0      0    LVQKAKLAE 143B_BOVIN_P293         0.027     37257.56    75.00\n        7    HLA-A0201    VQKAKLAEQ    VQKAKLAEQ      0      0      0      0      0    VQKAKLAEQ 143B_BOVIN_P293         0.040     32404.62    55.00\n      219    HLA-A0201    QLLRDNLTL    QLLRDNLTL      0      0      0      0      0    QLLRDNLTL 143B_BOVIN_P293         0.527       167.10     1.50 <= WB\n    -----------------------------------------------------------------------------------\n    \"\"\"\n    return parse_stdout(\n        stdout=stdout,\n        prediction_method_name=prediction_method_name,\n        sequence_key_mapping=sequence_key_mapping,\n        key_index=10,\n        offset_index=0,\n        peptide_index=2,\n        allele_index=1,\n        ic50_index=12,\n        rank_index=13,\n        log_ic50_index=11)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nparse the output of netmhcpan - 2. 0. 0 to 2001.", "response": "def parse_netmhcpan28_stdout(\n        stdout,\n        prediction_method_name=\"netmhcpan\",\n        sequence_key_mapping=None):\n    \"\"\"\n    # Affinity Threshold for Strong binding peptides  50.000',\n    # Affinity Threshold for Weak binding peptides 500.000',\n    # Rank Threshold for Strong binding peptides   0.500',\n    # Rank Threshold for Weak binding peptides   2.000',\n    ----------------------------------------------------------------------------\n    pos  HLA  peptide  Identity 1-log50k(aff) Affinity(nM)    %Rank  BindLevel\n    ----------------------------------------------------------------------------\n      0  HLA-A*02:03    QQQQQYFPE   id0         0.024     38534.25   50.00\n      1  HLA-A*02:03    QQQQYFPEI   id0         0.278      2461.53   15.00\n      2  HLA-A*02:03    QQQYFPEIT   id0         0.078     21511.53   50.00\n      3  HLA-A*02:03    QQYFPEITH   id0         0.041     32176.84   50.00\n      4  HLA-A*02:03    QYFPEITHI   id0         0.085     19847.09   32.00\n      5  HLA-A*02:03    YFPEITHII   id0         0.231      4123.85   15.00\n      6  HLA-A*02:03    FPEITHIII   id0         0.060     26134.28   50.00\n      7  HLA-A*02:03    PEITHIIIA   id0         0.034     34524.63   50.00\n      8  HLA-A*02:03    EITHIIIAS   id0         0.076     21974.48   50.00\n      9  HLA-A*02:03    ITHIIIASS   id0         0.170      7934.26   32.00\n     10  HLA-A*02:03    THIIIASSS   id0         0.040     32361.18   50.00\n     11  HLA-A*02:03    HIIIASSSL   id0         0.515       189.74    4.00 <= WB\n    \"\"\"\n    check_stdout_error(stdout, \"NetMHCpan-2.8\")\n    return parse_stdout(\n        stdout=stdout,\n        prediction_method_name=prediction_method_name,\n        sequence_key_mapping=sequence_key_mapping,\n        key_index=3,\n        offset_index=0,\n        peptide_index=2,\n        allele_index=1,\n        ic50_index=5,\n        rank_index=6,\n        log_ic50_index=4)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef parse_netmhcpan3_stdout(\n        stdout,\n        prediction_method_name=\"netmhcpan\",\n        sequence_key_mapping=None):\n    \"\"\"\n    # Rank Threshold for Strong binding peptides   0.500\n    # Rank Threshold for Weak binding peptides   2.000\n    -----------------------------------------------------------------------------------\n    Pos          HLA         Peptide       Core Of Gp Gl Ip Il        Icore        Identity   Score Aff(nM)   %Rank  BindLevel\n    -----------------------------------------------------------------------------------\n    1  HLA-B*18:01        MFCQLAKT  MFCQLAKT-  0  0  0  8  1     MFCQLAKT     sequence0_0 0.02864 36676.0   45.00\n    2  HLA-B*18:01        FCQLAKTY  F-CQLAKTY  0  0  0  1  1     FCQLAKTY     sequence0_0 0.07993 21056.5   13.00\n    \"\"\"\n\n    # the offset specified in \"pos\" (at index 0) is 1-based instead of 0-based. we adjust it to be\n    # 0-based, as in all the other netmhc predictors supported by this library.\n    transforms = {\n        0: lambda x: int(x) - 1,\n    }\n    return parse_stdout(\n        stdout=stdout,\n        prediction_method_name=prediction_method_name,\n        sequence_key_mapping=sequence_key_mapping,\n        key_index=10,\n        offset_index=0,\n        peptide_index=2,\n        allele_index=1,\n        ic50_index=12,\n        rank_index=13,\n        log_ic50_index=11,\n        transforms=transforms)", "response": "Parse the netmhcpan3 output of the netmhc predictor and return a list of the netmhc predictor objects."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef parse_netmhcpan4_stdout(\n        stdout,\n        prediction_method_name=\"netmhcpan\",\n        sequence_key_mapping=None):\n    \"\"\"\n    # NetMHCpan version 4.0\n\n    # Tmpdir made /var/folders/jc/fyrvcrcs3sb8g4mkdg6nl_t80000gp/T//netMHCpanuH3SvY\n    # Input is in PEPTIDE format\n\n    # Make binding affinity predictions\n\n    HLA-A02:01 : Distance to training data  0.000 (using nearest neighbor HLA-A02:01)\n\n    # Rank Threshold for Strong binding peptides   0.500\n    # Rank Threshold for Weak binding peptides   2.000\n    -----------------------------------------------------------------------------------\n      Pos          HLA         Peptide       Core Of Gp Gl Ip Il        Icore        Identity     Score Aff(nM)   %Rank  BindLevel\n    -----------------------------------------------------------------------------------\n        1  HLA-A*02:01        SIINFEKL  SIINF-EKL  0  0  0  5  1     SIINFEKL         PEPLIST 0.1141340 14543.1 18.9860\n    -----------------------------------------------------------------------------------\n\n    Protein PEPLIST. Allele HLA-A*02:01. Number of high binders 0. Number of weak binders 0. Number of peptides 1\n    \"\"\"\n\n    # Output format is compatible with netmhcpan3, but netmhcpan 4.0 must be\n    # called with the -BA flag, so it gives affinity predictions, not mass-spec\n    # elution likelihoods.\n    return parse_netmhcpan3_stdout(\n        stdout=stdout,\n        prediction_method_name=prediction_method_name,\n        sequence_key_mapping=sequence_key_mapping)", "response": "Parse the output of netmhcpan4. 0 and netmhcpan3."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ntake the binding predictions returned by IEDB s web API and parse them into a DataFrame", "response": "def _parse_iedb_response(response):\n    \"\"\"Take the binding predictions returned by IEDB's web API\n    and parse them into a DataFrame\n\n    Expect response to look like:\n    allele  seq_num start   end length  peptide ic50   percentile_rank\n    HLA-A*01:01 1   2   10  9   LYNTVATLY   2145.70 3.7\n    HLA-A*01:01 1   5   13  9   TVATLYCVH   2216.49 3.9\n    HLA-A*01:01 1   7   15  9   ATLYCVHQR   2635.42 5.1\n    HLA-A*01:01 1   4   12  9   NTVATLYCV   6829.04 20\n    HLA-A*01:01 1   1   9   9   SLYNTVATL   8032.38 24\n    HLA-A*01:01 1   8   16  9   TLYCVHQRI   8853.90 26\n    HLA-A*01:01 1   3   11  9   YNTVATLYC   9865.62 29\n    HLA-A*01:01 1   6   14  9   VATLYCVHQ   27575.71    58\n    HLA-A*01:01 1   10  18  9   YCVHQRIDV   48929.64    74\n    HLA-A*01:01 1   9   17  9   LYCVHQRID   50000.00    75\n    \"\"\"\n    if len(response) == 0:\n        raise ValueError(\"Empty response from IEDB!\")\n    df = pd.read_csv(io.BytesIO(response), delim_whitespace=True, header=0)\n\n    # pylint doesn't realize that df is a DataFrame, so tell is\n    assert type(df) == pd.DataFrame\n    df = pd.DataFrame(df)\n\n    if len(df) == 0:\n        raise ValueError(\n            \"No binding predictions in response from IEDB: %s\" % (response,))\n    required_columns = [\n        \"allele\",\n        \"peptide\",\n        \"ic50\",\n        \"start\",\n        \"end\",\n    ]\n    for column in required_columns:\n        if column not in df.columns:\n            raise ValueError(\n                \"Response from IEDB is missing '%s' column: %s. Full \"\n                \"response:\\n%s\" % (\n                    column,\n                    df.ix[0],\n                    response))\n    # since IEDB has allowed multiple column names for percentile rank,\n    # we're defensively normalizing all of them to just 'rank'\n    df = df.rename(columns={\n        \"percentile_rank\": \"rank\",\n        \"percentile rank\": \"rank\"})\n    return df"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _query_iedb(request_values, url):\n    data = urlencode(request_values)\n    req = Request(url, data.encode(\"ascii\"))\n    response = urlopen(req).read()\n    return _parse_iedb_response(response)", "response": "Query IEDB s web API for MHC binding prediction using request dictionary"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef predict_subsequences(self, sequence_dict, peptide_lengths=None):\n        sequence_dict = check_sequence_dictionary(sequence_dict)\n        peptide_lengths = self._check_peptide_lengths(peptide_lengths)\n\n        # take each mutated sequence in the dataframe\n        # and general MHC binding scores for all k-mer substrings\n        binding_predictions = []\n        expected_peptides = set([])\n\n        normalized_alleles = []\n        for key, amino_acid_sequence in sequence_dict.items():\n            for l in peptide_lengths:\n                for i in range(len(amino_acid_sequence) - l + 1):\n                    expected_peptides.add(amino_acid_sequence[i:i + l])\n            self._check_peptide_inputs(expected_peptides)\n            for allele in self.alleles:\n                # IEDB MHCII predictor expects DRA1 to be omitted.\n                allele = normalize_allele_name(allele, omit_dra1=True)\n                normalized_alleles.append(allele)\n                request = self._get_iedb_request_params(\n                    amino_acid_sequence, allele)\n                logger.info(\n                    \"Calling IEDB (%s) with request %s\",\n                    self.url,\n                    request)\n                response_df = _query_iedb(request, self.url)\n                for _, row in response_df.iterrows():\n                    binding_predictions.append(\n                        BindingPrediction(\n                            source_sequence_name=key,\n                            offset=row['start'] - 1,\n                            allele=row['allele'],\n                            peptide=row['peptide'],\n                            affinity=row['ic50'],\n                            percentile_rank=row['rank'],\n                            prediction_method_name=\"iedb-\" + self.prediction_method))\n        self._check_results(\n            binding_predictions,\n            alleles=normalized_alleles,\n            peptides=expected_peptides)\n        return BindingPredictionCollection(binding_predictions)", "response": "Run MHC binding predictions on all substrings in a dictionary of amino acid sequences and return a EpitopeCollection."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef log_error(msg=None, exc_info=None, logger=None, **kwargs):\n    if logger is None:\n        logger = _logger\n\n    if not exc_info:\n        exc_info = sys.exc_info()\n\n    if msg is None:\n        msg = \"\"\n\n    exc_class, exc_msg, _ = exc_info\n\n    if all(info is not None for info in exc_info):\n        logger.error(msg, exc_info=exc_info)", "response": "Log an error message and its traceback on the logger defined by logger."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\ngetting all the media metadata and read any kind of file object", "response": "async def get_media_metadata(data, path=None):\n    \"\"\"\n        Get all the file's metadata and read any kind of file object\n\n    Parameters\n    ----------\n    data : bytes\n        first bytes of the file (the mimetype shoudl be guessed from the\n        file headers\n    path : str, optional\n        path to the file\n\n    Returns\n    -------\n    str\n        The mimetype of the media\n    str\n        The category of the media on Twitter\n    \"\"\"\n    if isinstance(data, bytes):\n        media_type = await get_type(data, path)\n\n    else:\n        raise TypeError(\"get_metadata input must be a bytes\")\n\n    media_category = get_category(media_type)\n\n    _logger.info(\"media_type: %s, media_category: %s\" % (media_type,\n                                                         media_category))\n\n    return media_type, media_category"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ngetting the size of a file object.", "response": "async def get_size(media):\n    \"\"\"\n        Get the size of a file\n\n    Parameters\n    ----------\n    media : file object\n        The file object of the media\n\n    Returns\n    -------\n    int\n        The size of the file\n    \"\"\"\n    if hasattr(media, 'seek'):\n        await execute(media.seek(0, os.SEEK_END))\n        size = await execute(media.tell())\n        await execute(media.seek(0))\n    elif hasattr(media, 'headers'):\n        size = int(media.headers['Content-Length'])\n    elif isinstance(media, bytes):\n        size = len(media)\n    else:\n        raise TypeError(\"Can't get size of media of type:\",\n                        type(media).__name__)\n\n    _logger.info(\"media size: %dB\" % size)\n    return size"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreturning the mimetype of the media.", "response": "async def get_type(media, path=None):\n    \"\"\"\n    Parameters\n    ----------\n    media : file object\n        A file object of the image\n    path : str, optional\n        The path to the file\n\n    Returns\n    -------\n    str\n        The mimetype of the media\n    str\n        The category of the media on Twitter\n    \"\"\"\n    if magic:\n        if not media:\n            raise TypeError(\"Media data is empty\")\n\n        _logger.debug(\"guessing mimetype using magic\")\n        media_type = mime.from_buffer(media[:1024])\n    else:\n        media_type = None\n        if path:\n            _logger.debug(\"guessing mimetype using built-in module\")\n            media_type = mime.guess_type(path)[0]\n\n        if media_type is None:\n            msg = (\"Could not guess the mimetype of the media.\\n\"\n                   \"Please consider installing python-magic\\n\"\n                   \"(pip3 install peony-twitter[magic])\")\n            raise RuntimeError(msg)\n\n    return media_type"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nactivate error messages useful during development", "response": "def set_debug():\n    \"\"\" activates error messages, useful during development \"\"\"\n    logging.basicConfig(level=logging.WARNING)\n    peony.logger.setLevel(logging.DEBUG)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn a new BindingPrediction with updated fields", "response": "def clone_with_updates(self, **kwargs):\n        \"\"\"Returns new BindingPrediction with updated fields\"\"\"\n        fields_dict = self.to_dict()\n        fields_dict.update(kwargs)\n        return BindingPrediction(**fields_dict)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef get_data(self, response):\n        if self._response_list:\n            return response\n        elif self._response_key is None:\n            if hasattr(response, \"items\"):\n                for key, data in response.items():\n                    if (hasattr(data, \"__getitem__\")\n                            and not hasattr(data, \"items\")\n                            and len(data) > 0\n                            and 'id' in data[0]):\n                        self._response_key = key\n                        return data\n            else:\n                self._response_list = True\n                return response\n        else:\n            return response[self._response_key]\n\n        raise NoDataFound(response=response, url=self.request.get_url())", "response": "Get the data from the response"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ntry to fill the gaps and strip last tweet from the response if its id is that of the first tweet of the last response Parameters ---------- data : list The response data", "response": "async def call_on_response(self, data):\n        \"\"\"\n        Try to fill the gaps and strip last tweet from the response\n        if its id is that of the first tweet of the last response\n\n        Parameters\n        ----------\n        data : list\n            The response data\n        \"\"\"\n        since_id = self.kwargs.get(self.param, 0) + 1\n\n        if self.fill_gaps:\n            if data[-1]['id'] != since_id:\n                max_id = data[-1]['id'] - 1\n                responses = with_max_id(self.request(**self.kwargs,\n                                                     max_id=max_id))\n\n                async for tweets in responses:\n                    data.extend(tweets)\n\n            if data[-1]['id'] == self.last_id:\n                data = data[:-1]\n\n        if not data and not self.force:\n            raise StopAsyncIteration\n\n        await self.set_param(data)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nget a temporary oauth token", "response": "async def get_oauth_token(consumer_key, consumer_secret, callback_uri=\"oob\"):\n    \"\"\"\n    Get a temporary oauth token\n\n    Parameters\n    ----------\n    consumer_key : str\n        Your consumer key\n    consumer_secret : str\n        Your consumer secret\n    callback_uri : str, optional\n        Callback uri, defaults to 'oob'\n\n    Returns\n    -------\n    dict\n        Temporary tokens\n    \"\"\"\n\n    client = BasePeonyClient(consumer_key=consumer_key,\n                             consumer_secret=consumer_secret,\n                             api_version=\"\",\n                             suffix=\"\")\n\n    response = await client.api.oauth.request_token.post(\n        _suffix=\"\",\n        oauth_callback=callback_uri\n    )\n\n    return parse_token(response)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\nasync def get_oauth_verifier(oauth_token):\n    url = \"https://api.twitter.com/oauth/authorize?oauth_token=\" + oauth_token\n\n    try:\n        browser = webbrowser.open(url)\n        await asyncio.sleep(2)\n\n        if not browser:\n            raise RuntimeError\n    except RuntimeError:\n        print(\"could not open a browser\\ngo here to enter your PIN: \" + url)\n\n    verifier = input(\"\\nEnter your PIN: \")\n    return verifier", "response": "Get the oauth verifier from the user"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\nasync def get_access_token(consumer_key, consumer_secret,\n                           oauth_token, oauth_token_secret,\n                           oauth_verifier, **kwargs):\n    \"\"\"\n        get the access token of the user\n\n    Parameters\n    ----------\n    consumer_key : str\n        Your consumer key\n    consumer_secret : str\n        Your consumer secret\n    oauth_token : str\n        OAuth token from :func:`get_oauth_token`\n    oauth_token_secret : str\n        OAuth token secret from :func:`get_oauth_token`\n    oauth_verifier : str\n        OAuth verifier from :func:`get_oauth_verifier`\n\n    Returns\n    -------\n    dict\n        Access tokens\n    \"\"\"\n\n    client = BasePeonyClient(consumer_key=consumer_key,\n                             consumer_secret=consumer_secret,\n                             access_token=oauth_token,\n                             access_token_secret=oauth_token_secret,\n                             api_version=\"\",\n                             suffix=\"\")\n\n    response = await client.api.oauth.access_token.get(\n        _suffix=\"\",\n        oauth_verifier=oauth_verifier\n    )\n\n    return parse_token(response)", "response": "Get the access token of the user"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\nasync def async_oauth_dance(consumer_key, consumer_secret, callback_uri=\"oob\"):\n\n    token = await get_oauth_token(consumer_key, consumer_secret, callback_uri)\n\n    oauth_verifier = await get_oauth_verifier(token['oauth_token'])\n\n    token = await get_access_token(\n        consumer_key,\n        consumer_secret,\n        oauth_verifier=oauth_verifier,\n        **token\n    )\n\n    token = dict(\n        consumer_key=consumer_key,\n        consumer_secret=consumer_secret,\n        access_token=token['oauth_token'],\n        access_token_secret=token['oauth_token_secret']\n    )\n\n    return token", "response": "Get the user s OAuth token and access token dict"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nparsing the response containing the tokens", "response": "def parse_token(response):\n    \"\"\"\n    parse the responses containing the tokens\n\n    Parameters\n    ----------\n    response : str\n        The response containing the tokens\n\n    Returns\n    -------\n    dict\n        The parsed tokens\n    \"\"\"\n    items = response.split(\"&\")\n    items = [item.split(\"=\") for item in items]\n\n    return {key: value for key, value in items}"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef oauth_dance(consumer_key, consumer_secret,\n                oauth_callback=\"oob\", loop=None):\n    \"\"\"\n        OAuth dance to get the user's access token\n\n    It calls async_oauth_dance and create event loop of not given\n\n    Parameters\n    ----------\n    consumer_key : str\n        Your consumer key\n    consumer_secret : str\n        Your consumer secret\n    oauth_callback : str\n        Callback uri, defaults to 'oob'\n    loop : event loop\n        asyncio event loop\n\n    Returns\n    -------\n    dict\n        Access tokens\n    \"\"\"\n    loop = asyncio.get_event_loop() if loop is None else loop\n\n    coro = async_oauth_dance(consumer_key, consumer_secret, oauth_callback)\n    return loop.run_until_complete(coro)", "response": "This function is used to get the user s access token by OAuth dance. It calls async_oauth_dance and creates event loop of not given\n   "}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nreturn the token for the current user using OAuth2.", "response": "def oauth2_dance(consumer_key, consumer_secret, loop=None):\n    \"\"\"\n        oauth2 dance\n\n    Parameters\n    ----------\n    consumer_key : str\n        Your consumer key\n    consumer_secret : str\n        Your consumer secret\n    loop : event loop, optional\n        event loop to use\n\n    Returns\n    -------\n    str\n        Bearer token\n    \"\"\"\n    loop = asyncio.get_event_loop() if loop is None else loop\n    client = BasePeonyClient(consumer_key=consumer_key,\n                             consumer_secret=consumer_secret,\n                             auth=oauth.OAuth2Headers)\n\n    loop.run_until_complete(client.headers.sign())\n    return client.headers.token"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef predict(self, sequences):\n        with tempfile.NamedTemporaryFile(suffix=\".fsa\", mode=\"w\") as input_fd:\n            for (i, sequence) in enumerate(sequences):\n                input_fd.write(\"> %d\\n\" % i)\n                input_fd.write(sequence)\n                input_fd.write(\"\\n\")\n            input_fd.flush()\n            try:\n                output = subprocess.check_output([\"netChop\", input_fd.name])\n            except subprocess.CalledProcessError as e:\n                logging.error(\"Error calling netChop: %s:\\n%s\" % (e, e.output))\n                raise\n\n        parsed = self.parse_netchop(output)\n        assert len(parsed) == len(sequences), \\\n            \"Expected %d results but got %d\" % (\n                len(sequences), len(parsed))\n        assert [len(x) for x in parsed] == [len(x) for x in sequences]\n        return parsed", "response": "Predict cleavage for each position in each sequence."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nconvert a collection of BindingPrediction objects to a Pandas DataFrame.", "response": "def to_dataframe(\n            self,\n            columns=BindingPrediction.fields + (\"length\",)):\n        \"\"\"\n        Converts collection of BindingPrediction objects to DataFrame\n        \"\"\"\n        return pd.DataFrame.from_records(\n            [tuple([getattr(x, name) for name in columns]) for x in self],\n            columns=columns)"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef NetMHC(alleles,\n           default_peptide_lengths=[9],\n           program_name=\"netMHC\"):\n    \"\"\"\n    This function wraps NetMHC3 and NetMHC4 to automatically detect which class\n    to use. Currently based on running the '-h' command and looking for\n    discriminating substrings between the versions.\n    \"\"\"\n    # run NetMHC's help command and parse discriminating substrings out of\n    # the resulting str output\n    with open(os.devnull, 'w') as devnull:\n        help_output = check_output([program_name, \"-h\"], stderr=devnull)\n    help_output_str = help_output.decode(\"ascii\", \"ignore\")\n\n    substring_to_netmhc_class = {\n        \"-listMHC\": NetMHC4,\n        \"--Alleles\": NetMHC3,\n    }\n\n    successes = []\n\n    for substring, netmhc_class in substring_to_netmhc_class.items():\n        if substring in help_output_str:\n            successes.append(netmhc_class)\n\n    if len(successes) > 1:\n        raise SystemError(\"Command %s is valid for multiple NetMHC versions. \"\n                          \"This is likely an mhctools bug.\" % program_name)\n    if len(successes) == 0:\n        raise SystemError(\"Command %s is not a valid way of calling any NetMHC software.\"\n                          % program_name)\n\n    netmhc_class = successes[0]\n    return netmhc_class(\n        alleles=alleles,\n        default_peptide_lengths=default_peptide_lengths,\n        program_name=program_name)", "response": "This function will automatically detect the class of the NetMHC version."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef predict_peptides(self, peptides):\n\n        # importing locally to avoid slowing down CLI applications which\n        # don't use MHCflurry\n        from mhcflurry.encodable_sequences import EncodableSequences\n\n        binding_predictions = []\n        encodable_sequences = EncodableSequences.create(peptides)\n        for allele in self.alleles:\n            predictions_df = self.predictor.predict_to_dataframe(\n                encodable_sequences, allele=allele)\n            for (_, row) in predictions_df.iterrows():\n                binding_prediction = BindingPrediction(\n                    allele=allele,\n                    peptide=row.peptide,\n                    affinity=row.prediction,\n                    percentile_rank=(\n                        row.prediction_percentile\n                        if 'prediction_percentile' in row else nan),\n                    prediction_method_name=\"mhcflurry\"\n                )\n                binding_predictions.append(binding_prediction)\n        return BindingPredictionCollection(binding_predictions)", "response": "Predicts the peptides for all the alleles in the database."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef seq_to_str(obj, sep=\",\"):\n    if isinstance(obj, string_classes):\n        return obj\n    elif isinstance(obj, (list, tuple)):\n        return sep.join([str(x) for x in obj])\n    else:\n        return str(obj)", "response": "Given a sequence convert it to a comma separated string."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef convert(img, formats):\n    media = None\n    min_size = 0\n\n    for kwargs in formats:\n        f = io.BytesIO()\n        if img.mode == \"RGBA\" and kwargs['format'] != \"PNG\":\n            # convert to RGB if picture is too large as a png\n            # this implies that the png format is the first in `formats`\n            if min_size < 5 * 1024**2:\n                continue\n            else:\n                img.convert('RGB')\n\n        img.save(f, **kwargs)\n        size = f.tell()\n\n        if media is None or size < min_size:\n            if media is not None:\n                media.close()\n\n            media = f\n            min_size = size\n        else:\n            f.close()\n\n    return media", "response": "Convert the image to all the formats specified by the list of formats."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef optimize_media(file_, max_size, formats):\n    if not PIL:\n        msg = (\"Pillow must be installed to optimize a media\\n\"\n               \"$ pip3 install Pillow\")\n        raise RuntimeError(msg)\n\n    img = PIL.Image.open(file_)\n\n    # resize the picture (defaults to the 'large' photo size of Twitter\n    # in peony.PeonyClient.upload_media)\n    ratio = max(hw / max_hw for hw, max_hw in zip(img.size, max_size))\n\n    if ratio > 1:\n        size = tuple(int(hw // ratio) for hw in img.size)\n        img = img.resize(size, PIL.Image.ANTIALIAS)\n\n    media = convert(img, formats)\n\n    # do not close a file opened by the user\n    # only close if a filename was given\n    if not hasattr(file_, 'read'):\n        img.close()\n\n    return media", "response": "Optimize an image and convert it to the specified formats."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ncreating one or more files containing one peptide per line and returns names of files.", "response": "def create_input_peptides_files(\n        peptides,\n        max_peptides_per_file=None,\n        group_by_length=False):\n    \"\"\"\n    Creates one or more files containing one peptide per line,\n    returns names of files.\n    \"\"\"\n    if group_by_length:\n        peptide_lengths = {len(p) for p in peptides}\n        peptide_groups = {l: [] for l in peptide_lengths}\n        for p in peptides:\n            peptide_groups[len(p)].append(p)\n    else:\n        peptide_groups = {\"\": peptides}\n\n    file_names = []\n    for key, group in peptide_groups.items():\n        n_peptides = len(group)\n        if not max_peptides_per_file:\n            max_peptides_per_file = n_peptides\n        input_file = None\n        for i, p in enumerate(group):\n            if i % max_peptides_per_file == 0:\n                if input_file is not None:\n                    file_names.append(input_file.name)\n                    input_file.close()\n                input_file = make_writable_tempfile(\n                    prefix_number=i // max_peptides_per_file,\n                    prefix_name=key,\n                    suffix=\".txt\")\n            input_file.write(\"%s\\n\" % p)\n        if input_file is not None:\n            file_names.append(input_file.name)\n            input_file.close()\n    return file_names"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _check_peptide_lengths(self, peptide_lengths=None):\n        if not peptide_lengths:\n            peptide_lengths = self.default_peptide_lengths\n\n        if not peptide_lengths:\n            raise ValueError(\n                (\"Must either provide 'peptide_lengths' argument \"\n                \"or set 'default_peptide_lengths\"))\n        if isinstance(peptide_lengths, int):\n            peptide_lengths = [peptide_lengths]\n        require_iterable_of(peptide_lengths, int)\n        for peptide_length in peptide_lengths:\n            if (self.min_peptide_length is not None and\n                    peptide_length < self.min_peptide_length):\n                raise ValueError(\n                    \"Invalid peptide length %d, shorter than min %d\" % (\n                        peptide_length,\n                        self.min_peptide_length))\n            elif (self.max_peptide_length is not None and\n                    peptide_length > self.max_peptide_length):\n                raise ValueError(\n                    \"Invalid peptide length %d, longer than max %d\" % (\n                        peptide_length,\n                        self.max_peptide_length))\n        return peptide_lengths", "response": "Checks that the peptide lengths are valid and returns the peptide lengths. If they are not valid raises an exception."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _check_peptide_inputs(self, peptides):\n        require_iterable_of(peptides, string_types)\n        check_X = not self.allow_X_in_peptides\n        check_lower = not self.allow_lowercase_in_peptides\n        check_min_length = self.min_peptide_length is not None\n        min_length = self.min_peptide_length\n        check_max_length = self.max_peptide_length is not None\n        max_length = self.max_peptide_length\n        for p in peptides:\n            if not p.isalpha():\n                raise ValueError(\"Invalid characters in peptide '%s'\" % p)\n            elif check_X and \"X\" in p:\n                raise ValueError(\"Invalid character 'X' in peptide '%s'\" % p)\n            elif check_lower and not p.isupper():\n                raise ValueError(\"Invalid lowercase letters in peptide '%s'\" % p)\n            elif check_min_length and len(p) < min_length:\n                raise ValueError(\n                    \"Peptide '%s' too short (%d chars), must be at least %d\" % (\n                        p, len(p), min_length))\n            elif check_max_length and len(p) > max_length:\n                raise ValueError(\n                    \"Peptide '%s' too long (%d chars), must be at least %d\" % (\n                        p, len(p), max_length))", "response": "Checks that the peptide sequences are valid for this predictor."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef predict_subsequences(\n            self,\n            sequence_dict,\n            peptide_lengths=None):\n        \"\"\"\n        Given a dictionary mapping sequence names to amino acid strings,\n        and an optional list of peptide lengths, returns a\n        BindingPredictionCollection.\n        \"\"\"\n        if isinstance(sequence_dict, string_types):\n            sequence_dict = {\"seq\": sequence_dict}\n        elif isinstance(sequence_dict, (list, tuple)):\n            sequence_dict = {seq: seq for seq in sequence_dict}\n\n        peptide_lengths = self._check_peptide_lengths(peptide_lengths)\n\n        # convert long protein sequences to set of peptides and\n        # associated sequence name / offsets that each peptide may have come\n        # from\n        peptide_set = set([])\n        peptide_to_name_offset_pairs = defaultdict(list)\n\n        for name, sequence in sequence_dict.items():\n            for peptide_length in peptide_lengths:\n                for i in range(len(sequence) - peptide_length + 1):\n                    peptide = sequence[i:i + peptide_length]\n                    peptide_set.add(peptide)\n                    peptide_to_name_offset_pairs[peptide].append((name, i))\n        peptide_list = sorted(peptide_set)\n\n        binding_predictions = self.predict_peptides(peptide_list)\n\n        # create BindingPrediction objects with sequence name and offset\n        results = []\n        for binding_prediction in binding_predictions:\n            for name, offset in peptide_to_name_offset_pairs[\n                    binding_prediction.peptide]:\n                results.append(binding_prediction.clone_with_updates(\n                    source_sequence_name=name,\n                    offset=offset))\n        self._check_results(\n            results,\n            peptides=peptide_set,\n            alleles=self.alleles)\n        return BindingPredictionCollection(results)", "response": "Predicts subsequences of a given amino acid sequence."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ngive a list of HLA alleles and an optional list of valid_alleles HLA alleles return a set of alleles that we will pass into the MHC binding predictor.", "response": "def _check_hla_alleles(\n            alleles,\n            valid_alleles=None):\n        \"\"\"\n        Given a list of HLA alleles and an optional list of valid\n        HLA alleles, return a set of alleles that we will pass into\n        the MHC binding predictor.\n        \"\"\"\n        require_iterable_of(alleles, string_types, \"HLA alleles\")\n\n        # Don't run the MHC predictor twice for homozygous alleles,\n        # only run it for unique alleles\n        alleles = {\n            normalize_allele_name(allele.strip().upper())\n            for allele in alleles\n        }\n        if valid_alleles:\n            # For some reason netMHCpan drops the '*' in names, so\n            # 'HLA-A*03:01' becomes 'HLA-A03:01'\n            missing_alleles = [\n                allele\n                for allele in alleles\n                if allele not in valid_alleles\n            ]\n            if len(missing_alleles) > 0:\n                raise UnsupportedAllele(\n                    \"Unsupported HLA alleles: %s\" % missing_alleles)\n\n        return list(alleles)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nconnecting to the stream returning a coroutine that will receive the response from the server.", "response": "async def _connect(self):\n        \"\"\"\n            Connect to the stream\n\n        Returns\n        -------\n        asyncio.coroutine\n            The streaming response\n        \"\"\"\n        logger.debug(\"connecting to the stream\")\n        await self.client.setup\n        if self.session is None:\n            self.session = self.client._session\n        kwargs = await self.client.headers.prepare_request(**self.kwargs)\n        request = self.client.error_handler(self.session.request)\n\n        return await request(timeout=0, **kwargs)"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\nasync def connect(self):\n        with async_timeout.timeout(self.timeout):\n            self.response = await self._connect()\n\n        if self.response.status in range(200, 300):\n            self._error_timeout = 0\n            self.state = NORMAL\n        elif self.response.status == 500:\n            self.state = DISCONNECTION\n        elif self.response.status in range(501, 600):\n            self.state = RECONNECTION\n        elif self.response.status in (420, 429):\n            self.state = ENHANCE_YOUR_CALM\n        else:\n            logger.debug(\"raising error during stream connection\")\n            raise await exceptions.throw(self.response,\n                                         loads=self.client._loads,\n                                         url=self.kwargs['url'])\n\n        logger.debug(\"stream state: %d\" % self.state)", "response": "Connect to the stream and return the current state."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nrestarts the stream on error", "response": "async def init_restart(self, error=None):\n        \"\"\"\n            Restart the stream on error\n\n        Parameters\n        ----------\n        error : bool, optional\n            Whether to print the error or not\n        \"\"\"\n        if error:\n            utils.log_error(logger=logger)\n\n        if self.state == DISCONNECTION:\n            if self._error_timeout < MAX_DISCONNECTION_TIMEOUT:\n                self._error_timeout += DISCONNECTION_TIMEOUT\n\n            logger.info(\"The stream was disconnected, will reconnect in %ss\"\n                        % self._error_timeout)\n\n        elif self.state == RECONNECTION:\n            if self._error_timeout < RECONNECTION_TIMEOUT:\n                self._error_timeout = RECONNECTION_TIMEOUT\n            elif self._error_timeout < MAX_RECONNECTION_TIMEOUT:\n                self._error_timeout *= 2\n\n            logger.info(\"Could not connect to the stream, reconnection in %ss\"\n                        % self._error_timeout)\n\n        elif self.state == ENHANCE_YOUR_CALM:\n            if self._error_timeout < ENHANCE_YOUR_CALM_TIMEOUT:\n                self._error_timeout = ENHANCE_YOUR_CALM_TIMEOUT\n            else:\n                self._error_timeout *= 2\n\n            logger.warning(\"Enhance Your Calm response received from Twitter. \"\n                           \"If you didn't restart your program frenetically \"\n                           \"then there is probably something wrong with it. \"\n                           \"Make sure you are not opening too many connections\"\n                           \" to the endpoint you are currently using by \"\n                           \"checking out Twitter's Streaming API \"\n                           \"documentation: \"\n                           \"https://dev.twitter.com/streaming/overview\\n\"\n                           \"The stream will restart in %ss.\"\n                           % self._error_timeout)\n        elif self.state == EOF:\n            pass  # no timeout\n        else:\n            raise RuntimeError(\"Incorrect state: %d\" % self.state)\n\n        self._reconnecting = True\n        return {'reconnecting_in': self._error_timeout, 'error': error}"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\nasync def restart_stream(self):\n        await self.response.release()\n        await asyncio.sleep(self._error_timeout)\n        await self.connect()\n\n        logger.info(\"Reconnected to the stream\")\n        self._reconnecting = False\n        return {'stream_restart': True}", "response": "Restart the stream on error"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nreturn a decorator that can be used to handle commands with prefixes", "response": "def with_prefix(self, prefix, strict=False):\n        \"\"\"\n            decorator to handle commands with prefixes\n\n        Parameters\n        ----------\n        prefix : str\n            the prefix of the command\n        strict : bool, optional\n            If set to True the command must be at the beginning\n            of the message. Defaults to False.\n\n        Returns\n        -------\n        function\n            a decorator that returns an :class:`EventHandler` instance\n        \"\"\"\n\n        def decorated(func):\n            return EventHandler(func=func, event=self.event,\n                                prefix=prefix, strict=strict)\n\n        return decorated"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nreturning an Event that can be used for site streams", "response": "def envelope(self):\n        \"\"\" returns an :class:`Event` that can be used for site streams \"\"\"\n\n        def enveloped_event(data):\n            return 'for_user' in data and self._func(data.get('message'))\n\n        return self.__class__(enveloped_event, self.__name__)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nset the environment timezone to the timezone set in your twitter settings", "response": "async def set_tz(self):\n        \"\"\"\n            set the environment timezone to the timezone\n            set in your twitter settings\n        \"\"\"\n        settings = await self.api.account.settings.get()\n\n        tz = settings.time_zone.tzinfo_name\n\n        os.environ['TZ'] = tz\n        time.tzset()"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nexecuting a command in a node and show timing info.", "response": "def run_command(args, **kwargs):\n    \"\"\"\n    Given a list whose first element is a command name, followed by arguments,\n    execute it and show timing info.\n    \"\"\"\n    assert len(args) > 0\n    start_time = time.time()\n    process = AsyncProcess(args, **kwargs)\n    process.wait()\n    elapsed_time = time.time() - start_time\n    logger.info(\"%s took %0.4f seconds\", args[0], elapsed_time)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nrun multiple shell commands and writes each of their output to files associated with each command.", "response": "def run_multiple_commands_redirect_stdout(\n        multiple_args_dict,\n        print_commands=True,\n        process_limit=-1,\n        polling_freq=0.5,\n        **kwargs):\n    \"\"\"\n    Run multiple shell commands in parallel, write each of their\n    stdout output to files associated with each command.\n\n    Parameters\n    ----------\n    multiple_args_dict : dict\n        A dictionary whose keys are files and values are args list.\n        Run each args list as a subprocess and write stdout to the\n        corresponding file.\n\n    print_commands : bool\n        Print shell commands before running them.\n\n    process_limit : int\n        Limit the number of concurrent processes to this number. 0\n        if there is no limit, -1 to use max number of processors\n\n    polling_freq : int\n        Number of seconds between checking for done processes, if\n        we have a process limit\n    \"\"\"\n    assert len(multiple_args_dict) > 0\n    assert all(len(args) > 0 for args in multiple_args_dict.values())\n    assert all(hasattr(f, 'name') for f in multiple_args_dict.keys())\n    if process_limit < 0:\n        logger.debug(\"Using %d processes\" % cpu_count())\n        process_limit = cpu_count()\n\n    start_time = time.time()\n    processes = Queue(maxsize=process_limit)\n\n    def add_to_queue(process):\n        process.start()\n        if print_commands:\n            handler = logging.FileHandler(process.redirect_stdout_file.name)\n            handler.setLevel(logging.DEBUG)\n            logger.addHandler(handler)\n            logger.debug(\" \".join(process.args))\n            logger.removeHandler(handler)\n        processes.put(process)\n\n    for f, args in multiple_args_dict.items():\n        p = AsyncProcess(\n                args,\n                redirect_stdout_file=f,\n                **kwargs)\n        if not processes.full():\n            add_to_queue(p)\n        else:\n            while processes.full():\n                # Are there any done processes?\n                to_remove = []\n                for possibly_done in processes.queue:\n                    if possibly_done.poll() is not None:\n                        possibly_done.wait()\n                        to_remove.append(possibly_done)\n                # Remove them from the queue and stop checking\n                if to_remove:\n                    for process_to_remove in to_remove:\n                        processes.queue.remove(process_to_remove)\n                    break\n                # Check again in a second if there weren't\n                time.sleep(polling_freq)\n            add_to_queue(p)\n\n    # Wait for all the rest of the processes\n    while not processes.empty():\n        processes.get().wait()\n\n    elapsed_time = time.time() - start_time\n    logger.info(\n        \"Ran %d commands in %0.4f seconds\",\n        len(multiple_args_dict),\n        elapsed_time)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _determine_supported_alleles(command, supported_allele_flag):\n        try:\n            # convert to str since Python3 returns a `bytes` object\n            supported_alleles_output = check_output([\n                command, supported_allele_flag\n            ])\n            supported_alleles_str = supported_alleles_output.decode(\"ascii\", \"ignore\")\n            assert len(supported_alleles_str) > 0, \\\n                '%s returned empty allele list' % command\n            supported_alleles = set([])\n            for line in supported_alleles_str.split(\"\\n\"):\n                line = line.strip()\n                if not line.startswith('#') and len(line) > 0:\n                    try:\n                        # We need to normalize these alleles (the output of the predictor\n                        # when it lists its supported alleles) so that they are comparable with\n                        # our own alleles.\n                        supported_alleles.add(normalize_allele_name(line))\n                    except AlleleParseError as error:\n                        logger.info(\"Skipping allele %s: %s\", line, error)\n                        continue\n            if len(supported_alleles) == 0:\n                raise ValueError(\"Unable to determine supported alleles\")\n            return supported_alleles\n        except Exception as e:\n            logger.exception(e)\n            raise SystemError(\"Failed to run %s %s. Possibly an incorrect executable version?\" % (\n                command,\n                supported_allele_flag))", "response": "Try asking the commandline predictor to determine which alleles it supports."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreads the data of the response.", "response": "async def read(response, loads=loads, encoding=None):\n    \"\"\"\n        read the data of the response\n\n    Parameters\n    ----------\n    response : aiohttp.ClientResponse\n        response\n    loads : callable\n        json loads function\n    encoding : :obj:`str`, optional\n        character encoding of the response, if set to None\n        aiohttp should guess the right encoding\n\n    Returns\n    -------\n    :obj:`bytes`, :obj:`str`, :obj:`dict` or :obj:`list`\n        the data returned depends on the response\n    \"\"\"\n    ctype = response.headers.get('Content-Type', \"\").lower()\n\n    try:\n        if \"application/json\" in ctype:\n            logger.info(\"decoding data as json\")\n            return await response.json(encoding=encoding, loads=loads)\n\n        if \"text\" in ctype:\n            logger.info(\"decoding data as text\")\n            return await response.text(encoding=encoding)\n\n    except (UnicodeDecodeError, json.JSONDecodeError) as exc:\n        data = await response.read()\n        raise exceptions.PeonyDecodeError(response=response,\n                                          data=data,\n                                          exception=exc)\n\n    return await response.read()"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef doc(func):\n    stripped_chars = \" \\t\"\n\n    if hasattr(func, '__doc__'):\n        docstring = func.__doc__.lstrip(\" \\n\\t\")\n        if \"\\n\" in docstring:\n            i = docstring.index(\"\\n\")\n            return docstring[:i].rstrip(stripped_chars)\n        elif docstring:\n            return docstring.rstrip(stripped_chars)\n\n    return \"\"", "response": "Returns the help message for this command"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef permission_check(data, command_permissions,\n                     command=None, permissions=None):\n    \"\"\"\n        Check the permissions of the user requesting a command\n\n    Parameters\n    ----------\n    data : dict\n        message data\n    command_permissions : dict\n        permissions of the command, contains all the roles as key and users\n        with these permissions as values\n    command : function\n        the command that is run\n    permissions : tuple or list\n        a list of permissions for the command\n\n    Returns\n    -------\n    bool\n        True if the user has the right permissions, False otherwise\n    \"\"\"\n    if permissions:\n        pass\n    elif command:\n        if hasattr(command, 'permissions'):\n            permissions = command.permissions\n        else:\n            return True  # true if no permission is required\n    else:\n        msg = \"{name} must be called with command or permissions argument\"\n        raise RuntimeError(msg.format(name=\"_permission_check\"))\n\n    return any(data['sender']['id'] in command_permissions[permission]\n               for permission in permissions\n               if permission in command_permissions)", "response": "Checks if the user has the right permissions for a command"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nscripts to make pMHC binding predictions from amino acid sequences.", "response": "def main(args_list=None):\n    \"\"\"\n    Script to make pMHC binding predictions from amino acid sequences.\n\n    Usage example:\n        mhctools\n            --sequence SFFPIQQQQQAAALLLI \\\n            --sequence SILQQQAQAQQAQAASSSC \\\n            --extract-subsequences \\\n            --mhc-predictor netmhc \\\n            --mhc-alleles HLA-A0201 H2-Db \\\n            --mhc-predictor netmhc \\\n            --output-csv epitope.csv\n    \"\"\"\n    args = parse_args(args_list)\n    binding_predictions = run_predictor(args)\n    df = binding_predictions.to_dataframe()\n    logger.info('\\n%s', df)\n    if args.output_csv:\n        df.to_csv(args.output_csv, index=False)\n        print(\"Wrote: %s\" % args.output_csv)"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nprepares the DRB allele name.", "response": "def _prepare_drb_allele_name(self, parsed_beta_allele):\n        \"\"\"\n        Assume that we're dealing with a human DRB allele\n        which NetMHCIIpan treats differently because there is\n        little population diversity in the DR-alpha gene\n        \"\"\"\n        if \"DRB\" not in parsed_beta_allele.gene:\n            raise ValueError(\"Unexpected allele %s\" % parsed_beta_allele)\n        return \"%s_%s%s\" % (\n            parsed_beta_allele.gene,\n            parsed_beta_allele.allele_family,\n            parsed_beta_allele.allele_code)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef prepare_allele_name(self, allele_name):\n        parsed_alleles = parse_classi_or_classii_allele_name(allele_name)\n        if len(parsed_alleles) == 1:\n            allele = parsed_alleles[0]\n            if allele.species == \"H-2\":\n                return \"%s-%s%s\" % (\n                    allele.species,\n                    allele.gene,\n                    allele.allele_code)\n            return self._prepare_drb_allele_name(allele)\n\n        else:\n            alpha, beta = parsed_alleles\n            if \"DRA\" in alpha.gene:\n                return self._prepare_drb_allele_name(beta)\n            return \"HLA-%s%s%s-%s%s%s\" % (\n                alpha.gene,\n                alpha.allele_family,\n                alpha.allele_code,\n                beta.gene,\n                beta.allele_family,\n                beta.allele_code)", "response": "Prepare the allele name for the given allele name."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_error(data):\n    if isinstance(data, dict):\n        if 'errors' in data:\n            error = data['errors'][0]\n        else:\n            error = data.get('error', None)\n\n        if isinstance(error, dict):\n            if error.get('code') in errors:\n                return error", "response": "return the error if there is an exception"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nget the response data if possible and raise an exception", "response": "async def throw(response, loads=None, encoding=None, **kwargs):\n    \"\"\" Get the response data if possible and raise an exception \"\"\"\n    if loads is None:\n        loads = data_processing.loads\n\n    data = await data_processing.read(response, loads=loads,\n                                      encoding=encoding)\n\n    error = get_error(data)\n    if error is not None:\n        exception = errors[error['code']]\n        raise exception(response=response, error=error, data=data, **kwargs)\n\n    if response.status in statuses:\n        exception = statuses[response.status]\n        raise exception(response=response, data=data, **kwargs)\n\n    # raise PeonyException if no specific exception was found\n    raise PeonyException(response=response, data=data, **kwargs)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef code(self, code):\n        def decorator(exception):\n            self[code] = exception\n            return exception\n\n        return decorator", "response": "Decorator to associate a code to an exception"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\nasync def prepare_request(self, method, url,\n                              headers=None,\n                              skip_params=False,\n                              proxy=None,\n                              **kwargs):\n        \"\"\"\n        prepare all the arguments for the request\n\n        Parameters\n        ----------\n        method : str\n            HTTP method used by the request\n        url : str\n            The url to request\n        headers : dict, optional\n            Additionnal headers\n        proxy : str\n            proxy of the request\n        skip_params : bool\n            Don't use the parameters to sign the request\n\n        Returns\n        -------\n        dict\n            Parameters of the request correctly formatted\n        \"\"\"\n\n        if method.lower() == \"post\":\n            key = 'data'\n        else:\n            key = 'params'\n\n        if key in kwargs and not skip_params:\n            request_params = {key: kwargs.pop(key)}\n        else:\n            request_params = {}\n\n        request_params.update(dict(method=method.upper(), url=url))\n\n        coro = self.sign(**request_params, skip_params=skip_params,\n                         headers=headers)\n        request_params['headers'] = await utils.execute(coro)\n        request_params['proxy'] = proxy\n\n        kwargs.update(request_params)\n\n        return kwargs", "response": "Prepares the request for the current object."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nmakes sure the user doesn t override the Authorization header", "response": "def _user_headers(self, headers=None):\n        \"\"\" Make sure the user doesn't override the Authorization header \"\"\"\n        h = self.copy()\n\n        if headers is not None:\n            keys = set(headers.keys())\n            if h.get('Authorization', False):\n                keys -= {'Authorization'}\n\n            for key in keys:\n                h[key] = headers[key]\n\n        return h"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef process_keys(func):\n\n    @wraps(func)\n    def decorated(self, k, *args):\n        if not isinstance(k, str):\n            msg = \"%s: key must be a string\" % self.__class__.__name__\n            raise ValueError(msg)\n\n        if not k.startswith(self.prefix):\n            k = self.prefix + k\n\n        return func(self, k, *args)\n\n    return decorated", "response": "Decorator to process keys that are not strings\n    and add the prefix if it is missing\n   "}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nget the right function from the text", "response": "def _get(self, text):\n        \"\"\"\n            Analyze the text to get the right function\n\n        Parameters\n        ----------\n        text : str\n            The text that could call a function\n        \"\"\"\n        if self.strict:\n            match = self.prog.match(text)\n            if match:\n                cmd = match.group()\n                if cmd in self:\n                    return cmd\n        else:\n            words = self.prog.findall(text)\n            for word in words:\n                if word in self:\n                    return word"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nruns the function you want", "response": "async def run(self, *args, data):\n        \"\"\" run the function you want \"\"\"\n        cmd = self._get(data.text)\n\n        try:\n            if cmd is not None:\n                command = self[cmd](*args, data=data)\n                return await peony.utils.execute(command)\n\n        except:\n            fmt = \"Error occurred while running function {cmd}:\"\n            peony.utils.log_error(fmt.format(cmd=cmd))"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ngive a radius and theta return the cartesian ( x y ) coordinates.", "response": "def get_cartesian(r, theta):\n    \"\"\"\n    Given a radius and theta, return the cartesian (x, y) coordinates.\n    \"\"\"\n    x = r*np.sin(theta)\n    y = r*np.cos(theta)\n\n    return x, y"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef initialize_major_angle(self):\n        num_groups = len(self.nodes.keys())\n        self.major_angle = 2 * np.pi / num_groups", "response": "Initializes the major angle of the cluster."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef initialize_minor_angle(self):\n        num_groups = len(self.nodes.keys())\n\n        self.minor_angle = 2 * np.pi / (6 * num_groups)", "response": "Initializes the minor angle of the cluster."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef plot_radius(self):\n        plot_rad = 0\n        for group, nodelist in self.nodes.items():\n            proposed_radius = len(nodelist) * self.scale\n            if proposed_radius > plot_rad:\n                plot_rad = proposed_radius\n        return plot_rad + self.internal_radius", "response": "Computes the radius of the internal and internal nodes in the cluster."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ncheck whether there are any edges within a group.", "response": "def has_edge_within_group(self, group):\n        \"\"\"\n        Checks whether there are within-group edges or not.\n        \"\"\"\n        assert group in self.nodes.keys(),\\\n            \"{0} not one of the group of nodes\".format(group)\n        nodelist = self.nodes[group]\n        for n1, n2 in self.simplified_edges():\n            if n1 in nodelist and n2 in nodelist:\n                return True"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef plot_nodes(self, nodelist, theta, group):\n        for i, node in enumerate(nodelist):\n            r = self.internal_radius + i * self.scale\n            x, y = get_cartesian(r, theta)\n            circle = plt.Circle(xy=(x, y), radius=self.dot_radius,\n                                color=self.node_colormap[group], linewidth=0)\n            self.ax.add_patch(circle)", "response": "Plots nodes to screen."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ncompute the theta along which a group s nodes are aligned.", "response": "def group_theta(self, group):\n        \"\"\"\n        Computes the theta along which a group's nodes are aligned.\n        \"\"\"\n        for i, g in enumerate(self.nodes.keys()):\n            if g == group:\n                break\n\n        return i * self.major_angle"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef add_axes_and_nodes(self):\n        for i, (group, nodelist) in enumerate(self.nodes.items()):\n            theta = self.group_theta(group)\n\n            if self.has_edge_within_group(group):\n                    theta = theta - self.minor_angle\n                    self.plot_nodes(nodelist, theta, group)\n\n                    theta = theta + 2 * self.minor_angle\n                    self.plot_nodes(nodelist, theta, group)\n\n            else:\n                self.plot_nodes(nodelist, theta, group)", "response": "Adds the axes and nodes that belong to each group."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef find_node_group_membership(self, node):\n        for group, nodelist in self.nodes.items():\n            if node in nodelist:\n                return group", "response": "Finds the group for which a node belongs to."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nreturn the index of the node in the sorted list.", "response": "def get_idx(self, node):\n        \"\"\"\n        Finds the index of the node in the sorted list.\n        \"\"\"\n        group = self.find_node_group_membership(node)\n        return self.nodes[group].index(node)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ncompute the radial position of the node.", "response": "def node_radius(self, node):\n        \"\"\"\n        Computes the radial position of the node.\n        \"\"\"\n        return self.get_idx(node) * self.scale + self.internal_radius"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreturning the theta angle of the node.", "response": "def node_theta(self, node):\n        \"\"\"\n        Convenience function to find the node's theta angle.\n        \"\"\"\n        group = self.find_node_group_membership(node)\n        return self.group_theta(group)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef draw_edge(self, n1, n2, d, group):\n        start_radius = self.node_radius(n1)\n        start_theta = self.node_theta(n1)\n\n        end_radius = self.node_radius(n2)\n        end_theta = self.node_theta(n2)\n\n        start_theta, end_theta = self.correct_angles(start_theta, end_theta)\n        start_theta, end_theta = self.adjust_angles(n1, start_theta, n2,\n                                                    end_theta)\n\n        middle1_radius = np.min([start_radius, end_radius])\n        middle2_radius = np.max([start_radius, end_radius])\n\n        if start_radius > end_radius:\n            middle1_radius, middle2_radius = middle2_radius, middle1_radius\n\n        middle1_theta = np.mean([start_theta, end_theta])\n        middle2_theta = np.mean([start_theta, end_theta])\n\n        startx, starty = get_cartesian(start_radius, start_theta)\n        middle1x, middle1y = get_cartesian(middle1_radius, middle1_theta)\n        middle2x, middle2y = get_cartesian(middle2_radius, middle2_theta)\n        # middlex, middley = get_cartesian(middle_radius, middle_theta)\n        endx, endy = get_cartesian(end_radius, end_theta)\n\n        verts = [(startx, starty),\n                 (middle1x, middle1y),\n                 (middle2x, middle2y),\n                 (endx, endy)]\n        codes = [Path.MOVETO, Path.CURVE4, Path.CURVE4, Path.CURVE4]\n\n        path = Path(verts, codes)\n        if self.edge_colormap is None:\n            edgecolor = 'black'\n        else:\n            edgecolor = self.edge_colormap[group]\n        patch = patches.PathPatch(path, lw=self.linewidth, facecolor='none',\n                                  edgecolor=edgecolor, alpha=0.3)\n        self.ax.add_patch(patch)", "response": "Draws the edge between two nodes."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ndraw all of the edges in the graph.", "response": "def add_edges(self):\n        \"\"\"\n        Draws all of the edges in the graph.\n        \"\"\"\n        for group, edgelist in self.edges.items():\n            for (u, v, d) in edgelist:\n                self.draw_edge(u, v, d, group)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef draw(self):\n        self.ax.set_xlim(-self.plot_radius(), self.plot_radius())\n        self.ax.set_ylim(-self.plot_radius(), self.plot_radius())\n\n        self.add_axes_and_nodes()\n        self.add_edges()\n\n        self.ax.axis('off')", "response": "This is the master function that draws everything."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef adjust_angles(self, start_node, start_angle, end_node, end_angle):\n        start_group = self.find_node_group_membership(start_node)\n        end_group = self.find_node_group_membership(end_node)\n\n        if start_group == 0 and end_group == len(self.nodes.keys())-1:\n            if self.has_edge_within_group(start_group):\n                start_angle = correct_negative_angle(start_angle -\n                                                     self.minor_angle)\n            if self.has_edge_within_group(end_group):\n                end_angle = correct_negative_angle(end_angle +\n                                                   self.minor_angle)\n\n        elif start_group == len(self.nodes.keys())-1 and end_group == 0:\n            if self.has_edge_within_group(start_group):\n                start_angle = correct_negative_angle(start_angle +\n                                                     self.minor_angle)\n            if self.has_edge_within_group(end_group):\n                end_angle = correct_negative_angle(end_angle -\n                                                   self.minor_angle)\n\n        elif start_group < end_group:\n            if self.has_edge_within_group(end_group):\n                end_angle = correct_negative_angle(end_angle -\n                                                   self.minor_angle)\n            if self.has_edge_within_group(start_group):\n                start_angle = correct_negative_angle(start_angle +\n                                                     self.minor_angle)\n\n        elif end_group < start_group:\n            if self.has_edge_within_group(start_group):\n                start_angle = correct_negative_angle(start_angle -\n                                                     self.minor_angle)\n            if self.has_edge_within_group(end_group):\n                end_angle = correct_negative_angle(end_angle +\n                                                   self.minor_angle)\n\n        return start_angle, end_angle", "response": "This function adjusts the start and end angles to correct for duplicated axes."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef correct_angles(self, start_angle, end_angle):\n        # Edges going the anti-clockwise direction involves angle = 0.\n        if start_angle == 0 and (end_angle - start_angle > np.pi):\n            start_angle = np.pi * 2\n        if end_angle == 0 and (end_angle - start_angle < -np.pi):\n            end_angle = np.pi * 2\n\n        # Case when start_angle == end_angle:\n        if start_angle == end_angle:\n            start_angle = start_angle - self.minor_angle\n            end_angle = end_angle + self.minor_angle\n\n        return start_angle, end_angle", "response": "This function corrects the angle of the anti - clockwise edges."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nguessing an appropriate MODS XML genre type.", "response": "def mods_genre(self):\n\t\t\"\"\"\n\t\tGuesses an appropriate MODS XML genre type.\n\t\t\"\"\"\n\n\t\ttype2genre = {\n\t\t\t\t'conference': 'conference publication',\n\t\t\t\t'book chapter': 'bibliography',\n\t\t\t\t'unpublished': 'article'\n\t\t\t}\n\t\ttp = str(self.type).lower()\n\t\treturn type2genre.get(tp, tp)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _produce_author_lists(self):\n\n\t\t# post-process author names\n\t\tself.authors = self.authors.replace(', and ', ', ')\n\t\tself.authors = self.authors.replace(',and ', ', ')\n\t\tself.authors = self.authors.replace(' and ', ', ')\n\t\tself.authors = self.authors.replace(';', ',')\n\n\t\t# list of authors\n\t\tself.authors_list = [author.strip() for author in self.authors.split(',')]\n\n\t\t# simplified representation of author names\n\t\tself.authors_list_simple = []\n\n\t\t# author names represented as a tuple of given and family name\n\t\tself.authors_list_split = []\n\n\t\t# tests if title already ends with a punctuation mark\n\t\tself.title_ends_with_punct = self.title[-1] in ['.', '!', '?'] \\\n\t\t\tif len(self.title) > 0 else False\n\n\t\tsuffixes = ['I', 'II', 'III', 'IV', 'V', 'VI', 'VII', 'VIII', \"Jr.\", \"Sr.\"]\n\t\tprefixes = ['Dr.']\n\t\tprepositions = ['van', 'von', 'der', 'de', 'den']\n\n\t\t# further post-process author names\n\t\tfor i, author in enumerate(self.authors_list):\n\t\t\tif author == '':\n\t\t\t\tcontinue\n\n\t\t\tnames = author.split(' ')\n\n\t\t\t# check if last string contains initials\n\t\t\tif (len(names[-1]) <= 3) \\\n\t\t\t\tand names[-1] not in suffixes \\\n\t\t\t\tand all(c in ascii_uppercase for c in names[-1]):\n\t\t\t\t# turn \"Gauss CF\" into \"C. F. Gauss\"\n\t\t\t\tnames = [c + '.' for c in names[-1]] + names[:-1]\n\n\t\t\t# number of suffixes\n\t\t\tnum_suffixes = 0\n\t\t\tfor name in names[::-1]:\n\t\t\t\tif name in suffixes:\n\t\t\t\t\tnum_suffixes += 1\n\t\t\t\telse:\n\t\t\t\t\tbreak\n\n\t\t\t# abbreviate names\n\t\t\tfor j, name in enumerate(names[:-1 - num_suffixes]):\n\t\t\t\t# don't try to abbreviate these\n\t\t\t\tif j == 0 and name in prefixes:\n\t\t\t\t\tcontinue\n\t\t\t\tif j > 0 and name in prepositions:\n\t\t\t\t\tcontinue\n\n\t\t\t\tif (len(name) > 2) or (len(name) and (name[-1] != '.')):\n\t\t\t\t\tk = name.find('-')\n\t\t\t\t\tif 0 < k + 1 < len(name):\n\t\t\t\t\t\t# take care of dash\n\t\t\t\t\t\tnames[j] = name[0] + '.-' + name[k + 1] + '.'\n\t\t\t\t\telse:\n\t\t\t\t\t\tnames[j] = name[0] + '.'\n\n\t\t\tif len(names):\n\t\t\t\tself.authors_list[i] = ' '.join(names)\n\n\t\t\t\t# create simplified/normalized representation of author name\n\t\t\t\tif len(names) > 1:\n\t\t\t\t\tfor name in names[0].split('-'):\n\t\t\t\t\t\tname_simple = self.simplify_name(' '.join([name, names[-1]]))\n\t\t\t\t\t\tself.authors_list_simple.append(name_simple)\n\t\t\t\telse:\n\t\t\t\t\tself.authors_list_simple.append(self.simplify_name(names[0]))\n\n\t\t\t\t# number of prepositions\n\t\t\t\tnum_prepositions = 0\n\t\t\t\tfor name in names:\n\t\t\t\t\tif name in prepositions:\n\t\t\t\t\t\tnum_prepositions += 1\n\n\t\t\t\t# splitting point\n\t\t\t\tsp = 1 + num_suffixes + num_prepositions\n\t\t\t\tself.authors_list_split.append(\n\t\t\t\t\t(' '.join(names[:-sp]), ' '.join(names[-sp:])))\n\n\t\t# list of authors in BibTex format\n\t\tself.authors_bibtex = ' and '.join(self.authors_list)\n\n\t\t# overwrite authors string\n\t\tif len(self.authors_list) > 2:\n\t\t\tself.authors = ', and '.join([\n\t\t\t\t', '.join(self.authors_list[:-1]),\n\t\t\t\tself.authors_list[-1]])\n\t\telif len(self.authors_list) > 1:\n\t\t\tself.authors = ' and '.join(self.authors_list)\n\t\telse:\n\t\t\tself.authors = self.authors_list[0]", "response": "Parse the authors string to create lists of authors."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ngets a single publication.", "response": "def get_publication(context, id):\n\t\"\"\"\n\tGet a single publication.\n\t\"\"\"\n\n\tpbl = Publication.objects.filter(pk=int(id))\n\n\tif len(pbl) < 1:\n\t\treturn ''\n\n\tpbl[0].links = pbl[0].customlink_set.all()\n\tpbl[0].files = pbl[0].customfile_set.all()\n\n\treturn render_template(\n\t\t'publications/publication.html', context['request'], {'publication': pbl[0]})"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ngetting a publication list.", "response": "def get_publication_list(context, list, template='publications/publications.html'):\n\t\"\"\"\n\tGet a publication list.\n\t\"\"\"\n\n\tlist = List.objects.filter(list__iexact=list)\n\n\tif not list:\n\t\treturn ''\n\n\tlist = list[0]\n\tpublications = list.publication_set.all()\n\tpublications = publications.order_by('-year', '-month', '-id')\n\n\tif not publications:\n\t\treturn ''\n\n\t# load custom links and files\n\tpopulate(publications)\n\n\treturn render_template(\n\t\ttemplate, context['request'], {'list': list, 'publications': publications})"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef tex_parse(string):\n\n\tstring = string.replace('{', '').replace('}', '')\n\tdef tex_replace(match):\n\t\treturn \\\n\t\t\tsub(r'\\^(\\w)', r'<sup>\\1</sup>',\n\t\t\tsub(r'\\^\\{(.*?)\\}', r'<sup>\\1</sup>',\n\t\t\tsub(r'\\_(\\w)', r'<sub>\\1</sub>',\n\t\t\tsub(r'\\_\\{(.*?)\\}', r'<sub>\\1</sub>',\n\t\t\tsub(r'\\\\(' + GREEK_LETTERS + ')', r'&\\1;', match.group(1))))))\n\treturn mark_safe(sub(r'\\$([^\\$]*)\\$', tex_replace, escape(string)))", "response": "Renders some basic TeX math to HTML."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef parse(string):\n\n\t# bibliography\n\tbib = []\n\n\t# make sure we are dealing with unicode strings\n\tif not isinstance(string, six.text_type):\n\t\tstring = string.decode('utf-8')\n\n\t# replace special characters\n\tfor key, value in special_chars:\n\t\tstring = string.replace(key, value)\n\tstring = re.sub(r'\\\\[cuHvs]{?([a-zA-Z])}?', r'\\1', string)\n\n\t# split into BibTex entries\n\tentries = re.findall(\n\t\tr'(?u)@(\\w+)[ \\t]?{[ \\t]*([^,\\s]*)[ \\t]*,?\\s*((?:[^=,\\s]+\\s*\\=\\s*(?:\"[^\"]*\"|{(?:[^{}]*|{[^{}]*})*}|[^,}]*),?\\s*?)+)\\s*}',\n\t\tstring)\n\n\tfor entry in entries:\n\t\t# parse entry\n\t\tpairs = re.findall(r'(?u)([^=,\\s]+)\\s*\\=\\s*(\"[^\"]*\"|{(?:[^{}]*|{[^{}]*})*}|[^,]*)', entry[2])\n\n\t\t# add to bibliography\n\t\tbib.append({'type': entry[0].lower(), 'key': entry[1]})\n\n\t\tfor key, value in pairs:\n\t\t\t# post-process key and value\n\t\t\tkey = key.lower()\n\t\t\tif value and value[0] == '\"' and value[-1] == '\"':\n\t\t\t\tvalue = value[1:-1]\n\t\t\tif value and value[0] == '{' and value[-1] == '}':\n\t\t\t\tvalue = value[1:-1]\n\t\t\tif key not in ['booktitle', 'title']:\n\t\t\t\tvalue = value.replace('}', '').replace('{', '')\n\t\t\telse:\n\t\t\t\tif value.startswith('{') and value.endswith('}'):\n\t\t\t\t\tvalue = value[1:]\n\t\t\t\t\tvalue = value[:-1]\n\t\t\tvalue = value.strip()\n\t\t\tvalue = re.sub(r'\\s+', ' ', value)\n\n\t\t\t# store pair in bibliography\n\t\t\tbib[-1][key] = value\n\n\treturn bib", "response": "Takes a string in BibTex format and returns a list of dictionaries representing the entries in the BibTex format where each entry is a dictionary containing the keys and values."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nswap the positions of this object with a reference object.", "response": "def swap(self, qs):\n        \"\"\"\n        Swap the positions of this object with a reference object.\n        \"\"\"\n        try:\n            replacement = qs[0]\n        except IndexError:\n            # already first/last\n            return\n        if not self._valid_ordering_reference(replacement):\n            raise ValueError(\n                \"%r can only be swapped with instances of %r which %s equals %r.\" % (\n                    self, self.__class__, self.order_with_respect_to,\n                    self._get_order_with_respect_to()\n                )\n            )\n        self.order, replacement.order = replacement.order, self.order\n        self.save()\n        replacement.save()"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nmoving this object up one position.", "response": "def up(self):\n        \"\"\"\n        Move this object up one position.\n        \"\"\"\n        self.swap(self.get_ordering_queryset().filter(order__lt=self.order).order_by('-order'))"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nmove this object down one position.", "response": "def down(self):\n        \"\"\"\n        Move this object down one position.\n        \"\"\"\n        self.swap(self.get_ordering_queryset().filter(order__gt=self.order))"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nmoves object to a certain position updating all affected objects to move accordingly up or down.", "response": "def to(self, order):\n        \"\"\"\n        Move object to a certain position, updating all affected objects to move accordingly up or down.\n        \"\"\"\n        if order is None or self.order == order:\n            # object is already at desired position\n            return\n        qs = self.get_ordering_queryset()\n        if self.order > order:\n            qs.filter(order__lt=self.order, order__gte=order).update(order=F('order') + 1)\n        else:\n            qs.filter(order__gt=self.order, order__lte=order).update(order=F('order') - 1)\n        self.order = order\n        self.save()"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nmoves this object above the referenced object.", "response": "def above(self, ref):\n        \"\"\"\n        Move this object above the referenced object.\n        \"\"\"\n        if not self._valid_ordering_reference(ref):\n            raise ValueError(\n                \"%r can only be moved above instances of %r which %s equals %r.\" % (\n                    self, self.__class__, self.order_with_respect_to,\n                    self._get_order_with_respect_to()\n                )\n            )\n        if self.order == ref.order:\n            return\n        if self.order > ref.order:\n            o = ref.order\n        else:\n            o = self.get_ordering_queryset().filter(order__lt=ref.order).aggregate(Max('order')).get('order__max') or 0\n        self.to(o)"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nmoves this object below the referenced object.", "response": "def below(self, ref):\n        \"\"\"\n        Move this object below the referenced object.\n        \"\"\"\n        if not self._valid_ordering_reference(ref):\n            raise ValueError(\n                \"%r can only be moved below instances of %r which %s equals %r.\" % (\n                    self, self.__class__, self.order_with_respect_to,\n                    self._get_order_with_respect_to()\n                )\n            )\n        if self.order == ref.order:\n            return\n        if self.order > ref.order:\n            o = self.get_ordering_queryset().filter(order__gt=ref.order).aggregate(Min('order')).get('order__min') or 0\n        else:\n            o = ref.order\n        self.to(o)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef top(self):\n        o = self.get_ordering_queryset().aggregate(Min('order')).get('order__min')\n        self.to(o)", "response": "Move this object to the top of the ordered stack."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nmoving this object to the bottom of the ordered stack.", "response": "def bottom(self):\n        \"\"\"\n        Move this object to the bottom of the ordered stack.\n        \"\"\"\n        o = self.get_ordering_queryset().aggregate(Max('order')).get('order__max')\n        self.to(o)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef unapi(request):\n\n\tid = request.GET.get('id')\n\tformat = request.GET.get('format')\n\n\tif format is not None:\n\t\ttry:\n\t\t\tpublications = Publication.objects.filter(pk=int(id))\n\t\t\t\n\t\t\tif not publications:\n\t\t\t\traise ValueError\n\n\t\texcept ValueError:\n\t\t\t# invalid id\n\t\t\treturn HttpResponse('\\n'.join([\n\t\t\t\t'<?xml version=\"1.0\" encoding=\"UTF-8\"?>',\n\t\t\t\t'<error>Invalid ID.</error>']),\n\t\t\t\tcontent_type=\"application/xml\",\n\t\t\t\tstatus=404)\n\n\t\tif format == 'bibtex':\n\t\t\t# return BibTex encoded publication\n\t\t\treturn render(request, 'publications/publication.bib', {\n\t\t\t\t\t'publication': publications[0]\n\t\t\t\t},\n\t\t\t\tcontent_type='text/x-bibtex; charset=UTF-8')\n\n\t\tif format == 'mods':\n\t\t\t# return MODS encoded publication\n\t\t\treturn render(request, 'publications/publications.mods', {\n\t\t\t\t\t'publications': publications\n\t\t\t\t},\n\t\t\t\tcontent_type='application/xml; charset=UTF-8')\n\n\t\tif format == 'ris':\n\t\t\t# return MODS encoded publication\n\t\t\treturn render(request, 'publications/publications.ris', {\n\t\t\t\t\t'publications': publications\n\t\t\t\t},\n\t\t\t\tcontent_type='application/x-research-info-systems; charset=UTF-8')\n\n\t\t# invalid format\n\t\treturn HttpResponse('\\n'.join([\n\t\t\t'<?xml version=\"1.0\" encoding=\"UTF-8\"?>',\n\t\t\t'<error>Invalid format.</error>']),\n\t\t\tcontent_type=\"application/xml\",\n\t\t\tstatus=406)\n\n\tif id is not None:\n\t\treturn HttpResponse('\\n'.join([\n\t\t\t'<?xml version=\"1.0\" encoding=\"UTF-8\"?>',\n\t\t\t'<formats id=\"{0}\">'.format(id),\n\t\t\t'<format name=\"bibtex\" type=\"text/x-bibtex\" />',\n\t\t\t'<format name=\"ris\" type=\"application/x-research-info-systems\" />',\n\t\t\t'<format name=\"mods\" type=\"application/xml\" />',\n\t\t\t'</formats>']), content_type=\"application/xml\")\n\n\treturn HttpResponse('\\n'.join([\n\t\t'<?xml version=\"1.0\" encoding=\"UTF-8\"?>',\n\t\t'<formats>',\n\t\t'<format name=\"bibtex\" type=\"text/x-bibtex\" />',\n\t\t'<format name=\"ris\" type=\"application/x-research-info-systems\" />',\n\t\t'<format name=\"mods\" type=\"application/xml\" />',\n\t\t'</formats>']), content_type=\"application/xml\")", "response": "UnAPI 1. 0 view implements unAPI 1. 0."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\npopulates the publications. with the data from the database and attach to publications.", "response": "def populate(publications):\n\t\"\"\"\n\tLoad custom links and files from database and attach to publications.\n\t\"\"\"\n\n\tcustomlinks = CustomLink.objects.filter(publication__in=publications)\n\tcustomfiles = CustomFile.objects.filter(publication__in=publications)\n\n\tpublications_ = {}\n\tfor publication in publications:\n\t\tpublication.links = []\n\t\tpublication.files = []\n\t\tpublications_[publication.id] = publication\n\n\tfor link in customlinks:\n\t\tpublications_[link.publication_id].links.append(link)\n\tfor file in customfiles:\n\t\tpublications_[file.publication_id].files.append(file)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nbuilding a vcf file from the supercatg array and the cat. clust. gz output", "response": "def make(data, samples):\n    \"\"\" build a vcf file from the supercatg array and the cat.clust.gz output\"\"\"\n    \n    outfile = open(os.path.join(data.dirs.outfiles, data.name+\".vcf\"), 'w')\n    inloci = os.path.join(data.dirs.outfiles, data.name+\".loci\")\n    names = [i.name for i in samples]\n    names.sort()\n\n    ## TODO: Get a real version number for the current sw stack\n    version = \"0.1\"\n    ## TODO: This is just reporting minimum depth per base. Would it be useful to\n    ## report real depth of reads per base? YEAH, that's what supercatg is for.\n    mindepth = data.paramsdict[\"mindepth_statistical\"] \n\n    print >>outfile, \"##fileformat=VCFv4.1\"\n    print >>outfile, \"##fileDate=\"+time.strftime(\"%Y%m%d\")\n    print >>outfile, \"##source=ipyRAD.v.\"+version\n    print >>outfile, \"##reference=common_allele_at_each_locus\"\n    print >>outfile, \"##INFO=<ID=NS,Number=1,Type=Integer,Description=\\\"Number of Samples With Data\\\">\"\n    print >>outfile, \"##INFO=<ID=DP,Number=1,Type=Integer,Description=\\\"Total Depth\\\">\"\n    print >>outfile, \"##INFO=<ID=AF,Number=A,Type=Float,Description=\\\"Allele Frequency\\\">\"\n    print >>outfile, \"##INFO=<ID=AA,Number=1,Type=String,Description=\\\"Ancestral Allele\\\">\"\n    print >>outfile, \"##FORMAT=<ID=GT,Number=1,Type=String,Description=\\\"Genotype\\\">\"\n    print >>outfile, \"##FORMAT=<ID=GQ,Number=1,Type=Integer,Description=\\\"Genotype Quality\\\">\"\n    print >>outfile, \"##FORMAT=<ID=DP,Number=1,Type=Integer,Description=\\\"Read Depth\\\">\"\n    print >>outfile, \"\\t\".join([\"#CHROM\",\"POS\",\"ID\",\"REF\",\"ALT\",\"QUAL\",\"FILTER\",\"INFO    \",\"FORMAT\"]+list(names))\n\n    loci = open(inloci).read().split(\"|\")[:-1]\n    snps = 0\n    vcflist = []\n    for locusnumber in range(len(loci)):\n        samps = [i.split()[0][1:] for i in loci[locusnumber].strip().split(\"\\n\") if \">\" in i]\n        loc = np.array([tuple(i.split()[-1]) for i in loci[locusnumber].strip().split(\"\\n\") if \">\" in i])\n        NS = str(len(loc))\n        DP = str(mindepth)\n        for base in range(len(loc.T)):\n            col = []\n            site = list(loc.T[base])\n            site = list(\"\".join(site).replace(\"-\",\"\").replace(\"N\",\"\"))\n            if site:\n                for bb in site:\n                    if bb in list(\"RKYSWM\"):\n                        col += unstruct(bb)[0]\n                        col += unstruct(bb)[1]\n                    else:\n                        col += bb\n                REF = most_common([i for i in col if i not in list(\"-RKYSWMN\")])\n                ALT = set([i for i in col if (i in list(\"ATGC-N\")) and (i!=REF)])\n                if ALT:\n                    snps += 1\n                    GENO = [REF]+list(ALT)\n                    GENOS = []\n                    for samp in names:\n                        if samp in samps:\n                            idx = samps.index(samp)\n                            f = unstruct(loc.T[base][idx])\n                            if ('-' in f) or ('N' in f):\n                                GENOS.append(\"./.\")\n                            else:\n                                GENOS.append(str(GENO.index(f[0]))+\"|\"+str(GENO.index(f[1])))\n                        else:\n                            GENOS.append(\"./.\")\n                    vcflist.append(\"\\t\".join([`locusnumber+1`, `base+1`, '.', REF, \",\".join(ALT), \"20\", \"PASS\",\n                                              \";\".join([\"NS=\"+NS, \"DP=\"+DP]), \"GT\"]+GENOS))\n        if not locusnumber % 1000:\n            outfile.write( \"\\n\".join(vcflist)+\"\\n\" )\n            vcflist = []\n                                              \n                    #print >>outfile, \"\\t\".join([`locusnumber+1`, `base+1`, '.', REF, \",\".join(ALT), \"20\", \"PASS\",\n                    #                            \";\".join([\"NS=\"+NS, \"DP=\"+DP]), \"GT\"]+GENOS)\n    \n\n    outfile.write( \"\\n\".join(vcflist) )\n    outfile.close()"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ncalculate the quartet weights for the test at a random subsampled chunk of loci.", "response": "def worker(self):\n    \"\"\" \n    Calculates the quartet weights for the test at a random\n    subsampled chunk of loci.\n    \"\"\"\n\n    ## subsample loci \n    fullseqs = self.sample_loci()\n\n    ## find all iterations of samples for this quartet\n    liters = itertools.product(*self.imap.values())\n\n    ## run tree inference for each iteration of sampledict\n    hashval = uuid.uuid4().hex\n    weights = []\n    for ridx, lidx in enumerate(liters):\n        \n        ## get subalignment for this iteration and make to nex\n        a,b,c,d = lidx\n        sub = {}\n        for i in lidx:\n            if self.rmap[i] == \"p1\":\n                sub[\"A\"] = fullseqs[i]\n            elif self.rmap[i] == \"p2\":\n                sub[\"B\"] = fullseqs[i]\n            elif self.rmap[i] == \"p3\":\n                sub[\"C\"] = fullseqs[i]\n            else:\n                sub[\"D\"] = fullseqs[i]\n                \n        ## write as nexus file\n        nex = []\n        for tax in list(\"ABCD\"):\n            nex.append(\">{}         {}\".format(tax, sub[tax]))\n            \n        ## check for too much missing or lack of variants\n        nsites, nvar = count_var(nex)\n\n        ## only run test if there's variation present\n        if nvar > self.minsnps:\n               \n            ## format as nexus file\n            nexus = \"{} {}\\n\".format(4, len(fullseqs[a])) + \"\\n\".join(nex)    \n\n            ## infer ML tree\n            treeorder = self.run_tree_inference(nexus, \"{}.{}\".format(hashval, ridx))\n\n            ## add to list\n            weights.append(treeorder)\n\n    ## cleanup - remove all files with the hash val\n    rfiles = glob.glob(os.path.join(tempfile.tempdir, \"*{}*\".format(hashval)))\n    for rfile in rfiles:\n        if os.path.exists(rfile):\n            os.remove(rfile)\n\n    ## return result as weights for the set topologies.\n    trees = [\"ABCD\", \"ACBD\", \"ADBC\"]\n    wdict = {i:float(weights.count(i))/len(weights) for i in trees}\n    return wdict"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nget order of tree", "response": "def get_order(tre):\n    \"\"\"\n    return tree order\n    \"\"\"\n    anode = tre.tree&\">A\"\n    sister = anode.get_sisters()[0]\n    sisters = (anode.name[1:], sister.name[1:])\n    others = [i for i in list(\"ABCD\") if i not in sisters]\n    return sorted(sisters) + sorted(others)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef count_var(nex):\n    arr = np.array([list(i.split()[-1]) for i in nex])\n    miss = np.any(arr==\"N\", axis=0)\n    nomiss = arr[:, ~miss]\n    nsnps = np.invert(np.all(nomiss==nomiss[0, :], axis=0)).sum()\n    return nomiss.shape[1], nsnps", "response": "Count the number of variable sites in a single N - dimensional sequence."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef sample_loci(self):\n\n        ## store idx of passing loci\n        idxs = np.random.choice(self.idxs, self.ntests)\n\n        ## open handle, make a proper generator to reduce mem\n        with open(self.data) as indata:\n            liter = (indata.read().strip().split(\"|\\n\"))\n\n        ## store data as dict\n        seqdata = {i:\"\" for i in self.samples}\n\n        ## put chunks into a list\n        for idx, loc in enumerate(liter):\n            if idx in idxs:\n                ## parse chunk\n                lines = loc.split(\"\\n\")[:-1]\n                names = [i.split()[0] for i in lines]\n                seqs = [i.split()[1] for i in lines]\n                dd = {i:j for i,j in zip(names, seqs)}\n\n                ## add data to concatenated seqdict\n                for name in seqdata:\n                    if name in names:\n                        seqdata[name] += dd[name]\n                    else:\n                        seqdata[name] += \"N\"*len(seqs[0])\n                        \n        ## concatenate into a phylip file\n        return seqdata", "response": "finds loci with sufficient sampling for this test"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nwrite nexus to tmpfile runs phyml tree inference and parses and returns the resulting tree.", "response": "def run_tree_inference(self, nexus, idx):\n        \"\"\"\n        Write nexus to tmpfile, runs phyml tree inference, and parses\n        and returns the resulting tree. \n        \"\"\"\n        ## create a tmpdir for this test\n        tmpdir = tempfile.tempdir\n        tmpfile = os.path.join(tempfile.NamedTemporaryFile(\n            delete=False,\n            prefix=str(idx),\n            dir=tmpdir,\n        ))\n\n        ## write nexus to tmpfile\n        tmpfile.write(nexus)\n        tmpfile.flush()\n\n        ## infer the tree\n        rax = raxml(name=str(idx), data=tmpfile.name, workdir=tmpdir, N=1, T=2)\n        rax.run(force=True, block=True, quiet=True)\n\n        ## clean up\n        tmpfile.close()\n\n        ## return tree order\n        order = get_order(toytree.tree(rax.trees.bestTree))\n        return \"\".join(order)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nrun all the jobs in parallel", "response": "def run(self, ipyclient):\n        \"\"\"\n        parallelize calls to worker function.\n        \"\"\"\n        \n        ## connect to parallel client\n        lbview = ipyclient.load_balanced_view()\n        \n        ## iterate over tests\n        asyncs = []\n        for test in xrange(self.ntests): \n            \n            ## submit jobs to run\n            async = lbview.apply(worker, self)\n            asyncs.append(async)\n            \n        ## wait for jobs to finish\n        ipyclient.wait()\n\n        ## check for errors\n        for async in asyncs:\n            if not async.successful():\n                raise Exception(\"Error: {}\".format(async.result()))\n\n        ## return results as df\n        results = [i.result() for i in asyncs]\n        self.results_table = pd.DataFrame(results)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef plot(self):\n        if self.results_table == None:\n            return \"no results found\"\n        else:\n            bb = self.results_table.sort_values(\n                by=[\"ABCD\", \"ACBD\"], \n                ascending=[False, True],\n                )\n\n            ## make a barplot\n            import toyplot\n            c = toyplot.Canvas(width=600, height=200)\n            a = c.cartesian()\n            m = a.bars(bb)\n            return c, a, m", "response": "return a toyplot barplot of the results table"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef plot(self, pcs=[1, 2], ax=None, cmap=None, cdict=None, legend=True, title=None, outfile=None):\n        ## Specify which 2 pcs to plot, default is pc1 and pc2\n        pc1 = pcs[0] - 1\n        pc2 = pcs[1] - 1\n        if pc1 < 0 or pc2 > self.ncomponents - 1:\n            raise IPyradError(\"PCs are 1-indexed. 1 is min & {} is max\".format(self.ncomponents))\n\n        ## Convert genotype data to allele count data\n        ## We do this here because we might want to try different ways\n        ## of accounting for missing data and \"alt\" allele counts treat\n        ## missing data as \"ref\"\n        allele_counts = self.genotypes.to_n_alt()\n\n        ## Actually do the pca\n        if self.ncomponents > len(self.samples_vcforder):\n            self.ncomponents = len(self.samples_vcforder)\n            print(\"  INFO: # PCs < # samples. Forcing # PCs = {}\".format(self.ncomponents))\n        coords, model = allel.pca(allele_counts, n_components=self.ncomponents, scaler='patterson')\n\n        self.pcs = pd.DataFrame(coords,\n                                index=self.samples_vcforder,\n                                columns=[\"PC{}\".format(x) for x in range(1,self.ncomponents+1)])\n\n        ## Just allow folks to pass in the name of the cmap they want to use\n        if isinstance(cmap, str):\n            try:\n                cmap = cm.get_cmap(cmap)\n            except:\n                raise IPyradError(\"  Bad cmap value: {}\".format(cmap))\n\n\n        if not cmap and not cdict:\n            if not self.quiet:\n                print(\"  Using default cmap: Spectral\")\n            cmap = cm.get_cmap('Spectral')\n\n        if cmap:\n            if cdict:\n                print(\"  Passing in both cmap and cdict defaults to using the cmap value.\")\n            popcolors = cmap(np.arange(len(self.pops))/len(self.pops))\n            cdict = {i:j for i, j in zip(self.pops.keys(), popcolors)}\n\n        fig = \"\"\n        if not ax:\n            fig = plt.figure(figsize=(6, 5))\n            ax = fig.add_subplot(1, 1, 1)\n\n        x = coords[:, pc1]\n        y = coords[:, pc2]\n        for pop in self.pops:\n            ## Don't include pops with no samples, it makes the legend look stupid\n            ## TODO: This doesn't prevent empty pops from showing up in the legend for some reason.\n            if len(self.pops[pop]) > 0:\n                mask = np.isin(self.samples_vcforder, self.pops[pop])\n                ax.plot(x[mask], y[mask], marker='o', linestyle=' ', color=cdict[pop], label=pop, markersize=6, mec='k', mew=.5)\n\n        ax.set_xlabel('PC%s (%.1f%%)' % (pc1+1, model.explained_variance_ratio_[pc1]*100))\n        ax.set_ylabel('PC%s (%.1f%%)' % (pc2+1, model.explained_variance_ratio_[pc2]*100))\n\n        if legend:\n            ax.legend(bbox_to_anchor=(1, 1), loc='upper left')\n\n        if fig:\n            fig.tight_layout()\n\n        if title:\n            ax.set_title(title)\n\n        if outfile:\n            try:\n                plt.savefig(outfile, format=\"png\", bbox_inches=\"tight\")\n            except:\n                print(\"  Saving pca.plot() failed to save figure to {}\".format(outfile))\n\n        return ax", "response": "Plot the PCA and plot it."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef plot_pairwise_dist(self, labels=None, ax=None, cmap=None, cdict=None, metric=\"euclidean\"):\n        allele_counts = self.genotypes.to_n_alt()\n        dist = allel.pairwise_distance(allele_counts, metric=metric)\n        if not ax:\n            fig = plt.figure(figsize=(5, 5))\n            ax = fig.add_subplot(1, 1, 1)\n\n        if isinstance(labels, bool):\n            if labels:\n                labels = list(self.samples_vcforder)\n        elif isinstance(labels, type(None)):\n            pass\n        else:\n            ## If not bool or None (default), then check to make sure the list passed in\n            ## is the right length\n            if not len(labels) == len(self.samples_vcforder):\n                raise IPyradError(LABELS_LENGTH_ERROR.format(len(labels), len(self.samples_vcforder)))\n\n        allel.plot.pairwise_distance(dist, labels=labels, ax=ax, colorbar=False)", "response": "Plot pairwise distance between all samples in the species."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning a copy of the pca analysis object", "response": "def copy(self):\n        \"\"\" returns a copy of the pca analysis object \"\"\"\n        cp = copy.deepcopy(self)\n        cp.genotypes = allel.GenotypeArray(self.genotypes, copy=True)\n        return cp"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nconverting ipyrad. loci file to iqtree - pomo counts file.", "response": "def loci2cf(name, locifile, popdict, wdir=None, ipyclient=None):\n    \"\"\" \n    Convert ipyrad .loci file to an iqtree-pomo 'counts' file\n\n    Parameters:\n    -----------\n    name:\n        A prefix name for output files that will be produced\n    locifile:\n        A .loci file produced by ipyrad.\n    popdict: \n        A python dictionary grouping Clade names to Sample names. \n        Example: {\"A\": ['a', 'b', 'c'], \"B\": ['d', 'e', 'f']}\n    ipyclient:\n        If you pass it an ipyclient it will distribute work over\n        remote engines, otherwise we use multiprocessing (todo).\n    \"\"\"\n\n    ## working directory, make sure it exists\n    if wdir:\n        wdir = os.path.abspath(wdir)\n        if not os.path.exists(wdir):\n            raise IPyradWarningExit(\" working directory (wdir) does not exist\")\n    else:\n        wdir = os.path.curdir\n\n    ## output file path\n    name = name.rsplit(\".cf\")[0]\n    outfile = os.path.join(wdir, \"{}.cf\".format(name))\n    out = open(outfile, 'w')\n\n    ## parse loci file\n    with open(locifile) as inloc:\n        loci = inloc.read().strip().split(\"|\\n\")    \n\n    ## get all names\n    names = list(itertools.chain(*popdict.values()))\n    popkeys = sorted(popdict.keys())\n\n    ## count nsites\n    nsites = sum(len(loc.split(\"\\n\")[0].split()[1]) for loc in loci[:])\n\n    ## print the header \n    out.write(HEADER.format(**{\"NPOP\": len(popdict),\n                               \"NSITES\": nsites,\n                               \"VTAXA\": \"\\t\".join(popkeys)}))\n    \n    ## build print string\n    outstr = \"chr{:<8}  {:<4}  \"\n    for cidx in xrange(len(popkeys)):\n        outstr += \"{:<8}  \"\n    \n    toprint = []\n    for idx in xrange(len(loci)):\n        dat = loci[idx].split(\"\\n\")\n        seqs = np.array([list(i.split()[1]) for i in dat[:-1]])\n        names = [i.split()[0] for i in dat[:-1]]\n        data = np.zeros((seqs.shape[1], len(popkeys), 4), dtype=np.uint16)\n\n        for sidx in xrange(seqs.shape[1]):\n            for cidx in xrange(len(popkeys)):               \n                for name in popdict[popkeys[cidx]]:\n                    if name in names:\n                        base = seqs[names.index(name), sidx]\n                        if base in list(\"ACGT\"):\n                            data[sidx, cidx, BASE2IDX[base]] += 2\n                        elif base in list(\"RSYMKW\"):\n                            base1, base2 = AMBIGS[base]\n                            data[sidx, cidx, BASE2IDX[base1]] += 1\n                            data[sidx, cidx, BASE2IDX[base2]] += 1\n                        \n            ## print string for one locus\n            sdat = [\",\".join([str(i) for i in i.tolist()]) for i in data[sidx]]\n            #print outstr.format(idx+1, sidx+1, *sdat)\n            toprint.append(outstr.format(idx+1, sidx+1, *sdat))\n\n        ## if 10K loci, then print and clear\n        if not idx % 10000:\n            out.write(\"\\n\".join(toprint)+\"\\n\")\n            toprint = []\n\n    ## close handle\n    out.write(\"\\n\".join(toprint)+\"\\n\")\n    out.close()"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef update(assembly, idict, count):\n\n    data = iter(open(os.path.join(assembly.dirs.outfiles,\n                     assembly.name+\".phy\"), 'r'))\n\n    ntax, nchar = data.next().strip().split()\n\n    ## read in max N bp at a time                                                                            \n    for line in data:\n        tax, seq = line.strip().split()\n        idict[tax] = idict[tax][100000:]\n        idict[tax] += seq[count:count+100000]\n    del line\n\n    return idict", "response": "updates dictionary with the next. 5M reads from the super long string \n    phylip file. Makes faster reading."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nbuild a phy file from a list of samples.", "response": "def makephy(data, samples, longname):\n    \"\"\" builds phy output. If large files writes 50000 loci at a time to tmp\n    files and rebuilds at the end\"\"\"\n\n    ## order names\n    names = [i.name for i in samples]\n    names.sort()\n    \n    ## read in loci file\n    locifile = os.path.join(data.dirs.outfiles, data.name+\".loci\")\n    locus = iter(open(locifile, 'rb'))\n\n    ## dict for saving the full matrix\n    fdict = {name:[] for name in names}\n\n    ## list for saving locus number and locus range for partitions\n    partitions = []\n    initial_pos = 1\n\n    ## remove empty column sites and append edited seqs to dict F\n    done = 0\n    nloci = 0\n    nbases = 0\n\n    ## TODO: This should be fixed. it cycles through reading each locus\n    ## until nloci is less than this large number. It should really just\n    ## read to the end of the file, so it'll do all loci no matter how\n    ## many there are.\n    while nloci < 5000000: \n        seqs = []\n        #arrayed = np.array([])\n        anames = []\n        while 1:\n            ## get next locus\n            try:\n                samp = locus.next()\n            except StopIteration:\n                done = 1\n                break\n            if \"//\" in samp:\n                nloci += 1\n                break\n            else:\n                try:\n                    name, seq = samp.split()\n                except ValueError:\n                    print samp\n                anames.append(name[1:])\n                seqs.append(seq.strip())\n        ## reset\n        arrayed = np.array([list(i) for i in seqs])\n        if done:\n            break\n        ## create mask for columns that are empty or \n        ## that are paired-end separators (compatible w/ pyrad v2 and v3)\n        #mask = [i for i in range(len(arrayed.T)) if np.any([\n        ## still surely a better way to vectorize this...\n        mask = [i for i in arrayed.T if any([j not in list(\"-Nn\") for j in i])]\n        masked = np.dstack(mask)[0]\n\n        ## partition information\n        loc_name = \"p\"+str(nloci)\n        loc_range = str(initial_pos) + \"-\" +\\\n                    str(len(masked[0]) + initial_pos -1)\n        initial_pos += len(masked[0])\n        partitions.append(loc_name+\"=\"+loc_range)\n\n        ## uncomment to print block info (used to partition by locus)\n        #blockend += minray\n        #print blockend,\n        #print loc\n        #print arrayed\n\n        ## append data to dict\n        for name in names:\n            if name in anames:\n                #fdict[name].append(arrayed[anames.index(name), mask].tostring())\n                fdict[name].append(masked[anames.index(name),:].tostring())\n            else:\n                fdict[name].append(\"N\"*masked.shape[1])\n                #fdict[name].append(\"N\"*len(arrayed[0, mask]))\n        ## add len to total length\n        nbases += len(fdict[name][-1])\n\n        ## after x iterations tmp pickle fdict?\n        if not nloci % 1e4:\n            ## concat strings\n            for name in fdict:\n                with open(os.path.join(assembly.dirs.outfiles , \"tmp\", \n                    \"{}_{}.phy.tmp\".format(name, nloci)), 'wb') as wout:\n                    wout.write(\"\".join(fdict[name]))\n            del fdict\n            fdict = {name:[] for name in names}\n\n    ## print out .PHY file, if really big, pull form multiple tmp pickle\n    superout = open(os.path.join( assembly.dirs.outfiles, assembly.name+\".phy\" ), 'wb')\n    print >>superout, len(names), nbases\n    if nloci < 1e4:\n        for name in names:\n            print >>superout, name+(\" \"*((longname+3)-\\\n                              len(name)))+\"\".join(fdict[name])\n    else:\n        for name in names:\n            superout.write(\"{}{}{}\".format(\n                            name,\n                            \" \"*((longname+3)-len(name)),\n                            \"\".join(fdict[name])))\n            tmpfiles = glob.glob(os.path.join(assembly.dirs.outfiles, \"tmp\", name+\"*.phy.tmp\"))\n            tmpfiles.sort()\n            for tmpf in tmpfiles:\n                with open(tmpf, 'rb') as tmpin:\n                    superout.write(tmpin.read())\n                os.remove(tmpf)\n            superout.write(\"\\n\")\n    superout.close()\n    raxml_part_out = open(os.path.join(assembly.dirs.outfiles, assembly.name+\".phy.partitions\"), 'w')\n    for partition in partitions:\n        print >>raxml_part_out, \"DNA, %s\" % (partition)\n    raxml_part_out.close()\n\n    return partitions"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nmakes a phylip and nexus format for a list of samples.", "response": "def make(assembly, samples):\n    \"\"\" Make phylip and nexus formats. This is hackish since I'm recycling the \n    code whole-hog from pyrad V3. Probably could be good to go back through \n    and clean up the conversion code some time.\n    \"\"\"\n\n    ## get the longest name\n    longname = max([len(i) for i in assembly.samples.keys()])\n    names = [i.name for i in samples]\n\n    partitions = makephy(assembly, samples, longname)\n    makenex(assembly, names, longname, partitions)"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\ncleans up a bunch of loose files.", "response": "def sample_cleanup(data, sample):\n    \"\"\"\n    Clean up a bunch of loose files.\n    \"\"\"\n    umap1file = os.path.join(data.dirs.edits, sample.name+\"-tmp-umap1.fastq\")\n    umap2file = os.path.join(data.dirs.edits, sample.name+\"-tmp-umap2.fastq\")\n    unmapped = os.path.join(data.dirs.refmapping, sample.name+\"-unmapped.bam\")\n    samplesam = os.path.join(data.dirs.refmapping, sample.name+\".sam\")\n    split1 = os.path.join(data.dirs.edits, sample.name+\"-split1.fastq\")\n    split2 = os.path.join(data.dirs.edits, sample.name+\"-split2.fastq\")\n    refmap_derep = os.path.join(data.dirs.edits, sample.name+\"-refmap_derep.fastq\")\n    for f in [umap1file, umap2file, unmapped, samplesam, split1, split2, refmap_derep]:\n        try:\n            os.remove(f)\n        except:\n            pass"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nindexes the reference sequence.", "response": "def index_reference_sequence(data, force=False):\n    \"\"\" \n    Index the reference sequence, unless it already exists. Also make a mapping\n    of scaffolds to index numbers for later user in steps 5-6. \n    \"\"\"\n\n    ## get ref file from params\n    refseq_file = data.paramsdict['reference_sequence']\n    index_files = []\n\n    ## Check for existence of index files. Default to bwa unless you specify smalt\n    if \"smalt\" in data._hackersonly[\"aligner\"]:\n        # These are smalt index files. Only referenced here to ensure they exist\n        index_files.extend([\".sma\", \".smi\"])\n    else:\n        index_files.extend([\".amb\", \".ann\", \".bwt\", \".pac\", \".sa\"])\n\n    ## samtools specific index\n    index_files.extend([\".fai\"])\n\n    ## If reference sequence already exists then bail out of this func\n    if not force:\n        if all([os.path.isfile(refseq_file+i) for i in index_files]):\n            return\n    #if data._headers:\n    #    print(INDEX_MSG.format(data._hackersonly[\"aligner\"]))\n\n    if \"smalt\" in data._hackersonly[\"aligner\"]:\n        ## Create smalt index for mapping\n        ## smalt index [-k <wordlen>] [-s <stepsiz>]  <index_name> <reference_file>\n        cmd1 = [ipyrad.bins.smalt, \"index\", \n                \"-k\", str(data._hackersonly[\"smalt_index_wordlen\"]), \n                refseq_file, \n                refseq_file]\n    else:\n        ## bwa index <reference_file>\n        cmd1 = [ipyrad.bins.bwa, \"index\", refseq_file]\n\n    ## call the command\n    LOGGER.info(\" \".join(cmd1))\n    proc1 = sps.Popen(cmd1, stderr=sps.STDOUT, stdout=sps.PIPE)\n    error1 = proc1.communicate()[0]\n\n    ## simple samtools index for grabbing ref seqs\n    cmd2 = [ipyrad.bins.samtools, \"faidx\", refseq_file]\n    LOGGER.info(\" \".join(cmd2))\n    proc2 = sps.Popen(cmd2, stderr=sps.STDOUT, stdout=sps.PIPE)\n    error2 = proc2.communicate()[0]\n\n    ## error handling\n    if proc1.returncode:\n        raise IPyradWarningExit(error1)\n    if error2:\n        if \"please use bgzip\" in error2:\n            raise IPyradWarningExit(NO_ZIP_BINS.format(refseq_file))\n        else:\n            raise IPyradWarningExit(error2)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nmapping reads to reference sequence.", "response": "def mapreads(data, sample, nthreads, force):\n    \"\"\" \n    Attempt to map reads to reference sequence. This reads in the fasta files\n    (samples.files.edits), and maps each read to the reference. Unmapped reads \n    are dropped right back in the de novo pipeline. Reads that map successfully\n    are processed and pushed downstream and joined with the rest of the data \n    post muscle_align. \n\n    Mapped reads end up in a sam file.\n    \"\"\"\n\n    LOGGER.info(\"Entering mapreads(): %s %s\", sample.name, nthreads)\n\n    ## This is the input derep file, for paired data we need to split the data, \n    ## and so we will make sample.files.dereps == [derep1, derep2], but for \n    ## SE data we can simply use sample.files.derep == [derepfile].\n    derepfile = os.path.join(data.dirs.edits, sample.name+\"_derep.fastq\")\n    sample.files.dereps = [derepfile]\n\n    ## This is the final output files containing merged/concat derep'd refmap'd \n    ## reads that did not match to the reference. They will be back in \n    ## merge/concat (--nnnnn--) format ready to be input to vsearch, if needed. \n    mumapfile = sample.files.unmapped_reads\n    umap1file = os.path.join(data.dirs.edits, sample.name+\"-tmp-umap1.fastq\")\n    umap2file = os.path.join(data.dirs.edits, sample.name+\"-tmp-umap2.fastq\")        \n\n    ## split the derepfile into the two handles we designate\n    if \"pair\" in data.paramsdict[\"datatype\"]:\n        sample.files.split1 = os.path.join(data.dirs.edits, sample.name+\"-split1.fastq\")\n        sample.files.split2 = os.path.join(data.dirs.edits, sample.name+\"-split2.fastq\")\n        sample.files.dereps = [sample.files.split1, sample.files.split2]\n        split_merged_reads(sample.files.dereps, derepfile)\n\n    ## (cmd1) smalt <task> [TASK_OPTIONS] [<index_name> <file_name_A> [<file_name_B>]]\n    ##  -f sam       : Output as sam format, tried :clip: to hard mask output \n    ##                 but it shreds the unmapped reads (outputs empty fq)\n    ##  -l [pe,mp,pp]: If paired end select the orientation of each read\n    ##  -n #         : Number of threads to use\n    ##  -x           : Perform a more exhaustive search\n    ##  -y #         : proportion matched to reference (sequence similarity)\n    ##  -o           : output file\n    ##               : Reference sequence\n    ##               : Input file(s), in a list. One for R1 and one for R2\n    ##  -c #         : proportion of the query read length that must be covered\n\n    ## (cmd1) bwa mem [OPTIONS] <index_name> <file_name_A> [<file_name_B>] > <output_file>\n    ##  -t #         : Number of threads\n    ##  -M           : Mark split alignments as secondary.\n\n    ## (cmd2) samtools view [options] <in.bam>|<in.sam>|<in.cram> [region ...] \n    ##   -b = write to .bam\n    ##   -q = Only keep reads with mapq score >= 30 (seems to be pretty standard)\n    ##   -F = Select all reads that DON'T have these flags. \n    ##         0x4 (segment unmapped)\n    ##         0x100 (Secondary alignment)\n    ##         0x800 (supplementary alignment)\n    ##   -U = Write out all reads that don't pass the -F filter \n    ##        (all unmapped reads go to this file).\n\n    ## TODO: Should eventually add `-q 13` to filter low confidence mapping.\n    ## If you do this it will throw away some fraction of reads. Ideally you'd\n    ## catch these and throw them in with the rest of the unmapped reads, but\n    ## I can't think of a straightforward way of doing that. There should be \n    ## a `-Q` flag to only keep reads below the threshold, but i realize that\n    ## would be of limited use besides for me.\n\n    ## (cmd3) samtools sort [options...] [in.bam]\n    ##   -T = Temporary file name, this is required by samtools, ignore it\n    ##        Here we hack it to be samhandle.tmp cuz samtools cleans it up\n    ##   -O = Output file format, in this case bam\n    ##   -o = Output file name\n\n    if \"smalt\" in data._hackersonly[\"aligner\"]:\n        ## The output SAM data is written to file (-o)\n        ## input is either (derep) or (derep-split1, derep-split2)\n        cmd1 = [ipyrad.bins.smalt, \"map\", \n                \"-f\", \"sam\", \n                \"-n\", str(max(1, nthreads)),\n                \"-y\", str(data.paramsdict['clust_threshold']), \n                \"-o\", os.path.join(data.dirs.refmapping, sample.name+\".sam\"),\n                \"-x\",\n                data.paramsdict['reference_sequence']\n                ] + sample.files.dereps\n        cmd1_stdout = sps.PIPE\n        cmd1_stderr = sps.STDOUT\n    else:\n        cmd1 = [ipyrad.bins.bwa, \"mem\",\n                \"-t\", str(max(1, nthreads)),\n                \"-M\",\n                data.paramsdict['reference_sequence']\n                ] + sample.files.dereps\n        ## Insert optional flags for bwa\n        try:\n            bwa_args = data._hackersonly[\"bwa_args\"].split()\n            bwa_args.reverse()\n            for arg in bwa_args:\n                cmd1.insert(2, arg)\n        except KeyError:\n            ## Do nothing\n            pass\n        cmd1_stdout = open(os.path.join(data.dirs.refmapping, sample.name+\".sam\"), 'w')\n        cmd1_stderr = None\n\n    ## Reads in the SAM file from cmd1. It writes the unmapped data to file\n    ## and it pipes the mapped data to be used in cmd3\n    cmd2 = [ipyrad.bins.samtools, \"view\", \n           \"-b\", \n           ## TODO: This introduces a bug with PE right now. Think about the case where\n           ## R1 has low qual mapping and R2 has high. You get different numbers\n           ## of reads in the unmapped tmp files. FML.\n           #\"-q\", \"30\",\n           \"-F\", \"0x904\",\n           \"-U\", os.path.join(data.dirs.refmapping, sample.name+\"-unmapped.bam\"), \n           os.path.join(data.dirs.refmapping, sample.name+\".sam\")]\n\n    ## this is gonna catch mapped bam output from cmd2 and write to file\n    cmd3 = [ipyrad.bins.samtools, \"sort\", \n            \"-T\", os.path.join(data.dirs.refmapping, sample.name+\".sam.tmp\"),\n            \"-O\", \"bam\", \n            \"-o\", sample.files.mapped_reads]\n\n    ## TODO: Unnecessary?\n    ## this is gonna read the sorted BAM file and index it. only for pileup?\n    cmd4 = [ipyrad.bins.samtools, \"index\", sample.files.mapped_reads]\n\n    ## this is gonna read in the unmapped files, args are added below, \n    ## and it will output fastq formatted unmapped reads for merging.\n    ## -v 45 sets the default qscore arbitrarily high\n    cmd5 = [ipyrad.bins.samtools, \"bam2fq\", \"-v 45\",\n            os.path.join(data.dirs.refmapping, sample.name+\"-unmapped.bam\")]\n\n    ## Insert additional arguments for paired data to the commands.\n    ## We assume Illumina paired end reads for the orientation \n    ## of mate pairs (orientation: ---> <----). \n    if 'pair' in data.paramsdict[\"datatype\"]:\n        if \"smalt\" in data._hackersonly[\"aligner\"]:\n            ## add paired flag (-l pe) to cmd1 right after (smalt map ...)\n            cmd1.insert(2, \"pe\")\n            cmd1.insert(2, \"-l\")\n        else:\n            ## No special PE flags for bwa\n            pass\n        ## add samtools filter for only keep if both pairs hit\n        ## 0x1 - Read is paired\n        ## 0x2 - Each read properly aligned\n        cmd2.insert(2, \"0x3\")\n        cmd2.insert(2, \"-f\")\n\n        ## tell bam2fq that there are output files for each read pair\n        cmd5.insert(2, umap1file)\n        cmd5.insert(2, \"-1\")\n        cmd5.insert(2, umap2file)\n        cmd5.insert(2, \"-2\")\n    else:\n        cmd5.insert(2, mumapfile)\n        cmd5.insert(2, \"-0\")\n\n    ## Running cmd1 creates ref_mapping/sname.sam, \n    LOGGER.debug(\" \".join(cmd1))\n    proc1 = sps.Popen(cmd1, stderr=cmd1_stderr, stdout=cmd1_stdout)\n\n    ## This is really long running job so we wrap it to ensure it dies. \n    try:\n        error1 = proc1.communicate()[0]\n    except KeyboardInterrupt:\n        proc1.kill()\n\n    ## raise error if one occurred in smalt\n    if proc1.returncode:\n        raise IPyradWarningExit(error1)\n\n    ## Running cmd2 writes to ref_mapping/sname.unmapped.bam, and \n    ## fills the pipe with mapped BAM data\n    LOGGER.debug(\" \".join(cmd2))\n    proc2 = sps.Popen(cmd2, stderr=sps.STDOUT, stdout=sps.PIPE)\n\n    ## Running cmd3 pulls mapped BAM from pipe and writes to \n    ## ref_mapping/sname.mapped-sorted.bam. \n    ## Because proc2 pipes to proc3 we just communicate this to run both.\n    LOGGER.debug(\" \".join(cmd3))\n    proc3 = sps.Popen(cmd3, stderr=sps.STDOUT, stdout=sps.PIPE, stdin=proc2.stdout)\n    error3 = proc3.communicate()[0]\n    if proc3.returncode:\n        raise IPyradWarningExit(error3)\n    proc2.stdout.close()\n\n    ## Later we're gonna use samtools to grab out regions using 'view', and to\n    ## do that we need it to be indexed. Let's index it now. \n    LOGGER.debug(\" \".join(cmd4))\n    proc4 = sps.Popen(cmd4, stderr=sps.STDOUT, stdout=sps.PIPE)\n    error4 = proc4.communicate()[0]\n    if proc4.returncode:\n        raise IPyradWarningExit(error4)\n    \n    ## Running cmd5 writes to either edits/sname-refmap_derep.fastq for SE\n    ## or it makes edits/sname-tmp-umap{12}.fastq for paired data, which \n    ## will then need to be merged.\n    LOGGER.debug(\" \".join(cmd5))\n    proc5 = sps.Popen(cmd5, stderr=sps.STDOUT, stdout=sps.PIPE)\n    error5 = proc5.communicate()[0]\n    if proc5.returncode:\n        raise IPyradWarningExit(error5)\n\n    ## Finally, merge the unmapped reads, which is what cluster()\n    ## expects. If SE, just rename the outfile. In the end\n    ## <sample>-refmap_derep.fq will be the final output\n    if 'pair' in data.paramsdict[\"datatype\"]:\n        LOGGER.info(\"Merging unmapped reads {} {}\".format(umap1file, umap2file))\n        merge_pairs_after_refmapping(data, [(umap1file, umap2file)], mumapfile)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nbuild a single end cluster from the refmapped data.", "response": "def fetch_cluster_se(data, samfile, chrom, rstart, rend):\n    \"\"\"\n    Builds a single end cluster from the refmapped data.\n    \"\"\"\n\n    ## If SE then we enforce the minimum overlap distance to avoid the\n    ## staircase syndrome of multiple reads overlapping just a little.\n    overlap_buffer = data._hackersonly[\"min_SE_refmap_overlap\"]\n\n    ## the *_buff variables here are because we have to play patty\n    ## cake here with the rstart/rend vals because we want pysam to\n    ## enforce the buffer for SE, but we want the reference sequence\n    ## start and end positions to print correctly for downstream.\n    rstart_buff = rstart + overlap_buffer\n    rend_buff = rend - overlap_buffer\n\n    ## Reads that map to only very short segements of the reference\n    ## sequence will return buffer end values that are before the\n    ## start values causing pysam to complain. Very short mappings.\n    if rstart_buff > rend_buff:\n        tmp = rstart_buff\n        rstart_buff = rend_buff\n        rend_buff = tmp\n    ## Buffering can't make start and end equal or pysam returns nothing.\n    if rstart_buff == rend_buff:\n        rend_buff += 1\n\n    ## store pairs\n    rdict = {}\n    clust = []\n    iterreg = []\n\n    iterreg = samfile.fetch(chrom, rstart_buff, rend_buff)\n\n    ## use dict to match up read pairs\n    for read in iterreg:\n        if read.qname not in rdict:\n            rdict[read.qname] = read\n\n    ## sort dict keys so highest derep is first ('seed')\n    sfunc = lambda x: int(x.split(\";size=\")[1].split(\";\")[0])\n    rkeys = sorted(rdict.keys(), key=sfunc, reverse=True)\n\n    ## get blocks from the seed for filtering, bail out if seed is not paired\n    try:\n        read1 = rdict[rkeys[0]]\n    except ValueError:\n        LOGGER.error(\"Found bad cluster, skipping - key:{} rdict:{}\".format(rkeys[0], rdict))\n        return \"\"\n\n    ## the starting blocks for the seed\n    poss = read1.get_reference_positions(full_length=True)\n    seed_r1start = min(poss)\n    seed_r1end = max(poss)\n\n    ## store the seed -------------------------------------------\n    if read1.is_reverse:\n        seq = revcomp(read1.seq)\n    else:\n        seq = read1.seq\n\n    ## store, could write orient but just + for now.\n    size = sfunc(rkeys[0])\n    clust.append(\">{}:{}:{};size={};*\\n{}\"\\\n        .format(chrom, seed_r1start, seed_r1end, size, seq))\n\n    ## If there's only one hit in this region then rkeys will only have\n    ## one element and the call to `rkeys[1:]` will raise. Test for this.\n    if len(rkeys) > 1:\n        ## store the hits to the seed -------------------------------\n        for key in rkeys[1:]:\n            skip = False\n            try:\n                read1 = rdict[key]\n            except ValueError:\n                ## enter values that will make this read get skipped\n                read1 = rdict[key][0]\n                skip = True\n\n            ## orient reads only if not skipping\n            if not skip:\n                poss = read1.get_reference_positions(full_length=True)\n                minpos = min(poss)\n                maxpos = max(poss)\n                ## store the seq\n                if read1.is_reverse:\n                    seq = revcomp(read1.seq)\n                else:\n                    seq = read1.seq\n                ## store, could write orient but just + for now.\n                size = sfunc(key)\n                clust.append(\">{}:{}:{};size={};+\\n{}\"\\\n                    .format(chrom, minpos, maxpos, size, seq))\n            else:\n                ## seq is excluded, though, we could save it and return\n                ## it as a separate cluster that will be aligned separately.\n                pass\n\n    return clust"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef fetch_cluster_pairs(data, samfile, chrom, rstart, rend):\n    ## store pairs\n    rdict = {}\n    clust = []\n\n    ## grab the region and make tuples of info\n    iterreg = samfile.fetch(chrom, rstart, rend)\n\n    ## use dict to match up read pairs\n    for read in iterreg:\n        if read.qname not in rdict:\n            rdict[read.qname] = [read]\n        else:\n            rdict[read.qname].append(read)\n\n    ## sort dict keys so highest derep is first ('seed')\n    sfunc = lambda x: int(x.split(\";size=\")[1].split(\";\")[0])\n    rkeys = sorted(rdict.keys(), key=sfunc, reverse=True)\n    \n    ## get blocks from the seed for filtering, bail out if seed is not paired\n    try:\n        read1, read2 = rdict[rkeys[0]]\n    except ValueError:\n        return 0\n\n    ## the starting blocks for the seed\n    poss = read1.get_reference_positions() + read2.get_reference_positions()\n    seed_r1start = min(poss)\n    seed_r2end = max(poss)\n\n    ## store the seed -------------------------------------------\n    ## Simplify. R1 and R2 are always on opposite strands, but the\n    ## orientation is variable. We revcomp and order the reads to\n    ## preserve genomic order.\n    reads_overlap = False\n    if read1.is_reverse:\n        if read2.aend > read1.get_blocks()[0][0]:\n            reads_overlap = True\n            seq = read2.seq + \"nnnn\" + revcomp(read1.seq)\n        else:\n            seq = read2.seq + \"nnnn\" + read1.seq\n    else:\n        if read1.aend > read2.get_blocks()[0][0]:\n            reads_overlap = True\n            seq = read1.seq + \"nnnn\" + revcomp(read2.seq)\n        else:\n            seq = read1.seq + \"nnnn\" + read2.seq\n\n    ## store, could write orient but just + for now.\n    size = sfunc(rkeys[0])\n    clust.append(\">{}:{}:{};size={};*\\n{}\"\\\n        .format(chrom, seed_r1start, seed_r2end, size, seq))\n\n    ## If there's only one hit in this region then rkeys will only have\n    ## one element and the call to `rkeys[1:]` will raise. Test for this.\n    if len(rkeys) > 1:\n        ## store the hits to the seed -------------------------------\n        for key in rkeys[1:]:\n            skip = False\n            try:\n                read1, read2 = rdict[key]\n            except ValueError:\n                ## enter values that will make this read get skipped\n                read1 = rdict[key][0]\n                read2 = read1\n                skip = True\n\n            ## orient reads and filter out ones that will not align well b/c\n            ## they do not overlap enough with the seed\n            poss = read1.get_reference_positions() + read2.get_reference_positions()\n            minpos = min(poss)\n            maxpos = max(poss)\n\n            ## skip if more than one hit location\n            if read1.has_tag(\"SA\") or read2.has_tag(\"SA\"):\n                skip = True\n\n            ## store if read passes \n            if (abs(minpos - seed_r1start) < 50) and \\\n               (abs(maxpos - seed_r2end) < 50) and \\\n               (not skip):\n                ## store the seq\n                if read1.is_reverse:\n                    ## do reads overlap\n                    if read2.aend > read1.get_blocks()[0][0]:\n                        reads_overlap = True\n                        seq = read2.seq + \"nnnn\" + revcomp(read1.seq)\n                    else:\n                        seq = read2.seq + \"nnnn\" + read1.seq\n                else:\n                    if read1.aend > read2.get_blocks()[0][0]:\n                        reads_overlap = True\n                        seq = read1.seq + \"nnnn\" + revcomp(read2.seq)\n                    else:\n                        seq = read1.seq + \"nnnn\" + read2.seq\n\n                ## store, could write orient but just + for now.\n                size = sfunc(key)\n                clust.append(\">{}:{}:{};size={};+\\n{}\"\\\n                    .format(chrom, minpos, maxpos, size, seq))\n            else:\n                ## seq is excluded, though, we could save it and return\n                ## it as a separate cluster that will be aligned separately.\n                pass\n\n    ## merge the pairs prior to returning them\n    ## Remember, we already tested for quality scores, so\n    ## merge_after_pysam will generate arbitrarily high scores\n    ## It would be nice to do something here like test if\n    ## the average insert length + 2 stdv is > 2*read len\n    ## so you can switch off merging for mostly non-overlapping data\n    if reads_overlap:\n        if data._hackersonly[\"refmap_merge_PE\"]:\n            clust = merge_after_pysam(data, clust)\n            #clust = merge_pair_pipes(data, clust)\n\n    return clust", "response": "Builds a paired cluster from the refmapped data."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nbuilding a chunk for each region of a sample and add it to the list of clusters.", "response": "def ref_build_and_muscle_chunk(data, sample):\n    \"\"\" \n    1. Run bedtools to get all overlapping regions\n    2. Parse out reads from regions using pysam and dump into chunk files. \n       We measure it out to create 10 chunk files per sample. \n    3. If we really wanted to speed this up, though it is pretty fast already, \n       we could parallelize it since we can easily break the regions into \n       a list of chunks. \n    \"\"\"\n\n    ## get regions using bedtools\n    regions = bedtools_merge(data, sample).strip().split(\"\\n\")\n    nregions = len(regions)\n    chunksize = (nregions / 10) + (nregions % 10)\n\n    LOGGER.debug(\"nregions {} chunksize {}\".format(nregions, chunksize))\n    ## create an output file to write clusters to\n    idx = 0\n    tmpfile = os.path.join(data.tmpdir, sample.name+\"_chunk_{}.ali\")\n    ## remove old files if they exist to avoid append errors\n    for i in range(11):\n        if os.path.exists(tmpfile.format(i)):\n            os.remove(tmpfile.format(i))\n    fopen = open\n\n    ## If reference+denovo we drop the reads back into clust.gz\n    ## and let the muscle_chunker do it's thing back in cluster_within\n    if data.paramsdict[\"assembly_method\"] == \"denovo+reference\":\n        tmpfile = os.path.join(data.dirs.clusts, sample.name+\".clust.gz\")\n        fopen = gzip.open\n    \n    ## build clusters for aligning with muscle from the sorted bam file\n    samfile = pysam.AlignmentFile(sample.files.mapped_reads, 'rb')\n    #\"./tortas_refmapping/PZ70-mapped-sorted.bam\", \"rb\")\n\n    ## fill clusts list and dump periodically\n    clusts = []\n    nclusts = 0\n    for region in regions:\n        chrom, pos1, pos2 = region.split()\n        try:\n            ## fetches pairs quickly but then goes slow to merge them.\n            if \"pair\" in data.paramsdict[\"datatype\"]:\n                clust = fetch_cluster_pairs(data, samfile, chrom, int(pos1), int(pos2))\n\n            ## fetch but no need to merge\n            else:\n                clust = fetch_cluster_se(data, samfile, chrom, int(pos1), int(pos2))\n        except IndexError as inst:\n            LOGGER.error(\"Bad region chrom:start-end {}:{}-{}\".format(chrom, pos1, pos2))\n            continue\n        if clust:\n            clusts.append(\"\\n\".join(clust))\n            nclusts += 1\n\n            if nclusts == chunksize:\n                ## write to file\n                tmphandle = tmpfile.format(idx)\n                with fopen(tmphandle, 'a') as tmp:\n                    #LOGGER.debug(\"Writing tmpfile - {}\".format(tmpfile.format(idx)))\n                    #if data.paramsdict[\"assembly_method\"] == \"denovo+reference\":\n                    #    ## This is dumb, but for this method you need to prepend the\n                    #    ## separator to maintain proper formatting of clust.gz\n                    tmp.write(\"\\n//\\n//\\n\".join(clusts)+\"\\n//\\n//\\n\")\n                idx += 1\n                nclusts = 0\n                clusts = []\n    if clusts:\n        ## write remaining to file\n        with fopen(tmpfile.format(idx), 'a') as tmp:\n            #tmp.write(\"\\n//\\n//\\n\" + (\"\\n//\\n//\\n\".join(clusts)))\n            tmp.write(\"\\n//\\n//\\n\".join(clusts)+\"\\n//\\n//\\n\")\n        clusts = []\n    \n    if not data.paramsdict[\"assembly_method\"] == \"denovo+reference\":\n        chunkfiles = glob.glob(os.path.join(data.tmpdir, sample.name+\"_chunk_*.ali\"))\n        LOGGER.info(\"created chunks %s\", chunkfiles)\n\n    ## cleanup\n    samfile.close()"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nruns bedtools to get all overlapping regions. Pass this list into the func 'get_overlapping_reads' which will write fastq chunks to the clust.gz file. 1) Run bedtools merge to get a list of all contiguous blocks of bases in the reference seqeunce where one or more of our reads overlap. The output will look like this: 1 45230754 45230783 1 74956568 74956596 ... 1 116202035 116202060", "response": "def ref_muscle_chunker(data, sample):\n    \"\"\" \n    Run bedtools to get all overlapping regions. Pass this list into the func\n    'get_overlapping_reads' which will write fastq chunks to the clust.gz file.\n\n    1) Run bedtools merge to get a list of all contiguous blocks of bases\n    in the reference seqeunce where one or more of our reads overlap.\n    The output will look like this:\n            1       45230754        45230783\n            1       74956568        74956596\n            ...\n            1       116202035       116202060\n    \"\"\"\n    LOGGER.info('entering ref_muscle_chunker')\n\n    ## Get regions, which will be a giant list of 5-tuples, of which we're \n    ## only really interested in the first three: (chrom, start, end) position.\n    regions = bedtools_merge(data, sample)\n\n    if len(regions) > 0:\n        ## this calls bam_region_to_fasta a billion times\n        get_overlapping_reads(data, sample, regions)\n    else:\n        msg = \"No reads mapped to reference sequence - {}\".format(sample.name)\n        LOGGER.warn(msg)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ntake merged file from vsearch derep and splits it into R1 and R2 parts.", "response": "def split_merged_reads(outhandles, input_derep):\n    \"\"\"\n    Takes merged/concat derep file from vsearch derep and split it back into \n    separate R1 and R2 parts. \n    - sample_fastq: a list of the two file paths to write out to.\n    - input_reads: the path to the input merged reads\n    \"\"\"\n\n    handle1, handle2 = outhandles\n    splitderep1 = open(handle1, 'w')\n    splitderep2 = open(handle2, 'w')\n\n    with open(input_derep, 'r') as infile: \n        ## Read in the infile two lines at a time: (seqname, sequence)\n        duo = itertools.izip(*[iter(infile)]*2)\n\n        ## lists for storing results until ready to write\n        split1s = []\n        split2s = []\n\n        ## iterate over input splitting, saving, and writing.\n        idx = 0\n        while 1:\n            try:\n                itera = duo.next()\n            except StopIteration:\n                break\n            ## split the duo into separate parts and inc counter\n            part1, part2 = itera[1].split(\"nnnn\")\n            idx += 1\n\n            ## R1 needs a newline, but R2 inherits it from the original file            \n            ## store parts in lists until ready to write\n            split1s.append(\"{}{}\\n\".format(itera[0], part1))\n            split2s.append(\"{}{}\".format(itera[0], part2))\n\n            ## if large enough then write to file\n            if not idx % 10000:\n                splitderep1.write(\"\".join(split1s))\n                splitderep2.write(\"\".join(split2s))\n                split1s = []\n                split2s = [] \n\n    ## write final chunk if there is any\n    if any(split1s):\n        splitderep1.write(\"\".join(split1s))\n        splitderep2.write(\"\".join(split2s))\n\n    ## close handles\n    splitderep1.close()\n    splitderep2.close()"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ncheck mean insert size for this sample and update hackersonly. max_inner_mate_distance controls how far apart mate pairs can be considered for bedtools merging downstream.", "response": "def check_insert_size(data, sample):\n    \"\"\"\n    check mean insert size for this sample and update \n    hackersonly.max_inner_mate_distance if need be. This value controls how \n    far apart mate pairs can be to still be considered for bedtools merging \n    downstream.\n    \"\"\"\n\n    ## pipe stats output to grep\n    cmd1 = [ipyrad.bins.samtools, \"stats\", sample.files.mapped_reads]\n    cmd2 = [\"grep\", \"SN\"]\n    proc1 = sps.Popen(cmd1, stderr=sps.STDOUT, stdout=sps.PIPE)\n    proc2 = sps.Popen(cmd2, stderr=sps.STDOUT, stdout=sps.PIPE, stdin=proc1.stdout)\n\n    ## get piped result\n    res = proc2.communicate()[0]\n\n    ## raise exception on failure and do cleanup\n    if proc2.returncode:\n        raise IPyradWarningExit(\"error in %s: %s\", cmd2, res)\n        \n    ## starting vals\n    avg_insert = 0\n    stdv_insert = 0\n    avg_len = 0\n\n    ## iterate over results\n    for line in res.split(\"\\n\"):\n        if \"insert size average\" in line:\n            avg_insert = float(line.split(\":\")[-1].strip())\n\n        elif \"insert size standard deviation\" in line:\n            ## hack to fix sim data when stdv is 0.0. Shouldn't\n            ## impact real data bcz stdv gets rounded up below\n            stdv_insert = float(line.split(\":\")[-1].strip()) + 0.1\n       \n        elif \"average length\" in line:\n            avg_len = float(line.split(\":\")[-1].strip())\n\n    LOGGER.debug(\"avg {} stdv {} avg_len {}\"\\\n                 .format(avg_insert, stdv_insert, avg_len))\n\n    ## If all values return successfully set the max inner mate distance.\n    ## This is tricky. avg_insert is the average length of R1+R2+inner mate\n    ## distance. avg_len is the average length of a read. If there are lots\n    ## of reads that overlap then avg_insert will be close to but bigger than\n    ## avg_len. We are looking for the right value for `bedtools merge -d`\n    ## which wants to know the max distance between reads. \n    if all([avg_insert, stdv_insert, avg_len]):\n        ## If 2 * the average length of a read is less than the average\n        ## insert size then most reads DO NOT overlap\n        if stdv_insert < 5:\n            stdv_insert = 5.\n        if (2 * avg_len) < avg_insert:\n            hack = avg_insert + (3 * np.math.ceil(stdv_insert)) - (2 * avg_len)\n\n        ## If it is > than the average insert size then most reads DO\n        ## overlap, so we have to calculate inner mate distance a little \n        ## differently.\n        else:\n            hack = (avg_insert - avg_len) + (3 * np.math.ceil(stdv_insert))\n            \n\n        ## set the hackerdict value\n        LOGGER.info(\"stdv: hacked insert size is %s\", hack)\n        data._hackersonly[\"max_inner_mate_distance\"] = int(np.math.ceil(hack))\n\n    else:\n        ## If something fsck then set a relatively conservative distance\n        data._hackersonly[\"max_inner_mate_distance\"] = 300\n        LOGGER.debug(\"inner mate distance for {} - {}\".format(sample.name,\\\n                    data._hackersonly[\"max_inner_mate_distance\"]))"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nget all contiguous genomic regions with one or more overlapping reads. This is the shell command we'll eventually run bedtools bamtobed -i 1A_0.sorted.bam | bedtools merge [-d 100] -i <input_bam> : specifies the input file to bed'ize -d <int> : For PE set max distance between reads", "response": "def bedtools_merge(data, sample):\n    \"\"\" \n    Get all contiguous genomic regions with one or more overlapping\n    reads. This is the shell command we'll eventually run\n\n    bedtools bamtobed -i 1A_0.sorted.bam | bedtools merge [-d 100]\n        -i <input_bam>  :   specifies the input file to bed'ize\n        -d <int>        :   For PE set max distance between reads\n    \"\"\"\n    LOGGER.info(\"Entering bedtools_merge: %s\", sample.name)\n    mappedreads = os.path.join(data.dirs.refmapping, \n                               sample.name+\"-mapped-sorted.bam\")\n\n    ## command to call `bedtools bamtobed`, and pipe output to stdout\n    ## Usage:   bedtools bamtobed [OPTIONS] -i <bam> \n    ## Usage:   bedtools merge [OPTIONS] -i <bam> \n    cmd1 = [ipyrad.bins.bedtools, \"bamtobed\", \"-i\", mappedreads]\n    cmd2 = [ipyrad.bins.bedtools, \"merge\", \"-i\", \"-\"]\n\n    ## If PE the -d flag to tell bedtools how far apart to allow mate pairs.\n    ## If SE the -d flag is negative, specifying that SE reads need to\n    ## overlap by at least a specific number of bp. This prevents the\n    ## stairstep syndrome when a + and - read are both extending from\n    ## the same cutsite. Passing a negative number to `merge -d` gets this done.\n    if 'pair' in data.paramsdict[\"datatype\"]:\n        check_insert_size(data, sample)\n        #cmd2.insert(2, str(data._hackersonly[\"max_inner_mate_distance\"]))\n        cmd2.insert(2, str(data._hackersonly[\"max_inner_mate_distance\"]))\n        cmd2.insert(2, \"-d\")\n    else:\n        cmd2.insert(2, str(-1 * data._hackersonly[\"min_SE_refmap_overlap\"]))\n        cmd2.insert(2, \"-d\")\n\n    ## pipe output from bamtobed into merge\n    LOGGER.info(\"stdv: bedtools merge cmds: %s %s\", cmd1, cmd2)\n    proc1 = sps.Popen(cmd1, stderr=sps.STDOUT, stdout=sps.PIPE)\n    proc2 = sps.Popen(cmd2, stderr=sps.STDOUT, stdout=sps.PIPE, stdin=proc1.stdout)\n    result = proc2.communicate()[0]\n    proc1.stdout.close()\n\n    ## check for errors and do cleanup\n    if proc2.returncode:\n        raise IPyradWarningExit(\"error in %s: %s\", cmd2, result)\n\n    ## Write the bedfile out, because it's useful sometimes.\n    if os.path.exists(ipyrad.__debugflag__):\n        with open(os.path.join(data.dirs.refmapping, sample.name + \".bed\"), 'w') as outfile:\n            outfile.write(result)\n\n    ## Report the number of regions we're returning\n    nregions = len(result.strip().split(\"\\n\"))\n    LOGGER.info(\"bedtools_merge: Got # regions: %s\", nregions)\n    return result"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef trim_reference_sequence(fasta):\n    LOGGER.debug(\"pre - {}\".format(fasta[0]))\n\n    ## If the reads are merged then the reference sequence should be the\n    ## same length as the merged pair. If unmerged then we have to fix it.\n    if \"nnnn\" in fasta[1]:\n        r1_len = len(fasta[1].split(\"\\n\")[1].split(\"nnnn\")[0])\n        r2_len = len(fasta[1].split(\"\\n\")[1].split(\"nnnn\")[1])\n        new_seq = fasta[0].split(\"\\n\")[1][:r1_len]+(\"nnnn\")\\\n                    + revcomp(fasta[0].split(\"\\n\")[1][-r2_len:])\n        fasta[0] = fasta[0].split(\"\\n\")[0]+\"\\n\"+new_seq\n\n    LOGGER.debug(\"post - {}\".format(fasta[0]))\n    return fasta", "response": "This function trims the reference sequence to the length of the merged sequence."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef bam_region_to_fasta(data, sample, proc1, chrom, region_start, region_end):\n\n    ## output bam file handle for storing genome regions\n    bamf = sample.files.mapped_reads\n    if not os.path.exists(bamf):\n        raise IPyradWarningExit(\"  file not found - %s\", bamf)\n\n    # chrom = re.escape(repr(chrom))[1:-1].replace('\\\\\\\\', '\\\\')\n    #LOGGER.info(\"before: %s\", chrom)\n    chrom.replace(\"|\", r\"\\|\")\n    chrom.replace(\"(\", r\"\\(\")\n    chrom.replace(\")\", r\"\\)\")\n    #LOGGER.info(\"after: %s\", chrom)\n\n    ## What we want to do is have the num-chrom dict as an arg, then build this\n    ## string as three ints [chrom-int, pos-start, pos-end]\n    #cint = cdict[chrom]\n    #cpstring = \"__{}_{}_{}__\".format(cint, int(region_start)+1, region_end)\n\n    ## a string argument as input to commands, indexed at either 0 or 1, \n    ## and with pipe characters removed from chromo names\n    ## rstring_id1 is for fetching the reference sequence bcz faidx is\n    ## 1 indexed\n    rstring_id1 = \"{}:{}-{}\"\\\n        .format(chrom, str(int(region_start)+1), region_end)\n\n    ## rstring_id0 is just for printing out the reference CHROM/POS\n    ## in the read name\n    rstring_id0 = \"{}:{}-{}\"\\\n        .format(chrom, region_start, region_end)\n\n    ## If SE then we enforce the minimum overlap distance to avoid the \n    ## staircase syndrome of multiple reads overlapping just a little.\n    overlap_buffer = 0\n    if not \"pair\" in data.paramsdict[\"datatype\"]:\n        overlap_buffer = data._hackersonly[\"min_SE_refmap_overlap\"]\n\n    ## rstring_id0_buffered is for samtools view. We have to play patty\n    ## cake here with the two rstring_id0s because we want `view` to \n    ## enforce the buffer for SE, but we want the reference sequence\n    ## start and end positions to print correctly for downstream.\n    rstring_id0_buffered = \"{}:{}-{}\"\\\n        .format(chrom, int(region_start) + overlap_buffer,\\\n                int(region_end) - overlap_buffer)\n\n    ## The \"samtools faidx\" command will grab this region from reference \n    ## which we'll paste in at the top of each stack to aid alignment.\n    cmd1 = [ipyrad.bins.samtools, \"faidx\", \n            data.paramsdict[\"reference_sequence\"], \n            rstring_id1, \" ; echo __done__\"]\n\n    ## Call the command, I found that it doesn't work with shell=False if \n    ## the refstring is 'MT':100-200', but it works if it is MT:100-200. \n    LOGGER.info(\"Grabbing bam_region_to_fasta:\\n {}\".format(cmd1))\n    #proc1 = sps.Popen(cmd1, stderr=sps.STDOUT, stdout=sps.PIPE)\n    #ref = proc1.communicate()[0]\n    #if proc1.returncode:\n    #    raise IPyradWarningExit(\"  error in %s: %s\", cmd1, ref)\n\n    ## push the samtools faidx command to our subprocess, then accumulate\n    ## the results from stdout\n    print(\" \".join(cmd1), file=proc1.stdin)\n    ref = \"\"\n    for line in iter(proc1.stdout.readline, \"//\\n\"):\n        if \"__done__\" in line:\n            break\n        ref += line\n\n    ## initialize the fasta list.\n    fasta = []\n\n    ## parse sam to fasta. Save ref location to name.\n    ## Set size= an improbably large value so the REF sequence\n    ## sorts to the top for muscle aligning.\n    try:\n        name, seq = ref.strip().split(\"\\n\", 1)\n        seq = \"\".join(seq.split(\"\\n\"))\n        fasta = [\"{}_REF;size={};+\\n{}\".format(name, 1000000, seq)]\n    except ValueError as inst:\n        LOGGER.error(\"ref failed to parse - {}\".format(ref))\n        LOGGER.error(\" \".join(cmd1))\n\n    ## if PE then you have to merge the reads here\n    if \"pair\" in data.paramsdict[\"datatype\"]:\n        ## PE R1 can either be on the forward or the reverse strand.\n        ## Samtools view always outputs reads with respect to the\n        ## forward strand. This means that reads with R1 on reverse\n        ## end up with the R1 and R2 reference sequences swapped\n        ## in the clust.gz file. There is a way to fix it but it's\n        ## very annoying and i'm not sure if it's worth it...\n        ## Drop the reference sequence for now...\n        ##\n        ## If you ever fix this be sure to remove the reference sequence\n        ## from each cluster post alignment in cluster_within/align_and_parse()\n        fasta = []\n\n        ## Create temporary files for R1, R2 and merged, which we will pass to\n        ## the function merge_pairs() which calls vsearch to test merging.\n        ##\n        ## If you are on linux then creating the temp files in /dev/shm\n        ## should improve performance\n        if os.path.exists(\"/dev/shm\"):\n            prefix = os.path.join(\"/dev/shm\",\n                            \"{}-{}\".format(sample.name, rstring_id0))\n        else:\n            prefix = os.path.join(data.dirs.refmapping, \n                            \"{}-{}\".format(sample.name, rstring_id0))\n        read1 = \"{}-R1\".format(prefix)\n        read2 = \"{}-R2\".format(prefix)\n        merged = \"{}-merged\".format(prefix)\n\n        ## Grab all the reads that map to this genomic location and dump\n        ## fastq to R1 and R2 files.\n        ## `-v 45` sets the default qscore to something high\n        cmd1 = \" \".join([ipyrad.bins.samtools, \"view\", \"-b\", bamf, rstring_id0])\n        cmd2 = \" \".join([ipyrad.bins.samtools, \"bam2fq\", \"-v\", \"45\", \"-1\", read1, \"-2\", read2, \"-\", \"; echo __done__\"])\n        cmd = \" | \".join([cmd1, cmd2])\n\n        print(cmd, file=proc1.stdin)\n        for line in iter(proc1.stdout.readline, \"//\\n\"):\n            if \"__done__\" in line:\n                break\n\n        ## run commands, pipe 1 -> 2, then cleanup\n        ## proc1 = sps.Popen(cmd1, stderr=sps.STDOUT, stdout=sps.PIPE)\n        ## proc2 = sps.Popen(cmd2, stderr=sps.STDOUT, stdout=sps.PIPE, stdin=proc1.stdout)\n        ## res = proc2.communicate()[0]\n        ## if proc2.returncode:\n        ##     raise IPyradWarningExit(\"error {}: {}\".format(cmd2, res))\n        ## proc1.stdout.close()\n\n        ## merge the pairs. 0 means don't revcomp bcz samtools already\n        ## did it for us. 1 means \"actually merge\".\n        try:\n            ## return number of merged reads, writes merged data to 'merged'\n            ## we don't yet do anything with the returned number of merged \n            _ = merge_pairs(data, [(read1, read2)], merged, 0, 1)\n\n            with open(merged, 'r') as infile:\n                quatro = itertools.izip(*[iter(infile)]*4)\n                while 1:\n                    try:\n                        bits = quatro.next()\n                    except StopIteration:\n                        break\n                    ## TODO: figure out a real way to get orientation for PE\n                    orient = \"+\"\n                    fullfast = \">{a};{b};{c};{d}\\n{e}\".format(\n                        a=bits[0].split(\";\")[0],\n                        b=rstring_id1,\n                        c=bits[0].split(\";\")[1], \n                        d=orient,\n                        e=bits[1].strip())\n                    #,e=bits[9])\n                    fasta.append(fullfast)\n\n                ## TODO: If you ever figure out a good way to get the reference\n                ## sequence included w/ PE then this commented call is useful\n                ## for trimming the reference sequence to be the right length.\n                ## If doing PE and R1/R2 don't overlap then the reference sequence\n                ## will be quite long and will cause indel hell during the \n                ## alignment stage. Here trim the reference sequence to the length\n                ## of the merged reads.\n                ## This is commented out because we aren't currently including the\n                ## ref seq for PE alignment.\n                #fasta = trim_reference_sequence(fasta)\n\n        except (OSError, ValueError, IPyradError) as inst:\n            ## ValueError raised inside merge_pairs() if it can't open one\n            ## or both of the files. Write this out, but ignore for now.\n            ## Failed merging, probably unequal number of reads in R1 and R2\n            ## IPyradError raised if merge_pairs can't read either R1 or R2\n            ## file.\n            ## Skip this locus?\n            LOGGER.debug(\"Failed to merge reads, continuing; %s\", inst)\n            LOGGER.error(\"cmd - {}\".format(cmd))\n            return \"\"\n        finally:\n            ## Only clean up the files if they exist otherwise it'll raise.\n            if os.path.exists(merged):\n                os.remove(merged)\n            if os.path.exists(read1):\n                os.remove(read1)\n            if os.path.exists(read2):\n                os.remove(read2)\n       \n    else:\n        try:\n            ## SE if faster than PE cuz it skips writing intermedidate files\n            ## rstring_id0_buffered is going to enforce the required \n            ## min_SE_refmap_overlap on either end of this region.\n            cmd2 = [ipyrad.bins.samtools, \"view\", bamf, rstring_id0_buffered]\n            proc2 = sps.Popen(cmd2, stderr=sps.STDOUT, stdout=sps.PIPE)\n    \n            ## run and check outputs\n            res = proc2.communicate()[0]\n            if proc2.returncode:\n                raise IPyradWarningExit(\"{} {}\".format(cmd2, res))\n\n            ## if the region string is malformated you'll get back a warning\n            ## from samtools\n            if \"[main_samview]\" in res:\n                raise IPyradError(\"Bad reference region {}\".format(rstring_id0_buffered))\n\n            ## do not join seqs that\n            for line in res.strip().split(\"\\n\"):\n                bits = line.split(\"\\t\")\n\n                ## Read in the 2nd field (FLAGS), convert to binary\n                ## and test if the 7th bit is set which indicates revcomp\n                orient = \"+\"\n                if int('{0:012b}'.format(int(bits[1]))[7]):\n                    orient = \"-\"\n                    ## Don't actually revcomp the sequence because samtools\n                    ## writes out reference sequence on the forward strand\n                    ## as well as reverse strand hits from the bam file.\n                    #bits[9] = revcomp(bits[9])\n\n                ## Rip insert the mapping position between the seq label and\n                ## the vsearch derep size.\n                fullfast = \">{a};{b};{c};{d}\\n{e}\".format(\n                    a=bits[0].split(\";\")[0],\n                    b=rstring_id0,\n                    c=bits[0].split(\";\")[1],\n                    d=orient,\n                    e=bits[9])\n                fasta.append(fullfast)\n        except IPyradError as inst:\n            ## If the mapped fragment is too short then the you'll get\n            ## regions that look like this: scaffold262:299039-299036\n            ## Just carry on, it's not a big deal.\n            LOGGER.debug(\"Got a bad region string: {}\".format(inst))\n            return \"\"\n        except (OSError, ValueError, Exception) as inst:\n            ## Once in a blue moon something fsck and it breaks the\n            ## assembly. No reason to give up if .001% of reads fail\n            ## so just skip this locus.\n            LOGGER.error(\"Failed get reads at a locus, continuing; %s\", inst)\n            LOGGER.error(\"cmd - {}\".format(cmd2))\n            return \"\"\n\n    return \"\\n\".join(fasta)", "response": "This function takes the chromosome position start and end bases and returns the sequence of all reads that overlap these sites and reads that overlap these sites."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef refmap_stats(data, sample):\n    ## shorter names\n    mapf = os.path.join(data.dirs.refmapping, sample.name+\"-mapped-sorted.bam\")\n    umapf = os.path.join(data.dirs.refmapping, sample.name+\"-unmapped.bam\")\n\n    ## get from unmapped\n    cmd1 = [ipyrad.bins.samtools, \"flagstat\", umapf]\n    proc1 = sps.Popen(cmd1, stderr=sps.STDOUT, stdout=sps.PIPE)\n    result1 = proc1.communicate()[0]\n\n    ## get from mapped\n    cmd2 = [ipyrad.bins.samtools, \"flagstat\", mapf]\n    proc2 = sps.Popen(cmd2, stderr=sps.STDOUT, stdout=sps.PIPE)\n    result2 = proc2.communicate()[0]\n\n    ## store results\n    ## If PE, samtools reports the _actual_ number of reads mapped, both \n    ## R1 and R2, so here if PE divide the results by 2 to stay consistent\n    ## with how we've been reporting R1 and R2 as one \"read pair\"\n    if \"pair\" in data.paramsdict[\"datatype\"]:\n        sample.stats[\"refseq_unmapped_reads\"] = int(result1.split()[0]) / 2\n        sample.stats[\"refseq_mapped_reads\"] = int(result2.split()[0]) / 2\n    else:\n        sample.stats[\"refseq_unmapped_reads\"] = int(result1.split()[0])\n        sample.stats[\"refseq_mapped_reads\"] = int(result2.split()[0])\n\n    sample_cleanup(data, sample)", "response": "Get the number of mapped and unmapped reads for a sample and update sample. stats \n   "}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ninitializes the refmap files for the given sample", "response": "def refmap_init(data, sample, force):\n    \"\"\" create some file handles for refmapping \"\"\"\n    ## make some persistent file handles for the refmap reads files\n    sample.files.unmapped_reads = os.path.join(data.dirs.edits, \n                                  \"{}-refmap_derep.fastq\".format(sample.name))\n    sample.files.mapped_reads = os.path.join(data.dirs.refmapping,\n                                  \"{}-mapped-sorted.bam\".format(sample.name))"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef parse_command_line():\n\n    ## create the parser\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.RawDescriptionHelpFormatter,\n        epilog=\"\"\"\n  * Example command-line usage ---------------------------------------------- \n\n  * Read in sequence/SNP data file, provide linkage, output name, ambig option. \n     tetrad -s data.snps.phy -n test             ## input phylip and give name\n     tetrad -s data.snps.phy -l data.snps.map    ## sample one SNP per locus\n     tetrad -s data.snps.phy -n noambigs -r 0    ## do not use hetero sites\n\n  * Load saved/checkpointed analysis from '.tet.json' file, or force restart. \n     tetrad -j test.tet.json -b 100         ## continue 'test' until 100 boots\n     tetrad -j test.tet.json -b 100 -f      ## force restart of 'test'\n\n  * Sampling modes: 'equal' uses guide tree to sample quartets more efficiently \n     tetrad -s data.snps.phy -m all                       ## sample all quartets\n     tetrad -s data.snps.phy -m random -q 1e6 -x 123      ## sample 1M randomly\n     tetrad -s data.snps.phy -m equal -q 1e6 -t guide.tre ## sample 1M across tree\n\n  * Connect to N cores on a computer (default without -c arg is to use all avail.)\n     tetrad -s data.snps.phy -c 20\n\n  * Start an MPI cluster to connect to nodes across multiple available hosts.\n     tetrad -s data.snps.phy --MPI     \n\n  * Connect to a manually started ipcluster instance with default or named profile\n     tetrad -s data.snps.phy --ipcluster        ## connects to default profile\n     tetrad -s data.snps.phy --ipcluster pname  ## connects to profile='pname'\n\n  * Further documentation: http://ipyrad.readthedocs.io/analysis.html\n    \"\"\")\n\n\n    ## get version from ipyrad \n    ipyversion = str(pkg_resources.get_distribution('ipyrad'))\n    parser.add_argument('-v', '--version', action='version', \n        version=\"tetrad \"+ipyversion.split()[1])\n\n    parser.add_argument('-f', \"--force\", action='store_true',\n        help=\"force overwrite of existing data\")\n\n    parser.add_argument('-s', metavar=\"seq\", dest=\"seq\",\n        type=str, default=None,\n        help=\"path to input phylip file (only SNPs)\")\n\n    parser.add_argument('-j', metavar='json', dest=\"json\",\n        type=str, default=None,\n        help=\"load checkpointed/saved analysis from JSON file.\")\n\n    parser.add_argument('-m', metavar=\"method\", dest=\"method\",\n        type=str, default=\"all\",\n        help=\"method for sampling quartets (all, random, or equal)\")\n\n    parser.add_argument('-q', metavar=\"nquartets\", dest=\"nquartets\",\n        type=int, default=0,\n        help=\"number of quartets to sample (if not -m all)\")\n\n    parser.add_argument('-b', metavar=\"boots\", dest=\"boots\",\n        type=int, default=0,\n        help=\"number of non-parametric bootstrap replicates\")\n\n    parser.add_argument('-l', metavar=\"map_file\", dest=\"map\",\n        type=str, default=None,\n        help=\"map file of snp linkages (e.g., ipyrad .snps.map)\")\n\n    parser.add_argument('-r', metavar=\"resolve\", dest='resolve', \n        type=int, default=1, \n        help=\"randomly resolve heterozygous sites (default=1)\")\n\n    parser.add_argument('-n', metavar=\"name\", dest=\"name\",\n        type=str, default=\"test\",\n        help=\"output name prefix (default: 'test')\")\n\n    parser.add_argument('-o', metavar=\"workdir\", dest=\"workdir\",\n        type=str, default=\"./analysis-tetrad\",\n        help=\"output directory (default: creates ./analysis-tetrad)\")\n\n    parser.add_argument('-t', metavar=\"starting_tree\", dest=\"tree\",\n        type=str, default=None,\n        help=\"newick file starting tree for equal splits sampling\")\n\n    parser.add_argument(\"-c\", metavar=\"CPUs/cores\", dest=\"cores\",\n        type=int, default=0,\n        help=\"setting -c improves parallel efficiency with --MPI\")\n\n    parser.add_argument(\"-x\", metavar=\"random_seed\", dest=\"rseed\",\n        type=int, default=None,\n        help=\"random seed for quartet sampling and/or bootstrapping\")    \n\n    parser.add_argument('-d', \"--debug\", action='store_true',\n        help=\"print lots more info to debugger: ipyrad_log.txt.\")\n\n    parser.add_argument(\"--MPI\", action='store_true',\n        help=\"connect to parallel CPUs across multiple nodes\")\n\n    parser.add_argument(\"--invariants\", action='store_true',\n        help=\"save a (large) database of all invariants\")\n\n    parser.add_argument(\"--ipcluster\", metavar=\"ipcluster\", dest=\"ipcluster\",\n        type=str, nargs=\"?\", const=\"default\",\n        help=\"connect to ipcluster profile (default: 'default')\")\n\n    ## if no args then return help message\n    if len(sys.argv) == 1:\n        parser.print_help()\n        sys.exit(1)\n\n    ## parse args\n    args = parser.parse_args()\n\n    ## RAISE errors right away for some bad argument combinations:\n    if args.method not in [\"random\", \"equal\", \"all\"]:\n        raise IPyradWarningExit(\"  method argument (-m) must be one of\"+\\\n        \"\"\" \"all\", \"random\", or \"equal.\\n\"\"\")\n\n    ## if 'random' require nquarts argument\n    #if args.method == 'random':\n    #    if not args.nquartets:\n    #        raise IPyradWarningExit(\\\n    #        \"  Number of quartets (-q) is required with method = random\\n\")\n\n    ## if 'equal' method require starting tree and nquarts\n    # if args.method == 'equal':\n    #     raise IPyradWarningExit(\\\n    #         \"  The equal sampling method is currently for developers only.\\n\")\n    #     if not args.nquartets:\n    #         raise IPyradWarningExit(\\\n    #         \"  Number of quartets (-q) is required with method = equal\\n\")\n    #     if not args.tree:\n    #         raise IPyradWarningExit(\\\n    #         \"  Input guide tree (-t) is required with method = equal\\n\")\n\n    ## required args\n    if not any(x in [\"seq\", \"json\"] for x in vars(args).keys()):\n        print(\"\"\"\n    Bad arguments: tetrad command must include at least one of (-s or -j) \n    \"\"\")\n        parser.print_help()\n        sys.exit(1)\n\n    return args", "response": "Parse command - line arguments for the\n formula."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef main():\n\n    ## parse params file input (returns to stdout if --help or --version)\n    args = parse_command_line()\n    print(HEADER.format(ip.__version__))\n\n    ## set random seed\n    np.random.seed(args.rseed)\n\n    ## debugger----------------------------------------\n    if os.path.exists(ip.__debugflag__):\n        os.remove(ip.__debugflag__)\n    if args.debug:\n        print(\"\\n  ** Enabling debug mode ** \")\n        ip._debug_on()\n\n    ## if JSON, load existing Tetrad analysis -----------------------\n    if args.json:\n        data = ipa.tetrad(name=args.name, workdir=args.workdir, load=True)\n        ## if force then remove all results\n        if args.force:\n            data._refresh()\n\n    ## else create a new tmp assembly for the seqarray-----------------\n    else:\n        ## create new Tetrad class Object if it doesn't exist\n        newjson = os.path.join(args.workdir, args.name+'.tet.json')\n        ## if not quiet...\n        print(\"tetrad instance: {}\".format(args.name))\n\n        if (not os.path.exists(newjson)) or args.force:\n            ## purge any files associated with this name if forced\n            if args.force:\n                ## init an object in the correct location just to refresh\n                ipa.tetrad(name=args.name, \n                           workdir=args.workdir, \n                           data=args.seq, \n                           initarr=False, \n                           save_invariants=args.invariants,\n                           cli=True,\n                           quiet=True)._refresh()\n\n            ## create new tetrad object\n            data = ipa.tetrad(name=args.name, \n                              workdir=args.workdir, \n                              method=args.method, \n                              data=args.seq, \n                              resolve=args.resolve,\n                              mapfile=args.map, \n                              guidetree=args.tree, \n                              nboots=args.boots, \n                              nquartets=args.nquartets, \n                              cli=True,\n                              save_invariants=args.invariants,\n                              )\n        else:\n            raise SystemExit(QUARTET_EXISTS\\\n            .format(args.name, args.workdir, args.workdir, args.name, args.name))\n\n    ## boots can be set either for a new object or loaded JSON to continue it\n    if args.boots:\n        data.params.nboots = int(args.boots)\n\n    ## if ipyclient is running (and matched profile) then use that one\n    if args.ipcluster:\n        ipyclient = ipp.Client(profile=args.ipcluster)\n        data._ipcluster[\"cores\"] = len(ipyclient)\n\n    ## if not then we need to register and launch an ipcluster instance\n    else:\n        ## set CLI ipcluster terms\n        ipyclient = None\n        data._ipcluster[\"cores\"] = args.cores if args.cores else detect_cpus()\n        data._ipcluster[\"engines\"] = \"Local\"\n        if args.MPI:\n            data._ipcluster[\"engines\"] = \"MPI\"\n            if not args.cores:\n                raise IPyradWarningExit(\"must provide -c argument with --MPI\")\n        ## register to have a cluster-id with \"ip- name\"\n        data = register_ipcluster(data)\n\n    ## message about whether we are continuing from existing\n    if data.checkpoint.boots:\n        print(LOADING_MESSAGE.format(\n            data.name, data.params.method, data.checkpoint.boots))\n\n    ## run tetrad main function within a wrapper. The wrapper creates an \n    ## ipyclient view and appends to the list of arguments to run 'run'. \n    data.run(force=args.force, ipyclient=ipyclient)", "response": "main function for the sequence array analysis"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _command_list(self):\n\n        ## base args\n        cmd = [self.params.binary, \n                \"-i\", OPJ(self.workdir, self.name+\".treemix.in.gz\"),\n                \"-o\", OPJ(self.workdir, self.name),\n                ]\n\n        ## addon params\n        args = []\n        for key, val in self.params:\n            if key not in [\"minmap\", \"binary\"]:\n                if key == \"g\":\n                    if val[0]:\n                        args += [\"-\"+key, str(val[0]), str(val[1])]\n                elif key == \"global_\":\n                    if val:\n                        args += [\"-\"+key[:-1]]\n                elif key in [\"se\", \"global\", \"noss\"]: \n                    if val:\n                        args += [\"-\"+key]\n                else:\n                    if val:\n                        args += [\"-\"+key, str(val)]\n\n        return cmd+args", "response": "build the command list"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreturn a subsample of unlinked snp sites", "response": "def _subsample(self):\n        \"\"\" returns a subsample of unlinked snp sites \"\"\"\n        spans = self.maparr\n        samp = np.zeros(spans.shape[0], dtype=np.uint64)\n        for i in xrange(spans.shape[0]):\n            samp[i] = np.random.randint(spans[i, 0], spans[i, 1], 1)\n        return samp"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef copy(self, name):\n\n        ## make deepcopy of self.__dict__ but do not copy async objects\n        subdict = {i:j for i, j in self.__dict__.iteritems() if i != \"asyncs\"}\n        newdict = copy.deepcopy(subdict)\n\n        ## make back into a Treemix object\n        #if name == self.name:\n        #    raise Exception(\"new object name must be different from its parent\")\n\n        newobj = Treemix(\n            data=newdict[\"data\"],\n            name=name,\n            workdir=newdict[\"workdir\"],\n            imap={i:j for i, j in newdict[\"imap\"].items()},\n            mapfile=newdict['mapfile'],\n            minmap={i:j for i, j in newdict[\"minmap\"].items()},\n            seed=np.random.randint(0, int(1e9)),\n            )\n\n        ## update special dict attributes but not files\n        for key, val in newobj.params.__dict__.iteritems():\n            newobj.params.__setattr__(key, self.params.__getattribute__(key))\n        #for key, val in newobj.filters.__dict__.iteritems():\n        #    newobj.filters.__setattr__(key, self.filters.__getattribute__(key))\n\n        ## new object must have a different name than it's parent\n        return newobj", "response": "Returns a copy of the treemix object with the same parameter settings\n        but with the files attributes cleared and a new name attribute."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreturn a toyplot. axes object.", "response": "def draw(self, axes):\n        \"\"\"\n        Returns a treemix plot on a toyplot.axes object. \n        \"\"\"            \n        ## create a toytree object from the treemix tree result\n        tre = toytree.tree(newick=self.results.tree)\n        tre.draw(\n            axes=axes,\n            use_edge_lengths=True,\n            tree_style='c',\n            tip_labels_align=True,\n            edge_align_style={\"stroke-width\": 1}\n        );\n\n        ## get coords \n        for admix in self.results.admixture:\n            ## parse admix event\n            pidx, pdist, cidx, cdist, weight = admix\n            a = _get_admix_point(tre, pidx, pdist)\n            b = _get_admix_point(tre, cidx, cdist)\n\n            ## add line for admixture edge\n            mark = axes.plot(\n                a = (a[0], b[0]),\n                b = (a[1], b[1]),\n                style={\"stroke-width\": 10*weight,\n                       \"stroke-opacity\": 0.95, \n                       \"stroke-linecap\": \"round\"}\n            )\n\n            ## add points at admixture sink\n            axes.scatterplot(\n                a = (b[0]),\n                b = (b[1]),\n                size=8,\n                title=\"weight: {}\".format(weight),\n            )\n\n        ## add scale bar for edge lengths\n        axes.y.show=False\n        axes.x.ticks.show=True\n        axes.x.label.text = \"Drift parameter\"\n        return axes"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _resolveambig(subseq):\n    N = []\n    for col in subseq:\n        rand = np.random.binomial(1, 0.5)\n        N.append([_AMBIGS[i][rand] for i in col])\n    return np.array(N)", "response": "Randomly resolves iupac hetero codes."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ncount the number of PIS in a loci sequence", "response": "def _count_PIS(seqsamp, N):\n    \"\"\" filters for loci with >= N PIS \"\"\"\n    counts = [Counter(col) for col in seqsamp.T if not (\"-\" in col or \"N\" in col)]\n    pis = [i.most_common(2)[1][1] > 1 for i in counts if len(i.most_common(2))>1]\n    if sum(pis) >= N:\n        return sum(pis)\n    else:\n        return 0"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef write_nexus_files(self, force=False, quiet=False):\n\n        ## clear existing files \n        existing = glob.glob(os.path.join(self.workdir, self.name, \"*.nex\"))\n        if any(existing):\n            if force:\n                for rfile in existing:\n                    os.remove(rfile)\n            else:\n                path = os.path.join(self.workdir, self.name)\n                raise IPyradWarningExit(EXISTING_NEX_FILES.format(path))\n\n        ## parse the loci or alleles file\n        with open(self.files.data) as infile:\n            loci = iter(infile.read().strip().split(\"|\\n\"))\n\n        ## use entered samples or parse them from the file\n        if not self.samples:\n            with open(self.files.data) as infile:\n                samples = set((i.split()[0] for i in infile.readlines() \\\n                               if \"//\" not in i))\n        else:\n            samples = set(self.samples)\n\n        ## keep track of how many loci pass filtering\n        totn = len(samples)\n        nloci = 0\n\n        ## this set is just used for matching, then we randomly\n        ## subsample for real within the locus so it varies \n        if self._alleles:\n            msamples = {i+rbin() for i in samples}\n        else:\n            msamples = samples\n\n        ## write subsampled set of loci\n        for loc in loci:\n            ## get names and seqs from locus\n            dat = loc.split(\"\\n\")[:-1]\n            try:\n                names = [i.split()[0] for i in dat]\n                snames = set(names)\n                seqs = np.array([list(i.split()[1]) for i in dat])\n            except IndexError:\n                print(ALLELESBUGFIXED)\n                continue\n\n            ## check name matches\n            if len(snames.intersection(msamples)) == totn:\n\n                ## prune sample names if alleles. Done here so it is randomly\n                ## different in every locus which allele is selected from \n                ## each sample (e.g., 0 or 1)\n                if self._alleles:\n                    _samples = [i+rbin() for i in samples]\n                else:\n                    _samples = samples\n\n                ## re-order seqs to be in set order\n                seqsamp = seqs[[names.index(tax) for tax in _samples]]\n\n                ## resolve ambiguities randomly if .loci file otherwise\n                ## sample one of the alleles if .alleles file.\n                if not self._alleles:\n                    seqsamp = _resolveambig(seqsamp)\n\n                ## find parsimony informative sites\n                if _count_PIS(seqsamp, self.params.minsnps):\n                    ## keep the locus\n                    nloci += 1\n\n                    ## remove empty columns given this sampling\n                    copied = seqsamp.copy()\n                    copied[copied == \"-\"] == \"N\"\n                    rmcol = np.all(copied == \"N\", axis=0)\n                    seqsamp = seqsamp[:, ~rmcol]\n\n                    ## write nexus file\n                    if self._alleles:\n                        ## trim off the allele number\n                        samps = [i.rsplit(\"_\", 1)[0] for i in _samples]\n                        mdict = dict(zip(samps, [i.tostring() for i in seqsamp]))\n                    else:\n                        mdict = dict(zip(_samples, [i.tostring() for i in seqsamp]))\n                    self._write_nex(mdict, nloci)\n\n                    ## quit early if using maxloci\n                    if nloci == self.params.maxloci: \n                        break\n\n\n        ## print data size\n        if not quiet:\n            path = os.path.join(self.workdir, self.name)\n            path = path.replace(os.path.expanduser(\"~\"), \"~\")\n            print(\"wrote {} nexus files to {}\".format(nloci, path))", "response": "Write nexus files to the directory."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nrunning the nexus bucky analysis on the given set of parameters.", "response": "def run(self, steps=None, ipyclient=None, force=False, quiet=False):\n        \"\"\"\n        Submits an ordered list of jobs to a load-balancer to complete \n        the following tasks, and reports a progress bar:\n        (1) Write nexus files for each locus\n        (2) Run mrBayes on each locus to get a posterior of gene trees\n        (3) Run mbsum (a bucky tool) on the posterior set of trees\n        (4) Run Bucky on the summarized set of trees for all alpha values.\n\n        Parameters:\n        -----------\n        ipyclient (ipyparallel.Client())\n            A connected ipyparallel Client object used to distribute jobs\n        force (bool):\n            Whether to overwrite existing files with the same name and workdir\n            if they exist. Default is False.\n        quiet (bool):\n            Whether to suppress progress information. Default is False.\n        steps (list):\n            A list of integers of steps to perform. This is useful if a \n            job was interrupted, or you created a new bucky object copy, \n            or you wish to run an analysis under a new set of parameters, \n            after having run it once. For example, if you finished running\n            steps 1 and 2 (write nexus files and infer mrbayes posteriors), \n            but you want to rerun steps 3 and 4 with new settings, then you\n            could enter `steps=[3,4]` and also `force=True` to run steps 3 \n            and 4 with a new set of parameters. Default argument is None \n            which means run all steps. \n        \"\"\"\n\n        ## require ipyclient\n        if not ipyclient:\n            raise IPyradWarningExit(\"an ipyclient object is required\")\n\n        ## check the steps argument\n        if not steps:\n            steps = [1, 2, 3, 4]\n        if isinstance(steps, (int, str)):\n            steps = [int(i) for i in [steps]]\n        if isinstance(steps, list):\n            if not all(isinstance(i, int) for i in steps):\n                raise IPyradWarningExit(\"steps must be a list of integers\")\n\n        ## run steps ------------------------------------------------------\n        ## TODO: wrap this function so it plays nice when interrupted.\n        if 1 in steps:\n            self.write_nexus_files(force=force, quiet=quiet)\n        if 2 in steps:\n            self.run_mrbayes(force=force, quiet=quiet, ipyclient=ipyclient)\n        if 3 in steps:\n            self.run_mbsum(force=force, quiet=quiet, ipyclient=ipyclient)\n        if 4 in steps:\n            self.run_bucky(force=force, quiet=quiet, ipyclient=ipyclient)\n\n        ## make sure jobs are done if waiting (TODO: maybe make this optional)\n        ipyclient.wait()"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nwrite a NEX file with a mrbayes analysis block given a set of mcmc arguments.", "response": "def _write_nex(self, mdict, nlocus):\n        \"\"\" \n        function that takes a dictionary mapping names to sequences, \n        and a locus number, and writes it as a NEXUS file with a mrbayes \n        analysis block given a set of mcmc arguments.\n        \"\"\"\n\n        ## create matrix as a string\n        max_name_len = max([len(i) for i in mdict])\n        namestring = \"{:<\" + str(max_name_len+1) + \"} {}\\n\"\n        matrix = \"\"\n        for i in mdict.items():\n            matrix += namestring.format(i[0], i[1])\n\n        ## ensure dir\n        minidir = os.path.realpath(os.path.join(self.workdir, self.name))\n        if not os.path.exists(minidir):\n            os.makedirs(minidir)\n\n        ## write nexus block\n        handle = os.path.join(minidir, \"{}.nex\".format(nlocus))\n        with open(handle, 'w') as outnex:\n            outnex.write(NEXBLOCK.format(**{\n                \"ntax\": len(mdict), \n                \"nchar\": len(mdict.values()[0]), \n                \"matrix\": matrix,\n                \"ngen\": self.params.mb_mcmc_ngen, \n                \"sfreq\": self.params.mb_mcmc_sample_freq, \n                \"burnin\": self.params.mb_mcmc_burnin, \n                }))"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef run_mbsum(self, ipyclient, force=False, quiet=False):\n        minidir = os.path.realpath(os.path.join(self.workdir, self.name))\n        trees1 = glob.glob(os.path.join(minidir, \"*.run1.t\"))\n        trees2 = glob.glob(os.path.join(minidir, \"*.run2.t\"))\n\n        ## clear existing files \n        existing = glob.glob(os.path.join(self.workdir, self.name, \"*.sumt\"))\n        if any(existing):\n            if force:\n                for rfile in existing:\n                    os.remove(rfile)\n            else:\n                path = os.path.join(self.workdir, self.name)\n                raise IPyradWarningExit(EXISTING_SUMT_FILES.format(path))\n\n        ## load balancer\n        lbview = ipyclient.load_balanced_view()\n\n        ## submit each to be processed\n        asyncs = []\n        for tidx in xrange(len(trees1)):\n            rep1 = trees1[tidx]\n            rep2 = trees2[tidx]\n            outname = os.path.join(minidir, str(tidx)+\".sumt\")\n            async = lbview.apply(_call_mbsum, *(rep1, rep2, outname))\n            asyncs.append(async)\n\n        ## track progress\n        start = time.time()\n        printstr = \"[mbsum] sum replicate runs      | {} | \"\n        while 1:\n            ready = [i.ready() for i in asyncs]\n            elapsed = datetime.timedelta(seconds=int(time.time()-start))\n            if not quiet:            \n                progressbar(len(ready), sum(ready), printstr.format(elapsed), spacer=\"\")\n            if len(ready) == sum(ready):\n                if not quiet:\n                    print(\"\")\n                break\n            else:\n                time.sleep(0.1)\n\n        ## check success\n        for async in asyncs:\n            if not async.successful():\n                raise IPyradWarningExit(async.result())", "response": "Sums two replicate mrbayes runs for each locus."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nruns the mrbayes block in each nexus file", "response": "def run_mrbayes(self, ipyclient, force=False, quiet=False):\n        \"\"\"\n        calls the mrbayes block in each nexus file.\n        \"\"\"\n\n        ## get all the nexus files for this object\n        minidir = os.path.realpath(os.path.join(self.workdir, self.name))\n        nexus_files = glob.glob(os.path.join(minidir, \"*.nex\"))\n\n        ## clear existing files \n        #existing = glob.glob(os.path.join(self.workdir, self.name, \"*.nex\"))\n        existing = glob.glob(os.path.join(minidir, \"*.nex.*\"))\n        if any(existing):\n            if force:\n                for rfile in existing:\n                    os.remove(rfile)\n            else:\n                raise IPyradWarningExit(EXISTING_NEXdot_FILES.format(minidir))\n\n        ## write new nexus files, or should users do that before this?\n        #self.write_nexus_files(force=True)\n\n        ## load balancer\n        lbview = ipyclient.load_balanced_view()\n\n        ## submit each to be processed\n        asyncs = []\n        for nex in nexus_files:\n            async = lbview.apply(_call_mb, nex)\n            asyncs.append(async)\n\n        ## track progress\n        start = time.time()\n        printstr = \"[mb] infer gene-tree posteriors | {} | \"        \n        while 1:\n            ready = [i.ready() for i in asyncs]\n            elapsed = datetime.timedelta(seconds=int(time.time()-start))\n            if not quiet:            \n                progressbar(len(ready), sum(ready), printstr.format(elapsed), spacer=\"\")\n            if len(ready) == sum(ready):\n                if not quiet:\n                    print(\"\")\n                break\n            else:\n                time.sleep(0.1)\n\n        ## check success\n        for async in asyncs:\n            if not async.successful():\n                raise IPyradWarningExit(async.result())"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nruns bucky on a set of parameters and stores the result in the ipa. bucky object.", "response": "def run_bucky(self, ipyclient, force=False, quiet=False, subname=False):\n        \"\"\"\n        Runs bucky for a given set of parameters and stores the result \n        to the ipa.bucky object. The results will be stored by default\n        with the name '{name}-{alpha}' unless a argument is passed for\n        'subname' to customize the output name. \n\n        Parameters:\n        -----------\n        subname (str):\n            A custom name prefix for the output files produced by the bucky\n            analysis and output into the {workdir}/{name} directory.\n        force (bool):\n            If True then existing result files with the same name prefix\n            will be overwritten. \n        quiet (bool):\n            If True the progress bars will be suppressed. \n        ipyclient (ipyparallel.Client)\n            An active ipyparallel client to distribute jobs to.\n\n        \"\"\"\n\n        ## check for existing results files\n        minidir = os.path.realpath(os.path.join(self.workdir, self.name))\n        infiles = glob.glob(os.path.join(minidir, \"*.sumt\"))\n        outroot = os.path.realpath(os.path.join(self.workdir, self.name))\n\n        ## build alpha list\n        if isinstance(self.params.bucky_alpha, list):\n            alphas = self.params.bucky_alpha\n        else:\n            alphas = [self.params.bucky_alpha]\n\n        ## load balancer\n        lbview = ipyclient.load_balanced_view()\n\n        ## submit each to be processed\n        asyncs = []\n        for alpha in alphas:\n            pathname = os.path.join(outroot, \"CF-a\"+str(alpha))\n            if (os.path.exists(pathname)) and (force!=True):\n                print(\"BUCKy results already exist for this object at alpha={}\\n\".format(alpha) +\\\n                      \"use force=True to overwrite existing results\")\n            else:\n                args = [\n                    alpha, \n                    self.params.bucky_nchains, \n                    self.params.bucky_nreps, \n                    self.params.bucky_niter, \n                    pathname,\n                    infiles]\n                async = lbview.apply(_call_bucky, *args)\n                asyncs.append(async)\n\n        ## track progress\n        start = time.time()\n        printstr = \"[bucky] infer CF posteriors     | {} | \"\n        while 1:\n            ready = [i.ready() for i in asyncs]\n            elapsed = datetime.timedelta(seconds=int(time.time()-start))\n            if not quiet:            \n                progressbar(len(ready), sum(ready), printstr.format(elapsed), spacer=\"\")\n            if len(ready) == sum(ready):\n                if not quiet:\n                    print(\"\")\n                break\n            else:\n                time.sleep(0.1)\n\n        ## check success\n        for async in asyncs:\n            if not async.successful():\n                raise IPyradWarningExit(async.result())"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _get_samples(self, samples):\n    ## if samples not entered use all samples\n    if not samples:\n        samples = self.samples.keys()\n\n    ## Be nice and allow user to pass in only one sample as a string,\n    ## rather than a one element list. When you make the string into a list\n    ## you have to wrap it in square braces or else list makes a list of\n    ## each character individually.\n    if isinstance(samples, str):\n        samples = list([samples])\n\n    ## if sample keys, replace with sample obj\n    assert isinstance(samples, list), \\\n        \"to subselect samples enter as a list, e.g., [A, B].\"\n    newsamples = [self.samples.get(key) for key in samples \\\n                  if self.samples.get(key)]\n    strnewsamples = [i.name for i in newsamples]\n\n    ## are there any samples that did not make it into the dict?\n    badsamples = set(samples).difference(set(strnewsamples))\n    if badsamples:\n        outstring = \", \".join(badsamples)\n        raise IPyradError(\\\n        \"Unrecognized Sample name(s) not linked to {}: {}\"\\\n        .format(self.name, outstring))\n\n    ## require Samples\n    assert newsamples, \\\n           \"No Samples passed in and none in assembly {}\".format(self.name)\n\n    return newsamples", "response": "Internal function to read in perhaps\n    non empty list of samples."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _name_from_file(fname, splitnames, fields):\n    ## allowed extensions\n    file_extensions = [\".gz\", \".fastq\", \".fq\", \".fasta\", \".clustS\", \".consens\"]\n    base, _ = os.path.splitext(os.path.basename(fname))\n\n    ## remove read number from name\n    base = base.replace(\"_R1_.\", \".\")\\\n               .replace(\"_R1_\", \"\")\\\n               .replace(\"_R1.\", \".\")\n\n    ## remove extensions, retains '.' in file names.\n    while 1:\n        tmpb, tmpext = os.path.splitext(base)\n        if tmpext in file_extensions:        \n            base = tmpb\n        else:\n            break\n\n    if fields:\n        namebits = base.split(splitnames)\n        base = []\n        for field in fields:\n            try:\n                base.append(namebits[field])\n            except IndexError:\n                pass\n        base = splitnames.join(base)\n\n    if not base:\n        raise IPyradError(\"\"\"\n    Found invalid/empty filename in link_fastqs. Check splitnames argument.\n    \"\"\")\n\n    return base", "response": "internal function to get the sample name from a pyrad file."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _read_sample_names(fname):\n    try:\n        with open(fname, 'r') as infile:\n            subsamples = [x.split()[0] for x in infile.readlines() if x.strip()]\n\n    except Exception as inst:\n        print(\"Failed to read input file with sample names.\\n{}\".format(inst))\n        raise inst\n\n    return subsamples", "response": "Read in sample names from a plain text file."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _expander(namepath):\n    if \"~\" in namepath:\n        namepath = os.path.expanduser(namepath)\n    else:\n        namepath = os.path.abspath(namepath)\n    return namepath", "response": "expand ./ ~ and ../ designators in location names"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef merge(name, assemblies):\n\n    ## checks\n    assemblies = list(assemblies)\n\n    ## create new Assembly as a branch (deepcopy)\n    merged = assemblies[0].branch(name)\n\n    ## get all sample names from all Assemblies\n    allsamples = set(merged.samples.keys())\n    for iterass in assemblies[1:]:\n        allsamples.update(set(iterass.samples.keys()))\n\n    ## Make sure we have the max of all values for max frag length\n    ## from all merging assemblies.\n    merged._hackersonly[\"max_fragment_length\"] =\\\n        max([x._hackersonly[\"max_fragment_length\"] for x in assemblies])\n\n    ## warning message?\n    warning = 0\n\n    ## iterate over assembly objects, skip first already copied\n    for iterass in assemblies[1:]:\n        ## iterate over allsamples, add if not in merged\n        for sample in iterass.samples:\n            ## iterate over stats, skip 'state'\n            if sample not in merged.samples:\n                merged.samples[sample] = copy.deepcopy(iterass.samples[sample])\n                ## if barcodes data present then keep it\n                if iterass.barcodes.get(sample):\n                    merged.barcodes[sample] = iterass.barcodes[sample]\n            else:\n                ## merge stats and files of the sample\n                for stat in merged.stats.keys()[1:]:\n                    merged.samples[sample].stats[stat] += \\\n                                iterass.samples[sample].stats[stat]\n                ## merge file references into a list\n                for filetype in ['fastqs', 'edits']:\n                    merged.samples[sample].files[filetype] += \\\n                                iterass.samples[sample].files[filetype]\n                if iterass.samples[sample].files[\"clusters\"]:\n                    warning += 1\n\n    ## print warning if clusters or later was present in merged assembly\n    if warning:\n        print(\"\"\"\\\n    Warning: the merged Assemblies contained Samples that are identically named,\n    and so ipyrad has attempted to merge these Samples. This is perfectly fine to\n    do up until step 3, but not after, because at step 3 all reads for a Sample\n    should be included during clustering/mapping. Take note, you can merge Assemblies\n    at any step *if they do not contain the same Samples*, however, here that is not\n    the case. If you wish to proceed with this merged Assembly you will have to\n    start from step 3, therefore the 'state' of the Samples in this new merged\n    Assembly ({}) have been set to 2.\n    \"\"\".format(name))\n        for sample in merged.samples:\n            merged.samples[sample].stats.state = 2\n            ## clear stats\n            for stat in [\"refseq_mapped_reads\", \"refseq_unmapped_reads\",\n                         \"clusters_total\", \"clusters_hidepth\", \"hetero_est\",\n                         \"error_est\", \"reads_consens\"]:\n                merged.samples[sample].stats[stat] = 0\n            ## clear files\n            for ftype in [\"mapped_reads\", \"unmapped_reads\", \"clusters\",\n                          \"consens\", \"database\"]:\n                merged.samples[sample].files[ftype] = []\n\n    ## Set the values for some params that don't make sense inside\n    ## merged assemblies\n    merged_names = \", \".join([x.name for x in assemblies])\n    merged.paramsdict[\"raw_fastq_path\"] = \"Merged: \" + merged_names\n    merged.paramsdict[\"barcodes_path\"] = \"Merged: \" + merged_names\n    merged.paramsdict[\"sorted_fastq_path\"] = \"Merged: \" + merged_names\n\n    ## return the new Assembly object\n    merged.save()\n    return merged", "response": "Creates and returns a new Assembly object in which samples from two or more Assemblies are merged."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\ncounts the number of lines in a file.", "response": "def _bufcountlines(filename, gzipped):\n    \"\"\"\n    fast line counter. Used to quickly sum number of input reads when running\n    link_fastqs to append files. \"\"\"\n    if gzipped:\n        fin = gzip.open(filename)\n    else:\n        fin = open(filename)\n    nlines = 0\n    buf_size = 1024 * 1024\n    read_f = fin.read # loop optimization\n    buf = read_f(buf_size)\n    while buf:\n        nlines += buf.count('\\n')\n        buf = read_f(buf_size)\n    fin.close()\n    return nlines"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ntake a string argument and returns value as a tuple.", "response": "def _tuplecheck(newvalue, dtype=str):\n    \"\"\"\n    Takes a string argument and returns value as a tuple.\n    Needed for paramfile conversion from CLI to set_params args\n    \"\"\"\n\n    if isinstance(newvalue, list):\n        newvalue = tuple(newvalue)\n\n    if isinstance(newvalue, str):\n        newvalue = newvalue.rstrip(\")\").strip(\"(\")\n        try:\n            newvalue = tuple([dtype(i.strip()) for i in newvalue.split(\",\")])\n\n        ## Type error is thrown by tuple if it's applied to a non-iterable.\n        except TypeError:\n            newvalue = tuple(dtype(newvalue))\n\n        ## If dtype fails to cast any element of newvalue\n        except ValueError:\n            LOGGER.info(\"Assembly.tuplecheck() failed to cast to {} - {}\"\\\n                        .format(dtype, newvalue))\n            raise\n\n        except Exception as inst:\n            LOGGER.info(inst)\n            raise SystemExit(\\\n            \"\\nError: Param`{}` is not formatted correctly.\\n({})\\n\"\\\n                 .format(newvalue, inst))\n\n    return newvalue"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ncheck if the given parameter is set to values.", "response": "def _paramschecker(self, param, newvalue):\n    \"\"\" Raises exceptions when params are set to values they should not be\"\"\"\n\n    if param == 'assembly_name':\n        ## Make sure somebody doesn't try to change their assembly_name, bad\n        ## things would happen. Calling set_params on assembly_name only raises\n        ## an informative error. Assembly_name is set at Assembly creation time\n        ## and is immutable.\n        raise IPyradWarningExit(CANNOT_CHANGE_ASSEMBLY_NAME)\n\n    elif param == 'project_dir':\n        expandpath = _expander(newvalue)\n        if not expandpath.startswith(\"/\"):\n            if os.path.exists(expandpath):\n                expandpath = _expander(expandpath)\n        ## Forbid spaces in path names\n        if \" \" in expandpath:\n            raise IPyradWarningExit(BAD_PROJDIR_NAME.format(expandpath))\n        self.paramsdict[\"project_dir\"] = expandpath\n        self.dirs[\"project\"] = expandpath\n\n    ## `Merged:` in newvalue for raw_fastq_path indicates that this\n    ## assembly is a merge of several others, so this param has no\n    ## value for this assembly\n    elif param == 'raw_fastq_path':\n        if newvalue and not \"Merged:\" in newvalue:\n            fullrawpath = _expander(newvalue)\n            if os.path.isdir(fullrawpath):\n                raise IPyradWarningExit(RAW_PATH_ISDIR.format(fullrawpath))\n            ## if something is found in path\n            elif glob.glob(fullrawpath):\n                self.paramsdict['raw_fastq_path'] = fullrawpath\n            ## else allow empty, tho it can still raise an error in step1\n            else:\n                raise IPyradWarningExit(NO_RAW_FILE.format(fullrawpath))\n        else:\n            self.paramsdict['raw_fastq_path'] = \"\"\n\n\n    ## `Merged:` in newvalue for barcodes_path indicates that this\n    ## assembly is a merge of several others, so this param has no\n    ## value for this assembly\n    elif param == 'barcodes_path':\n        ## if a value was entered check that it exists\n        if newvalue and not \"Merged:\" in newvalue:\n            ## also allow for fuzzy match in names using glob\n            fullbarpath = glob.glob(_expander(newvalue))[0]\n            ## raise error if file is not found\n            if not os.path.exists(fullbarpath):\n                raise IPyradWarningExit(BARCODE_NOT_FOUND.format(fullbarpath))\n            else:\n                self.paramsdict['barcodes_path'] = fullbarpath\n                self._link_barcodes()\n\n        ## if no path was entered then set barcodes path to empty.\n        ## this is checked again during step 1 and will raise an error\n        ## if you try demultiplexing without a barcodes file\n        else:\n            self.paramsdict['barcodes_path'] = newvalue\n\n\n    ## `Merged:` in newvalue for sorted_fastq_path indicates that this\n    ## assembly is a merge of several others, so this param has no\n    ## value for this assembly\n    elif param == 'sorted_fastq_path':\n        if newvalue and not \"Merged:\" in newvalue:\n            fullsortedpath = _expander(newvalue)\n\n            if os.path.isdir(fullsortedpath):\n                raise IPyradWarningExit(SORTED_ISDIR.format(fullsortedpath))\n            elif glob.glob(fullsortedpath):\n                self.paramsdict['sorted_fastq_path'] = fullsortedpath\n\n            else:\n                raise IPyradWarningExit(SORTED_NOT_FOUND.format(fullsortedpath))\n        ## if no value was entered then set to \"\".\n        else:\n            self.paramsdict['sorted_fastq_path'] = \"\"\n\n\n    elif param == 'assembly_method':\n        ## TEMPORARY BLOCK ON DENOVO+REFERENCE METHOD\n#        if newvalue == \"denovo+reference\":\n#            raise IPyradWarningExit(\"\"\"\n#    Error: The 'denovo+reference' method is temporarily blocked while we \n#    refactor it to greatly improve the speed. You can either revert to an\n#    older version (pre v.0.7.0) or wait for the next update to resume using\n#    this method. \n#    \"\"\")\n\n        methods = [\"denovo\", \"reference\", \"denovo+reference\", \"denovo-reference\"]\n        assert newvalue in methods, BAD_ASSEMBLY_METHOD.format(newvalue)\n        self.paramsdict['assembly_method'] = newvalue\n\n\n    elif param == 'reference_sequence':\n        if newvalue:\n            fullrawpath = _expander(newvalue)\n            if not os.path.isfile(fullrawpath):\n                LOGGER.info(\"reference sequence file not found.\")\n                raise IPyradWarningExit(REF_NOT_FOUND.format(fullrawpath))\n            self.paramsdict['reference_sequence'] = fullrawpath\n        ## if no value was entered the set to \"\". Will be checked again\n        ## at step3 if user tries to do refseq and raise error\n        else:\n            self.paramsdict['reference_sequence'] = \"\"\n\n\n    elif param == 'datatype':\n        ## list of allowed datatypes\n        datatypes = ['rad', 'gbs', 'ddrad', 'pairddrad',\n                     'pairgbs', 'merged', '2brad', 'pair3rad']\n        ## raise error if something else\n        if str(newvalue) not in datatypes:\n            raise IPyradError(\"\"\"\n    datatype {} not recognized, must be one of: {}\n    \"\"\".format(newvalue, datatypes))\n        else:\n            self.paramsdict['datatype'] = str(newvalue)\n            ## link_barcodes is called before datatypes is set\n            ## we need to know the datatype so we can read in\n            ## the multiplexed barcodes for 3rad. This seems\n            ## a little annoying, but it was better than any\n            ## alternatives I could think of.\n            if \"3rad\" in self.paramsdict['datatype'] and not \\\n            self.paramsdict['sorted_fastq_path'].strip():\n                if not \"Merged:\" in self.paramsdict['barcodes_path']:\n                    self._link_barcodes()\n\n    elif param == 'restriction_overhang':\n        newvalue = _tuplecheck(newvalue, str)\n        assert isinstance(newvalue, tuple), \"\"\"\n    cut site must be a tuple, e.g., (TGCAG, '') or (TGCAG, CCGG)\"\"\"\n        ## Handle the special case where the user has 1\n        ## restriction overhang and does not include the trailing comma\n        if len(newvalue) == 1:\n            ## for gbs users might not know to enter the second cut site\n            ## so we do it for them.\n            if self.paramsdict[\"datatype\"] == \"gbs\":\n                newvalue += newvalue\n            else:\n                newvalue += (\"\",)\n        #=======\n        #    newvalue = (newvalue[0], \"\")\n        #>>>>>>> d40a5d5086a0d0aace04dd08338ec4ba5341d1f2\n\n        ## Handle 3rad datatype with only 3 cutters\n        if len(newvalue) == 3:\n            newvalue = (newvalue[0], newvalue[1], newvalue[2], \"\")\n        assert len(newvalue) <= 4, \"\"\"\n    most datasets require 1 or 2 cut sites, e.g., (TGCAG, '') or (TGCAG, CCGG).\n    For 3rad/seqcap may be up to 4 cut sites.\"\"\"\n        self.paramsdict['restriction_overhang'] = newvalue\n\n    elif param == 'max_low_qual_bases':\n        assert isinstance(int(newvalue), int), \"\"\"\n    max_low_qual_bases must be an integer.\"\"\"\n        self.paramsdict['max_low_qual_bases'] = int(newvalue)\n\n    elif param == 'phred_Qscore_offset':\n        assert isinstance(int(newvalue), int), \\\n            \"phred_Qscore_offset must be an integer.\"\n        self.paramsdict['phred_Qscore_offset'] = int(newvalue)\n\n    elif param == 'mindepth_statistical':\n        assert isinstance(int(newvalue), int), \\\n            \"mindepth_statistical must be an integer.\"\n        ## do not allow values below 5\n        if int(newvalue) < 5:\n            raise IPyradError(\"\"\"\n    mindepth statistical cannot be set < 5. Use mindepth_majrule.\"\"\")\n        else:\n            self.paramsdict['mindepth_statistical'] = int(newvalue)\n\n    elif param == 'mindepth_majrule':\n        assert isinstance(int(newvalue), int), \\\n            \"mindepth_majrule must be an integer.\"\n        self.paramsdict['mindepth_majrule'] = int(newvalue)\n\n    elif param == 'maxdepth':\n        self.paramsdict['maxdepth'] = int(newvalue)\n\n    elif param == 'clust_threshold':\n        newvalue = float(newvalue)\n        assert (newvalue < 1) & (newvalue > 0), \\\n        \"clust_threshold must be a decimal value between 0 and 1.\"\n        self.paramsdict['clust_threshold'] = newvalue\n\n    elif param == 'max_barcode_mismatch':\n        self.paramsdict['max_barcode_mismatch'] = int(newvalue)\n\n    elif param == 'filter_adapters':\n        self.paramsdict['filter_adapters'] = int(newvalue)\n\n    elif param == 'filter_min_trim_len':\n        self.paramsdict[\"filter_min_trim_len\"] = int(newvalue)\n\n    elif param == 'max_alleles_consens':\n        self.paramsdict['max_alleles_consens'] = int(newvalue)\n\n    elif param == 'max_Ns_consens':\n        newvalue = _tuplecheck(newvalue, int)\n        assert isinstance(newvalue, tuple), \\\n        \"max_Ns_consens should be a tuple e.g., (8, 8)\"\n        self.paramsdict['max_Ns_consens'] = newvalue\n\n    elif param == 'max_Hs_consens':\n        newvalue = _tuplecheck(newvalue, int)\n        assert isinstance(newvalue, tuple), \\\n        \"max_Hs_consens should be a tuple e.g., (5, 5)\"\n        self.paramsdict['max_Hs_consens'] = newvalue\n\n    elif param == 'min_samples_locus':\n        self.paramsdict['min_samples_locus'] = int(newvalue)\n\n    elif param == 'max_shared_Hs_locus':\n        if isinstance(newvalue, str):\n            if newvalue.isdigit():\n                newvalue = int(newvalue)\n            else:\n                try:\n                    newvalue = float(newvalue)\n                except Exception as inst:\n                    raise IPyradParamsError(\"\"\"\n    max_shared_Hs_locus must be int or float, you put: {}\"\"\".format(newvalue))\n        self.paramsdict['max_shared_Hs_locus'] = newvalue\n\n    elif param == 'max_SNPs_locus':\n        newvalue = _tuplecheck(newvalue, int)\n        assert isinstance(newvalue, tuple), \\\n        \"max_SNPs_locus should be a tuple e.g., (20, 20)\"\n        self.paramsdict['max_SNPs_locus'] = newvalue\n\n    elif param == 'max_Indels_locus':\n        newvalue = _tuplecheck(newvalue, int)\n        assert isinstance(newvalue, tuple), \\\n        \"max_Indels_locus should be a tuple e.g., (5, 100)\"\n        self.paramsdict['max_Indels_locus'] = newvalue\n\n    ## deprecated but retained for legacy, now uses trim_reads (below)\n    elif param == 'edit_cutsites':\n        ## Force into a string tuple\n        newvalue = _tuplecheck(newvalue)\n        ## try converting each tup element to ints\n        newvalue = list(newvalue)\n        for i in range(2):\n            try:\n                newvalue[i] = int(newvalue[i])\n            except (ValueError, IndexError):\n                newvalue.append(0)\n                pass\n        newvalue = tuple(newvalue)\n        ## make sure we have a nice tuple\n        if not isinstance(newvalue, tuple):\n            raise IPyradWarningExit(\"\"\"\n    Error: edit_cutsites should be a tuple e.g., (0, 5) or ('TGCAG', 6),\n    you entered {}\n    \"\"\".format(newvalue))\n        self.paramsdict['edit_cutsites'] = newvalue\n\n    elif param == 'trim_reads':\n        ## Force into a string tuple\n        newvalue = _tuplecheck(newvalue)\n        ## try converting each tup element to ints\n        newvalue = list(newvalue)\n        for i in range(4):\n            try:\n                newvalue[i] = int(newvalue[i])\n            except (ValueError, IndexError):\n                newvalue.append(0)\n                pass\n        newvalue = tuple(newvalue)\n        ## make sure we have a nice tuple\n        if not isinstance(newvalue, tuple):\n            raise IPyradWarningExit(\"\"\"\n    Error: trim_reads should be a tuple e.g., (0, -5, -5, 0) \n    or (0, 90, 0, 90), or (0, 0, 0, 0). \n    You entered {}\\n\"\"\".format(newvalue))\n        self.paramsdict['trim_reads'] = newvalue        \n\n    ## deprecated but retained for legacy, now named trim_loci \n    elif param == 'trim_overhang':\n        newvalue = _tuplecheck(newvalue, str)\n        assert isinstance(newvalue, tuple), \\\n        \"trim_overhang should be a tuple e.g., (4, *, *, 4)\"\n        self.paramsdict['trim_overhang'] = tuple([int(i) for i in newvalue])\n\n    elif param == 'trim_loci':\n        newvalue = _tuplecheck(newvalue, str)\n        assert isinstance(newvalue, tuple), \\\n        \"trim_overhang should be a tuple e.g., (0, -5, -5, 0)\"\n        self.paramsdict['trim_loci'] = tuple([int(i) for i in newvalue])\n\n\n    elif param == 'output_formats':\n        ## let's get whatever the user entered as a tuple of letters\n        allowed = assemble.write_outfiles.OUTPUT_FORMATS.keys()\n\n        #<<<<<<< HEAD\n        ## Handle the case where output formats is an empty string\n        if isinstance(newvalue, str):\n            ## strip commas and spaces from string so we have only letters\n            newvalue = newvalue.replace(\",\", \"\").replace(\" \", \"\")\n            newvalue = list(newvalue)\n            if not newvalue:\n                newvalue = [\"*\"]\n        if isinstance(newvalue, tuple):\n            newvalue = list(newvalue)\n        #=======\n        #if isinstance(newvalue, tuple):\n        #    newvalue = list(newvalue)\n        #if isinstance(newvalue, str):\n        #    newvalue = [i.strip() for i in newvalue.split(\",\")]\n        #    ## Handle the case where output formats is empty\n        #    if not any(newvalue):\n        #        newvalue = \"*\"\n        #>>>>>>> 488144d1d97240b8b6f6caf9cfb6c023bb6ebb36\n        if isinstance(newvalue, list):\n            ## if more than letters, raise an warning\n            if any([len(i) > 1 for i in newvalue]):\n                LOGGER.warning(\"\"\"\n    'output_formats' params entry is malformed. Setting to * to avoid errors.\"\"\")\n                newvalue = allowed\n            newvalue = tuple(newvalue)\n            #newvalue = tuple([i for i in newvalue if i in allowed])\n        if \"*\" in newvalue:\n            newvalue = allowed\n\n        ## set the param\n        self.paramsdict['output_formats'] = newvalue\n\n\n    elif param == 'pop_assign_file':\n        fullpoppath = _expander(newvalue)\n\n        ## if a path is entered, raise exception if not found\n        if newvalue:\n            if not os.path.isfile(fullpoppath):\n                LOGGER.warn(\"Population assignment file not found.\")\n                raise IPyradWarningExit(\"\"\"\n    Warning: Population assignment file not found. This must be an\n    absolute path (/home/wat/ipyrad/data/my_popfile.txt) or relative to\n    the directory where you're running ipyrad (./data/my_popfile.txt)\n    You entered: {}\\n\"\"\".format(fullpoppath))\n        ## should we add a check here that all pop samples are in samples?\n\n            self.paramsdict['pop_assign_file'] = fullpoppath\n            self._link_populations()\n        else:\n            self.paramsdict['pop_assign_file'] = \"\"\n            ## Don't forget to possibly blank the populations dictionary\n            self.populations = {}\n\n    return self"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning a data frame with Sample data and state.", "response": "def stats(self):\n        \"\"\" Returns a data frame with Sample data and state. \"\"\"\n        nameordered = self.samples.keys()\n        nameordered.sort()\n\n        ## Set pandas to display all samples instead of truncating\n        pd.options.display.max_rows = len(self.samples)\n        statdat = pd.DataFrame([self.samples[i].stats for i in nameordered],\n                      index=nameordered).dropna(axis=1, how='all')\n        # ensure non h,e columns print as ints\n        for column in statdat:\n            if column not in [\"hetero_est\", \"error_est\"]:\n                statdat[column] = np.nan_to_num(statdat[column]).astype(int)\n        return statdat"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef files(self):\n        nameordered = self.samples.keys()\n        nameordered.sort()\n        ## replace curdir with . for shorter printing\n        #fullcurdir = os.path.realpath(os.path.curdir)\n        return pd.DataFrame([self.samples[i].files for i in nameordered],\n                      index=nameordered).dropna(axis=1, how='all')", "response": "Returns a data frame with Sample files."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _build_stat(self, idx):\n        nameordered = self.samples.keys()\n        nameordered.sort()\n        newdat = pd.DataFrame([self.samples[i].stats_dfs[idx] \\\n                               for i in nameordered], index=nameordered)\\\n                               .dropna(axis=1, how='all')\n        return newdat", "response": "Builds a data frame with Sample stats for each step"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nlinking all fastq files in the specified path to the specified path.", "response": "def _link_fastqs(self, path=None, force=False, append=False, splitnames=\"_\",\n        fields=None, ipyclient=None):\n        \"\"\"\n        Create Sample objects from demultiplexed fastq files in sorted_fastq_path,\n        or append additional fastq files to existing Samples. This provides\n        more flexible file input through the API than available in step1 of the\n        command line interface. If passed ipyclient it will run in parallel.\n\n        Note\n        ----\n        This function is called during step 1 if files are specified in\n        'sorted_fastq_path'.\n\n        Parameters\n        ----------\n        path : str\n            Path to the fastq files to be linked to Sample objects. The default\n            location is to select all files in the 'sorted_fastq_path'.\n            Alternatively a different path can be entered here.\n\n        append : bool\n            The default action is to overwrite fastq files linked to Samples if\n            they already have linked files. Use append=True to instead append\n            additional fastq files to a Sample (file names should be formatted\n            the same as usual, e.g., [name]_R1_[optional].fastq.gz).\n\n        splitnames : str\n            A string character used to file names. In combination with the\n            fields argument can be used to subselect filename fields names.\n\n        fields : list\n            A list of indices for the fields to be included in names after\n            filnames are split on the splitnames character. Useful for appending\n            sequence names which must match existing names. If the largest index\n            is greater than the number of split strings in the name the index\n            if ignored. e.g., [2,3,4] ## excludes 0, 1 and >4\n\n        force : bool\n            Overwrites existing Sample data and statistics.\n\n        Returns\n        -------\n        str\n            Prints the number of new Sample objects created and the number of\n            fastq files linked to Sample objects in the Assembly object.\n        \"\"\"\n\n        ## cannot both force and append at once\n        if force and append:\n            raise IPyradError(\"Cannot use force and append at the same time.\")\n\n        if self.samples and not (force or append):\n            raise IPyradError(\"Files already linked to `{}`.\".format(self.name)\\\n                +\" Use force=True to replace all files, or append=True to add\"\n                +\" additional files to existing Samples.\")\n\n        ## make sure there is a workdir and workdir/fastqdir\n        self.dirs.fastqs = os.path.join(self.paramsdict[\"project_dir\"],\n                                        self.name+\"_fastqs\")\n        if not os.path.exists(self.paramsdict[\"project_dir\"]):\n            os.mkdir(self.paramsdict[\"project_dir\"])\n\n        ## get path to data files\n        if not path:\n            path = self.paramsdict[\"sorted_fastq_path\"]\n\n        ## but grab fastq/fq/gz, and then sort\n        fastqs = glob.glob(path)\n        ## Assert files are not .bz2 format\n        if any([i for i in fastqs if i.endswith(\".bz2\")]):\n            raise IPyradError(NO_SUPPORT_FOR_BZ2.format(path))\n        fastqs = [i for i in fastqs if i.endswith(\".gz\") \\\n                                    or i.endswith(\".fastq\") \\\n                                    or i.endswith(\".fq\")]\n        fastqs.sort()\n        LOGGER.debug(\"Linking these fastq files:\\n{}\".format(fastqs))\n\n        ## raise error if no files are found\n        if not fastqs:\n            raise IPyradError(NO_FILES_FOUND_PAIRS\\\n                        .format(self.paramsdict[\"sorted_fastq_path\"]))\n\n        ## link pairs into tuples\n        if 'pair' in self.paramsdict[\"datatype\"]:\n            ## check that names fit the paired naming convention\n            ## trying to support flexible types (_R2_, _2.fastq)\n            r1_try1 = [i for i in fastqs if \"_R1_\" in i]\n            r1_try2 = [i for i in fastqs if i.endswith(\"_1.fastq.gz\")]\n            r1_try3 = [i for i in fastqs if i.endswith(\"_R1.fastq.gz\")]\n\n            r2_try1 = [i for i in fastqs if \"_R2_\" in i]\n            r2_try2 = [i for i in fastqs if i.endswith(\"_2.fastq.gz\")]\n            r2_try3 = [i for i in fastqs if i.endswith(\"_R2.fastq.gz\")]\n\n            r1s = [r1_try1, r1_try2, r1_try3]\n            r2s = [r2_try1, r2_try2, r2_try3]\n\n            ## check that something was found\n            if not r1_try1 + r1_try2 + r1_try3:\n                raise IPyradWarningExit(\n                    \"Paired filenames are improperly formatted. See Documentation\")\n            if not r2_try1 + r2_try2 + r2_try3:\n                raise IPyradWarningExit(\n                    \"Paired filenames are improperly formatted. See Documentation\")\n\n            ## find the one with the right number of R1s\n            for idx, tri in enumerate(r1s):\n                if len(tri) == len(fastqs)/2:\n                    break\n            r1_files = r1s[idx]\n            r2_files = r2s[idx]\n\n            if len(r1_files) != len(r2_files):\n                raise IPyradWarningExit(R1_R2_name_error\\\n                    .format(len(r1_files), len(r2_files)))\n\n            fastqs = [(i, j) for i, j in zip(r1_files, r2_files)]\n\n        ## data are not paired, create empty tuple pair\n        else:\n            ## print warning if _R2_ is in names when not paired\n            idx = 0\n            if any([\"_R2_\" in i for i in fastqs]):\n                print(NAMES_LOOK_PAIRED_WARNING)\n            fastqs = [(i, \"\") for i in fastqs]\n\n        ## counters for the printed output\n        linked = 0\n        appended = 0\n\n        ## clear samples if force\n        if force:\n            self.samples = {}\n\n        ## track parallel jobs\n        linkjobs = {}\n        if ipyclient:\n            lbview = ipyclient.load_balanced_view()\n\n        ## iterate over input files\n        for fastqtuple in list(fastqs):\n            assert isinstance(fastqtuple, tuple), \"fastqs not a tuple.\"\n            ## local counters\n            createdinc = 0\n            linkedinc = 0\n            appendinc = 0\n            ## remove file extension from name\n            if idx == 0:\n                sname = _name_from_file(fastqtuple[0], splitnames, fields)\n            elif idx == 1:\n                sname = os.path.basename(fastqtuple[0].rsplit(\"_1.fastq.gz\", 1)[0])\n            elif idx == 2:\n                sname = os.path.basename(fastqtuple[0].rsplit(\"_R1.fastq.gz\", 1)[0])\n            LOGGER.debug(\"New Sample name {}\".format(sname))\n\n            if sname not in self.samples:\n                ## create new Sample\n                LOGGER.debug(\"Creating new sample - \".format(sname))\n                self.samples[sname] = Sample(sname)\n                self.samples[sname].stats.state = 1\n                self.samples[sname].barcode = None\n                self.samples[sname].files.fastqs.append(fastqtuple)\n                createdinc += 1\n                linkedinc += 1\n            else:\n                ## if not forcing, shouldn't be here with existing Samples\n                if append:\n                    #if fastqtuple not in self.samples[sname].files.fastqs:\n                    self.samples[sname].files.fastqs.append(fastqtuple)\n                    appendinc += 1\n\n                elif force:\n                    ## overwrite/create new Sample\n                    LOGGER.debug(\"Overwriting sample - \".format(sname))\n                    self.samples[sname] = Sample(sname)\n                    self.samples[sname].stats.state = 1\n                    self.samples[sname].barcode = None\n                    self.samples[sname].files.fastqs.append(fastqtuple)\n                    createdinc += 1\n                    linkedinc += 1\n                else:\n                    print(\"\"\"\n        The files {} are already in Sample. Use append=True to append additional\n        files to a Sample or force=True to replace all existing Samples.\n        \"\"\".format(sname))\n\n            ## support serial execution w/o ipyclient\n            if not ipyclient:\n                if any([linkedinc, createdinc, appendinc]):\n                    gzipped = bool(fastqtuple[0].endswith(\".gz\"))\n                    nreads = 0\n                    for alltuples in self.samples[sname].files.fastqs:\n                        nreads += _zbufcountlines(alltuples[0], gzipped)\n                    self.samples[sname].stats.reads_raw = nreads/4\n                    self.samples[sname].stats_dfs.s1[\"reads_raw\"] = nreads/4\n                    self.samples[sname].state = 1\n\n                    LOGGER.debug(\"Got reads for sample - {} {}\".format(sname,\\\n                                    self.samples[sname].stats.reads_raw))\n                    #created += createdinc\n                    linked += linkedinc\n                    appended += appendinc\n\n            ## do counting in parallel\n            else:\n                if any([linkedinc, createdinc, appendinc]):\n                    gzipped = bool(fastqtuple[0].endswith(\".gz\"))\n                    for sidx, tup in enumerate(self.samples[sname].files.fastqs):\n                        key = sname+\"_{}\".format(sidx)\n                        linkjobs[key] = lbview.apply(_bufcountlines,\n                                                    *(tup[0], gzipped))\n                    LOGGER.debug(\"sent count job for {}\".format(sname))\n                    #created += createdinc\n                    linked += linkedinc\n                    appended += appendinc\n\n        ## wait for link jobs to finish if parallel\n        if ipyclient:\n            start = time.time()\n            printstr = ' loading reads         | {} | s1 |'\n            while 1:\n                fin = [i.ready() for i in linkjobs.values()]\n                elapsed = datetime.timedelta(seconds=int(time.time()-start))\n                progressbar(len(fin), sum(fin),\n                    printstr.format(elapsed), spacer=self._spacer)\n                time.sleep(0.1)\n                if len(fin) == sum(fin):\n                    print(\"\")\n                    break\n\n            ## collect link job results\n            sampdict = {i:0 for i in self.samples}\n            for result in linkjobs:\n                sname = result.rsplit(\"_\", 1)[0]\n                nreads = linkjobs[result].result()\n                sampdict[sname] += nreads\n\n            for sname in sampdict:\n                self.samples[sname].stats.reads_raw = sampdict[sname]/4\n                self.samples[sname].stats_dfs.s1[\"reads_raw\"] = sampdict[sname]/4\n                self.samples[sname].state = 1\n\n        ## print if data were linked\n        #print(\"  {} new Samples created in '{}'.\".format(created, self.name))\n        if linked:\n            ## double for paired data\n            if 'pair' in self.paramsdict[\"datatype\"]:\n                linked = linked*2\n            if self._headers:\n                print(\"{}{} fastq files loaded to {} Samples.\".\\\n                      format(self._spacer, linked, len(self.samples)))\n            ## save the location where these files are located\n            self.dirs.fastqs = os.path.realpath(os.path.dirname(path))\n\n        if appended:\n            if self._headers:\n                print(\"{}{} fastq files appended to {} existing Samples.\".\\\n                      format(self._spacer, appended, len(self.samples)))\n\n        ## save step-1 stats. We don't want to write this to the fastq dir, b/c\n        ## it is not necessarily inside our project dir. Instead, we'll write \n        ## this file into our project dir in the case of linked_fastqs.\n        self.stats_dfs.s1 = self._build_stat(\"s1\")\n        self.stats_files.s1 = os.path.join(self.paramsdict[\"project_dir\"],\n                                           self.name+\n                                           '_s1_demultiplex_stats.txt')\n        with open(self.stats_files.s1, 'w') as outfile:\n            self.stats_dfs.s1.fillna(value=0).astype(np.int).to_string(outfile)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _link_barcodes(self):\n\n        ## parse barcodefile\n        try:\n            ## allows fuzzy match to barcodefile name\n            barcodefile = glob.glob(self.paramsdict[\"barcodes_path\"])[0]\n\n            ## read in the file\n            bdf = pd.read_csv(barcodefile, header=None, delim_whitespace=1, dtype=str)\n            bdf = bdf.dropna()\n\n            ## make sure bars are upper case\n            bdf[1] = bdf[1].str.upper()\n\n            ## if replicates are present then print a warning\n            reps = bdf[0].unique().shape[0] != bdf[0].shape[0]\n            if reps:\n                print(\"{spacer}Warning: technical replicates (same name) will be combined.\"\\\n                      .format(**{'spacer': self._spacer}))\n                ## add -technical-replicate-N to replicate names\n                reps = [i for i in bdf[0] if list(bdf[0]).count(i) > 1]\n                ureps = list(set(reps))\n                for name in ureps:\n                    idxs = bdf[bdf[0] == ureps[0]].index.tolist()\n                    for num, idx in enumerate(idxs):\n                        bdf.ix[idx][0] = bdf.ix[idx][0] + \"-technical-replicate-\" + str(num+1)\n\n            ## make sure chars are all proper\n            if not all(bdf[1].apply(set(\"RKSYWMCATG\").issuperset)):\n                LOGGER.warn(BAD_BARCODE)\n                raise IPyradError(BAD_BARCODE)\n\n            ## 3rad/seqcap use multiplexed barcodes\n            ## We'll concatenate them with a plus and split them later\n            if \"3rad\" in self.paramsdict[\"datatype\"]:\n                try:\n                    bdf[2] = bdf[2].str.upper()\n                    self.barcodes = dict(zip(bdf[0], bdf[1] + \"+\" + bdf[2]))\n                except KeyError as inst:\n                    msg = \"    3rad assumes multiplexed barcodes. Doublecheck your barcodes file.\"\n                    LOGGER.error(msg)\n                    raise IPyradError(msg)\n            else:\n                ## set attribute on Assembly object\n                self.barcodes = dict(zip(bdf[0], bdf[1]))\n\n        except (IOError, IndexError):\n            raise IPyradWarningExit(\\\n                \"    Barcodes file not found. You entered: {}\"\\\n                .format(self.paramsdict[\"barcodes_path\"]))\n\n        except ValueError as inst:\n            msg = \"    Barcodes file format error.\"\n            LOGGER.warn(msg)\n            raise IPyradError(inst)", "response": "Private function that links sample barcodes in a dictionary as\n           ."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ncreate a dictionary that contains the populations for a specific loci and the number of samples that are present in each locus.", "response": "def _link_populations(self, popdict=None, popmins=None):\n        \"\"\"\n        Creates self.populations dictionary to save mappings of individuals to\n        populations/sites, and checks that individual names match with Samples.\n        The self.populations dict keys are pop names and the values are lists\n        of length 2. The first element is the min number of samples per pop\n        for final filtering of loci, and the second element is the list of\n        samples per pop.\n\n        Population assigments are used for heirarchical clustering, for\n        generating summary stats, and for outputing some file types (.treemix\n        for example). Internally stored as a dictionary.\n\n        Note\n        ----\n        By default a File is read in from `pop_assign_file` with one individual\n        per line and space separated pairs of ind pop:\n\n            ind1 pop1\n            ind2 pop2\n            ind3 pop3\n            etc...\n\n        Parameters\n        ----------\n        TODO: NB: Using API and passing in popdict and popmins is currently \n        unimplemented, or at least looks like it doesn't work. Leaving\n        these docs cuz Deren might have ideas about it being useful.\n\n        popdict : dict\n            When using the API it may be easier to simply create a dictionary\n            to pass in as an argument instead of reading from an input file.\n            This can be done with the `popdict` argument like below:\n\n            pops = {'pop1': ['ind1', 'ind2', 'ind3'], 'pop2': ['ind4', 'ind5']}\n            [Assembly]._link_populations(popdict=pops).\n\n        popmins : dict\n            If you want to apply a minsamples filter based on populations\n            you can add a popmins dictionary. This indicates the number of \n            samples in each population that must be present in a locus for \n            the locus to be retained. Example:\n\n            popmins = {'pop1': 3, 'pop2': 2}\n\n        \"\"\"\n        if not popdict:\n            ## glob it in case of fuzzy matching\n            popfile = glob.glob(self.paramsdict[\"pop_assign_file\"])[0]\n            if not os.path.exists(popfile):\n                raise IPyradError(\"Population assignment file not found: {}\"\\\n                                  .format(self.paramsdict[\"pop_assign_file\"]))\n\n\n            try:\n                ## parse populations file\n                popdat = pd.read_csv(popfile, header=None,\n                                              delim_whitespace=1,\n                                              names=[\"inds\", \"pops\"], \n                                              comment=\"#\")\n                popdict = {key: group.inds.values.tolist() for key, group in \\\n                                                        popdat.groupby(\"pops\")}\n\n                ## parse minsamples per population if present (line with #)\n                mindat = [i.lstrip(\"#\").lstrip().rstrip() for i in \\\n                          open(popfile, 'r').readlines() if i.startswith(\"#\")]\n                if mindat:\n                    popmins = {}\n                    for i in range(len(mindat)):\n                        minlist = mindat[i].replace(\",\", \"\").split()\n                        popmins.update({i.split(':')[0]:int(i.split(':')[1]) \\\n                                        for i in minlist})\n                else:\n                    raise IPyradError(MIN_SAMPLES_PER_POP_MALFORMED)\n\n            except (ValueError, IOError):\n                LOGGER.warn(\"Populations file may be malformed.\")\n                raise IPyradError(MIN_SAMPLES_PER_POP_MALFORMED)\n\n        else:\n            ## pop dict is provided by user\n            pass\n\n        ## check popdict. Filter for bad samples\n        ## Warn user but don't bail out, could be setting the pops file\n        ## on a new assembly w/o any linked samples.\n        badsamples = [i for i in itertools.chain(*popdict.values()) \\\n                      if i not in self.samples.keys()]\n        if any(badsamples):\n            LOGGER.warn(\"Some names from population input do not match Sample \"\\\n                        + \"names: \".format(\", \".join(badsamples)))\n            LOGGER.warn(\"If this is a new assembly this is normal.\")\n\n        ## If popmins not set, just assume all mins are zero\n        if not popmins:\n            popmins = {i: 0 for i in popdict.keys()}\n\n        ## check popmins\n        ## cannot have higher min for a pop than there are samples in the pop\n        popmax = {i: len(popdict[i]) for i in popdict}\n        if not all([popmax[i] >= popmins[i] for i in popdict]):\n            raise IPyradWarningExit(\\\n                \" minsample per pop value cannot be greater than the \"+\n                \" number of samples in the pop. Modify the populations file.\")\n\n        ## return dict\n        self.populations = {i: (popmins[i], popdict[i]) for i in popdict}"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef set_params(self, param, newvalue):\n\n        ## this includes current params and some legacy params for conversion\n        legacy_params = [\"edit_cutsites\", \"trim_overhang\"]\n        current_params = self.paramsdict.keys()\n        allowed_params = current_params + legacy_params\n\n        ## require parameter recognition\n        #if not ((param in range(50)) or \\\n        #        (param in [str(i) for i in range(50)]) or \\\n        #        (param in allowed_params)):\n        if not param in allowed_params:\n            raise IPyradParamsError(\"Parameter key not recognized: {}\"\\\n                                    .format(param))\n\n        ## make string\n        param = str(param)\n        ## get index if param is keyword arg (this index is now zero based!)\n        if len(param) < 3:\n            param = self.paramsdict.keys()[int(param)]\n\n        ## run assertions on new param\n        try:\n            self = _paramschecker(self, param, newvalue)\n\n        except Exception as inst:\n            raise IPyradWarningExit(BAD_PARAMETER\\\n                                    .format(param, inst, newvalue))", "response": "Set a parameter in the current Assembly object to a new value. Raises error if newvalue is wrong type."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef write_params(self, outfile=None, force=False):\n        if outfile is None:\n            outfile = \"params-\"+self.name+\".txt\"\n\n        ## Test if params file already exists?\n        ## If not forcing, test for file and bail out if it exists\n        if not force:\n            if os.path.isfile(outfile):\n                raise IPyradWarningExit(PARAMS_EXISTS.format(outfile))\n\n        with open(outfile, 'w') as paramsfile:\n            ## Write the header. Format to 80 columns\n            header = \"------- ipyrad params file (v.{})\".format(ip.__version__)\n            header += (\"-\"*(80-len(header)))\n            paramsfile.write(header)\n\n            ## Whip through the current paramsdict and write out the current\n            ## param value, the ordered dict index number. Also,\n            ## get the short description from paramsinfo. Make it look pretty,\n            ## pad nicely if at all possible.\n            for key, val in self.paramsdict.iteritems():\n                ## If multiple elements, write them out comma separated\n                if isinstance(val, list) or isinstance(val, tuple):\n                    paramvalue = \", \".join([str(i) for i in val])\n                else:\n                    paramvalue = str(val)\n\n                ## skip deprecated params\n                if key in [\"edit_cutsites\", \"trim_overhang\"]:\n                    continue\n\n                padding = (\" \"*(30-len(paramvalue)))\n                paramkey = self.paramsdict.keys().index(key)\n                paramindex = \" ## [{}] \".format(paramkey)\n                LOGGER.debug(key, val, paramindex)\n                name = \"[{}]: \".format(paramname(paramkey))\n                description = paraminfo(paramkey, short=True)\n                paramsfile.write(\"\\n\" + paramvalue + padding + \\\n                                        paramindex + name + description)", "response": "Write out the parameters of this assembly to a file properly with the parameters of the current assembly."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef branch(self, newname, subsamples=None, infile=None):\n        ## subsample by removal or keeping.\n        remove = 0\n\n        ## is there a better way to ask if it already exists?\n        if (newname == self.name or os.path.exists(\n                                    os.path.join(self.paramsdict[\"project_dir\"],\n                                    newname+\".assembly\"))):\n            print(\"{}Assembly object named {} already exists\"\\\n                  .format(self._spacer, newname))\n\n        else:\n            ## Make sure the new name doesn't have any wacky characters\n            self._check_name(newname)\n\n            ## Bozo-check. Carve off 'params-' if it's in the new name.\n            if newname.startswith(\"params-\"):\n                newname = newname.split(\"params-\")[1]\n\n            ## create a copy of the Assembly obj\n            newobj = copy.deepcopy(self)\n            newobj.name = newname\n            newobj.paramsdict[\"assembly_name\"] = newname\n\n            if subsamples and infile:\n                print(BRANCH_NAMES_AND_INPUT)\n\n            if infile:\n                if infile[0] == \"-\":\n                    remove = 1\n                    infile = infile[1:]\n                if os.path.exists(infile):\n                    subsamples = _read_sample_names(infile)\n\n            ## if remove then swap the samples\n            if remove:\n                subsamples = list(set(self.samples.keys()) - set(subsamples))\n\n            ## create copies of each subsampled Sample obj\n            if subsamples:\n                for sname in subsamples:\n                    if sname in self.samples:\n                        newobj.samples[sname] = copy.deepcopy(self.samples[sname])\n                    else:\n                        print(\"Sample name not found: {}\".format(sname))\n\n                ## reload sample dict w/o non subsamples\n                newobj.samples = {name:sample for name, sample in \\\n                           newobj.samples.items() if name in subsamples}\n\n            ## create copies of each subsampled Sample obj\n            else:\n                for sample in self.samples:\n                    newobj.samples[sample] = copy.deepcopy(self.samples[sample])\n\n            ## save json of new obj and return object\n            newobj.save()\n            return newobj", "response": "Branch the Assembly object with the new name."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nhiding wrapped function to start the demultiplexing step 1", "response": "def _step1func(self, force, ipyclient):\n        \"\"\" hidden wrapped function to start step 1 \"\"\"\n\n        ## check input data files\n        sfiles = self.paramsdict[\"sorted_fastq_path\"]\n        rfiles = self.paramsdict[\"raw_fastq_path\"]\n\n        ## do not allow both a sorted_fastq_path and a raw_fastq\n        if sfiles and rfiles:\n            raise IPyradWarningExit(NOT_TWO_PATHS)\n\n        ## but also require that at least one exists\n        if not (sfiles or rfiles):\n            raise IPyradWarningExit(NO_SEQ_PATH_FOUND)\n\n        ## print headers\n        if self._headers:\n            if sfiles:\n                print(\"\\n{}Step 1: Loading sorted fastq data to Samples\"\\\n                      .format(self._spacer))\n            else:\n                print(\"\\n{}Step 1: Demultiplexing fastq data to Samples\"\\\n                    .format(self._spacer))\n\n        ## if Samples already exist then no demultiplexing\n        if self.samples:\n            if not force:\n                print(SAMPLES_EXIST.format(len(self.samples), self.name))\n            else:\n                ## overwrite existing data else do demux\n                if glob.glob(sfiles):\n                    self._link_fastqs(ipyclient=ipyclient, force=force)\n                else:\n                    assemble.demultiplex.run2(self, ipyclient, force)\n\n        ## Creating new Samples\n        else:\n            ## first check if demultiplexed files exist in sorted path\n            if glob.glob(sfiles):\n                self._link_fastqs(ipyclient=ipyclient)\n\n            ## otherwise do the demultiplexing\n            else:\n                assemble.demultiplex.run2(self, ipyclient, force)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nhide wrapped function to start step 2", "response": "def _step2func(self, samples, force, ipyclient):\n        \"\"\" hidden wrapped function to start step 2\"\"\"\n\n        ## print header\n        if self._headers:\n            print(\"\\n  Step 2: Filtering reads \")\n\n        ## If no samples in this assembly then it means you skipped step1,\n        if not self.samples.keys():\n            raise IPyradWarningExit(FIRST_RUN_1)\n\n        ## Get sample objects from list of strings, if API.\n        samples = _get_samples(self, samples)\n\n        if not force:\n            ## print warning and skip if all are finished\n            if all([i.stats.state >= 2 for i in samples]):\n                print(EDITS_EXIST.format(len(samples)))\n                return\n\n        ## Run samples through rawedit\n        assemble.rawedit.run2(self, samples, force, ipyclient)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nrunning the clustering within step 3", "response": "def _step3func(self, samples, noreverse, maxindels, force, ipyclient):\n        \"\"\" hidden wrapped function to start step 3 \"\"\"\n        ## print headers\n        if self._headers:\n            print(\"\\n  Step 3: Clustering/Mapping reads\")\n\n        ## Require reference seq for reference-based methods\n        if self.paramsdict['assembly_method'] != \"denovo\":\n            if not self.paramsdict['reference_sequence']:\n                raise IPyradError(REQUIRE_REFERENCE_PATH\\\n                            .format(self.paramsdict[\"assembly_method\"]))\n            else:\n                ## index the reference sequence\n                ## Allow force to reindex the reference sequence\n                ## send to run on the cluster. \n                lbview = ipyclient.load_balanced_view()\n                async = lbview.apply(index_reference_sequence, *(self, force))\n\n                ## print a progress bar for the indexing\n                start = time.time()\n                while 1:\n                    elapsed = datetime.timedelta(seconds=int(time.time()-start))\n                    printstr = \" {}    | {} | s3 |\".format(\"indexing reference\", elapsed)\n                    finished = int(async.ready())\n                    progressbar(1, finished, printstr, spacer=self._spacer)\n                    if finished:\n                        print(\"\")\n                        break\n                    time.sleep(0.9)\n                ## error check\n                if not async.successful():\n                    raise IPyradWarningExit(async.result())\n\n        ## Get sample objects from list of strings\n        samples = _get_samples(self, samples)\n\n        ## Check if all/none in the right state\n        if not self._samples_precheck(samples, 3, force):\n            raise IPyradError(FIRST_RUN_2)\n\n        elif not force:\n            ## skip if all are finished\n            if all([i.stats.state >= 3 for i in samples]):\n                print(CLUSTERS_EXIST.format(len(samples)))\n                return\n\n        ## run the step function\n        assemble.cluster_within.run(self, samples, noreverse, maxindels,\n                                    force, ipyclient)"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nhides wrapped function to start step 4", "response": "def _step4func(self, samples, force, ipyclient):\n        \"\"\" hidden wrapped function to start step 4 \"\"\"\n\n        if self._headers:\n            print(\"\\n  Step 4: Joint estimation of error rate and heterozygosity\")\n\n        ## Get sample objects from list of strings\n        samples = _get_samples(self, samples)\n\n        ## Check if all/none in the right state\n        if not self._samples_precheck(samples, 4, force):\n            raise IPyradError(FIRST_RUN_3)\n\n        elif not force:\n            ## skip if all are finished\n            if all([i.stats.state >= 4 for i in samples]):\n                print(JOINTS_EXIST.format(len(samples)))\n                return\n\n        ## send to function\n        assemble.jointestimate.run(self, samples, force, ipyclient)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nhide wrapped function to start the step 5", "response": "def _step5func(self, samples, force, ipyclient):\n        \"\"\" hidden wrapped function to start step 5 \"\"\"\n        ## print header\n        if self._headers:\n            print(\"\\n  Step 5: Consensus base calling \")\n\n        ## Get sample objects from list of strings\n        samples = _get_samples(self, samples)\n\n        ## Check if all/none in the right state\n        if not self._samples_precheck(samples, 5, force):\n            raise IPyradError(FIRST_RUN_4)\n\n        elif not force:\n            ## skip if all are finished\n            if all([i.stats.state >= 5 for i in samples]):\n                print(CONSENS_EXIST.format(len(samples)))\n                return\n        ## pass samples to rawedit\n        assemble.consens_se.run(self, samples, force, ipyclient)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nhide function to start Step 6.", "response": "def _step6func(self, \n        samples, \n        noreverse, \n        force, \n        randomseed, \n        ipyclient, \n        **kwargs):\n        \"\"\" \n        Hidden function to start Step 6. \n        \"\"\"\n\n        ## Get sample objects from list of strings\n        samples = _get_samples(self, samples)\n\n        ## remove samples that aren't ready\n        csamples = self._samples_precheck(samples, 6, force)\n\n        ## print CLI header\n        if self._headers:\n            print(\"\\n  Step 6: Clustering at {} similarity across {} samples\".\\\n                  format(self.paramsdict[\"clust_threshold\"], len(csamples)))\n\n        ## Check if all/none in the right state\n        if not csamples:\n            raise IPyradError(FIRST_RUN_5)\n\n        elif not force:\n            ## skip if all are finished\n            if all([i.stats.state >= 6 for i in csamples]):\n                print(DATABASE_EXISTS.format(len(samples)))\n                return\n\n        ## run if this point is reached. We no longer check for existing\n        ## h5 file, since checking Sample states should suffice.\n        assemble.cluster_across.run(\n            self, \n            csamples, \n            noreverse,\n            force, \n            randomseed, \n            ipyclient, \n            **kwargs)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nstep 7: Filter and write output files", "response": "def _step7func(self, samples, force, ipyclient):\n        \"\"\" Step 7: Filter and write output files \"\"\"\n\n        ## Get sample objects from list of strings\n        samples = _get_samples(self, samples)\n\n        if self._headers:\n            print(\"\\n  Step 7: Filter and write output files for {} Samples\".\\\n                  format(len(samples)))\n\n        ## Check if all/none of the samples are in the self.database\n        try:\n            with h5py.File(self.clust_database, 'r') as io5:\n                dbset = set(io5[\"seqs\"].attrs['samples'])\n                iset = set([i.name for i in samples])\n\n                ## TODO: Handle the case where dbdiff is not empty?\n                ## This might arise if someone tries to branch and remove\n                ## samples at step 7.\n                dbdiff = dbset.difference(iset)\n                idiff = iset.difference(dbset)\n                if idiff:\n                    print(NOT_CLUSTERED_YET\\\n                    .format(self.database, \", \".join(list(idiff))))\n\n                    ## The the old way that failed unless all samples were\n                    ## clustered successfully in step 6. Adding some flexibility\n                    ## to allow writing output even if some samples failed.\n                    ## raise IPyradWarningExit(msg)\n\n                    ## Remove the samples that aren't ready for writing out\n                    ## i.e. only proceed with the samples that are actually\n                    ## present in the db\n                    samples = [x for x in samples if x.name not in idiff]\n        except (IOError, ValueError):\n            raise IPyradError(FIRST_RUN_6.format(self.clust_database))\n\n        if not force:\n            outdir = os.path.join(self.dirs.project, self.name+\"_outfiles\")\n            if os.path.exists(outdir):\n                raise IPyradWarningExit(OUTPUT_EXISTS.format(outdir))\n\n        ## Run step7\n        assemble.write_outfiles.run(self, samples, force, ipyclient)"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns a list of samples that are actually ready for the next step.", "response": "def _samples_precheck(self, samples, mystep, force):\n        \"\"\" Return a list of samples that are actually ready for the next step.\n            Each step runs this prior to calling run, makes it easier to\n            centralize and normalize how each step is checking sample states.\n            mystep is the state produced by the current step.\n        \"\"\"\n        subsample = []\n        ## filter by state\n        for sample in samples:\n            if sample.stats.state < mystep - 1:\n                LOGGER.debug(\"Sample {} not in proper state.\"\\\n                             .format(sample.name))\n            else:\n                subsample.append(sample)\n        return subsample"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _compatible_params_check(self):\n\n        ## do not allow statistical < majrule\n        val1 = self.paramsdict[\"mindepth_statistical\"]\n        val2 = self.paramsdict['mindepth_majrule']\n        if val1 < val2:\n            msg = \"\"\"\n    Warning: mindepth_statistical cannot not be < mindepth_majrule.\n    Forcing mindepth_majrule = mindepth_statistical = {}\n    \"\"\".format(val1)\n            LOGGER.warning(msg)\n            print(msg)\n            self.paramsdict[\"mindepth_majrule\"] = val1", "response": "check for mindepths after all params are set"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef run(self, steps=0, force=False, ipyclient=None, \n        show_cluster=0, **kwargs):\n        \"\"\"\n        Run assembly steps of an ipyrad analysis. Enter steps as a string,\n        e.g., \"1\", \"123\", \"12345\". This step checks for an existing\n        ipcluster instance otherwise it raises an exception. The ipyparallel\n        connection is made using information from the _ipcluster dict of the\n        Assembly class object.\n        \"\"\"\n        ## check that mindepth params are compatible, fix and report warning.\n        self._compatible_params_check()\n\n        ## wrap everything in a try statement to ensure that we save the\n        ## Assembly object if it is interrupted at any point, and also\n        ## to ensure proper cleanup of the ipyclient.\n        inst = None\n        try:\n            ## use an existing ipcluster instance\n            if not ipyclient:\n                args = self._ipcluster.items() + [(\"spacer\", self._spacer)]\n                ipyclient = ip.core.parallel.get_client(**dict(args))\n\n            ## print a message about the cluster status\n            ## if MPI setup then we are going to wait until all engines are\n            ## ready so that we can print how many cores started on each\n            ## host machine exactly.\n            if (self._cli) or show_cluster:\n                ip.cluster_info(ipyclient=ipyclient, spacer=self._spacer)\n\n            ## get the list of steps to run\n            if isinstance(steps, int):\n                steps = str(steps)\n            steps = sorted(list(steps))\n\n            ## print an Assembly name header if inside API\n            if not self._cli:\n                print(\"Assembly: {}\".format(self.name))\n\n            ## store ipyclient engine pids to the Assembly so we can \n            ## hard-interrupt them later if assembly is interrupted. \n            ## Only stores pids of engines that aren't busy at this moment, \n            ## otherwise it would block here while waiting to find their pids.\n            self._ipcluster[\"pids\"] = {}\n            for eid in ipyclient.ids:\n                engine = ipyclient[eid]\n                if not engine.outstanding:\n                    pid = engine.apply(os.getpid).get()\n                    self._ipcluster[\"pids\"][eid] = pid\n            #ipyclient[:].apply(os.getpid).get_dict()\n\n            ## has many fixed arguments right now, but we may add these to\n            ## hackerz_only, or they may be accessed in the API.\n            if '1' in steps:\n                self._step1func(force, ipyclient)\n                self.save()\n                ipyclient.purge_everything()\n\n            if '2' in steps:\n                self._step2func(samples=None, force=force, ipyclient=ipyclient)\n                self.save()\n                ipyclient.purge_everything()\n\n            if '3' in steps:\n                self._step3func(samples=None, noreverse=0, force=force,\n                             maxindels=8, ipyclient=ipyclient)\n                self.save()\n                ipyclient.purge_everything()\n\n            if '4' in steps:\n                self._step4func(samples=None, force=force, ipyclient=ipyclient)\n                self.save()\n                ipyclient.purge_everything()\n\n            if '5' in steps:\n                self._step5func(samples=None, force=force, ipyclient=ipyclient)\n                self.save()\n                ipyclient.purge_everything()\n\n            if '6' in steps:\n                self._step6func(samples=None, noreverse=0, randomseed=12345,\n                                force=force, ipyclient=ipyclient, **kwargs)\n                self.save()\n                ipyclient.purge_everything()\n\n            if '7' in steps:\n                self._step7func(samples=None, force=force, ipyclient=ipyclient)\n                self.save()\n                ipyclient.purge_everything()\n\n\n        ## handle exceptions so they will be raised after we clean up below\n        except KeyboardInterrupt as inst:\n            print(\"\\n  Keyboard Interrupt by user\")\n            LOGGER.info(\"assembly interrupted by user.\")\n\n        except IPyradWarningExit as inst:\n            LOGGER.error(\"IPyradWarningExit: %s\", inst)\n            print(\"\\n  Encountered an error (see details in ./ipyrad_log.txt)\"+\\\n                  \"\\n  Error summary is below -------------------------------\"+\\\n                  \"\\n{}\".format(inst))\n\n        except Exception as inst:\n            LOGGER.error(inst)\n            print(\"\\n  Encountered an unexpected error (see ./ipyrad_log.txt)\"+\\\n                  \"\\n  Error message is below -------------------------------\"+\\\n                  \"\\n{}\".format(inst))\n\n        ## close client when done or interrupted\n        finally:\n            try:\n                ## save the Assembly\n                self.save()\n\n                ## can't close client if it was never open\n                if ipyclient:\n\n                    ## send SIGINT (2) to all engines\n                    try:\n                        ipyclient.abort()\n                        time.sleep(1)\n                        for engine_id, pid in self._ipcluster[\"pids\"].items():\n                            if ipyclient.queue_status()[engine_id][\"tasks\"]:\n                                os.kill(pid, 2)\n                                LOGGER.info('interrupted engine {} w/ SIGINT to {}'\\\n                                        .format(engine_id, pid))\n                        time.sleep(1)\n                    except ipp.NoEnginesRegistered:\n                        pass\n\n                    ## if CLI, stop jobs and shutdown. Don't use _cli here \n                    ## because you can have a CLI object but use the --ipcluster\n                    ## flag, in which case we don't want to kill ipcluster.\n                    if 'ipyrad-cli' in self._ipcluster[\"cluster_id\"]:\n                        LOGGER.info(\"  shutting down engines\")\n                        ipyclient.shutdown(hub=True, block=False)\n                        ipyclient.close()\n                        LOGGER.info(\"  finished shutdown\")\n                    else:\n                        if not ipyclient.outstanding:\n                            ipyclient.purge_everything()\n                        else:\n                            ## nanny: kill everything, something bad happened\n                            ipyclient.shutdown(hub=True, block=False)\n                            ipyclient.close()\n                            print(\"\\nwarning: ipcluster shutdown and must be restarted\")\n                    \n            ## if exception is close and save, print and ignore\n            except Exception as inst2:\n                print(\"warning: error during shutdown:\\n{}\".format(inst2))\n                LOGGER.error(\"shutdown warning: %s\", inst2)", "response": "Run assembly steps of an ipyrad analysis."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _to_fulldict(self):\n        \n        ## \n        returndict = OrderedDict([\n            (\"name\", self.name),\n            (\"barcode\", self.barcode),\n            (\"files\", self.files),\n            (\"stats_dfs\", {\n                \"s1\": self.stats_dfs.s1.to_dict(),\n                \"s2\": self.stats_dfs.s2.to_dict(),                \n                \"s3\": self.stats_dfs.s3.to_dict(),\n                \"s4\": self.stats_dfs.s4.to_dict(),\n                \"s5\": self.stats_dfs.s5.to_dict(),\n            }),\n            (\"stats\", self.stats.to_dict()),\n            (\"depths\", self.depths)\n            ])\n\n        return returndict", "response": "Write to dict including data frames. All sample dicts \n        are combined in dump ("}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ncombines first and second read files into one sequence.", "response": "def combinefiles(filepath):\n    \"\"\" Joins first and second read file names \"\"\"\n    ## unpack seq files in filepath\n    fastqs = glob.glob(filepath)\n    firsts = [i for i in fastqs if \"_R1_\" in i]\n\n    ## check names\n    if not firsts:\n        raise IPyradWarningExit(\"First read files names must contain '_R1_'.\")\n\n    ## get paired reads\n    seconds = [ff.replace(\"_R1_\", \"_R2_\") for ff in firsts]\n    return zip(firsts, seconds)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nfind barcode sequence in the beginning of read", "response": "def findbcode(cutters, longbar, read1):\n    \"\"\" find barcode sequence in the beginning of read \"\"\"\n    ## default barcode string\n    for cutter in cutters[0]:\n        ## If the cutter is unambiguous there will only be one.\n        if not cutter:\n            continue\n        search = read1[1][:int(longbar[0]+len(cutter)+1)]\n        barcode = search.rsplit(cutter, 1)\n        if len(barcode) > 1:\n            return barcode[0]\n    ## No cutter found\n    return barcode[0]"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef find3radbcode(cutters, longbar, read1):\n    ## default barcode string\n    for ambigcuts in cutters:\n        for cutter in ambigcuts:\n            ## If the cutter is unambiguous there will only be one.\n            if not cutter:\n                continue\n            search = read1[1][:int(longbar[0]+len(cutter)+1)]\n            splitsearch = search.rsplit(cutter, 1)\n            if len(splitsearch) > 1:\n                return splitsearch[0]\n    ## No cutter found\n    return splitsearch[0]", "response": "find 3radbcode in the beginning of a read"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef make_stats(data, perfile, fsamplehits, fbarhits, fmisses, fdbars):\n\n    ## out file\n    outhandle = os.path.join(data.dirs.fastqs, 's1_demultiplex_stats.txt')\n    outfile = open(outhandle, 'w')\n\n    ## write the header for file stats ------------------------------------\n    outfile.write('{:<35}  {:>13}{:>13}{:>13}\\n'.\\\n                  format(\"raw_file\", \"total_reads\", \"cut_found\", \"bar_matched\"))\n\n    ## write the file stats\n    r1names = sorted(perfile)\n    for fname in r1names:\n        dat = perfile[fname]\n        #dat = [perfile[fname][i] for i in [\"ftotal\", \"fcutfound\", \"fmatched\"]]\n        outfile.write('{:<35}  {:>13}{:>13}{:>13}\\n'.\\\n            format(fname, dat[0], dat[1], dat[2]))\n        ## repeat for pairfile\n        if 'pair' in data.paramsdict[\"datatype\"]:\n            fname = fname.replace(\"_R1_\", \"_R2_\")\n            outfile.write('{:<35}  {:>13}{:>13}{:>13}\\n'.\\\n                format(fname, dat[0], dat[1], dat[2]))\n\n    ## spacer, how many records for each sample --------------------------\n    outfile.write('\\n{:<35}  {:>13}\\n'.format(\"sample_name\", \"total_reads\"))\n\n    ## names alphabetical. Write to file. Will save again below to Samples.\n    snames = set()\n    for sname in data.barcodes:\n        if \"-technical-replicate-\" in sname:\n            sname = sname.rsplit(\"-technical-replicate\", 1)[0]\n        snames.add(sname)\n        \n    for sname in sorted(list(snames)):\n        outfile.write(\"{:<35}  {:>13}\\n\".format(sname, fsamplehits[sname]))\n\n    ## spacer, which barcodes were found -----------------------------------\n    outfile.write('\\n{:<35}  {:>13} {:>13} {:>13}\\n'.\\\n                  format(\"sample_name\", \"true_bar\", \"obs_bar\", \"N_records\"))\n\n    ## write sample results\n    for sname in sorted(data.barcodes):\n        if \"-technical-replicate-\" in sname:\n            fname = sname.rsplit(\"-technical-replicate\", 1)[0]  \n        else:\n            fname = sname\n            \n        ## write perfect hit\n        hit = data.barcodes[sname]\n        offhitstring = \"\"\n    \n        ## write off-n hits\n        ## sort list of off-n hits  \n        if fname in fdbars:\n            offkeys = list(fdbars.get(fname))\n            for offhit in offkeys[::-1]:\n                ## exclude perfect hit\n                if offhit not in data.barcodes.values():\n                    offhitstring += '{:<35}  {:>13} {:>13} {:>13}\\n'.\\\n                        format(sname, hit, offhit, fbarhits[offhit]/2)\n                    #sumoffhits += fbarhits[offhit]\n        \n            ## write string to file\n            outfile.write('{:<35}  {:>13} {:>13} {:>13}\\n'.\\\n                #format(sname, hit, hit, fsamplehits[fname]-sumoffhits))\n                format(sname, hit, hit, fbarhits[hit]/2))\n            outfile.write(offhitstring)\n        \n    ## write misses\n    misskeys = list(fmisses.keys())\n    misskeys.sort(key=fmisses.get)\n    for key in misskeys[::-1]:\n        outfile.write('{:<35}  {:>13}{:>13}{:>13}\\n'.\\\n            format(\"no_match\", \"_\", key, fmisses[key]))\n    outfile.close()        \n\n\n    ## Link Sample with this data file to the Assembly object\n    for sname in snames:\n\n        ## make the sample\n        sample = Sample()\n        sample.name = sname\n\n        ## allow multiple barcodes if its a replicate. \n        barcodes = []\n        for n in xrange(500):\n            fname = sname+\"-technical-replicate-{}\".format(n)\n            fbar = data.barcodes.get(fname)\n            if fbar:\n                barcodes.append(fbar)\n        if barcodes:\n            sample.barcode = barcodes\n        else:\n            sample.barcode = data.barcodes[sname]\n\n        ## file names        \n        if 'pair' in data.paramsdict[\"datatype\"]:\n            sample.files.fastqs = [(os.path.join(data.dirs.fastqs,\n                                                  sname+\"_R1_.fastq.gz\"),\n                                     os.path.join(data.dirs.fastqs,\n                                                  sname+\"_R2_.fastq.gz\"))]\n        else:\n            sample.files.fastqs = [(os.path.join(data.dirs.fastqs,\n                                                  sname+\"_R1_.fastq.gz\"), \"\")]\n        ## fill in the summary stats\n        sample.stats[\"reads_raw\"] = int(fsamplehits[sname])\n        ## fill in the full df stats value\n        sample.stats_dfs.s1[\"reads_raw\"] = int(fsamplehits[sname])\n\n        ## Only link Sample if it has data\n        if sample.stats[\"reads_raw\"]:\n            sample.stats.state = 1\n            data.samples[sample.name] = sample\n        else:\n            print(\"Excluded sample: no data found for\", sname)\n\n    ## initiate s1 key for data object\n    data.stats_dfs.s1 = data._build_stat(\"s1\")\n    data.stats_files.s1 = outhandle", "response": "Write stats and stores to Assembly object."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef get_barcode_func(data, longbar):\n    ## build func for finding barcode\n    if longbar[1] == 'same':\n        if data.paramsdict[\"datatype\"] == '2brad':\n            def getbarcode(cutters, read1, longbar):\n                \"\"\" find barcode for 2bRAD data \"\"\"\n                return read1[1][:-(len(cutters[0][0]) + 1)][-longbar[0]:]\n\n        else:\n            def getbarcode(_, read1, longbar):\n                \"\"\" finds barcode for invariable length barcode data \"\"\"\n                return read1[1][:longbar[0]]\n    else:\n        def getbarcode(cutters, read1, longbar):\n            \"\"\" finds barcode for variable barcode lengths\"\"\"\n            return findbcode(cutters, longbar, read1)\n    return getbarcode", "response": "returns the fastest function given data & longbar"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_quart_iter(tups):\n\n    if tups[0].endswith(\".gz\"):\n        ofunc = gzip.open\n    else:\n        ofunc = open\n\n    ## create iterators \n    ofile1 = ofunc(tups[0], 'r')\n    fr1 = iter(ofile1) \n    quart1 = itertools.izip(fr1, fr1, fr1, fr1)\n    if tups[1]:\n        ofile2 = ofunc(tups[1], 'r')\n        fr2 = iter(ofile2)  \n        quart2 = itertools.izip(fr2, fr2, fr2, fr2)\n        quarts = itertools.izip(quart1, quart2)\n    else:\n        ofile2 = 0\n        quarts = itertools.izip(quart1, iter(int, 1))\n\n    ## make a generator\n    def feedme(quarts):\n        for quart in quarts:\n            yield quart\n    genquarts = feedme(quarts)\n\n    ## return generator and handles\n    return genquarts, ofile1, ofile2", "response": "returns an iterator to grab four lines at a time"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef writetofastq(data, dsort, read):\n    if read == 1:\n        rrr = \"R1\"\n    else:\n        rrr = \"R2\"\n\n    for sname in dsort:\n        ## skip writing if empty. Write to tmpname\n        handle = os.path.join(data.dirs.fastqs, \n                \"{}_{}_.fastq\".format(sname, rrr))\n        with open(handle, 'a') as out:\n            out.write(\"\".join(dsort[sname]))", "response": "Writes sorted data to a tmp files\nAttributeNames"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ncollates temp fastq files in tmp - dir into 1 gzipped sample.", "response": "def collate_files(data, sname, tmp1s, tmp2s):\n    \"\"\" \n    Collate temp fastq files in tmp-dir into 1 gzipped sample.\n    \"\"\"\n    ## out handle\n    out1 = os.path.join(data.dirs.fastqs, \"{}_R1_.fastq.gz\".format(sname))\n    out = io.BufferedWriter(gzip.open(out1, 'w'))\n\n    ## build cmd\n    cmd1 = ['cat']\n    for tmpfile in tmp1s:\n        cmd1 += [tmpfile]\n\n    ## compression function\n    proc = sps.Popen(['which', 'pigz'], stderr=sps.PIPE, stdout=sps.PIPE).communicate()\n    if proc[0].strip():\n        compress = [\"pigz\"]\n    else:\n        compress = [\"gzip\"]\n\n    ## call cmd\n    proc1 = sps.Popen(cmd1, stderr=sps.PIPE, stdout=sps.PIPE)\n    proc2 = sps.Popen(compress, stdin=proc1.stdout, stderr=sps.PIPE, stdout=out)\n    err = proc2.communicate()\n    if proc2.returncode:\n        raise IPyradWarningExit(\"error in collate_files R1 %s\", err)\n    proc1.stdout.close()\n    out.close()\n\n    ## then cleanup\n    for tmpfile in tmp1s:\n        os.remove(tmpfile)\n\n    if 'pair' in data.paramsdict[\"datatype\"]:\n        ## out handle\n        out2 = os.path.join(data.dirs.fastqs, \"{}_R2_.fastq.gz\".format(sname))\n        out = io.BufferedWriter(gzip.open(out2, 'w'))\n\n        ## build cmd\n        cmd1 = ['cat']\n        for tmpfile in tmp2s:\n            cmd1 += [tmpfile]\n\n        ## call cmd\n        proc1 = sps.Popen(cmd1, stderr=sps.PIPE, stdout=sps.PIPE)\n        proc2 = sps.Popen(compress, stdin=proc1.stdout, stderr=sps.PIPE, stdout=out)\n        err = proc2.communicate()\n        if proc2.returncode:\n            raise IPyradWarningExit(\"error in collate_files R2 %s\", err)\n        proc1.stdout.close()\n        out.close()\n\n        ## then cleanup\n        for tmpfile in tmp2s:\n            os.remove(tmpfile)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef prechecks2(data, force):\n\n    ## check for data using glob for fuzzy matching\n    if not glob.glob(data.paramsdict[\"raw_fastq_path\"]):\n        raise IPyradWarningExit(NO_RAWS.format(data.paramsdict[\"raw_fastq_path\"]))\n\n    ## find longest barcode\n    try:\n        ## Handle 3rad multi-barcodes. Gets len of the first one. \n        ## Should be harmless for single barcode data\n        barlens = [len(i.split(\"+\")[0]) for i in data.barcodes.values()]\n        if len(set(barlens)) == 1:\n            longbar = (barlens[0], 'same')\n        else:\n            longbar = (max(barlens), 'diff')\n\n        ## For 3rad we need to add the length info for barcodes_R2\n        if \"3rad\" in data.paramsdict[\"datatype\"]:\n            barlens = [len(i.split(\"+\")[1]) for i in data.barcodes.values()]\n            longbar = (longbar[0], longbar[1], max(barlens))\n    except ValueError:\n        raise IPyradWarningExit(NO_BARS.format(data.paramsdict[\"barcodes_path\"]))\n\n    ## setup dirs: [workdir] and a [workdir/name_fastqs]\n    opj = os.path.join\n\n    ## create project dir\n    pdir = os.path.realpath(data.paramsdict[\"project_dir\"])\n    if not os.path.exists(pdir):\n        os.mkdir(pdir)\n\n    ## create fastq dir\n    data.dirs.fastqs = opj(pdir, data.name+\"_fastqs\")\n    if os.path.exists(data.dirs.fastqs) and force:\n        print(OVERWRITING_FASTQS.format(**{\"spacer\":data._spacer}))\n        shutil.rmtree(data.dirs.fastqs)\n    if not os.path.exists(data.dirs.fastqs):\n        os.mkdir(data.dirs.fastqs)\n\n    ## insure no leftover tmp files from a previous run (there shouldn't be)\n    oldtmps = glob.glob(os.path.join(data.dirs.fastqs, \"tmp_*_R1_\"))\n    oldtmps += glob.glob(os.path.join(data.dirs.fastqs, \"tmp_*_R2_\"))    \n    for oldtmp in oldtmps:\n        os.remove(oldtmp)\n\n    ## gather raw sequence filenames (people want this to be flexible ...)\n    if 'pair' in data.paramsdict[\"datatype\"]:\n        raws = combinefiles(data.paramsdict[\"raw_fastq_path\"])\n    else:\n        raws = zip(glob.glob(data.paramsdict[\"raw_fastq_path\"]), iter(int, 1))\n\n    ## returns a list of both resolutions of cut site 1\n    ## (TGCAG, ) ==> [TGCAG, ]\n    ## (TWGC, ) ==> [TAGC, TTGC]\n    ## (TWGC, AATT) ==> [TAGC, TTGC]\n    cutters = [ambigcutters(i) for i in data.paramsdict[\"restriction_overhang\"]]\n    print(cutters)\n    assert cutters, \"Must enter a `restriction_overhang` for demultiplexing.\"\n\n    ## get matchdict\n    matchdict = inverse_barcodes(data)\n\n    ## return all\n    return raws, longbar, cutters, matchdict", "response": "This function is a simplified version of demux\n   . It is used to demux the entire tree of files."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nbuilding a full inverse barcodes dictionary", "response": "def inverse_barcodes(data):\n    \"\"\" Build full inverse barcodes dictionary \"\"\"\n\n    matchdict = {}\n    bases = set(\"CATGN\")\n    poss = set()\n\n    ## do perfect matches\n    for sname, barc in data.barcodes.items():\n        ## remove -technical-replicate-N if present\n        if \"-technical-replicate-\" in sname:\n            sname = sname.rsplit(\"-technical-replicate\", 1)[0]\n        matchdict[barc] = sname\n        poss.add(barc)\n\n        if data.paramsdict[\"max_barcode_mismatch\"] > 0:\n            ## get 1-base diffs\n            for idx1, base in enumerate(barc):\n                diffs = bases.difference(base)\n                for diff in diffs:\n                    lbar = list(barc)\n                    lbar[idx1] = diff\n                    tbar1 = \"\".join(lbar)\n                    if tbar1 not in poss:\n                        matchdict[tbar1] = sname                    \n                        poss.add(tbar1)\n                    else:\n                        if matchdict.get(tbar1) != sname:\n                            print(\"\"\"\\\n        Note: barcodes {}:{} and {}:{} are within {} base change of each other\n            Ambiguous barcodes that match to both samples will arbitrarily\n            be assigned to the first sample. If you do not like this idea \n            then lower the value of max_barcode_mismatch and rerun step 1\\n\"\"\"\\\n        .format(sname, barc, \n                matchdict[tbar1], data.barcodes[matchdict[tbar1]],\n                data.paramsdict[\"max_barcode_mismatch\"]))\n\n                ## if allowing two base difference things get big\n                ## for each modified bar, allow one modification to other bases\n                if data.paramsdict[\"max_barcode_mismatch\"] > 1:\n                    for idx2, _ in enumerate(tbar1):\n                        ## skip the base that is already modified\n                        if idx2 != idx1:\n                            for diff in bases.difference(tbar1[idx2]):\n                                ltbar = list(tbar1)\n                                ltbar[idx2] = diff\n                                tbar2 = \"\".join(ltbar)\n                                if tbar2 not in poss:\n                                    matchdict[tbar2] = sname                    \n                                    poss.add(tbar2)\n                                else:\n                                    if matchdict.get(tbar2) != sname:\n                                        print(\"\"\"\\\n        Note: barcodes {}:{} and {}:{} are within {} base change of each other\\\n             Ambiguous barcodes that match to both samples will arbitrarily\n             be assigned to the first sample. If you do not like this idea \n             then lower the value of max_barcode_mismatch and rerun step 1\\n\"\"\"\\\n        .format(sname, barc, \n                            matchdict[tbar2], data.barcodes[matchdict[tbar2]],\n                            data.paramsdict[\"max_barcode_mismatch\"]))\n    return matchdict"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef estimate_optim(data, testfile, ipyclient):\n    ## count the len of one file and assume all others are similar len\n    insize = os.path.getsize(testfile)\n    tmp_file_name = os.path.join(data.paramsdict[\"project_dir\"], \"tmp-step1-count.fq\")\n    if testfile.endswith(\".gz\"):\n        infile = gzip.open(testfile)\n        outfile = gzip.open(tmp_file_name, 'wb', compresslevel=5)\n    else:\n        infile = open(testfile)\n        outfile = open(tmp_file_name, 'w')\n        \n    ## We'll take the average of the size of a file based on the\n    ## first 10000 reads to approximate number of reads in the main file\n    outfile.write(\"\".join(itertools.islice(infile, 40000)))\n    outfile.close()\n    infile.close()\n\n    ## Get the size of the tmp file\n    tmp_size = os.path.getsize(tmp_file_name)\n\n    ## divide by the tmp file size and multiply by 10000 to approximate\n    ## the size of the input .fq files\n    inputreads = int(insize / tmp_size) * 10000\n    os.remove(tmp_file_name)\n\n    return inputreads", "response": "Estimate a reasonable optim value by grabbing a chunk of sequences decompressing and counting them to estimate the full file size."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef run2(data, ipyclient, force):\n\n    ## get file handles, name-lens, cutters, and matchdict\n    raws, longbar, cutters, matchdict = prechecks2(data, force)\n\n    ## wrap funcs to ensure we can kill tmpfiles\n    kbd = 0\n    try:\n        ## if splitting files, split files into smaller chunks for demuxing\n        chunkfiles = splitfiles(data, raws, ipyclient)\n\n        ## send chunks to be demux'd\n        statdicts = demux2(data, chunkfiles, cutters, longbar, matchdict, ipyclient)\n\n        ## concat tmp files\n        concat_chunks(data, ipyclient)\n\n        ## build stats from dictionaries\n        perfile, fsamplehits, fbarhits, fmisses, fdbars = statdicts    \n        make_stats(data, perfile, fsamplehits, fbarhits, fmisses, fdbars)\n\n\n    except KeyboardInterrupt:\n        print(\"\\n  ...interrupted, just a second while we ensure proper cleanup\")\n        kbd = 1\n\n    ## cleanup\n    finally:\n        ## cleaning up the tmpdir is safe from ipyclient\n        tmpdir = os.path.join(data.paramsdict[\"project_dir\"], \"tmp-chunks-\"+data.name)\n        if os.path.exists(tmpdir):\n            shutil.rmtree(tmpdir)\n\n        if kbd:\n            raise KeyboardInterrupt(\"s1\")\n        else:\n            _cleanup_and_die(data)", "response": "This function is a helper function that runs two processors on a single input file or pair."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _cleanup_and_die(data):\n    tmpfiles = glob.glob(os.path.join(data.dirs.fastqs, \"tmp_*_R*.fastq\"))\n    tmpfiles += glob.glob(os.path.join(data.dirs.fastqs, \"tmp_*.p\"))\n    for tmpf in tmpfiles:            \n        os.remove(tmpf)", "response": "cleanup func for step 1"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nsplitting a list of files into chunks", "response": "def splitfiles(data, raws, ipyclient):\n    \"\"\" sends raws to be chunked\"\"\"\n\n    ## create a tmpdir for chunked_files and a chunk optimizer \n    tmpdir = os.path.join(data.paramsdict[\"project_dir\"], \"tmp-chunks-\"+data.name)\n    if os.path.exists(tmpdir):\n        shutil.rmtree(tmpdir)\n    os.makedirs(tmpdir)\n\n    ## chunk into 8M reads\n    totalreads = estimate_optim(data, raws[0][0], ipyclient)\n    optim = int(8e6)\n    njobs = int(totalreads/(optim/4.)) * len(raws)\n\n    ## if more files than cpus: no chunking\n    nosplit = 0\n    if (len(raws) > len(ipyclient)) or (totalreads < optim):\n        nosplit = 1\n\n    ## send slices N at a time. The dict chunkfiles stores a tuple of rawpairs\n    ## dictionary to store asyncresults for sorting jobs\n    start = time.time()\n    chunkfiles = {}\n    for fidx, tups in enumerate(raws):\n        handle = os.path.splitext(os.path.basename(tups[0]))[0]\n        ## if number of lines is > 20M then just submit it\n        if nosplit:\n            chunkfiles[handle] = [tups]\n        else:\n            ## chunk the file using zcat_make_temps\n            chunklist = zcat_make_temps(data, tups, fidx, tmpdir, optim, njobs, start)\n            chunkfiles[handle] = chunklist\n    if not nosplit:\n        print(\"\")\n\n    return chunkfiles"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef concat_chunks(data, ipyclient):\n\n    ## collate files progress bar\n    start = time.time()\n    printstr = ' writing/compressing   | {} | s1 |'\n    lbview = ipyclient.load_balanced_view()\n    elapsed = datetime.timedelta(seconds=int(time.time()-start))\n    progressbar(10, 0, printstr.format(elapsed), spacer=data._spacer) \n    ## get all the files\n    ftmps = glob.glob(os.path.join(data.dirs.fastqs, \"tmp_*.fastq\"))\n\n    ## a dict to assign tmp files to names/reads\n    r1dict = {}\n    r2dict = {}\n    for sname in data.barcodes:\n        if \"-technical-replicate-\" in sname:\n            sname = sname.rsplit(\"-technical-replicate\", 1)[0]\n        r1dict[sname] = []\n        r2dict[sname] = []\n\n    ## assign to name keys\n    for ftmp in ftmps:\n        base, orient, _ = ftmp.rsplit(\"_\", 2)\n        sname = base.rsplit(\"/\", 1)[-1].split(\"tmp_\", 1)[1]\n        if orient == \"R1\":\n            r1dict[sname].append(ftmp)\n        else:\n            r2dict[sname].append(ftmp)\n\n    ## concatenate files\n    snames = []\n    for sname in data.barcodes:\n        if \"-technical-replicate-\" in sname:\n            sname = sname.rsplit(\"-technical-replicate\", 1)[0]\n        snames.append(sname)\n\n    writers = []\n    for sname in set(snames):\n        tmp1s = sorted(r1dict[sname])\n        tmp2s = sorted(r2dict[sname])\n        writers.append(lbview.apply(collate_files, *[data, sname, tmp1s, tmp2s]))\n\n    total = len(writers)\n    while 1:\n        ready = [i.ready() for i in writers]\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))\n        progressbar(total, sum(ready), printstr.format(elapsed), spacer=data._spacer) \n        time.sleep(0.1)\n        if all(ready):\n            print(\"\")\n            break", "response": "Concatenate chunks into one sample name file."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nsubmitting chunks to be sorted by the barmatch() function then calls putstats().", "response": "def demux2(data, chunkfiles, cutters, longbar, matchdict, ipyclient):\n    \"\"\" \n    Submit chunks to be sorted by the barmatch() function then \n    calls putstats().\n    \"\"\"\n\n    ## parallel stuff, limit to 1/4 of available cores for RAM limits.\n    start = time.time()\n    printstr = ' sorting reads         | {} | s1 |'\n    lbview = ipyclient.load_balanced_view(targets=ipyclient.ids[::4])\n\n    ## store statcounters and async results in dicts\n    perfile = {}\n    filesort = {}\n    total = 0\n    done = 0 \n\n    ## chunkfiles is a dict with {handle: chunkslist, ...}. The func barmatch\n    ## writes results to samplename files with PID number, and also writes a \n    ## pickle for chunk specific results with fidx suffix, which it returns.\n    for handle, rawtuplist in chunkfiles.items():\n        ## get args for job\n        for fidx, rawtuple in enumerate(rawtuplist):\n            #handle = os.path.splitext(os.path.basename(rawtuple[0]))[0]\n            args = (data, rawtuple, cutters, longbar, matchdict, fidx)\n\n            ## submit the job\n            async = lbview.apply(barmatch, *args)\n            filesort[total] = (handle, async)\n            total += 1\n\n            ## get ready to receive stats: 'total', 'cutfound', 'matched'\n            perfile[handle] = np.zeros(3, dtype=np.int)\n\n    ## stats for each sample\n    fdbars = {}\n    fsamplehits = Counter()\n    fbarhits = Counter()\n    fmisses = Counter()\n    ## a tuple to hold my dictionaries\n    statdicts = perfile, fsamplehits, fbarhits, fmisses, fdbars\n\n    ## wait for jobs to finish\n    while 1:\n        fin = [i for i, j in filesort.items() if j[1].ready()]\n        #fin = [i for i in jobs if i[1].ready()]\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))\n        progressbar(total, done, printstr.format(elapsed), spacer=data._spacer)\n        time.sleep(0.1)\n\n        ## should we break?\n        if total == done:\n            print(\"\")\n            break\n\n        ## cleanup\n        for key in fin:\n            tup = filesort[key]\n            if tup[1].successful():\n                pfile = tup[1].result()\n                handle = tup[0]\n                if pfile:\n                    ## check if this needs to return data\n                    putstats(pfile, handle, statdicts)\n                    ## purge to conserve memory\n                    del filesort[key]\n                    done += 1\n\n    return statdicts"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nsubmitting chunks to be sorted", "response": "def demux(data, chunkfiles, cutters, longbar, matchdict, ipyclient):\n    \"\"\" submit chunks to be sorted \"\"\"\n\n    ## parallel stuff\n    start = time.time()\n    printstr = ' sorting reads         | {} | s1 |'\n    lbview = ipyclient.load_balanced_view()\n\n    ## store statcounters and async results in dicts\n    perfile = {}\n    filesort = {}\n    for handle, rawtuplist in chunkfiles.items():\n        ## get args for job\n        for fidx, rawtuple in enumerate(rawtuplist):\n            #handle = os.path.splitext(os.path.basename(rawtuple[0]))[0]\n            args = (data, rawtuple, cutters, longbar, matchdict, fidx)\n\n            ## submit the job\n            filesort[handle] = lbview.apply(barmatch, *args)\n\n            ## get ready to receive stats: 'total', 'cutfound', 'matched'\n            perfile[handle] = np.zeros(3, dtype=np.int)\n\n    ## stats for each sample\n    fdbars = {}\n    fsamplehits = Counter()\n    fbarhits = Counter()\n    fmisses = Counter()\n    ## a tuple to hold my dictionaries\n    statdicts = perfile, fsamplehits, fbarhits, fmisses, fdbars\n\n    try:\n        kbd = 0\n        total = len(chunkfiles)\n        done = 0\n        ## wait for jobs to finish\n        while 1:\n            fin = [i for i, j in filesort.items() if j.ready()]\n            elapsed = datetime.timedelta(seconds=int(time.time()-start))\n            progressbar(total, done, printstr.format(elapsed), spacer=data._spacer)\n            time.sleep(0.1)\n\n            ## should we break?\n            if total == done:\n                print(\"\")\n                break\n\n            ## cleanup\n            for job in fin:\n                if filesort[job].successful():\n                    pfile = filesort[job].result()\n                    #if result:\n                    if pfile:\n                        ## check if this needs to return data\n                        putstats(pfile, handle, statdicts)\n                        \n                        ## purge to conserve memory\n                        del filesort[job]\n                        done += 1\n\n        ## keep tacking progreess during writing stage\n        start = time.time()\n        printstr = ' writing/compressing   | {} | s1 |'\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))\n        progressbar(10, 0, printstr.format(elapsed), spacer=data._spacer)\n\n\n    except KeyboardInterrupt:\n        ## wait to cleanup\n        kbd = 1\n        raise\n\n\n    ## only proceed here if barmatch jobs were not interrupted\n    else:\n        ## collate files and do progress bar\n        ftmps = glob.glob(os.path.join(data.dirs.fastqs, \"tmp_*.fastq\"))\n\n        ## a dict to assign tmp files to names/reads\n        r1dict = {}\n        r2dict = {}\n        for sname in data.barcodes:\n            r1dict[sname] = []\n            r2dict[sname] = []\n\n        ## assign to name keys\n        for ftmp in ftmps:\n            ## split names\n            base, orient, _ = ftmp.rsplit(\"_\", 2)\n            sname = base.rsplit(\"/\", 1)[-1].split(\"tmp_\", 1)[1]\n            ## put into dicts\n            if orient == \"R1\":\n                r1dict[sname].append(ftmp)\n            else:\n                r2dict[sname].append(ftmp)\n\n        ## concatenate files\n        total = len(data.barcodes)\n        done = 0\n\n        ## store asyncs of collate jobs\n        writers = []\n        for sname in data.barcodes:\n            tmp1s = sorted(r1dict[sname])\n            tmp2s = sorted(r2dict[sname])\n            writers.append(lbview.apply(collate_files, \n                           *[data, sname, tmp1s, tmp2s]))\n        \n        ## track progress of collate jobs\n        while 1:\n            ready = [i.ready() for i in writers]\n            elapsed = datetime.timedelta(seconds=int(time.time()-start))\n            progressbar(total, sum(ready), printstr.format(elapsed), spacer=data._spacer)  \n            time.sleep(0.1)\n            if all(ready):\n                print(\"\")\n                break\n\n    finally:\n        ## clean up junk files\n        tmpfiles = glob.glob(os.path.join(data.dirs.fastqs, \"tmp_*_R*.fastq\"))\n        tmpfiles += glob.glob(os.path.join(data.dirs.fastqs, \"tmp_*.p\"))\n        for tmpf in tmpfiles:\n            os.remove(tmpf)\n\n        if kbd:\n            raise KeyboardInterrupt()\n        else:\n            ## build stats from dictionaries\n            perfile, fsamplehits, fbarhits, fmisses, fdbars = statdicts    \n            make_stats(data, perfile, fsamplehits, fbarhits, fmisses, fdbars)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef putstats(pfile, handle, statdicts):\n\n    ## load in stats\n    with open(pfile, 'r') as infile:\n        filestats, samplestats = pickle.load(infile)\n\n    ## get dicts from statdicts tuple\n    perfile, fsamplehits, fbarhits, fmisses, fdbars = statdicts\n\n    ## pull new stats\n    #handle = os.path.splitext(os.path.basename(handle))[0]\n    perfile[handle] += filestats\n\n    ## update sample stats\n    samplehits, barhits, misses, dbars = samplestats\n    fsamplehits.update(samplehits)\n    fbarhits.update(barhits)\n    fmisses.update(misses)\n    fdbars.update(dbars)\n\n    ## repack the tuple and return\n    statdicts = perfile, fsamplehits, fbarhits, fmisses, fdbars\n    return statdicts", "response": "put stats from pickles into a dictionary"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef zcat_make_temps(data, raws, num, tmpdir, optim, njobs, start):\n\n    printstr = ' chunking large files  | {} | s1 |'\n\n    ## split args\n    tmpdir = os.path.realpath(tmpdir)\n    LOGGER.info(\"zcat is using optim = %s\", optim)\n\n    ## read it, is it gzipped?\n    catcmd = [\"cat\"]\n    if raws[0].endswith(\".gz\"):\n        catcmd = [\"gunzip\", \"-c\"]\n\n    ## get reading commands for r1s, r2s\n    cmd1 = catcmd + [raws[0]]\n    cmd2 = catcmd + [raws[1]]\n\n    ## second command splits and writes with name prefix\n    cmd3 = [\"split\", \"-a\", \"4\", \"-l\", str(int(optim)), \"-\", \n            os.path.join(tmpdir, \"chunk1_\"+str(num)+\"_\")]\n    cmd4 = [\"split\", \"-a\", \"4\", \"-l\", str(int(optim)), \"-\", \n            os.path.join(tmpdir, \"chunk2_\"+str(num)+\"_\")]\n\n    ### run splitter\n    proc1 = sps.Popen(cmd1, stderr=sps.STDOUT, stdout=sps.PIPE)\n    proc3 = sps.Popen(cmd3, stderr=sps.STDOUT, stdout=sps.PIPE, stdin=proc1.stdout)\n\n    ## wrap the actual call so we can kill it if anything goes awry\n    while 1:\n        try:\n            if not isinstance(proc3.poll(), int):\n                elapsed = datetime.timedelta(seconds=int(time.time()-start))\n                done = len(glob.glob(os.path.join(tmpdir, 'chunk1_*')))\n                progressbar(njobs, min(njobs, done), printstr.format(elapsed), spacer=data._spacer)\n                time.sleep(0.1)\n            else:\n                res = proc3.communicate()[0]\n                proc1.stdout.close()\n                break\n\n        except KeyboardInterrupt:\n            proc1.kill()\n            proc3.kill()\n            raise KeyboardInterrupt()\n\n    if proc3.returncode:\n        raise IPyradWarningExit(\" error in %s: %s\", cmd3, res)\n\n    ## grab output handles\n    chunks1 = glob.glob(os.path.join(tmpdir, \"chunk1_\"+str(num)+\"_*\"))\n    chunks1.sort()\n\n    if \"pair\" in data.paramsdict[\"datatype\"]:\n        proc2 = sps.Popen(cmd2, stderr=sps.STDOUT, stdout=sps.PIPE)\n        proc4 = sps.Popen(cmd4, stderr=sps.STDOUT, stdout=sps.PIPE, stdin=proc2.stdout)\n\n        ## wrap the actual call so we can kill it if anything goes awry\n        while 1:\n            try:\n                if not isinstance(proc4.poll(), int):\n                    elapsed = datetime.timedelta(seconds=int(time.time()-start))\n                    done = len(glob.glob(os.path.join(tmpdir, 'chunk1_*')))\n                    progressbar(njobs, min(njobs, done), printstr.format(elapsed), data._spacer)\n                    time.sleep(0.1)\n                else:\n                    res = proc4.communicate()[0]\n                    proc2.stdout.close()\n                    break\n\n            except KeyboardInterrupt:\n                proc2.kill()\n                proc4.kill()\n                raise KeyboardInterrupt()\n\n        if proc4.returncode:\n            raise IPyradWarningExit(\" error in %s: %s\", cmd4, res)\n\n        ## grab output handles\n        chunks2 = glob.glob(os.path.join(tmpdir, \"chunk2_\"+str(num)+\"_*\"))\n        chunks2.sort()\n    \n    else:\n        chunks2 = [0]*len(chunks1)\n\n    assert len(chunks1) == len(chunks2), \\\n        \"R1 and R2 files are not the same length.\"\n\n    ## ensure full progress bar b/c estimates njobs could be off\n    progressbar(10, 10, printstr.format(elapsed), spacer=data._spacer)\n    return zip(chunks1, chunks2)", "response": "Create large files from a list of files."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _plotshare(share, names, **kwargs):\n\n    ## set the colormap\n    colormap = toyplot.color.LinearMap(toyplot.color.brewer.palette(\"Spectral\"), \n                                 domain_min=share.min(), domain_max=share.max())\n\n    ## set up canvas\n    if not kwargs.get('width'):\n        width=900\n    else:\n        width = kwargs['width']\n    canvas = toyplot.Canvas(width=width, height=width*0.77778)\n\n    ## order the dta\n    table = canvas.matrix((share, colormap), \n                          bounds=(50, canvas.height-100,\n                                  50, canvas.height-100), \n                          step=5, tshow=False, lshow=False)\n\n    ## put a box around the table\n    table.body.grid.vlines[..., [0, -1]] = 'single'\n    table.body.grid.hlines[[0, -1], ...] = 'single'\n\n    ## make hover info on grid\n    for i, j in itertools.product(range(len(share)), repeat=2):\n        table.body.cell(i,j).title = \"%s, %s : %s\" % (names[i], names[j], int(share[i,j]))\n\n    ## create barplot\n    axes = canvas.cartesian(bounds=(665, 800, 90, 560))\n\n    ## make a hover for barplot\n    zf = zip(names[::-1], share.diagonal()[::-1])\n    barfloater = [\"%s: %s\" % (i, int(j)) for i, j in zf]\n\n    ## plot bars\n    axes.bars(share.diagonal()[::-1], along='y', title=barfloater)\n\n    ## hide spine, move labels to the left, \n    ## use taxon names, rotate angle, align\n    axes.y.spine.show = False\n    axes.y.ticks.labels.offset = 0\n    axes.y.ticks.locator = toyplot.locator.Explicit(range(len(names)), \n                                                    labels=names[::-1])\n    axes.y.ticks.labels.angle = -90\n    axes.y.ticks.labels.style = {\"baseline-shift\":0,\n                                 \"text-anchor\":\"end\", \n                                 \"font-size\":\"8px\"}\n\n    ## rotate xlabels, align with ticks, change to thousands, move up on canvas\n    ## show ticks, and hide popup coordinates\n    axes.x.ticks.labels.angle = 90\n    axes.x.ticks.labels.offset = 20\n    axes.x.ticks.locator = toyplot.locator.Explicit(\n        range(0, int(share.max()), \n                 int(share.max() / 10)), \n        [\"{}\".format(i) for i in range(0, int(share.max()),\n                                           int(share.max() / 10))])\n    axes.x.ticks.labels.style = {\"baseline-shift\":0,\n                                 \"text-anchor\":\"end\", \n                                 \"-toyplot-anchor-shift\":\"15px\"}\n    axes.x.ticks.show = True\n\n    ## add labels\n    label_style = {\"font-size\": \"16px\", \"font-weight\": \"bold\"}\n    canvas.text(300, 60, \"Matrix of shared RAD loci\", style=label_style)\n    canvas.text(700, 60, \"N RAD loci per sample\", style=label_style)\n\n    return canvas, axes", "response": "make toyplot matrix fig"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nparse the loci file list and return presence and absence matrix", "response": "def _getarray(loci, tree):\n    \"\"\" \n    parse the loci file list and return presence/absence matrix\n    ordered by the tips on the tree\n    \"\"\"\n\n    ## order tips\n    tree.ladderize()\n\n    ## get tip names\n    snames = tree.get_leaf_names()\n\n    ## make an empty matrix\n    lxs = np.zeros((len(snames), len(loci)), dtype=np.int)\n\n    ## fill the matrix\n    for loc in xrange(len(loci)):\n        for seq in loci[loc].split(\"\\n\")[:-1]:\n            lxs[snames.index(seq.split()[0]), loc] += 1\n\n    return lxs, snames"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nfills a matrix with pairwise data sharing", "response": "def _countmatrix(lxs):\n    \"\"\" fill a matrix with pairwise data sharing \"\"\"\n    \n    ## an empty matrix\n    share = np.zeros((lxs.shape[0], lxs.shape[0]))\n\n    ## fill above\n    names = range(lxs.shape[0])\n    for row in lxs:\n        for samp1, samp2 in itertools.combinations(names, 2):\n            shared = lxs[samp1, lxs[samp2] > 0].sum()\n            share[samp1, samp2] = shared\n\n    ## mirror below\n    ##share[]\n\n    ## fill diagonal with total sample coverage\n    for row in xrange(len(names)):\n        share[row, row] = lxs[row].sum()\n\n    return share"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef paramname(param=\"\"):\n\n    try: \n        name = pinfo[str(param)][0].strip().split(\" \")[1]\n    except (KeyError, ValueError) as err:\n        ## TODO: paramsinfo get description by param string not working.\n        ## It would be cool to have an assembly object bcz then you could\n        ## just do this:\n        ##\n        ## print(pinfo[data.paramsinfo.keys().index(param)])\n        print(\"\\tKey name/number not recognized - \".format(param), err)\n        raise\n\n    return name", "response": "Get the param name from the dict index value."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning detailed information for a parameter.", "response": "def paraminfo(param=\"\", short=False):\n    \"\"\" Returns detailed information for the numbered parameter. \n        Further information is available in the tutorial.\n        Unlike params() this function doesn't deal well with *\n        It only takes one parameter at a time and returns the desc\n    \"\"\"\n\n    ## If the short flag is set return the short description, otherwise\n    ## return the long.\n    if short:\n        desc = 1\n    else:\n        desc = 0\n\n    try: \n        description = pinfo[str(param)][desc]\n    except (KeyError, ValueError) as err:\n        ## TODO: paramsinfo get description by param string not working.\n        ## It would be cool to have an assembly object bcz then you could\n        ## just do this:\n        ##\n        ## print(pinfo[data.paramsinfo.keys().index(param)])\n        print(\"\\tKey name/number not recognized - \".format(param), err)\n        raise\n\n    return description"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef paramsinfo(param=\"\", short=False):\n    if short:\n        desc = 1\n    else:\n        desc = 0\n\n    if param == \"*\":\n        for key in pinfo:\n            print(pinfo[str(key)][desc])\n    elif param:\n        try:\n            print(pinfo[str(param)][desc])\n        except (KeyError, ValueError) as err:\n            ## TODO: paramsinfo get description by param string not working.\n            ## It would be cool to have an assembly object bcz then you could\n            ## just do this:\n            ##\n            ## print(pinfo[data.paramsinfo.keys().index(param)])\n            print(\"\\tKey name/number not recognized\", err)\n            raise\n    else:\n        print(\"Enter a name or number for explanation of the parameter\\n\")\n        for key in pinfo:\n            print(pinfo[str(key)][desc].split(\"\\n\")[1][2:-10])", "response": "This function prints out the information about the parameter."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef update_assembly(data):\n\n    print(\"##############################################################\")\n    print(\"Updating assembly to current version\")\n    ## New assembly object to update pdate from.\n    new_assembly = ip.Assembly(\"update\", quiet=True)\n\n    ## Hackersonly dict gets automatically overwritten\n    ## Always use the current version for params in this dict.\n    data._hackersonly = deepcopy(new_assembly._hackersonly)\n\n    new_params = set(new_assembly.paramsdict.keys())\n\n    my_params = set(data.paramsdict.keys())\n\n    ## Find all params in loaded assembly that aren't in the new assembly.\n    ## Make a new dict that doesn't include anything in removed_params\n    removed_params = my_params.difference(new_params)\n    for i in removed_params:\n        print(\"Removing parameter: {} = {}\".format(i, data.paramsdict[i]))\n        \n    ## Find all params that are in the new paramsdict and not in the old one.\n    ## If the set isn't emtpy then we create a new dictionary based on the new\n    ## assembly parameters and populated with currently loaded assembly values.\n    ## Conditioning on not including any removed params. Magic.\n    added_params = new_params.difference(my_params)\n    for i in added_params:\n        print(\"Adding parameter: {} = {}\".format(i, new_assembly.paramsdict[i]))\n\n    print(\"\\nPlease take note of these changes. Every effort is made to\\n\"\\\n            +\"ensure compatibility across versions of ipyrad. See online\\n\"\\\n            +\"documentation for further details about new parameters.\")\n    time.sleep(5)\n    print(\"##############################################################\")\n    \n    if added_params:\n        for i in data.paramsdict:\n            if i not in removed_params:\n                new_assembly.paramsdict[i] = data.paramsdict[i]\n        data.paramsdict = deepcopy(new_assembly.paramsdict)\n\n    data.save()\n    return data", "response": "Update the current assembly with the new parameters."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nsaving assembly and samples as json", "response": "def save_json(data):\n    \"\"\" Save assembly and samples as json \"\"\"\n\n    ## data as dict\n    #### skip _ipcluster because it's made new\n    #### skip _headers because it's loaded new\n    #### statsfiles save only keys\n    #### samples save only keys\n    datadict = OrderedDict([\n        (\"_version\", data.__dict__[\"_version\"]),\n        (\"_checkpoint\", data.__dict__[\"_checkpoint\"]),\n        (\"name\", data.__dict__[\"name\"]), \n        (\"dirs\", data.__dict__[\"dirs\"]),\n        (\"paramsdict\", data.__dict__[\"paramsdict\"]),\n        (\"samples\", data.__dict__[\"samples\"].keys()),\n        (\"populations\", data.__dict__[\"populations\"]),\n        (\"database\", data.__dict__[\"database\"]),\n        (\"clust_database\", data.__dict__[\"clust_database\"]),        \n        (\"outfiles\", data.__dict__[\"outfiles\"]),\n        (\"barcodes\", data.__dict__[\"barcodes\"]),\n        (\"stats_files\", data.__dict__[\"stats_files\"]),\n        (\"_hackersonly\", data.__dict__[\"_hackersonly\"]),\n        ])\n\n    ## sample dict\n    sampledict = OrderedDict([])\n    for key, sample in data.samples.iteritems():\n        sampledict[key] = sample._to_fulldict()\n\n    ## json format it using cumstom Encoder class\n    fulldumps = json.dumps({\n        \"assembly\": datadict,\n        \"samples\": sampledict\n        },\n        cls=Encoder,\n        sort_keys=False, indent=4, separators=(\",\", \":\"),\n        )\n\n    ## save to file\n    assemblypath = os.path.join(data.dirs.project, data.name+\".json\")\n    if not os.path.exists(data.dirs.project):\n        os.mkdir(data.dirs.project)\n    \n    ## protect save from interruption\n    done = 0\n    while not done:\n        try:\n            with open(assemblypath, 'w') as jout:\n                jout.write(fulldumps)\n            done = 1\n        except (KeyboardInterrupt, SystemExit): \n            print('.')\n            continue"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef load_json(path, quiet=False, cli=False):\n\n    ## load the JSON string and try with name+.json\n    checkfor = [path+\".json\", path]\n    for inpath in checkfor:\n        inpath = inpath.replace(\"~\", os.path.expanduser(\"~\"))\n        try:\n            with open(inpath, 'r') as infile:\n                ## uses _tup_and_byte to ensure ascii and tuples are correct\n                fullj = json.loads(infile.read(), object_hook=_tup_and_byte)\n        except IOError:\n            pass\n\n    ## create a new empty Assembly\n    try:\n        oldname = fullj[\"assembly\"].pop(\"name\")\n        olddir = fullj[\"assembly\"][\"dirs\"][\"project\"]\n        oldpath = os.path.join(olddir, os.path.splitext(oldname)[0]+\".json\")\n        null = ip.Assembly(oldname, quiet=True, cli=cli)\n\n    except (UnboundLocalError, AttributeError) as inst:\n        raise IPyradWarningExit(\"\"\"\n    Could not find saved Assembly file (.json) in expected location.\n    Checks in: [project_dir]/[assembly_name].json\n    Checked: {}\n    \"\"\".format(inpath))\n\n    ## print msg with shortpath\n    if not quiet:\n        oldpath = oldpath.replace(os.path.expanduser(\"~\"), \"~\")\n        print(\"{}loading Assembly: {}\".format(null._spacer, oldname))\n        print(\"{}from saved path: {}\".format(null._spacer, oldpath))\n\n    ## First get the samples. Create empty sample dict of correct length \n    samplekeys = fullj[\"assembly\"].pop(\"samples\")\n    null.samples = {name: \"\" for name in samplekeys}\n\n    ## Next get paramsdict and use set_params to convert values back to \n    ## the correct dtypes. Allow set_params to fail because the object will \n    ## be subsequently updated by the params from the params file, which may\n    ## correct any errors/incompatibilities in the old params file\n    oldparams = fullj[\"assembly\"].pop(\"paramsdict\")\n    for param, val in oldparams.iteritems():\n        ## a fix for backward compatibility with deprecated options\n        if param not in [\"assembly_name\", \"excludes\", \"outgroups\"]:\n            try:\n                null.set_params(param, val)\n            except IPyradWarningExit as inst:\n                #null.set_params(param, \"\")\n                LOGGER.warning(\"\"\" \n    Load assembly error setting params. Not critical b/c new params file may\n    correct the problem. Recorded here for debugging:\n    {}\n    \"\"\".format(inst))\n\n    ## Import the hackersonly dict. In this case we don't have the nice\n    ## set_params so we're shooting from the hip to reset the values\n    try:\n        oldhackersonly = fullj[\"assembly\"].pop(\"_hackersonly\")\n        for param, val in oldhackersonly.iteritems():\n            if val == None:\n                null._hackersonly[param] = None\n            else:\n                null._hackersonly[param] = val\n\n    except Exception as inst:\n        LOGGER.warning(\"\"\"\n    Load assembly error resetting hackersonly dict element. We will just use\n    the default value in the current assembly.\"\"\")\n\n    ## Check remaining attributes of Assembly and Raise warning if attributes\n    ## do not match up between old and new objects\n    newkeys = null.__dict__.keys()\n    oldkeys = fullj[\"assembly\"].keys()\n\n    ## find shared keys and deprecated keys\n    sharedkeys = set(oldkeys).intersection(set(newkeys))\n    lostkeys = set(oldkeys).difference(set(newkeys))\n\n    ## raise warning if there are lost/deprecated keys\n    if lostkeys:\n        LOGGER.warning(\"\"\"\n    load_json found {a} keys that are unique to the older Assembly.\n        - assembly [{b}] v.[{c}] has: {d}\n        - current assembly is v.[{e}]\n        \"\"\".format(a=len(lostkeys), \n                   b=oldname,\n                   c=fullj[\"assembly\"][\"_version\"],\n                   d=lostkeys,\n                   e=null._version))\n\n    ## load in remaining shared Assembly attributes to null\n    for key in sharedkeys:\n        null.__setattr__(key, fullj[\"assembly\"][key])\n\n    ## load in svd results if they exist\n    try:\n        if fullj[\"assembly\"][\"svd\"]:\n            null.__setattr__(\"svd\", fullj[\"assembly\"][\"svd\"])\n            null.svd = ObjDict(null.svd)\n    except Exception:\n        LOGGER.debug(\"skipping: no svd results present in old assembly\")\n\n    ## Now, load in the Sample objects json dicts\n    sample_names = fullj[\"samples\"].keys()\n    if not sample_names:\n        raise IPyradWarningExit(\"\"\"\n    No samples found in saved assembly. If you are just starting a new\n    assembly the file probably got saved erroneously, so it's safe to try \n    removing the assembly file (e.g., rm {}.json) and restarting.\n\n    If you fully completed step 1 and you see this message you should probably\n    contact the developers.\n    \"\"\".format(inpath))\n        \n    sample_keys = fullj[\"samples\"][sample_names[0]].keys()\n    stats_keys = fullj[\"samples\"][sample_names[0]][\"stats\"].keys()\n    stats_dfs_keys = fullj[\"samples\"][sample_names[0]][\"stats_dfs\"].keys()\n    ind_statkeys = \\\n        [fullj[\"samples\"][sample_names[0]][\"stats_dfs\"][i].keys() \\\n        for i in stats_dfs_keys]\n    ind_statkeys = list(itertools.chain(*ind_statkeys))\n\n    ## check against a null sample\n    nsamp = ip.Sample()\n    newkeys = nsamp.__dict__.keys()\n    newstats = nsamp.__dict__[\"stats\"].keys()\n    newstatdfs = nsamp.__dict__[\"stats_dfs\"].keys()\n    newindstats = [nsamp.__dict__[\"stats_dfs\"][i].keys() for i in newstatdfs]\n    newindstats = list(itertools.chain(*[i.values for i in newindstats]))\n\n    ## different in attributes?\n    diffattr = set(sample_keys).difference(newkeys)\n    diffstats = set(stats_keys).difference(newstats)\n    diffindstats = set(ind_statkeys).difference(newindstats)\n\n    ## Raise warning if any oldstats were lost or deprecated\n    alldiffs = diffattr.union(diffstats).union(diffindstats)\n    if any(alldiffs):\n        LOGGER.warning(\"\"\"\n    load_json found {a} keys that are unique to the older Samples.\n        - assembly [{b}] v.[{c}] has: {d}\n        - current assembly is v.[{e}]\n        \"\"\".format(a=len(alldiffs), \n                   b=oldname,\n                   c=fullj[\"assembly\"][\"_version\"],\n                   d=alldiffs,\n                   e=null._version))\n\n    ## save stats and statsfiles to Samples\n    for sample in null.samples:\n        ## create a null Sample\n        null.samples[sample] = ip.Sample()\n\n        ## save stats\n        sdat = fullj[\"samples\"][sample]['stats']\n        ## Reorder the keys so they ascend by step, only include\n        ## stats that are actually in the sample. newstats is a\n        ## list of the new sample stat names, and stats_keys\n        ## are the names of the stats from the json file.\n        newstats = [x for x in newstats if x in stats_keys]\n        null.samples[sample].stats = pd.Series(sdat).reindex(newstats)\n\n        ## save stats_dfs\n        for statskey in stats_dfs_keys:\n            null.samples[sample].stats_dfs[statskey] = \\\n                pd.Series(fullj[\"samples\"][sample][\"stats_dfs\"][statskey])\\\n                .reindex(nsamp.__dict__[\"stats_dfs\"][statskey].keys())\n\n        ## save Sample files\n        for filehandle in fullj[\"samples\"][sample][\"files\"].keys():\n            null.samples[sample].files[filehandle] = \\\n                fullj[\"samples\"][sample][\"files\"][filehandle]\n\n\n    ## build the Assembly object stats_dfs\n    for statskey in stats_dfs_keys:\n        indstat = null._build_stat(statskey)\n        if not indstat.empty:\n            null.stats_dfs[statskey] = indstat\n\n    ## add remaning attributes to null Samples\n    shared_keys = set(sample_keys).intersection(newkeys)\n    shared_keys.discard(\"stats\")\n    shared_keys.discard(\"files\")    \n    shared_keys.discard(\"stats_files\")\n    shared_keys.discard(\"stats_dfs\")\n\n    for sample in null.samples:\n        ## set the others\n        for key in shared_keys:\n            null.samples[sample].__setattr__(key, fullj[\"samples\"][sample][key])\n\n    ## ensure objects are object dicts\n    null.dirs = ObjDict(null.dirs)\n    null.stats_files = ObjDict(null.stats_files)\n    null.stats_dfs = ObjDict(null.stats_dfs)    \n    null.populations = ObjDict(null.populations)\n    null.outfiles = ObjDict(null.outfiles)\n\n    return null", "response": "Load a json serialized Assembly object and ensure it matches to the current INDRA Assembly object"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nfunctions to encode json string", "response": "def encode(self, obj):\n        \"\"\" function to encode json string\"\"\"\n        def hint_tuples(item):\n            \"\"\" embeds __tuple__ hinter in json strings \"\"\"\n            if isinstance(item, tuple):\n                return {'__tuple__': True, 'items': item}\n            if isinstance(item, list):\n                return [hint_tuples(e) for e in item]\n            if isinstance(item, dict):\n                return {\n                    key: hint_tuples(val) for key, val in item.iteritems()\n                    }\n            else:\n                return item\n\n        return super(Encoder, self).encode(hint_tuples(obj))"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef baba_panel_plot(\n    ttree,\n    tests, \n    boots,\n    show_tip_labels=True, \n    show_test_labels=True, \n    use_edge_lengths=False, \n    collapse_outgroup=False, \n    pct_tree_x=0.4, \n    pct_tree_y=0.2,\n    alpha=3.0,\n    *args, \n    **kwargs):\n    \"\"\"\n    signature...\n    \"\"\"\n\n    ## create Panel plot object and set height & width\n    bootsarr = np.array(boots)\n    panel = Panel(ttree, tests, bootsarr, alpha)\n    if not kwargs.get(\"width\"):\n        panel.kwargs[\"width\"] = min(1000, 50*len(panel.tree))\n    if not kwargs.get(\"height\"):\n        panel.kwargs[\"height\"] = min(1000, 50*len(panel.tests))   \n\n    ## update defaults with kwargs & update size based on ntips & ntests\n    kwargs.update(dict(pct_tree_x=pct_tree_x, pct_tree_y=pct_tree_y))\n    panel.kwargs.update(kwargs)\n\n    ## create a canvas and a single cartesian coord system\n    canvas = toyplot.Canvas(height=panel.kwargs['height'], width=panel.kwargs['width'])\n    axes = canvas.cartesian(bounds=(\"10%\", \"90%\", \"5%\", \"95%\"))    \n    axes.show = False\n\n    ## add panels to axes\n    panel.panel_tree(axes)\n    panel.panel_test(axes)\n    panel.panel_tip_labels(axes)\n    if isinstance(boots, np.ndarray):\n        panel.panel_results(axes)\n    return canvas, axes, panel", "response": "Create a panel plot object and return it"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef depthplot(data, samples=None, dims=(None,None), canvas=(None,None), \n              xmax=50, log=False, outprefix=None, use_maxdepth=False):\n    \"\"\" plots histogram of coverages across clusters\"\"\"\n\n    ## select samples to be plotted, requires depths info\n    if not samples:\n        samples = data.samples.keys()\n        samples.sort()\n    subsamples = OrderedDict([(i, data.samples[i]) for i in samples])\n\n    ## get canvas dimensions based on n-samples\n    if any(dims):\n        ## user-supplied dimensions (...)\n        print(\"userdims\")\n    else:\n        if len(subsamples) <= 4:\n            ## set dimension to N samples \n            dims = (1, len(subsamples))\n        else:\n            dims = (len(subsamples)/4, 4)\n\n    ## create canvas\n    if any(canvas):\n        print(\"usercanvas\")\n        canvas = toyplot.Canvas(width=canvas[0], height=canvas[1])\n    else:\n        canvas = toyplot.Canvas(width=200*dims[1], height=150*dims[0])\n\n    ## get all of the data arrays\n    for panel, sample in enumerate(subsamples):\n        ## statistical called bins\n        statdat = subsamples[sample].depths\n        statdat = statdat[statdat >= data.paramsdict[\"mindepth_statistical\"]]\n        if use_maxdepth:\n            statdat = {i:j for (i, j) in statdat if \\\n                                        i < data.paramsdict[\"maxdepth\"]}\n        sdat = np.histogram(statdat, range(50))\n\n        ## majrule called bins\n        statdat = subsamples[sample].depths\n        statdat = statdat[statdat < data.paramsdict[\"mindepth_statistical\"]]\n        statdat = statdat[statdat >= data.paramsdict[\"mindepth_majrule\"]]\n        if use_maxdepth:\n            statdat = statdat[statdat < data.paramsdict[\"maxdepth\"]]\n        mdat = np.histogram(statdat, range(50))\n\n        ## excluded bins\n        tots = data.samples[sample].depths\n        tots = tots[tots < data.paramsdict[\"mindepth_majrule\"]]\n        if use_maxdepth:\n            tots = tots[tots < data.paramsdict[\"maxdepth\"]]\n        edat = np.histogram(tots, range(50))\n\n        ## fill in each panel of canvas with a sample\n        axes = canvas.cartesian(grid=(dims[0], dims[1], panel), gutter=25)\n        axes.x.domain.xmax = xmax\n        axes.label.text = sample\n        if log:\n            axes.y.scale = \"log\"\n\n        # heights = np.column_stack((sdat,mdat,edat))\n        axes.bars(sdat)\n        axes.bars(edat)\n        axes.bars(mdat)\n\n\n    ## return objects to be saved...\n    if outprefix:\n        toyplot.html.render(canvas, fobj=outprefix+\".html\")\n        toyplot.svg.render(canvas, fobj=outprefix+\".svg\")", "response": "plot histogram of coverages across clusters"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nparsing the 00 outfile as a pandas DataFrame", "response": "def _parse_00(ofile):\n    \"\"\"\n    return 00 outfile as a pandas DataFrame\n    \"\"\"\n    with open(ofile) as infile:\n        ## read in the results summary from the end of the outfile\n        arr = np.array(\n            [\" \"] + infile.read().split(\"Summary of MCMC results\\n\\n\\n\")[1:][0]\\\n            .strip().split())\n\n        ## reshape array \n        rows = 12\n        cols = (arr.shape[0] + 1) / rows\n        arr = arr.reshape(rows, cols)\n        \n        ## make into labeled data frame\n        df = pd.DataFrame(\n            data=arr[1:, 1:], \n            columns=arr[0, 1:], \n            index=arr[1:, 0],\n            ).T\n        return df"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nloads existing results files for an object with this workdir and name.", "response": "def _load_existing_results(self, name, workdir):\n        \"\"\"\n        Load existing results files for an object with this workdir and name. \n        This does NOT reload the parameter settings for the object...\n        \"\"\"\n        ## get mcmcs\n        path = os.path.realpath(os.path.join(self.workdir, self.name))\n        mcmcs = glob.glob(path+\"_r*.mcmc.txt\")\n        outs = glob.glob(path+\"_r*.out.txt\")\n        trees = glob.glob(path+\"_r*.tre\")\n\n        for mcmcfile in mcmcs:\n            if mcmcfile not in self.files.mcmcfiles:\n                self.files.mcmcfiles.append(mcmcfile)\n        for outfile in outs:\n            if outfile not in self.files.outfiles:\n                self.files.outfiles.append(outfile)\n        for tree in trees:\n            if tree not in self.files.treefiles:\n                self.files.treefiles.append(tree)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nsubmitting bpp jobs to run on a cluster (ipyparallel Client). The seed for the random number generator if not set is randomly drawn, and if multiple reps are submitted (nreps>1) then each will draw a subsequent random seeds after that. An ipyclient connection is required. Asynchronous result objects are stored in the bpp object submitting the jobs. Parameters: ----------- nreps (int): submits nreps replicate jobs to the cluster each with a different random seed drawn starting from the starting seed. ipyclient (ipyparallel.Client) an ipyparallel.Client object connected to a running cluster. quiet (bool): whether to print that the jobs have been submitted randomize_order (bool): if True then when maxloci is set this will randomly sample a different set of N loci in each replicate, rather than sampling just the first N loci < maxloci. force (bool): Overwrite existing files with the same name. Default=False, skip over existing files.", "response": "def run(self,\n        ipyclient, \n        nreps=1, \n        quiet=False,\n        randomize_order=False,\n        force=False,\n        ):\n        \"\"\"\n        Submits bpp jobs to run on a cluster (ipyparallel Client). \n        The seed for the random number generator if not set is randomly \n        drawn, and if multiple reps are submitted (nreps>1) then each will \n        draw a subsequent random seeds after that. An ipyclient connection \n        is required. Asynchronous result objects are stored in the bpp \n        object submitting the jobs. \n        \n        Parameters:\n        -----------\n        nreps (int):\n            submits nreps replicate jobs to the cluster each with a different \n            random seed drawn starting from the starting seed. \n        ipyclient (ipyparallel.Client)\n            an ipyparallel.Client object connected to a running cluster. \n        quiet (bool):\n            whether to print that the jobs have been submitted\n        randomize_order (bool):\n            if True then when maxloci is set this will randomly sample a \n            different set of N loci in each replicate, rather than sampling\n            just the first N loci < maxloci. \n        force (bool):\n            Overwrite existing files with the same name. Default=False, skip\n            over existing files.\n        \"\"\"\n\n        ## is this running algorithm 00?\n        is_alg00 = (not self.params.infer_sptree) and (not self.params.infer_delimit)\n\n        ## clear out pre-existing files for this object\n        self.files.mcmcfiles = []\n        self.files.outfiles = []\n        self.files.treefiles = []\n        self.asyncs = []\n\n        ## initiate random seed\n        np.random.seed(self.params.seed)\n\n        ## load-balancer\n        lbview = ipyclient.load_balanced_view()\n\n        ## send jobs\n        for job in xrange(nreps):\n\n            ## make repname and make ctl filename\n            self._name = \"{}_r{}\".format(self.name, job)\n            ctlhandle = os.path.realpath(\n                os.path.join(self.workdir, \"{}.ctl.txt\".format(self._name)))\n\n            ## skip if ctlfile exists\n            if (not force) and (os.path.exists(ctlhandle)):\n                print(\"Named ctl file already exists. Use force=True to\" \\\n                    +\" overwrite\\nFilename:{}\".format(ctlhandle))\n            else:\n                ## change seed and ctl for each rep, this writes into the ctl\n                ## file the correct name for the other files which share the \n                ## same rep number in their names.\n                #self.params._seed = np.random.randint(0, 1e9, 1)[0]\n                self._write_mapfile()\n                #if randomize_order:\n                self._write_seqfile(randomize_order=randomize_order)\n                ctlfile = self._write_ctlfile()\n                \n                ## submit to engines\n                async = lbview.apply(_call_bpp, *(self._kwargs[\"binary\"], ctlfile, is_alg00))\n                self.asyncs.append(async)\n\n                ## save tree file if alg 00\n                if is_alg00:\n                    self.files.treefiles.append(\n                        ctlfile.rsplit(\".ctl.txt\", 1)[0] + \".tre\")\n\n        if self.asyncs and (not quiet):\n            sys.stderr.write(\"submitted {} bpp jobs [{}] ({} loci)\\n\"\\\n                             .format(nreps, self.name, self._nloci))"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nwrite the bpp files for the current job to the working directory.", "response": "def write_bpp_files(self, randomize_order=False, quiet=False):\n        \"\"\" \n        Writes bpp files (.ctl, .seq, .imap) to the working directory. \n\n        Parameters:\n        ------------\n        randomize_order (bool):\n            whether to randomize the locus order, this will allow you to \n            sample different subsets of loci in different replicates when\n            using the filters.maxloci option.\n        quiet (bool):\n            whether to print info to stderr when finished.\n        \"\"\"\n\n        ## remove any old jobs with this same job name\n        self._name = self.name\n        oldjobs = glob.glob(os.path.join(self.workdir, self._name+\"*.ctl.txt\"))\n        for job in oldjobs:\n            os.remove(job)\n\n        ## check params types\n        ## ...\n\n        ## write tmp files for the job\n        self._write_seqfile(randomize_order=randomize_order)\n        self._write_mapfile()#name=True)\n        self._write_ctlfile()\n\n        if not quiet:\n            sys.stderr.write(\"input files created for job {} ({} loci)\\n\"\\\n                             .format(self._name, self._nloci))"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _write_ctlfile(self):#, rep=None):\n\n        ## A string to store ctl info\n        ctl = []\n\n        ## write the top header info\n        ctl.append(\"seed = {}\".format(self.params.seed))\n        ctl.append(\"seqfile = {}\".format(self.seqfile))\n        ctl.append(\"Imapfile = {}\".format(self.mapfile))\n\n        path = os.path.realpath(os.path.join(self.workdir, self._name))\n        mcmcfile = \"{}.mcmc.txt\".format(path)\n        outfile = \"{}.out.txt\".format(path)\n        if mcmcfile not in self.files.mcmcfiles:\n            self.files.mcmcfiles.append(mcmcfile)\n        if outfile not in self.files.outfiles:\n            self.files.outfiles.append(outfile)\n\n        ctl.append(\"mcmcfile = {}\".format(mcmcfile))\n        ctl.append(\"outfile = {}\".format(outfile))\n\n        ## number of loci (checks that seq file exists and parses from there)\n        ctl.append(\"nloci = {}\".format(self._nloci))\n        ctl.append(\"usedata = {}\".format(self.params.usedata))\n        ctl.append(\"cleandata = {}\".format(self.params.cleandata))\n\n        ## infer species tree\n        if self.params.infer_sptree:\n            ctl.append(\"speciestree = 1 0.4 0.2 0.1\")\n        else:\n            ctl.append(\"speciestree = 0\")\n\n        ## infer delimitation (with algorithm 1 by default)\n        ctl.append(\"speciesdelimitation = {} {} {}\"\\\n                   .format(self.params.infer_delimit, \n                           self.params.delimit_alg[0],\n                           \" \".join([str(i) for i in self.params.delimit_alg[1:]]) \n                           )\n                   )\n        ## get tree values\n        nspecies = str(len(self.imap))\n        species = \" \".join(sorted(self.imap))\n        ninds = \" \".join([str(len(self.imap[i])) for i in sorted(self.imap)])\n        ctl.append(SPECIESTREE.format(nspecies, species, ninds, self.tree.write(format=9)))\n\n        ## priors\n        ctl.append(\"thetaprior = {} {}\".format(*self.params.thetaprior))\n        ctl.append(\"tauprior = {} {} {}\".format(*self.params.tauprior))\n\n        ## other values, fixed for now\n        ctl.append(\"finetune = 1: {}\".format(\" \".join([str(i) for i in self.params.finetune])))\n        #CTL.append(\"finetune = 1: 1 0.002 0.01 0.01 0.02 0.005 1.0\")\n        ctl.append(\"print = 1 0 0 0\")\n        ctl.append(\"burnin = {}\".format(self.params.burnin))\n        ctl.append(\"sampfreq = {}\".format(self.params.sampfreq))\n        ctl.append(\"nsample = {}\".format(self.params.nsample))\n\n        ## write out the ctl file\n        ctlhandle = os.path.realpath(\n                \"{}.ctl.txt\".format(os.path.join(self.workdir, self._name)))\n        # if isinstance(rep, int):\n        #     ctlhandle = os.path.realpath(\n        #         \"{}-r{}.ctl.txt\".format(os.path.join(self.workdir, self._name), rep))\n        # else:\n        #     ctlhandle = os.path.realpath(\n        #         \"{}.ctl.txt\".format(os.path.join(self.workdir, self._name)))\n        with open(ctlhandle, 'w') as out:\n            out.write(\"\\n\".join(ctl))\n\n        return ctlhandle", "response": "Write the ctl file."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef copy(self, name, load_existing_results=False):\n\n        ## make deepcopy of self.__dict__ but do not copy async objects\n        subdict = {i:j for i,j in self.__dict__.iteritems() if i != \"asyncs\"}\n        newdict = copy.deepcopy(subdict)\n\n        ## make back into a bpp object\n        if name == self.name:\n            raise Exception(\"new object must have a different 'name' than its parent\")\n        newobj = Bpp(\n            name=name,\n            data=newdict[\"files\"].data,\n            workdir=newdict[\"workdir\"],\n            guidetree=newdict[\"tree\"].write(),\n            imap={i:j for i, j in newdict[\"imap\"].items()},\n            copied=True,\n            load_existing_results=load_existing_results,\n            )\n\n        ## update special dict attributes but not files\n        for key, val in newobj.params.__dict__.iteritems():\n            newobj.params.__setattr__(key, self.params.__getattribute__(key))\n        for key, val in newobj.filters.__dict__.iteritems():\n            newobj.filters.__setattr__(key, self.filters.__getattribute__(key))\n\n        ## new object must have a different name than it's parent\n        return newobj", "response": "Returns a copy of the object with the same parameter settings\n        but with the files. cmcfiles and files. outfiles attributes cleared and with a new name attribute."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nprinting a summarized table of results from replicate runs or a list of separate TCOs.", "response": "def summarize_results(self, individual_results=False):\n        \"\"\" \n        Prints a summarized table of results from replicate runs, or,\n        if individual_result=True, then returns a list of separate\n        dataframes for each replicate run. \n        \"\"\"\n\n        ## return results depending on algorithm\n\n        ## algorithm 00\n        if (not self.params.infer_delimit) & (not self.params.infer_sptree):\n            if individual_results:\n                ## return a list of parsed CSV results\n                return [_parse_00(i) for i in self.files.outfiles]\n            else:\n                ## concatenate each CSV and then get stats w/ describe\n                return pd.concat(\n                    [pd.read_csv(i, sep='\\t', index_col=0) \\\n                    for i in self.files.mcmcfiles]).describe().T\n\n        ## algorithm 01\n        if self.params.infer_delimit & (not self.params.infer_sptree):\n            return _parse_01(self.files.outfiles, individual=individual_results)\n\n        ## others\n        else:\n            return \"summary function not yet ready for this type of result\""}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef persistent_popen_align3(data, samples, chunk):\n\n    ## data are already chunked, read in the whole thing\n    with open(chunk, 'rb') as infile:\n        clusts = infile.read().split(\"//\\n//\\n\")[:-1]    \n\n    ## snames to ensure sorted order\n    samples.sort(key=lambda x: x.name)\n    snames = [sample.name for sample in samples]\n\n    ## make a tmparr to store metadata (this can get huge, consider using h5)\n    maxlen = data._hackersonly[\"max_fragment_length\"] + 20\n    indels = np.zeros((len(samples), len(clusts), maxlen), dtype=np.bool_)\n    duples = np.zeros(len(clusts), dtype=np.bool_)\n\n    ## create a persistent shell for running muscle in. \n    proc = sps.Popen([\"bash\"], \n                    stdin=sps.PIPE, \n                    stdout=sps.PIPE, \n                    universal_newlines=True)\n\n    ## iterate over clusters until finished\n    allstack = []\n    #istack = []    \n    for ldx in xrange(len(clusts)):\n        ## new alignment string for read1s and read2s\n        aligned = []\n        istack = []\n        lines = clusts[ldx].strip().split(\"\\n\")\n        names = lines[::2]\n        seqs = lines[1::2]        \n        align1 = \"\"\n        align2 = \"\"\n\n        ## we don't allow seeds with no hits to make it here, currently\n        #if len(names) == 1:\n        #    aligned.append(clusts[ldx].replace(\">\", \"\").strip())\n\n        ## find duplicates and skip aligning but keep it for downstream.\n        if len(names) != len(set([x.rsplit(\"_\", 1)[0] for x in names])):\n            duples[ldx] = 1\n            istack = [\"{}\\n{}\".format(i[1:], j) for i, j in zip(names, seqs)]\n            #aligned.append(clusts[ldx].replace(\">\", \"\").strip())\n        \n        else:\n            ## append counter to names because muscle doesn't retain order\n            names = [\">{};*{}\".format(j[1:], i) for i, j in enumerate(names)]\n\n            try:\n                ## try to split names on nnnn splitter\n                clust1, clust2 = zip(*[i.split(\"nnnn\") for i in seqs])\n\n                ## make back into strings\n                cl1 = \"\\n\".join(itertools.chain(*zip(names, clust1)))\n                cl2 = \"\\n\".join(itertools.chain(*zip(names, clust2)))\n\n                ## store allele (lowercase) info\n                shape = (len(seqs), max([len(i) for i in seqs]))\n                arrseqs = np.zeros(shape, dtype=\"S1\")\n                for row in range(arrseqs.shape[0]):\n                    seqsrow = seqs[row]\n                    arrseqs[row, :len(seqsrow)] = list(seqsrow)\n                amask = np.char.islower(arrseqs)\n                save_alleles = np.any(amask)\n\n                ## send align1 to the bash shell\n                ## TODO: check for pipe-overflow here and use files for i/o                \n                cmd1 = \"echo -e '{}' | {} -quiet -in - ; echo {}\"\\\n                       .format(cl1, ipyrad.bins.muscle, \"//\")\n                print(cmd1, file=proc.stdin)\n\n                ## read the stdout by line until splitter is reached\n                for line in iter(proc.stdout.readline, \"//\\n\"):\n                    align1 += line\n\n                ## send align2 to the bash shell\n                ## TODO: check for pipe-overflow here and use files for i/o                \n                cmd2 = \"echo -e '{}' | {} -quiet -in - ; echo {}\"\\\n                       .format(cl2, ipyrad.bins.muscle, \"//\")\n                print(cmd2, file=proc.stdin)\n\n                ## read the stdout by line until splitter is reached\n                for line in iter(proc.stdout.readline, \"//\\n\"):\n                    align2 += line\n\n                ## join the aligned read1 and read2 and ensure name order match\n                la1 = align1[1:].split(\"\\n>\")\n                la2 = align2[1:].split(\"\\n>\")\n                dalign1 = dict([i.split(\"\\n\", 1) for i in la1])\n                dalign2 = dict([i.split(\"\\n\", 1) for i in la2])\n                keys = sorted(dalign1.keys(), key=DEREP)\n                keys2 = sorted(dalign2.keys(), key=DEREP)\n\n                ## Make sure R1 and R2 actually exist for each sample. If not\n                ## bail out of this cluster.\n                if not len(keys) == len(keys2):\n                    LOGGER.error(\"R1 and R2 results differ in length: \"\\\n                                    + \"\\nR1 - {}\\nR2 - {}\".format(keys, keys2))\n                    continue\n\n                ## impute allele (lowercase) info back into alignments\n                for kidx, key in enumerate(keys):\n                    concatseq = dalign1[key].replace(\"\\n\", \"\")+\\\n                                \"nnnn\"+dalign2[key].replace(\"\\n\", \"\")\n\n                    ## impute alleles\n                    if save_alleles:\n                        newmask = np.zeros(len(concatseq), dtype=np.bool_)                        \n                        ## check for indels and impute to amask\n                        indidx = np.where(np.array(list(concatseq)) == \"-\")[0]\n                        if indidx.size:\n                            allrows = np.arange(amask.shape[1])\n                            mask = np.ones(allrows.shape[0], dtype=np.bool_)\n                            for idx in indidx:\n                                if idx < mask.shape[0]:\n                                    mask[idx] = False\n                            not_idx = allrows[mask == 1]\n                            ## fill in new data into all other spots\n                            newmask[not_idx] = amask[kidx, :not_idx.shape[0]]\n                        else:\n                            newmask = amask[kidx]\n                        \n                        ## lower the alleles\n                        concatarr = np.array(list(concatseq))\n                        concatarr[newmask] = np.char.lower(concatarr[newmask])\n                        concatseq = concatarr.tostring()\n                        #LOGGER.info(concatseq)\n                        \n                    ## fill list with aligned data\n                    aligned.append(\"{}\\n{}\".format(key, concatseq))\n\n                ## put into a dict for writing to file\n                #aligned = []\n                #for key in keys:\n                #    aligned.append(\"\\n\".join(\n                #        [key, \n                #         dalign1[key].replace(\"\\n\", \"\")+\"nnnn\"+\\\n                #         dalign2[key].replace(\"\\n\", \"\")]))\n            except IndexError as inst:\n                LOGGER.debug(\"Error in PE - ldx: {}\".format())\n                LOGGER.debug(\"Vars: {}\".format(dict(globals(), **locals())))\n                raise\n\n            except ValueError:\n                ## make back into strings\n                cl1 = \"\\n\".join([\"\\n\".join(i) for i in zip(names, seqs)])                \n\n                ## store allele (lowercase) info\n                shape = (len(seqs), max([len(i) for i in seqs]))\n                arrseqs = np.zeros(shape, dtype=\"S1\")\n                for row in range(arrseqs.shape[0]):\n                    seqsrow = seqs[row]\n                    arrseqs[row, :len(seqsrow)] = list(seqsrow)\n                amask = np.char.islower(arrseqs)\n                save_alleles = np.any(amask)\n\n                ## send align1 to the bash shell (TODO: check for pipe-overflow)\n                cmd1 = \"echo -e '{}' | {} -quiet -in - ; echo {}\"\\\n                       .format(cl1, ipyrad.bins.muscle, \"//\")\n                print(cmd1, file=proc.stdin)\n\n                ## read the stdout by line until splitter is reached\n                for line in iter(proc.stdout.readline, \"//\\n\"):\n                    align1 += line\n\n                ## ensure name order match\n                la1 = align1[1:].split(\"\\n>\")\n                dalign1 = dict([i.split(\"\\n\", 1) for i in la1])\n                keys = sorted(dalign1.keys(), key=DEREP)\n\n                ## put into dict for writing to file\n                for kidx, key in enumerate(keys):\n                    concatseq = dalign1[key].replace(\"\\n\", \"\")\n                    ## impute alleles\n                    if save_alleles:\n                        newmask = np.zeros(len(concatseq), dtype=np.bool_)                        \n                        ## check for indels and impute to amask\n                        indidx = np.where(np.array(list(concatseq)) == \"-\")[0]\n                        if indidx.size:\n                            allrows = np.arange(amask.shape[1])\n                            mask = np.ones(allrows.shape[0], dtype=np.bool_)\n                            for idx in indidx:\n                                if idx < mask.shape[0]:\n                                    mask[idx] = False\n                            not_idx = allrows[mask == 1]\n                            ## fill in new data into all other spots\n                            newmask[not_idx] = amask[kidx, :not_idx.shape[0]]\n                        else:\n                            newmask = amask[kidx]\n                        \n                        ## lower the alleles\n                        concatarr = np.array(list(concatseq))\n                        concatarr[newmask] = np.char.lower(concatarr[newmask])\n                        concatseq = concatarr.tostring()\n\n                    ## fill list with aligned data\n                    aligned.append(\"{}\\n{}\".format(key, concatseq))\n                ## put aligned locus in list\n                #aligned.append(\"\\n\".join(inner_aligned))\n\n            ## enforce maxlen on aligned seqs\n            aseqs = np.vstack([list(i.split(\"\\n\")[1]) for i in aligned])\n            LOGGER.info(\"\\naseqs here: %s\", aseqs)\n\n            ## index names by snames order\n            sidxs = [snames.index(key.rsplit(\"_\", 1)[0]) for key in keys]\n            thislen = min(maxlen, aseqs.shape[1])\n            for idx in xrange(aseqs.shape[0]):\n                ## enter into stack\n                newn = aligned[idx].split(\";\", 1)[0]\n                #newn = key[idx].split(\";\", 1)[0]\n                istack.append(\"{}\\n{}\".format(newn, aseqs[idx, :thislen].tostring()))\n                ## name index in sorted list (indels order)\n                sidx = sidxs[idx]\n                indels[sidx, ldx, :thislen] = aseqs[idx, :thislen] == \"-\"\n\n        if istack:\n            allstack.append(\"\\n\".join(istack))\n            #LOGGER.debug(\"\\n\\nSTACK (%s)\\n%s\\n\", duples[ldx], \"\\n\".join(istack))\n\n    ## cleanup\n    proc.stdout.close()\n    if proc.stderr:\n        proc.stderr.close()\n    proc.stdin.close()\n    proc.wait()\n\n    #LOGGER.info(\"\\n\\nALLSTACK %s\\n\", \"\\n\".join(i) for i in allstack[:5]])\n\n    ## write to file after\n    odx = chunk.rsplit(\"_\")[-1]\n    alignfile = os.path.join(data.tmpdir, \"align_{}.fa\".format(odx))\n    with open(alignfile, 'wb') as outfile:\n        outfile.write(\"\\n//\\n//\\n\".join(allstack)+\"\\n\")\n        os.remove(chunk)\n\n    ## save indels array to tmp dir\n    ifile = os.path.join(data.tmpdir, \"indels_{}.tmp.npy\".format(odx))\n    np.save(ifile, indels)\n    dfile = os.path.join(data.tmpdir, \"duples_{}.tmp.npy\".format(odx))\n    np.save(dfile, duples)", "response": "aligns the samples in chunk with the next cluster"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef multi_muscle_align(data, samples, ipyclient):\n    LOGGER.info(\"starting alignments\")\n\n    ## get client\n    lbview = ipyclient.load_balanced_view()\n    start = time.time()\n    printstr = \" aligning clusters     | {} | s6 |\"\n    elapsed = datetime.timedelta(seconds=int(time.time()-start))\n    progressbar(20, 0, printstr.format(elapsed), spacer=data._spacer)\n\n    ## submit clustbits as jobs to engines. The chunkfiles are removed when they\n    ## are finished so this job can even be restarted if it was half finished, \n    ## though that is probably rare. \n    path = os.path.join(data.tmpdir, data.name + \".chunk_*\")\n    clustbits = glob.glob(path)\n    jobs = {}\n    for idx in xrange(len(clustbits)):\n        args = [data, samples, clustbits[idx]]\n        jobs[idx] = lbview.apply(persistent_popen_align3, *args)\n    allwait = len(jobs)\n    elapsed = datetime.timedelta(seconds=int(time.time()-start))\n    progressbar(20, 0, printstr.format(elapsed), spacer=data._spacer)\n\n    ## print progress while bits are aligning\n    while 1:\n        finished = [i.ready() for i in jobs.values()]\n        fwait = sum(finished)\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))\n        progressbar(allwait, fwait, printstr.format(elapsed), spacer=data._spacer)\n        time.sleep(0.1)\n        if all(finished):\n            break\n\n    ## check for errors in muscle_align_across\n    keys = jobs.keys()\n    for idx in keys:\n        if not jobs[idx].successful():\n            LOGGER.error(\"error in persistent_popen_align %s\", jobs[idx].exception())\n            raise IPyradWarningExit(\"error in step 6 {}\".format(jobs[idx].exception()))\n        del jobs[idx]\n    print(\"\")", "response": "This function is used to align the cluster bits with muscle."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef concatclusts(outhandle, alignbits):\n    with gzip.open(outhandle, 'wb') as out:\n        for fname in alignbits:\n            with open(fname) as infile:\n                out.write(infile.read()+\"//\\n//\\n\")", "response": "concatenates sorted aligned cluster tmpfiles removes them"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nbuild the indels array and catclust. gz file from the aligned clusters.", "response": "def build_indels(data, samples, ipyclient):\n    \"\"\"\n    Builds the indels array and catclust.gz file from the aligned clusters.\n    Building catclust is very fast. Entering indels into h5 array is a bit\n    slow but can probably be sped up. (todo). NOT currently parallelized.\n    \"\"\"\n\n    ## progress bars\n    lbview = ipyclient.load_balanced_view()\n    start = time.time()\n    printstr = \" database indels       | {} | s6 |\"\n    njobs = len(glob.glob(os.path.join(data.tmpdir, \"align_*.fa\"))) + 1\n\n    ## build tmparrs\n    async = lbview.apply(build_tmp_h5, *(data, samples))\n\n    ## track progress\n    while 1:\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))\n        ready = bool(async.ready())\n        progressbar(njobs, ready, printstr.format(elapsed), spacer=data._spacer)\n        if ready:\n            break\n        else:\n            time.sleep(0.1)\n\n    ## check for errors\n    if not async.successful():\n        raise IPyradWarningExit(async.result())\n\n    ## start subfunc\n    async = lbview.apply(sub_build_indels, *(data, samples))\n    \n    prog = 1\n    while 1:\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))\n        if async.stdout:\n            prog = int(async.stdout.split()[-1])+1\n        progressbar(njobs, prog, printstr.format(elapsed), spacer=data._spacer)\n        if async.ready():\n            break\n        else:\n            time.sleep(0.1)\n\n    ## check for errors\n    if not async.successful():\n        raise IPyradWarningExit(async.result())\n    print(\"\")\n\n    ## prepare for next substep by removing the singlecat result files if \n    ## they exist. \n    snames = [i.name for i in samples]\n    snames.sort()\n    smpios = [os.path.join(data.dirs.across, i+'.tmp.h5') for i in snames]\n    for smpio in smpios:\n        if os.path.exists(smpio):\n            os.remove(smpio)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef sub_build_indels(data, samples):\n\n    ## get file handles\n    indelfiles = glob.glob(os.path.join(data.tmpdir, \"indels_*.tmp.npy\"))\n    alignbits = glob.glob(os.path.join(data.tmpdir, \"align_*.fa\"))\n\n    ## sort into input order by chunk names\n    indelfiles.sort(key=lambda x: int(x.rsplit(\"_\", 1)[-1][:-8]))\n    alignbits.sort(key=lambda x: int(x.rsplit(\"_\", 1)[-1][:-3]))\n    LOGGER.info(\"indelfiles %s\", indelfiles)\n    LOGGER.info(\"alignbits %s\", alignbits)\n    chunksize = int(indelfiles[0].rsplit(\"_\", 1)[-1][:-8])\n\n    ## concatenate finished seq clusters into a tmp file\n    outhandle = os.path.join(data.dirs.across, data.name+\"_catclust.gz\")\n    concatclusts(outhandle, alignbits)\n\n    ## get dims for full indel array\n    maxlen = data._hackersonly[\"max_fragment_length\"] + 20\n    nloci = get_nloci(data)\n    LOGGER.info(\"maxlen inside build is %s\", maxlen)\n    LOGGER.info(\"nloci for indels %s\", nloci)\n\n    ## INIT TEMP INDEL ARRAY\n    ## build an indel array for ALL loci in cat.clust.gz,\n    ## chunked so that individual samples can be pulled out\n    ipath = os.path.join(data.dirs.across, data.name+\".tmp.indels.hdf5\")\n    with h5py.File(ipath, 'w') as io5:\n        iset = io5.create_dataset(\n            \"indels\",\n            shape=(len(samples), nloci, maxlen),\n            dtype=np.bool_,\n            chunks=(1, chunksize, maxlen))\n\n        ## again make sure names are ordered right\n        samples.sort(key=lambda x: x.name)\n\n        #iset.attrs[\"chunksize\"] = (1, data.nloci, maxlen)\n        iset.attrs[\"samples\"] = [i.name for i in samples]\n\n        ## enter all tmpindel arrays into full indel array\n        done = 0\n        init = 0\n        for indf in indelfiles:\n            end = int(indf.rsplit(\"_\", 1)[-1][:-8])\n            inarr = np.load(indf)\n            LOGGER.info('inarr shape %s', inarr.shape)\n            LOGGER.info('iset shape %s', iset.shape)\n            iset[:, init:end, :] = inarr[:, :end-init]\n            init += end-init\n            done += 1\n            print(done)", "response": "sub func in build_indels"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ndistribute 'cluster()' function to an ipyclient to make sure it runs on a high memory node.", "response": "def call_cluster(data, noreverse, ipyclient):\n    \"\"\"\n    distributes 'cluster()' function to an ipyclient to make sure it runs\n    on a high memory node. \n    \"\"\"\n    ## Find host with the most engines, for now just using first.\n    lbview = ipyclient.load_balanced_view()\n\n    ## request engine data, skips busy engines.    \n    asyncs = {}\n    for eid in ipyclient.ids:\n        engine = ipyclient[eid]\n        if not engine.outstanding:\n            asyncs[eid] = engine.apply(socket.gethostname)\n    ## get results\n    hosts = {}\n    for key in asyncs:\n        hosts[key] = asyncs[key].get()\n    ## count them\n    results = {}\n    for eid, hostname in hosts.items():\n        if hostname in results:\n            results[hostname].append(eid)\n        else:\n            results[hostname] = [eid] \n\n    ## which is largest\n    hosts = sorted(results.items(), key=lambda x: len(x[1]), reverse=True)\n    _, eids = hosts[0]\n    bighost = ipyclient[eids[0]]\n\n    ## nthreads is len eids, or ipcluster.threads, unless ipcluster.threads \n    ## is really small, then we assume threads should not apply here.\n    ##    ipyrad -p params.txt -s 6 -c 20 would give:\n    ##    min(20, max(2, 10)) = 8\n    ## while \n    ##    ipyrad -p params.txt -s 6 -c 20 -t 4 would give:\n    ##    min(20, max(4, 10)) = 10\n    ## and \n    ##    ipyrad -p params.txt -s 6 -c 20 -t 15 would give:\n    ##    min(20, max(15, 10)) = 15\n    ## and\n    ##    ipyrad -p params.txt -s 6 -c 16 --MPI (on 2 X 8-core nodes) would give:\n    ##    min(8, max(2, 10)) = 8\n    nthreads = min(len(eids), max(data._ipcluster[\"threads\"], 10))\n\n    ## submit job to the host with the most\n    async = bighost.apply(cluster, *(data, noreverse, nthreads))\n    #async = lbview.apply(cluster, *(data, noreverse, nthreads))\n    \n    ## track progress\n    prog = 0\n    start = time.time()\n    printstr = \" clustering across     | {} | s6 |\"\n    \n    while 1:\n        if async.stdout:\n            prog = int(async.stdout.split()[-1])\n        elapsed = datetime.timedelta(seconds=int(time.time() - start))\n        progressbar(100, prog, printstr.format(elapsed), spacer=data._spacer)\n        if async.ready():\n            progressbar(100, prog, printstr.format(elapsed), spacer=data._spacer)\n            print(\"\")\n            break\n        else:\n            time.sleep(0.5)\n\n    ## store log result\n    ipyclient.wait()\n    data.stats_files.s6 = os.path.join(data.dirs.across, \"s6_cluster_stats.txt\")"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef cluster(data, noreverse, nthreads):\n\n    ## input and output file handles\n    cathaplos = os.path.join(data.dirs.across, data.name+\"_catshuf.tmp\")\n    uhaplos = os.path.join(data.dirs.across, data.name+\".utemp\")\n    hhaplos = os.path.join(data.dirs.across, data.name+\".htemp\")\n    logfile = os.path.join(data.dirs.across, \"s6_cluster_stats.txt\")\n\n    ## parameters that vary by datatype\n    ## (too low of cov values yield too many poor alignments)\n    strand = \"plus\"\n    cov = 0.75    ##0.90\n    if data.paramsdict[\"datatype\"] in [\"gbs\", \"2brad\"]:\n        strand = \"both\"\n        cov = 0.60\n    elif data.paramsdict[\"datatype\"] == \"pairgbs\":\n        strand = \"both\"\n        cov = 0.75   ##0.90\n\n    ## nthreads is calculated in 'call_cluster()'\n    cmd = [ipyrad.bins.vsearch,\n           \"-cluster_smallmem\", cathaplos,\n           \"-strand\", strand,\n           \"-query_cov\", str(cov),\n           \"-minsl\", str(0.5),\n           \"-id\", str(data.paramsdict[\"clust_threshold\"]),\n           \"-userout\", uhaplos,\n           \"-notmatched\", hhaplos,\n           \"-userfields\", \"query+target+qstrand\",\n           \"-maxaccepts\", \"1\",\n           \"-maxrejects\", \"0\",\n           \"-fasta_width\", \"0\",\n           \"-threads\", str(nthreads), #\"0\",\n           \"-fulldp\",\n           \"-usersort\",\n           \"-log\", logfile]\n\n    ## override reverse clustering option\n    if noreverse:\n        strand = \"plus\"  # -leftjust \"\n\n    try:\n        ## this seems to start vsearch on a different pid than the engine\n        ## and so it's hard to kill... \n        LOGGER.info(cmd)\n        (dog, owner) = pty.openpty()\n        proc = sps.Popen(cmd, stdout=owner, stderr=owner, close_fds=True)\n                                     \n        prog = 0\n        newprog = 0\n        while 1:\n            isdat = select.select([dog], [], [], 0)\n            if isdat[0]:\n                dat = os.read(dog, 80192)\n            else:\n                dat = \"\"\n            if \"Clustering\" in dat:\n                try:\n                    newprog = int(dat.split()[-1][:-1])\n                ## may raise value error when it gets to the end\n                except ValueError:\n                    pass\n\n            ## break if done\n            ## catches end chunk of printing if clustering went really fast\n            elif \"Clusters:\" in dat:\n                LOGGER.info(\"ended vsearch tracking loop\")\n                break\n            else:\n                time.sleep(0.1)\n            ## print progress\n            if newprog != prog:\n                print(newprog)\n                prog = newprog\n\n        ## another catcher to let vsearch cleanup after clustering is done\n        proc.wait()\n        print(100)\n\n\n    except KeyboardInterrupt:\n        LOGGER.info(\"interrupted vsearch here: %s\", proc.pid)\n        os.kill(proc.pid, 2)\n        raise KeyboardInterrupt()\n    except sps.CalledProcessError as inst:\n        raise IPyradWarningExit(\"\"\"\n        Error in vsearch: \\n{}\\n{}\"\"\".format(inst, sps.STDOUT))\n    except OSError as inst:\n        raise IPyradWarningExit(\"\"\"\n        Failed to allocate pty: \\n{}\"\"\".format(inst))\n\n    finally:\n        data.stats_files.s6 = logfile", "response": "Calls vsearch for clustering across samples."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef build_h5_array(data, samples, nloci):\n\n    ## sort to ensure samples will be in alphabetical order, tho they should be.\n    samples.sort(key=lambda x: x.name)\n\n    ## get maxlen dim\n    maxlen = data._hackersonly[\"max_fragment_length\"] + 20\n    LOGGER.info(\"maxlen inside build_h5_array is %s\", maxlen)\n    LOGGER.info(\"nloci inside build_h5_array is %s\", nloci)\n\n    ## open new h5 handle\n    data.clust_database = os.path.join(data.dirs.across, data.name+\".clust.hdf5\")\n    io5 = h5py.File(data.clust_database, 'w')\n\n    ## chunk to approximately 2 chunks per core\n    chunks = ((nloci // (data.cpus*2)) + (nloci % (data.cpus*2)))\n\n    ## Number of elements in hdf5 chunk may not exceed 500MB\n    ## This is probably not actually optimal, to have such\n    ## enormous chunk sizes, could probably explore efficiency\n    ## of smaller chunk sizes on very very large datasets\n    chunklen = chunks * len(samples) * maxlen * 4\n    while chunklen > int(500e6):\n        chunks = (chunks // 2) + (chunks % 2)\n        chunklen = chunks * len(samples) * maxlen * 4\n    LOGGER.info(\"chunks in build_h5_array: %s\", chunks)\n\n    data.chunks = chunks\n    LOGGER.info(\"nloci is %s\", nloci)\n    LOGGER.info(\"chunks is %s\", data.chunks)\n\n    ## INIT FULL CATG ARRAY\n    ## store catgs with a .10 loci chunk size\n    supercatg = io5.create_dataset(\"catgs\", (nloci, len(samples), maxlen, 4),\n                                    dtype=np.uint32,\n                                    chunks=(chunks, 1, maxlen, 4),\n                                    compression=\"gzip\")\n    superseqs = io5.create_dataset(\"seqs\", (nloci, len(samples), maxlen),\n                                    dtype=\"|S1\",\n                                    #dtype=np.uint8,\n                                    chunks=(chunks, len(samples), maxlen),\n                                    compression='gzip')\n    superalls = io5.create_dataset(\"nalleles\", (nloci, len(samples)),\n                                    dtype=np.uint8,\n                                    chunks=(chunks, len(samples)),\n                                    compression=\"gzip\")\n    superchroms = io5.create_dataset(\"chroms\", (nloci, 3), \n                                     dtype=np.int64, \n                                     chunks=(chunks, 3),\n                                     compression=\"gzip\")\n\n    ## allele count storage\n    supercatg.attrs[\"chunksize\"] = (chunks, 1, maxlen, 4)\n    supercatg.attrs[\"samples\"] = [i.name for i in samples]\n    superseqs.attrs[\"chunksize\"] = (chunks, len(samples), maxlen)\n    superseqs.attrs[\"samples\"] = [i.name for i in samples]\n    superalls.attrs[\"chunksize\"] = (chunks, len(samples))\n    superalls.attrs[\"samples\"] = [i.name for i in samples]\n    superchroms.attrs[\"chunksize\"] = (chunks, len(samples))\n    superchroms.attrs[\"samples\"] = [i.name for i in samples]\n\n    ## array for pair splits locations, dup and ind filters\n    io5.create_dataset(\"splits\", (nloci, ), dtype=np.uint16)\n    io5.create_dataset(\"duplicates\", (nloci, ), dtype=np.bool_)\n\n    ## close the big boy\n    io5.close()", "response": "Build a new h5 array for pre - filtered loci."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef fill_dups_arr(data):\n    ## build the duplicates array\n    duplefiles = glob.glob(os.path.join(data.tmpdir, \"duples_*.tmp.npy\"))\n    duplefiles.sort(key=lambda x: int(x.rsplit(\"_\", 1)[-1][:-8]))\n\n    ## enter the duplicates filter into super h5 array\n    io5 = h5py.File(data.clust_database, 'r+')\n    dfilter = io5[\"duplicates\"]\n\n    ## enter all duple arrays into full duplicates array\n    init = 0\n    for dupf in duplefiles:\n        end = int(dupf.rsplit(\"_\", 1)[-1][:-8])\n        inarr = np.load(dupf)\n        dfilter[init:end] = inarr\n        init += end-init\n        #os.remove(dupf)\n    #del inarr\n\n    ## continued progress bar\n    LOGGER.info(\"all duplicates: %s\", dfilter[:].sum())\n    io5.close()", "response": "fills the duplicates array from the multi_muscle_align tmp files\n   "}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef build_tmp_h5(data, samples):\n    ## get samples and names, sorted\n    snames = [i.name for i in samples]\n    snames.sort()\n\n    ## Build an array for quickly indexing consens reads from catg files.\n    ## save as a npy int binary file.\n    uhandle = os.path.join(data.dirs.across, data.name+\".utemp.sort\")\n    bseeds = os.path.join(data.dirs.across, data.name+\".tmparrs.h5\")\n\n    ## send as first async1 job\n    get_seeds_and_hits(uhandle, bseeds, snames)", "response": "build tmp h5 arrays that can return quick access for nloci"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef get_nloci(data):\n    bseeds = os.path.join(data.dirs.across, data.name+\".tmparrs.h5\")\n    with h5py.File(bseeds) as io5:\n        return io5[\"seedsarr\"].shape[0]", "response": "return nloci from the tmp h5 arr"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nbuild a array of ints from the utemp. sort file and creates a array of ints from the utemp. sort file.", "response": "def get_seeds_and_hits(uhandle, bseeds, snames):\n    \"\"\"\n    builds a seeds and hits (uarr) array of ints from the utemp.sort file.\n    Saves outputs to files ...\n    \"\"\"\n    ## Get max name length. Allow for trailing _ + up to 9 digits\n    ## of numbers of loci (an astronomical number of unique loci)\n    maxlen_names = np.max(map(len, snames)) + 10\n    ## read in the utemp.sort file\n    updf = np.loadtxt(uhandle, dtype=\"S\".format(maxlen_names))\n\n    ## Get seeds for all matches from usort\n    seeds = np.unique(updf[:, 1])\n    seedsarr = np.column_stack([\n                    np.arange(len(seeds)),\n                   [i.rsplit(\"_\", 1)[0] for i in seeds],\n                   [i.rsplit(\"_\", 1)[1] for i in seeds]])\n    seedsarr[:, 1] = [snames.index(i) for i in seedsarr[:, 1]]\n    seedsarr = seedsarr.astype(np.int64)\n    LOGGER.info(\"got a seedsarr %s\", seedsarr.shape)\n\n    ## Get matches from usort and create an array for fast entry\n    uarr = np.zeros((updf.shape[0], 3), dtype=np.int64)\n    idx = -1\n    lastloc = None\n    for ldx in xrange(updf.shape[0]):\n        tloc = updf[ldx, 1]\n        if tloc != lastloc:\n            idx += 1\n        uarr[ldx, 0] = idx\n        lastloc = tloc\n    ## create a column with sample index\n    uarr[:, 1] = [int(snames.index(i.rsplit(\"_\", 1)[0])) for i in updf[:, 0]]\n    ## create a column with only consens index for sample\n    uarr[:, 2] = [int(i.rsplit(\"_\", 1)[1]) for i in updf[:, 0]]\n    uarr = uarr.astype(np.int64)\n    LOGGER.info(\"got a uarr %s\", uarr.shape)\n\n    ## save as h5 to we can grab by sample slices\n    with h5py.File(bseeds, 'w') as io5:\n        io5.create_dataset(\"seedsarr\", data=seedsarr, dtype=np.int64)\n        io5.create_dataset(\"uarr\", data=uarr, dtype=np.int64)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ncreate a new multicat dataset.", "response": "def new_multicat(data, samples, ipyclient):\n    \"\"\"\n    Calls 'singlecat()' for all samples to build index files.\n    \"\"\"\n\n    ## track progress\n    LOGGER.info(\"in the multicat\")\n    start = time.time()\n    printstr = \" indexing clusters     | {} | s6 |\"\n\n    ## Build the large h5 array. This will write a new HDF5 file and overwrite\n    ## existing data. \n    nloci = get_nloci(data)\n    build_h5_array(data, samples, nloci)\n\n    ## parallel client (reserve engine 0 for data entry), if/else here in case\n    ## user has only one engine.\n    if len(ipyclient) > 1:\n        filler = ipyclient.load_balanced_view(targets=[0])\n        smallview = ipyclient.load_balanced_view(targets=ipyclient.ids[1::2])\n    else:\n        filler = ipyclient.load_balanced_view(targets=[0])\n        smallview = ipyclient.load_balanced_view(targets=[0])                \n\n    ## First submit a sleeper job as temp_flag for cleanups\n    last_sample = 0\n    cleanups = {}\n    cleanups[last_sample] = filler.apply(time.sleep, 0.0)\n\n    ## fill the duplicates filter array\n    async = smallview.apply(fill_dups_arr, data)\n    while 1:\n        elapsed = datetime.timedelta(seconds=int(time.time() - start))\n        progressbar(20, 0, printstr.format(elapsed), spacer=data._spacer)\n        time.sleep(0.1)\n        if async.ready():\n            break\n    if not async.successful():\n        raise IPyradWarningExit(async.result())\n\n    ## Get all existing .tmp.h5 files. If files exist then assume that we are\n    ## restarting an interrupted job. We need to check for each one whether it \n    ## has it finished being built, and whether it has been written to the \n    ## large array yet.\n    snames = [i.name for i in samples]\n    snames.sort()\n    smpios = {i:os.path.join(data.dirs.across, i+'.tmp.h5') for i in snames}\n\n    ## send 'singlecat()' jobs to engines\n    bseeds = os.path.join(data.dirs.across, data.name+\".tmparrs.h5\")\n    jobs = {}\n    for sample in samples:\n        sidx = snames.index(sample.name)\n        args = (data, sample, bseeds, sidx, nloci)\n\n        ## Only build it if it doesn't already exist. Singlecat removes\n        ## unfinished files if interrupted, so .tmp.h5 should not exist\n        ## unless the file is ready to be entered. \n        if not os.path.exists(smpios[sample.name]):\n            jobs[sample.name] = smallview.apply(singlecat, *args)\n\n    ## track progress of singlecat jobs and submit writing jobs for finished\n    ## singlecat files (.tmp.h5).\n    alljobs = len(jobs)\n    while 1:\n        ## check for finished jobs\n        curkeys = jobs.keys()\n        for key in curkeys:\n            async = jobs[key]\n            if async.ready():\n                if async.successful():\n                    ## submit cleanup for finished job\n                    args = (data, data.samples[key], snames.index(key))\n                    with filler.temp_flags(after=cleanups[last_sample]):\n                        cleanups[key] = filler.apply(write_to_fullarr, *args)\n                        last_sample = key\n                        del jobs[key]\n                else:\n                    if not async.successful():\n                        raise IPyradWarningExit(async.result())\n\n        ## print progress or break\n        elapsed = datetime.timedelta(seconds=int(time.time() - start))\n        progressbar(alljobs, alljobs-len(jobs), printstr.format(elapsed), spacer=data._spacer)\n        time.sleep(0.1)\n        if not jobs:\n            break        \n\n    ## add the dask_chroms func for reference data\n    if 'reference' in data.paramsdict[\"assembly_method\"]:\n        with filler.temp_flags(after=cleanups.values()):\n            cleanups['ref'] = filler.apply(dask_chroms, *(data, samples))\n\n    ## ------- print breakline between indexing and writing database ---------\n    print(\"\")\n\n    ## track progress of databaseing\n    start = time.time()\n    printstr = \" building database     | {} | s6 |\"\n    while 1:\n        finished = [i for i in cleanups.values() if i.ready()]\n        elapsed = datetime.timedelta(seconds=int(time.time() - start))\n        progressbar(len(cleanups), len(finished), printstr.format(elapsed), spacer=data._spacer)\n        time.sleep(0.1)\n        ## break if one failed, or if finished\n        if not all([i.successful() for i in finished]):\n            break\n        if len(cleanups) == len(finished):\n            break\n\n    ## check for errors\n    for job in cleanups:\n        if cleanups[job].ready():\n            if not cleanups[job].successful():\n                raise IPyradWarningExit((job, cleanups[job].result()))"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nrun a singlecat and cleanup jobs for each sample.", "response": "def multicat(data, samples, ipyclient):\n    \"\"\"\n    Runs singlecat and cleanup jobs for each sample.\n    For each sample this fills its own hdf5 array with catg data & indels.\n    This is messy, could use simplifiying.\n    \"\"\"\n\n    ## progress ticker\n    start = time.time()\n    printstr = \" indexing clusters     | {} | s6 |\"\n    elapsed = datetime.timedelta(seconds=int(time.time() - start))\n    progressbar(20, 0, printstr.format(elapsed))\n\n    ## parallel client\n    lbview = ipyclient.load_balanced_view()\n\n    ## First submit a sleeper job as temp_flag for cleanups\n    last_sample = 0\n    cleanups = {}\n    cleanups[last_sample] = lbview.apply(time.sleep, 0.0)\n\n    ## get samples and names, sorted\n    snames = [i.name for i in samples]\n    snames.sort()\n\n    ## Build an array for quickly indexing consens reads from catg files.\n    ## save as a npy int binary file.\n    uhandle = os.path.join(data.dirs.across, data.name+\".utemp.sort\")\n    bseeds = os.path.join(data.dirs.across, data.name+\".tmparrs.h5\")\n\n    ## send as first async1 job\n    async1 = lbview.apply(get_seeds_and_hits, *(uhandle, bseeds, snames))\n    async2 = lbview.apply(fill_dups_arr, data)\n\n    ## progress bar for seed/hit sorting\n    while not (async1.ready() and async2.ready()):\n        elapsed = datetime.timedelta(seconds=int(time.time() - start))\n        progressbar(20, 0, printstr.format(elapsed))\n        time.sleep(0.1)\n    if not async1.successful():\n        raise IPyradWarningExit(\"error in get_seeds: %s\", async1.exception())\n    if not async2.successful():\n        raise IPyradWarningExit(\"error in fill_dups: %s\", async2.exception())\n\n    ## make a limited njobs view based on mem limits \n    ## is using smallview necessary? (yes, it is for bad libraries)\n    smallview = ipyclient.load_balanced_view(targets=ipyclient.ids[::2])\n\n    ## make sure there are no old tmp.h5 files \n    smpios = [os.path.join(data.dirs.across, sample.name+'.tmp.h5') \\\n              for sample in samples]\n    for smpio in smpios:\n        if os.path.exists(smpio):\n            os.remove(smpio)\n\n    ## send 'singlecat()' jobs to engines\n    jobs = {}\n    for sample in samples:\n        sidx = snames.index(sample.name)\n        jobs[sample.name] = smallview.apply(singlecat, *(data, sample, bseeds, sidx))\n\n    ## check for finished and submit disk-writing job when finished\n    alljobs = len(jobs)\n    while 1:\n        ## check for finished jobs\n        curkeys = jobs.keys()\n        for key in curkeys:\n            async = jobs[key]\n            if async.ready():\n                if async.successful():\n                    ## submit cleanup for finished job\n                    args = (data, data.samples[key], snames.index(key))\n                    with lbview.temp_flags(after=cleanups[last_sample]):\n                        cleanups[key] = lbview.apply(write_to_fullarr, *args)\n                    last_sample = key\n                    del jobs[key]\n                else:\n                    err = jobs[key].exception()\n                    errmsg = \"singlecat error: {} {}\".format(key, err)\n                    raise IPyradWarningExit(errmsg)\n        ## print progress or break\n        elapsed = datetime.timedelta(seconds=int(time.time() - start))\n        progressbar(alljobs, alljobs-len(jobs), printstr.format(elapsed))\n        time.sleep(0.1)\n        if not jobs:\n            break\n\n    ## add the dask_chroms func for reference data\n    if 'reference' in data.paramsdict[\"assembly_method\"]:\n        with lbview.temp_flags(after=cleanups.values()):\n            cleanups['ref'] = lbview.apply(dask_chroms, *(data, samples))\n\n    ## wait for \"write_to_fullarr\" jobs to finish\n    print(\"\")\n    start = time.time()\n    printstr = \" building database     | {} | s6 |\"\n    while 1:\n        finished = [i for i in cleanups.values() if i.ready()]\n        elapsed = datetime.timedelta(seconds=int(time.time() - start))\n        progressbar(len(cleanups), len(finished), printstr.format(elapsed))\n        time.sleep(0.1)\n        ## break if one failed, or if finished\n        if not all([i.successful() for i in finished]):\n            break\n        if len(cleanups) == len(finished):\n            break\n\n    ## check for errors\n    for job in cleanups:\n        if cleanups[job].ready():\n            if not cleanups[job].successful():\n                err = \" error in write_to_fullarr ({}) {}\"\\\n                     .format(job, cleanups[job].result())\n                LOGGER.error(err)\n                raise IPyradWarningExit(err)\n\n    ## remove large indels array file and singlecat tmparr file\n    ifile = os.path.join(data.dirs.across, data.name+\".tmp.indels.hdf5\")\n    if os.path.exists(ifile):\n        os.remove(ifile)\n    if os.path.exists(bseeds):\n        os.remove(bseeds)\n    for sh5 in [os.path.join(data.dirs.across, i.name+\".tmp.h5\") for i in samples]:\n        os.remove(sh5)\n\n    ## print final progress\n    elapsed = datetime.timedelta(seconds=int(time.time() - start))\n    progressbar(10, 10, printstr.format(elapsed))\n    print(\"\")"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef singlecat(data, sample, bseeds, sidx, nloci):\n\n    LOGGER.info(\"in single cat here\")\n    ## enter ref data?\n    isref = 'reference' in data.paramsdict[\"assembly_method\"]\n\n    ## grab seeds and hits info for this sample\n    with h5py.File(bseeds, 'r') as io5:\n        ## get hits just for this sample and sort them by sample order index\n        hits = io5[\"uarr\"][:]\n        hits = hits[hits[:, 1] == sidx, :]\n        #hits = hits[hits[:, 2].argsort()]\n        ## get seeds just for this sample and sort them by sample order index\n        seeds = io5[\"seedsarr\"][:]\n        seeds = seeds[seeds[:, 1] == sidx, :]\n        #seeds = seeds[seeds[:, 2].argsort()]\n        full = np.concatenate((seeds, hits))\n        full = full[full[:, 0].argsort()]\n\n    ## still using max+20 len limit, rare longer merged reads get trimmed\n    ## we need to allow room for indels to be added too\n    maxlen = data._hackersonly[\"max_fragment_length\"] + 20\n\n    ## we'll fill a new catg and alleles arr for this sample in locus order,\n    ## which is known from seeds and hits\n    ocatg = np.zeros((nloci, maxlen, 4), dtype=np.uint32)\n    onall = np.zeros(nloci, dtype=np.uint8)\n    ochrom = np.zeros((nloci, 3), dtype=np.int64)\n    \n    ## grab the sample's data and write to ocatg and onall\n    if not sample.files.database:\n        raise IPyradWarningExit(\"missing catg file - {}\".format(sample.name))\n\n    with h5py.File(sample.files.database, 'r') as io5:\n        ## get it and delete it\n        catarr = io5[\"catg\"][:]\n        tmp = catarr[full[:, 2], :maxlen, :]\n        del catarr\n        ocatg[full[:, 0], :tmp.shape[1], :] = tmp\n        del tmp\n\n        ## get it and delete it\n        nall = io5[\"nalleles\"][:]\n        onall[full[:, 0]] = nall[full[:, 2]]\n        del nall\n\n        ## fill the reference data\n        if isref:\n            chrom = io5[\"chroms\"][:]\n            ochrom[full[:, 0]] = chrom[full[:, 2]]\n            del chrom\n\n    ## get indel locations for this sample\n    ipath = os.path.join(data.dirs.across, data.name+\".tmp.indels.hdf5\")\n    with h5py.File(ipath, 'r') as ih5:\n        indels = ih5[\"indels\"][sidx, :, :maxlen]\n\n    ## insert indels into ocatg\n    newcatg = inserted_indels(indels, ocatg)\n    del ocatg, indels\n    \n    ## save individual tmp h5 data\n    smpio = os.path.join(data.dirs.across, sample.name+'.tmp.h5')\n    with h5py.File(smpio, 'w') as oh5:\n        oh5.create_dataset(\"icatg\", data=newcatg, dtype=np.uint32)\n        oh5.create_dataset(\"inall\", data=onall, dtype=np.uint8)\n        if isref:\n            oh5.create_dataset(\"ichrom\", data=ochrom, dtype=np.int64)", "response": "Create a single segment for each sample in the locus order."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nwriting arrays to h5 disk", "response": "def write_to_fullarr(data, sample, sidx):\n    \"\"\" writes arrays to h5 disk \"\"\"\n\n    ## enter ref data?\n    #isref = 'reference' in data.paramsdict[\"assembly_method\"]\n    LOGGER.info(\"writing fullarr %s %s\", sample.name, sidx)\n\n    ## save big arrays to disk temporarily\n    with h5py.File(data.clust_database, 'r+') as io5:\n        ## open views into the arrays we plan to fill\n        chunk = io5[\"catgs\"].attrs[\"chunksize\"][0]\n        catg = io5[\"catgs\"]\n        nall = io5[\"nalleles\"]\n\n        ## adding an axis to newcatg makes it write about 1000X faster.\n        smpio = os.path.join(data.dirs.across, sample.name+'.tmp.h5')\n        with h5py.File(smpio) as indat:\n\n            ## grab all of the data from this sample's arrays\n            newcatg = indat[\"icatg\"] #[:]\n            onall = indat[\"inall\"]   #[:]\n\n            ## enter it into the full array one chunk at a time\n            for cidx in xrange(0, catg.shape[0], chunk):\n                end = cidx + chunk\n                catg[cidx:end, sidx:sidx+1, :] = np.expand_dims(newcatg[cidx:end, :], axis=1)\n                nall[:, sidx:sidx+1] = np.expand_dims(onall, axis=1)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef dask_chroms(data, samples):\n    \n    ## example concatenating with dask\n    h5s = [os.path.join(data.dirs.across, s.name+\".tmp.h5\") for s in samples]\n    handles = [h5py.File(i) for i in h5s]\n    dsets = [i['/ichrom'] for i in handles]\n    arrays = [da.from_array(dset, chunks=(10000, 3)) for dset in dsets]\n    stack = da.stack(arrays, axis=2)\n\n    ## max chrom (should we check for variable hits? if so, things can get wonk)\n    maxchrom = da.max(stack, axis=2)[:, 0]\n\n    ## max pos\n    maxpos = da.max(stack, axis=2)[:, 2]\n\n    ## min pos\n    mask = stack == 0\n    stack[mask] = 9223372036854775807  ## max int64 value\n    minpos = da.min(stack, axis=2)[:, 1]\n    final = da.stack([maxchrom, minpos, maxpos], axis=1)\n    final.to_hdf5(data.clust_database, \"/chroms\")\n\n    ## close the h5 handles\n    _ = [i.close() for i in handles]", "response": "A dask relay function to fill chroms for all samples"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef inserted_indels(indels, ocatg):\n    ## return copy with indels inserted\n    newcatg = np.zeros(ocatg.shape, dtype=np.uint32)\n\n    ## iterate over loci and make extensions for indels\n    for iloc in xrange(ocatg.shape[0]):\n        ## get indels indices\n        indidx = np.where(indels[iloc, :])[0]\n        if np.any(indidx):\n            ## which new (empty) rows will be added\n            allrows = np.arange(ocatg.shape[1])\n            mask = np.ones(allrows.shape[0], dtype=np.bool_)\n            for idx in indidx:\n                mask[idx] = False\n            not_idx = allrows[mask == 1]\n\n            ## fill in new data into all other spots\n            newcatg[iloc][not_idx] = ocatg[iloc, :not_idx.shape[0]]\n        else:\n            newcatg[iloc] = ocatg[iloc]\n    return newcatg", "response": "inserts indels into the catg array"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nfills the superseqs array with seq data from cat. clust. clust and fill the edges array with information about paired split locations.", "response": "def fill_superseqs(data, samples):\n    \"\"\"\n    Fills the superseqs array with seq data from cat.clust\n    and fill the edges array with information about paired split locations.\n    \"\"\"\n\n    ## load super to get edges\n    io5 = h5py.File(data.clust_database, 'r+')\n    superseqs = io5[\"seqs\"]\n    splits = io5[\"splits\"]\n\n    ## samples are already sorted\n    snames = [i.name for i in samples]\n    LOGGER.info(\"snames %s\", snames)\n\n    ## get maxlen again\n    maxlen = data._hackersonly[\"max_fragment_length\"] + 20\n    LOGGER.info(\"maxlen inside fill_superseqs is %s\", maxlen)\n\n    ## data has to be entered in blocks\n    infile = os.path.join(data.dirs.across, data.name+\"_catclust.gz\")\n    clusters = gzip.open(infile, 'r')\n    pairdealer = itertools.izip(*[iter(clusters)]*2)\n\n    ## iterate over clusters\n    chunks = superseqs.attrs[\"chunksize\"]\n    chunksize = chunks[0]\n    done = 0\n    iloc = 0\n    cloc = 0\n    chunkseqs = np.zeros(chunks, dtype=\"|S1\")\n    chunkedge = np.zeros(chunksize, dtype=np.uint16)\n\n    while 1:\n        try:\n            done, chunk = clustdealer(pairdealer, 1)\n        except IndexError:\n            raise IPyradWarningExit(\"clustfile formatting error in %s\", chunk)\n\n        ## if chunk is full put into superseqs and reset counter\n        if cloc == chunksize:\n            LOGGER.info(\"cloc chunk writing %s\", cloc)\n            superseqs[iloc-cloc:iloc] = chunkseqs\n            splits[iloc-cloc:iloc] = chunkedge\n            ## reset chunkseqs, chunkedge, cloc\n            cloc = 0\n            chunkseqs = np.zeros((chunksize, len(samples), maxlen), dtype=\"|S1\")\n            chunkedge = np.zeros((chunksize), dtype=np.uint16)\n\n        ## get seq and split it\n        if chunk:\n            try:\n                fill = np.zeros((len(samples), maxlen), dtype=\"|S1\")\n                fill.fill(\"N\")\n                piece = chunk[0].strip().split(\"\\n\")\n                names = piece[0::2]\n                seqs = np.array([list(i) for i in piece[1::2]])\n                \n                ## fill in the separator if it exists\n                separator = np.where(np.all(seqs == 'n', axis=0))[0]\n                if np.any(separator):\n                    chunkedge[cloc] = separator.min()\n\n                ## fill in the hits\n                ## seqs will be (5,) IF the seqs are variable lengths, which \n                ## can happen if it had duplicaes AND there were indels, and \n                ## so the indels did not get aligned\n                try:\n                    shlen = seqs.shape[1]\n                except IndexError as inst:\n                    shlen = min([len(x) for x in seqs])\n\n                for name, seq in zip(names, seqs):\n                    sidx = snames.index(name.rsplit(\"_\", 1)[0])\n                    #fill[sidx, :shlen] = seq[:maxlen]\n                    fill[sidx, :shlen] = seq[:shlen]\n\n                ## PUT seqs INTO local ARRAY\n                chunkseqs[cloc] = fill\n\n            except Exception as inst:\n                LOGGER.info(inst)\n                LOGGER.info(\"\\nfill: %s\\nshlen %s\\nmaxlen %s\", fill.shape, shlen, maxlen)\n                LOGGER.info(\"dupe chunk \\n{}\".format(\"\\n\".join(chunk)))\n\n            ## increase counters if there was a chunk\n            cloc += 1\n            iloc += 1\n        if done:\n            break\n\n    ## write final leftover chunk\n    superseqs[iloc-cloc:,] = chunkseqs[:cloc]\n    splits[iloc-cloc:] = chunkedge[:cloc]\n\n    ## close super\n    io5.close()\n    clusters.close()\n\n    ## edges is filled with splits for paired data.\n    LOGGER.info(\"done filling superseqs\")"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef count_seeds(usort):\n    with open(usort, 'r') as insort:\n        cmd1 = [\"cut\", \"-f\", \"2\"]\n        cmd2 = [\"uniq\"]\n        cmd3 = [\"wc\"]\n        proc1 = sps.Popen(cmd1, stdin=insort, stdout=sps.PIPE, close_fds=True)\n        proc2 = sps.Popen(cmd2, stdin=proc1.stdout, stdout=sps.PIPE, close_fds=True)\n        proc3 = sps.Popen(cmd3, stdin=proc2.stdout, stdout=sps.PIPE, close_fds=True)\n        res = proc3.communicate()\n        nseeds = int(res[0].split()[0])\n        proc1.stdout.close()\n        proc2.stdout.close()\n        proc3.stdout.close()\n    return nseeds", "response": "count the number of seeds in a given order"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef sort_seeds(uhandle, usort):\n    cmd = [\"sort\", \"-k\", \"2\", uhandle, \"-o\", usort]\n    proc = sps.Popen(cmd, close_fds=True)\n    proc.communicate()", "response": "sort seeds from cluster results"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef build_clustbits(data, ipyclient, force):\n\n    ## If you run this step then we clear all tmp .fa and .indel.h5 files\n    if os.path.exists(data.tmpdir):\n        shutil.rmtree(data.tmpdir)\n        os.mkdir(data.tmpdir)\n\n    ## parallel client\n    lbview = ipyclient.load_balanced_view()\n    start = time.time()\n    printstr = \" building clusters     | {} | s6 |\"\n    elapsed = datetime.timedelta(seconds=int(time.time()-start))\n    progressbar(3, 0, printstr.format(elapsed), spacer=data._spacer)\n\n    uhandle = os.path.join(data.dirs.across, data.name+\".utemp\")\n    usort = os.path.join(data.dirs.across, data.name+\".utemp.sort\")\n\n    async1 = \"\"\n    ## skip usorting if not force and already exists\n    if not os.path.exists(usort) or force:\n\n        ## send sort job to engines. Sorted seeds allows us to work through\n        ## the utemp file one locus at a time instead of reading all into mem.\n        LOGGER.info(\"building reads file -- loading utemp file into mem\")\n        async1 = lbview.apply(sort_seeds, *(uhandle, usort))\n        while 1:\n            elapsed = datetime.timedelta(seconds=int(time.time()-start))\n            progressbar(3, 0, printstr.format(elapsed), spacer=data._spacer)\n            if async1.ready():\n                break\n            else:\n                time.sleep(0.1)\n\n    ## send count seeds job to engines.\n    async2 = lbview.apply(count_seeds, usort)\n    while 1:\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))\n        progressbar(3, 1, printstr.format(elapsed), spacer=data._spacer)\n        if async2.ready():\n            break\n        else:\n            time.sleep(0.1)\n\n    ## wait for both to finish while printing progress timer\n    nseeds = async2.result()\n\n    ## send the clust bit building job to work and track progress\n    async3 = lbview.apply(sub_build_clustbits, *(data, usort, nseeds))\n    while 1:\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))\n        progressbar(3, 2, printstr.format(elapsed), spacer=data._spacer)\n        if async3.ready():\n            break\n        else:\n            time.sleep(0.1)\n    elapsed = datetime.timedelta(seconds=int(time.time()-start))\n    progressbar(3, 3, printstr.format(elapsed), spacer=data._spacer)\n    print(\"\")\n\n    ## check for errors\n    for job in [async1, async2, async3]:\n        try:\n            if not job.successful():\n                raise IPyradWarningExit(job.result())\n        except AttributeError:\n            ## If we skip usorting then async1 == \"\" so the call to\n            ## successful() raises, but we can ignore it.\n            pass", "response": "Builds the cluster from the utemp and htemp files and writes them to chunked files for aligning in muscle."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef sub_build_clustbits(data, usort, nseeds):\n\n    ## load FULL concat fasta file into a dict. This could cause RAM issues.\n    ## this file has iupac codes in it, not ambigs resolved, and is gzipped.\n    LOGGER.info(\"loading full _catcons file into memory\")\n    allcons = {}\n    conshandle = os.path.join(data.dirs.across, data.name+\"_catcons.tmp\")\n    with gzip.open(conshandle, 'rb') as iocons:\n        cons = itertools.izip(*[iter(iocons)]*2)\n        for namestr, seq in cons:\n            nnn, sss = [i.strip() for i in namestr, seq]\n            allcons[nnn[1:]] = sss\n\n    ## set optim to approximately 4 chunks per core. Smaller allows for a bit\n    ## cleaner looking progress bar. 40 cores will make 160 files.\n    optim = ((nseeds // (data.cpus*4)) + (nseeds % (data.cpus*4)))\n    LOGGER.info(\"building clustbits, optim=%s, nseeds=%s, cpus=%s\",\n                optim, nseeds, data.cpus)\n\n    ## iterate through usort grabbing seeds and matches\n    with open(usort, 'rb') as insort:\n        ## iterator, seed null, and seqlist null\n        isort = iter(insort)\n        loci = 0\n        lastseed = 0\n        fseqs = []\n        seqlist = []\n        seqsize = 0\n\n        while 1:\n            ## grab the next line\n            try:\n                hit, seed, ori = isort.next().strip().split()\n            except StopIteration:\n                break\n\n            try:\n                ## if same seed, append match\n                if seed != lastseed:\n                    ## store the last fseq, count it, and clear it\n                    if fseqs:\n                        seqlist.append(\"\\n\".join(fseqs))\n                        seqsize += 1\n                        fseqs = []\n                    ## occasionally write to file\n                    if seqsize >= optim:\n                        if seqlist:\n                            loci += seqsize\n                            with open(os.path.join(data.tmpdir,\n                                data.name+\".chunk_{}\".format(loci)), 'w') as clustsout:\n                                LOGGER.debug(\"writing chunk - seqsize {} loci {} {}\".format(seqsize, loci, clustsout.name))\n                                clustsout.write(\"\\n//\\n//\\n\".join(seqlist)+\"\\n//\\n//\\n\")\n                            ## reset list and counter\n                            seqlist = []\n                            seqsize = 0\n    \n                    ## store the new seed on top of fseq\n                    fseqs.append(\">{}\\n{}\".format(seed, allcons[seed]))\n                    lastseed = seed\n    \n                ## add match to the seed\n                seq = allcons[hit]\n                ## revcomp if orientation is reversed\n                if ori == \"-\":\n                    seq = fullcomp(seq)[::-1]\n                fseqs.append(\">{}\\n{}\".format(hit, seq))\n            except KeyError as inst:\n                ## Caught bad seed or hit? Log and continue.\n                LOGGER.error(\"Bad Seed/Hit: seqsize {}\\tloci {}\\tseed {}\\thit {}\".format(seqsize, loci, seed, hit))\n\n    ## write whatever is left over to the clusts file\n    if fseqs:\n        seqlist.append(\"\\n\".join(fseqs))\n        seqsize += 1\n        loci += seqsize\n    if seqlist:\n        with open(os.path.join(data.tmpdir,\n            data.name+\".chunk_{}\".format(loci)), 'w') as clustsout:\n            clustsout.write(\"\\n//\\n//\\n\".join(seqlist)+\"\\n//\\n//\\n\")\n\n    ## final progress and cleanup\n    del allcons\n    clustbits = glob.glob(os.path.join(data.tmpdir, data.name+\".chunk_*\"))\n\n    ## return stuff\n    return clustbits, loci", "response": "This function is a subfunction of build_clustbits that creates a bit - list of all the unaligned clusters. It will also build the bit - list of all the unaligned clusters."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef build_input_file(data, samples, randomseed):\n\n    ## get all of the consens handles for samples that have consens reads\n    ## this is better than using sample.files.consens for selecting files\n    ## b/c if they were moved we only have to edit data.dirs.consens\n\n    ## scratch the statement above, people shouldn't be moving files, \n    ## they should be using merge/branch, and so sample.files.consens\n    ## is needed to keep track of samples from different dirs if they\n    ## are later merged into the same assembly.\n    #conshandles = [os.path.join(data.dirs.consens, sample.name+\".consens.gz\") \\\n    #              for sample in samples if \\\n    #              sample.stats.reads_consens]\n    conshandles = [sample.files.consens[0] \\\n                  for sample in samples if \\\n                  sample.stats.reads_consens]\n    conshandles.sort()\n    assert conshandles, \"no consensus files found\"\n\n    ## concatenate all of the gzipped consens files\n    cmd = ['cat'] + conshandles\n    #allcons = os.path.join(data.dirs.consens, data.name+\"_catcons.tmp\")\n    allcons = os.path.join(data.dirs.across, data.name+\"_catcons.tmp\")\n    LOGGER.debug(\" \".join(cmd))\n    with open(allcons, 'w') as output:\n        call = sps.Popen(cmd, stdout=output, close_fds=True)\n        call.communicate()\n\n    ## a string of sed substitutions for temporarily replacing hetero sites\n    ## skips lines with '>', so it doesn't affect taxon names\n    subs = [\"/>/!s/W/A/g\", \"/>/!s/w/A/g\", \"/>/!s/R/A/g\", \"/>/!s/r/A/g\",\n            \"/>/!s/M/A/g\", \"/>/!s/m/A/g\", \"/>/!s/K/T/g\", \"/>/!s/k/T/g\",\n            \"/>/!s/S/C/g\", \"/>/!s/s/C/g\", \"/>/!s/Y/C/g\", \"/>/!s/y/C/g\"]\n    subs = \";\".join(subs)\n\n    ## impute pseudo-haplo information to avoid mismatch at hetero sites\n    ## the read data with hetero sites is put back into clustered data later.\n    ## pipe passed data from gunzip to sed.\n    cmd1 = [\"gunzip\", \"-c\", allcons]\n    cmd2 = [\"sed\", subs]\n    LOGGER.debug(\" \".join(cmd1))\n    proc1 = sps.Popen(cmd1, stdout=sps.PIPE, close_fds=True)\n    allhaps = allcons.replace(\"_catcons.tmp\", \"_cathaps.tmp\")\n    with open(allhaps, 'w') as output:\n        LOGGER.debug(\" \".join(cmd2))\n        proc2 = sps.Popen(cmd2, stdin=proc1.stdout, stdout=output, close_fds=True)\n        proc2.communicate()\n    proc1.stdout.close()\n\n    ## now sort the file using vsearch\n    allsort = allcons.replace(\"_catcons.tmp\", \"_catsort.tmp\")\n    cmd1 = [ipyrad.bins.vsearch,\n            \"--sortbylength\", allhaps,\n            \"--fasta_width\", \"0\",\n            \"--output\", allsort]\n    LOGGER.debug(\" \".join(cmd1))\n    proc1 = sps.Popen(cmd1, close_fds=True)\n    proc1.communicate()\n\n    ## shuffle sequences within size classes. Tested seed (8/31/2016)\n    ## shuffling works repeatably with seed.\n    random.seed(randomseed)\n\n    ## open an iterator to lengthsorted file and grab two lines at at time\n    allshuf = allcons.replace(\"_catcons.tmp\", \"_catshuf.tmp\")\n    outdat = open(allshuf, 'w')\n    indat = open(allsort, 'r')\n    idat = itertools.izip(iter(indat), iter(indat))\n    done = 0\n\n    chunk = [idat.next()]\n    while not done:\n        ## grab 2-lines until they become shorter (unless there's only one)\n        oldlen = len(chunk[-1][-1])\n        while 1:\n            try:\n                dat = idat.next()\n            except StopIteration:\n                done = 1\n                break\n            if len(dat[-1]) == oldlen:\n                chunk.append(dat)\n            else:\n                ## send the last chunk off to be processed\n                random.shuffle(chunk)\n                outdat.write(\"\".join(itertools.chain(*chunk)))\n                ## start new chunk\n                chunk = [dat]\n                break\n\n    ## do the last chunk\n    random.shuffle(chunk)\n    outdat.write(\"\".join(itertools.chain(*chunk)))\n\n    indat.close()\n    outdat.close()", "response": "Builds the input file for the consensus files."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nstep 6-1: Clears dirs and databases and calls 'build_input_file()'", "response": "def clean_and_build_concat(data, samples, randomseed, ipyclient):\n    \"\"\" \n    STEP 6-1:\n    Clears dirs and databases and calls 'build_input_file()'\n    \"\"\"\n    ## but check for new clust database name if this is a new branch\n    cleanup_tempfiles(data)\n    catclust = os.path.join(data.dirs.across, data.name+\"_catclust.gz\")\n    if os.path.exists(catclust):\n        os.remove(catclust)\n    if os.path.exists(data.clust_database):\n        os.remove(data.clust_database)\n\n    ## get parallel view\n    start = time.time()\n    printstr = \" concat/shuffle input  | {} | s6 |\"\n\n    ## make a vsearch input fasta file with all samples reads concat\n    async = ipyclient[0].apply(build_input_file, *[data, samples, randomseed])\n    while 1:\n        ready = int(async.ready())\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))\n        progressbar(1, ready, printstr.format(elapsed), spacer=data._spacer)\n        if ready:\n            break\n        else:\n            time.sleep(0.1)\n    print(\"\")\n\n    ## store that this step was successful\n    if not async.successful():\n        raise IPyradWarningExit(async.result())"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef run(data, samples, noreverse, force, randomseed, ipyclient, **kwargs):\n\n    ## if force then set checkpoint to zero and run all substeps for just\n    ## the user specified steps. \n    if force:\n        data._checkpoint = 0\n        if kwargs.get('substeps'):\n            substeps = kwargs.get('substeps')\n        else:  \n            substeps = range(1, 8)\n\n    ## if {data}._checkpoint attribute exists then find the checkpoint where\n    ## this assembly left off (unless force) and build step list from there.\n    else:\n        if kwargs.get('substeps'):\n            substeps = kwargs.get('substeps')\n        else:  \n            if hasattr(data, '_checkpoint'):\n                substeps = range(max(1, data._checkpoint), 8)\n            else:\n                data._checkpoint = 0\n                substeps = range(1, 8)\n\n    ## build substeps list to subset which funtions need to be run\n    if isinstance(substeps, (int, float, str)):\n        substeps = [substeps]\n        substeps = [int(i) for i in substeps]\n\n    ## print continuation message\n    if substeps[0] != 1:\n        print(\"{}Continuing from checkpoint 6.{}\"\\\n              .format(data._spacer, substeps[0]))\n    LOGGER.info(\"checkpoint = %s\", data._checkpoint)\n    LOGGER.info(\"substeps = %s\", substeps)\n\n    ## Set variables on data that are needed for all steps;\n    data.dirs.across = os.path.realpath(\n        os.path.join(data.paramsdict[\"project_dir\"], data.name+\"_across\"))\n    data.tmpdir = os.path.join(data.dirs.across, data.name+\"-tmpalign\")\n    data.clust_database = os.path.join(data.dirs.across, data.name+\".clust.hdf5\")\n    if not os.path.exists(data.dirs.across):\n        os.mkdir(data.dirs.across)\n    if not os.path.exists(data.tmpdir):\n        os.mkdir(data.tmpdir)\n    data.cpus = data._ipcluster[\"cores\"]\n    if not data.cpus:\n        data.cpus = len(ipyclient)\n\n    ## STEP 6-1: Clean database and build input concat file for clustering\n    if 1 in substeps:\n        clean_and_build_concat(data, samples, randomseed, ipyclient)\n        data._checkpoint = 1\n\n    ## STEP 6-2: Cluster across w/ vsearch; uses all threads on largest host \n    if 2 in substeps:\n        call_cluster(data, noreverse, ipyclient)\n        data._checkpoint = 2\n\n    ## builds consens cluster bits and writes them to the tmp directory. These\n    ## will not be deleted until either step 6-6 is complete, or the force flag\n    ## is used. This will clear the tmpdir if it is run.\n    if 3 in substeps:\n        build_clustbits(data, ipyclient, force)\n        data._checkpoint = 3\n\n    ## muscle align the cluster bits and create tmp hdf5 indel arrays for the\n    ## next step. These will not be deleted until...\n    if 4 in substeps:\n        multi_muscle_align(data, samples, ipyclient)\n        data._checkpoint = 4\n\n    ## fill the indel array with the indel tmp arrays from aligning step.\n    if 5 in substeps:\n        build_indels(data, samples, ipyclient)\n        data._checkpoint = 5\n\n    if 6 in substeps:\n        ## builds the final HDF5 array which includes three main keys\n        ## /catg -- contains all indiv catgs and has indels inserted\n        ##   .attr['samples'] = [samples]\n        ## /filters -- filled for dups, left empty for others until step 7.\n        ##   .attr['filters'] = [f1, f2, f3, f4, f5]\n        ## /seqs -- contains the clustered sequence data as string arrays\n        ##   .attr['samples'] = [samples]\n        ## /edges -- gets the paired split locations for now.\n        ## /snps  -- left empty for now\n\n        ## FILL SUPERCATG and fills dupfilter, indfilter, and nalleles\n        ## this function calls singlecat() on each sample and enters their\n        ## resulting arrays into the superarray. If all singlecats are built\n        ## then it will continue to enter them into the database. \n        LOGGER.info(\"multicat -- building full database\")\n        new_multicat(data, samples, ipyclient)\n        data._checkpoint = 6\n\n    if 7 in substeps:\n        ## FILL SUPERSEQS and fills edges(splits) for paired-end data\n        fill_superseqs(data, samples)\n        data._checkpoint = 7\n\n        ## remove files but not dir (used in step 1 too)\n        cleanup_tempfiles(data)\n        ## remove the tmpdir\n        if os.path.exists(data.tmpdir):\n            shutil.rmtree(data.tmpdir)\n\n        ## set sample states\n        for sample in samples:\n            sample.stats.state = 6\n        print(\"\")", "response": "Run the sequence of functions in a single step."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef cleanup_tempfiles(data):\n\n    ## remove align-related tmp files\n    tmps1 = glob.glob(os.path.join(data.tmpdir, \"*.fa\"))\n    tmps2 = glob.glob(os.path.join(data.tmpdir, \"*.npy\"))\n    for tmp in tmps1 + tmps2:\n        if os.path.exists(tmp):\n            os.remove(tmp)\n\n    ## remove cluster related files\n    removal = [\n        os.path.join(data.dirs.across, data.name+\".utemp\"),\n        os.path.join(data.dirs.across, data.name+\".htemp\"),\n        os.path.join(data.dirs.across, data.name+\"_catcons.tmp\"),\n        os.path.join(data.dirs.across, data.name+\"_cathaps.tmp\"),\n        os.path.join(data.dirs.across, data.name+\"_catshuf.tmp\"),\n        os.path.join(data.dirs.across, data.name+\"_catsort.tmp\"),\n        os.path.join(data.dirs.across, data.name+\".tmparrs.h5\"),\n        os.path.join(data.dirs.across, data.name+\".tmp.indels.hdf5\"),\n        ]\n    for rfile in removal:\n        if os.path.exists(rfile):\n            os.remove(rfile)\n\n    ## remove singlecat related h5 files\n    smpios = glob.glob(os.path.join(data.dirs.across, '*.tmp.h5'))\n    for smpio in smpios:\n        if os.path.exists(smpio):\n            os.remove(smpio)", "response": "Function to remove old files from the temp files of the current sequence."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef assembly_cleanup(data):\n\n    ## build s2 results data frame\n    data.stats_dfs.s2 = data._build_stat(\"s2\")\n    data.stats_files.s2 = os.path.join(data.dirs.edits, 's2_rawedit_stats.txt')\n\n    ## write stats for all samples\n    with io.open(data.stats_files.s2, 'w', encoding='utf-8') as outfile:\n        data.stats_dfs.s2.fillna(value=0).astype(np.int).to_string(outfile)", "response": "cleanup for assembly object"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nparse results from cutadapt into sample data", "response": "def parse_single_results(data, sample, res1):\n    \"\"\" parse results from cutadapt into sample data\"\"\"\n\n    ## set default values \n    #sample.stats_dfs.s2[\"reads_raw\"] = 0\n    sample.stats_dfs.s2[\"trim_adapter_bp_read1\"] = 0\n    sample.stats_dfs.s2[\"trim_quality_bp_read1\"] = 0\n    sample.stats_dfs.s2[\"reads_filtered_by_Ns\"] = 0\n    sample.stats_dfs.s2[\"reads_filtered_by_minlen\"] = 0\n    sample.stats_dfs.s2[\"reads_passed_filter\"] = 0\n\n    ## parse new values from cutadapt results output\n    lines = res1.strip().split(\"\\n\")\n    for line in lines:\n\n        if \"Total reads processed:\" in line:\n            value = int(line.split()[3].replace(\",\", \"\"))\n            sample.stats_dfs.s2[\"reads_raw\"] = value\n\n        if \"Reads with adapters:\" in line:\n            value = int(line.split()[3].replace(\",\", \"\"))\n            sample.stats_dfs.s2[\"trim_adapter_bp_read1\"] = value\n\n        if \"Quality-trimmed\" in line:\n            value = int(line.split()[1].replace(\",\", \"\"))\n            sample.stats_dfs.s2[\"trim_quality_bp_read1\"] = value\n\n        if \"Reads that were too short\" in line:\n            value = int(line.split()[5].replace(\",\", \"\"))\n            sample.stats_dfs.s2[\"reads_filtered_by_minlen\"] = value\n\n        if \"Reads with too many N\" in line:\n            value = int(line.split()[5].replace(\",\", \"\"))\n            sample.stats_dfs.s2[\"reads_filtered_by_Ns\"] = value\n   \n        if \"Reads written (passing filters):\" in line:\n            value = int(line.split()[4].replace(\",\", \"\"))\n            sample.stats_dfs.s2[\"reads_passed_filter\"] = value\n\n    ## save to stats summary\n    if sample.stats_dfs.s2.reads_passed_filter:\n        sample.stats.state = 2\n        sample.stats.reads_passed_filter = sample.stats_dfs.s2.reads_passed_filter\n        sample.files.edits = [\n            (OPJ(data.dirs.edits, sample.name+\".trimmed_R1_.fastq.gz\"), 0)]\n        ## write the long form output to the log file.\n        LOGGER.info(res1)\n\n    else:\n        print(\"{}No reads passed filtering in Sample: {}\".format(data._spacer, sample.name))"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nparse pair mod results from cutadapt", "response": "def parse_pair_results(data, sample, res):\n    \"\"\" parse results from cutadapt for paired data\"\"\"\n    LOGGER.info(\"in parse pair mod results\\n%s\", res)   \n    ## set default values\n    sample.stats_dfs.s2[\"trim_adapter_bp_read1\"] = 0\n    sample.stats_dfs.s2[\"trim_adapter_bp_read2\"] = 0    \n    sample.stats_dfs.s2[\"trim_quality_bp_read1\"] = 0\n    sample.stats_dfs.s2[\"trim_quality_bp_read2\"] = 0    \n    sample.stats_dfs.s2[\"reads_filtered_by_Ns\"] = 0\n    sample.stats_dfs.s2[\"reads_filtered_by_minlen\"] = 0\n    sample.stats_dfs.s2[\"reads_passed_filter\"] = 0\n\n    lines = res.strip().split(\"\\n\")\n    qprimed = 0\n    for line in lines:\n        ## set primer to catch next line\n        if \"Quality-trimmed\" in line:\n            qprimed = 1\n\n        ## grab read1 and read2 lines when qprimed\n        if \"Read 1:\" in line:\n            if qprimed:\n                value = int(line.split()[2].replace(\",\", \"\"))\n                sample.stats_dfs.s2[\"trim_quality_bp_read1\"] = value\n\n        if \"Read 2:\" in line:\n            if qprimed:\n                value = int(line.split()[2].replace(\",\", \"\"))\n                sample.stats_dfs.s2[\"trim_quality_bp_read2\"] = value\n                qprimed = 0\n\n        if \"Read 1 with adapter:\" in line:\n            value = int(line.split()[4].replace(\",\", \"\"))\n            sample.stats_dfs.s2[\"trim_adapter_bp_read1\"] = value\n\n        if \"Read 2 with adapter:\" in line:\n            value = int(line.split()[4].replace(\",\", \"\"))\n            sample.stats_dfs.s2[\"trim_adapter_bp_read2\"] = value\n\n        if \"Total read pairs processed:\" in line:\n            value = int(line.split()[4].replace(\",\", \"\"))\n            sample.stats_dfs.s2[\"reads_raw\"] = value\n\n        if \"Pairs that were too short\" in line:\n            value = int(line.split()[5].replace(\",\", \"\"))\n            sample.stats_dfs.s2[\"reads_filtered_by_minlen\"] = value\n\n        if \"Pairs with too many N\" in line:\n            value = int(line.split()[5].replace(\",\", \"\"))\n            sample.stats_dfs.s2[\"reads_filtered_by_Ns\"] = value\n\n        if \"Pairs written (passing filters):\" in line:\n            value = int(line.split()[4].replace(\",\", \"\"))\n            sample.stats_dfs.s2[\"reads_passed_filter\"] = value\n\n    ## save to stats summary\n    if sample.stats_dfs.s2.reads_passed_filter:\n        sample.stats.state = 2\n        sample.stats.reads_passed_filter = sample.stats_dfs.s2.reads_passed_filter\n        sample.files.edits = [(\n             OPJ(data.dirs.edits, sample.name+\".trimmed_R1_.fastq.gz\"), \n             OPJ(data.dirs.edits, sample.name+\".trimmed_R2_.fastq.gz\")\n             )]\n\n    else:\n        print(\"No reads passed filtering in Sample: {}\".format(sample.name))"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\napplying quality and adapter filters to reads using cutadapt. If the ipyrad filter param is set to 0 then it only filters to hard trim edges and uses mintrimlen. If filter=1, we add quality filters. If filter=2 we add adapter filters.", "response": "def cutadaptit_single(data, sample):\n    \"\"\" \n    Applies quality and adapter filters to reads using cutadapt. If the ipyrad\n    filter param is set to 0 then it only filters to hard trim edges and uses\n    mintrimlen. If filter=1, we add quality filters. If filter=2 we add\n    adapter filters. \n    \"\"\"\n\n    sname = sample.name\n    ## if (GBS, ddRAD) we look for the second cut site + adapter. For single-end\n    ## data we don't bother trying to remove the second barcode since it's not\n    ## as critical as with PE data.\n    if data.paramsdict[\"datatype\"] == \"rad\":\n        adapter = data._hackersonly[\"p3_adapter\"]\n    else:\n        ## if GBS then the barcode can also be on the other side. \n        if data.paramsdict[\"datatype\"] == \"gbs\":\n\n            ## make full adapter (-revcompcut-revcompbarcode-adapter)\n            ## and add adapter without revcompbarcode\n            if data.barcodes:\n                adapter = \\\n                    fullcomp(data.paramsdict[\"restriction_overhang\"][1])[::-1] \\\n                  + fullcomp(data.barcodes[sample.name])[::-1] \\\n                  + data._hackersonly[\"p3_adapter\"]\n                ## add incomplete adapter to extras (-recompcut-adapter)\n                data._hackersonly[\"p3_adapters_extra\"].append(\n                    fullcomp(data.paramsdict[\"restriction_overhang\"][1])[::-1] \\\n                  + data._hackersonly[\"p3_adapter\"])\n            else:\n                LOGGER.warning(\"No barcode information present, and is therefore not \"+\\\n                               \"being used for adapter trimming of SE gbs data.\")\n                ## else no search for barcodes on 3'\n                adapter = \\\n                    fullcomp(data.paramsdict[\"restriction_overhang\"][1])[::-1] \\\n                  + data._hackersonly[\"p3_adapter\"]\n        else:\n            adapter = \\\n                fullcomp(data.paramsdict[\"restriction_overhang\"][1])[::-1] \\\n              + data._hackersonly[\"p3_adapter\"]\n\n    ## get length trim parameter from new or older version of ipyrad params\n    trim5r1 = trim3r1 = []\n    if data.paramsdict.get(\"trim_reads\"):\n        trimlen = data.paramsdict.get(\"trim_reads\")\n        \n        ## trim 5' end\n        if trimlen[0]:\n            trim5r1 = [\"-u\", str(trimlen[0])]\n        if trimlen[1] < 0:\n            trim3r1 = [\"-u\", str(trimlen[1])]\n        if trimlen[1] > 0:\n            trim3r1 = [\"--length\", str(trimlen[1])]\n    else:\n        trimlen = data.paramsdict.get(\"edit_cutsites\")\n        trim5r1 = [\"--cut\", str(trimlen[0])]\n\n    ## testing new 'trim_reads' setting\n    cmdf1 = [\"cutadapt\"]\n    if trim5r1:\n        cmdf1 += trim5r1\n    if trim3r1:\n        cmdf1 += trim3r1\n    cmdf1 += [\"--minimum-length\", str(data.paramsdict[\"filter_min_trim_len\"]),\n              \"--max-n\", str(data.paramsdict[\"max_low_qual_bases\"]),\n              \"--trim-n\", \n              \"--output\", OPJ(data.dirs.edits, sname+\".trimmed_R1_.fastq.gz\"),\n              sample.files.concat[0][0]]\n\n    if int(data.paramsdict[\"filter_adapters\"]):\n        ## NEW: only quality trim the 3' end for SE data.\n        cmdf1.insert(1, \"20\")\n        cmdf1.insert(1, \"-q\")\n        cmdf1.insert(1, str(data.paramsdict[\"phred_Qscore_offset\"]))\n        cmdf1.insert(1, \"--quality-base\")\n\n    ## if filter_adapters==3 then p3_adapters_extra will already have extra\n    ## poly adapters added to its list. \n    if int(data.paramsdict[\"filter_adapters\"]) > 1:\n        ## first enter extra cuts (order of input is reversed)\n        for extracut in list(set(data._hackersonly[\"p3_adapters_extra\"]))[::-1]:\n            cmdf1.insert(1, extracut)\n            cmdf1.insert(1, \"-a\")\n        ## then put the main cut so it appears first in command\n        cmdf1.insert(1, adapter)\n        cmdf1.insert(1, \"-a\")\n\n\n    ## do modifications to read1 and write to tmp file\n    LOGGER.info(cmdf1)\n    proc1 = sps.Popen(cmdf1, stderr=sps.STDOUT, stdout=sps.PIPE, close_fds=True)\n    try:\n        res1 = proc1.communicate()[0]\n    except KeyboardInterrupt:\n        proc1.kill()\n        raise KeyboardInterrupt\n\n    ## raise errors if found\n    if proc1.returncode:\n        raise IPyradWarningExit(\" error in {}\\n {}\".format(\" \".join(cmdf1), res1))\n\n    ## return result string to be parsed outside of engine\n    return res1"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef cutadaptit_pairs(data, sample):\n    LOGGER.debug(\"Entering cutadaptit_pairs - {}\".format(sample.name))\n    sname = sample.name\n\n    ## applied to read pairs\n    #trim_r1 = str(data.paramsdict[\"edit_cutsites\"][0])\n    #trim_r2 = str(data.paramsdict[\"edit_cutsites\"][1])\n    finput_r1 = sample.files.concat[0][0]\n    finput_r2 = sample.files.concat[0][1]\n\n    ## Get adapter sequences. This is very important. For the forward adapter\n    ## we don't care all that much about getting the sequence just before the \n    ## Illumina adapter, b/c it will either be random (in RAD), or the reverse\n    ## cut site of cut1 or cut2 (gbs or ddrad). Either way, we can still trim it \n    ## off later in step7 with trim overhang if we want. And it should be invar-\n    ## iable unless the cut site has an ambiguous char. The reverse adapter is \n    ## super important, however b/c it can contain the inline barcode and \n    ## revcomp cut site. We def want to trim out the barcode, and ideally the \n    ## cut site too to be safe. Problem is we don't always know the barcode if \n    ## users demultiplexed their data elsewhere. So, if barcode is missing we \n    ## do a very fuzzy match before the adapter and trim it out. \n\n    ## this just got more complicated now that we allow merging technical\n    ## replicates in step 1 since a single sample might have multiple barcodes\n    ## associated with it and so we need to search for multiple adapter+barcode\n    ## combinations.\n    ## We will assume that if they are 'linking_barcodes()' here then there are\n    ## no technical replicates in the barcodes file. If there ARE technical\n    ## replicates, then they should run step1 so they are merged, in which case\n    ## the sample specific barcodes will be saved to each Sample under its\n    ## .barcode attribute as a list. \n\n    if not data.barcodes:\n        ## try linking barcodes again in case user just added a barcodes path\n        ## after receiving the warning. We assume no technical replicates here.\n        try:\n            data._link_barcodes()\n        except Exception as inst:\n            LOGGER.warning(\"  error adding barcodes info: %s\", inst)\n\n    ## barcodes are present meaning they were parsed to the samples in step 1.\n    if data.barcodes:\n        try:\n            adapter1 = fullcomp(data.paramsdict[\"restriction_overhang\"][1])[::-1] \\\n                        + data._hackersonly[\"p3_adapter\"]\n            if isinstance(sample.barcode, list):\n                bcode = fullcomp(sample.barcode[0])[::-1]\n            elif isinstance(data.barcodes[sample.name], list):\n                bcode = fullcomp(data.barcodes[sample.name][0][::-1])\n            else:\n                bcode = fullcomp(data.barcodes[sample.name])[::-1]\n            ## add full adapter (-revcompcut-revcompbcode-adapter)\n            adapter2 = fullcomp(data.paramsdict[\"restriction_overhang\"][0])[::-1] \\\n                        + bcode \\\n                        + data._hackersonly[\"p5_adapter\"]      \n        except KeyError as inst:\n            msg = \"\"\"\n    Sample name does not exist in the barcode file. The name in the barcode file\n    for each sample must exactly equal the raw file name for the sample minus\n    `_R1`. So for example a sample called WatDo_PipPrep_R1_100.fq.gz must\n    be referenced in the barcode file as WatDo_PipPrep_100. The name in your\n    barcode file for this sample must match: {}\n    \"\"\".format(sample.name)\n            LOGGER.error(msg)\n            raise IPyradWarningExit(msg)\n    else:\n        print(NO_BARS_GBS_WARNING)\n        #adapter1 = fullcomp(data.paramsdict[\"restriction_overhang\"][1])[::-1]+\\\n        #           data._hackersonly[\"p3_adapter\"]\n        #adapter2 = \"XXX\"\n        adapter1 = data._hackersonly[\"p3_adapter\"]\n        adapter2 = fullcomp(data._hackersonly[\"p5_adapter\"])\n\n\n    ## parse trim_reads\n    trim5r1 = trim5r2 = trim3r1 = trim3r2 = []\n    if data.paramsdict.get(\"trim_reads\"):\n        trimlen = data.paramsdict.get(\"trim_reads\")\n        \n        ## trim 5' end\n        if trimlen[0]:\n            trim5r1 = [\"-u\", str(trimlen[0])]\n        if trimlen[1] < 0:\n            trim3r1 = [\"-u\", str(trimlen[1])]\n        if trimlen[1] > 0:\n            trim3r1 = [\"--length\", str(trimlen[1])]\n\n        ## legacy support for trimlen = 0,0 default\n        if len(trimlen) > 2:\n            if trimlen[2]:\n                trim5r2 = [\"-U\", str(trimlen[2])]\n\n        if len(trimlen) > 3:\n            if trimlen[3]:\n                if trimlen[3] < 0:\n                    trim3r2 = [\"-U\", str(trimlen[3])]\n                if trimlen[3] > 0:            \n                    trim3r2 = [\"--length\", str(trimlen[3])]\n\n    else:\n        ## legacy support\n        trimlen = data.paramsdict.get(\"edit_cutsites\")\n        trim5r1 = [\"-u\", str(trimlen[0])]\n        trim5r2 = [\"-U\", str(trimlen[1])]\n\n    ## testing new 'trim_reads' setting\n    cmdf1 = [\"cutadapt\"]\n    if trim5r1:\n        cmdf1 += trim5r1\n    if trim3r1:\n        cmdf1 += trim3r1\n    if trim5r2:\n        cmdf1 += trim5r2\n    if trim3r2:\n        cmdf1 += trim3r2\n\n    cmdf1 += [\"--trim-n\",\n              \"--max-n\", str(data.paramsdict[\"max_low_qual_bases\"]),\n              \"--minimum-length\", str(data.paramsdict[\"filter_min_trim_len\"]),\n              \"-o\", OPJ(data.dirs.edits, sname+\".trimmed_R1_.fastq.gz\"),\n              \"-p\", OPJ(data.dirs.edits, sname+\".trimmed_R2_.fastq.gz\"),\n              finput_r1,\n              finput_r2]\n\n    ## additional args\n    if int(data.paramsdict[\"filter_adapters\"]) < 2:\n        ## add a dummy adapter to let cutadapt know whe are not using legacy-mode\n        cmdf1.insert(1, \"XXX\")\n        cmdf1.insert(1, \"-A\")\n\n    if int(data.paramsdict[\"filter_adapters\"]):\n        cmdf1.insert(1, \"20,20\")\n        cmdf1.insert(1, \"-q\")\n        cmdf1.insert(1, str(data.paramsdict[\"phred_Qscore_offset\"]))\n        cmdf1.insert(1, \"--quality-base\")\n\n    if int(data.paramsdict[\"filter_adapters\"]) > 1:\n        ## if technical replicates then add other copies\n        if isinstance(sample.barcode, list):\n            for extrabar in sample.barcode[1:]:\n                data._hackersonly[\"p5_adapters_extra\"] += \\\n                    fullcomp(data.paramsdict[\"restriction_overhang\"][0])[::-1] + \\\n                    fullcomp(extrabar)[::-1] + \\\n                    data._hackersonly[\"p5_adapter\"]\n                data._hackersonly[\"p5_adapters_extra\"] += \\\n                    fullcomp(data.paramsdict[\"restriction_overhang\"][1])[::-1] + \\\n                    data._hackersonly[\"p3_adapter\"]\n\n        ## first enter extra cuts\n        zcut1 = list(set(data._hackersonly[\"p3_adapters_extra\"]))[::-1]\n        zcut2 = list(set(data._hackersonly[\"p5_adapters_extra\"]))[::-1]\n        for ecut1, ecut2 in zip(zcut1, zcut2):\n            cmdf1.insert(1, ecut1)\n            cmdf1.insert(1, \"-a\")\n            cmdf1.insert(1, ecut2)\n            cmdf1.insert(1, \"-A\")\n        ## then put the main cut first\n        cmdf1.insert(1, adapter1)\n        cmdf1.insert(1, '-a')        \n        cmdf1.insert(1, adapter2)\n        cmdf1.insert(1, '-A')         \n\n    ## do modifications to read1 and write to tmp file\n    LOGGER.debug(\" \".join(cmdf1))\n    #sys.exit()\n    try:\n        proc1 = sps.Popen(cmdf1, stderr=sps.STDOUT, stdout=sps.PIPE, close_fds=True)\n        res1 = proc1.communicate()[0]\n    except KeyboardInterrupt:\n        proc1.kill()\n        LOGGER.info(\"this is where I want it to interrupt\")\n        raise KeyboardInterrupt()\n\n    ## raise errors if found\n    if proc1.returncode:\n        raise IPyradWarningExit(\" error [returncode={}]: {}\\n{}\"\\\n            .format(proc1.returncode, \" \".join(cmdf1), res1))\n\n    LOGGER.debug(\"Exiting cutadaptit_pairs - {}\".format(sname))\n    ## return results string to be parsed outside of engine\n    return res1", "response": "Given a sample and a list of reads that are in a single sample we can use that to find the adapter sequences that are in the pair of reads."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef run2(data, samples, force, ipyclient):\n\n    ## create output directories \n    data.dirs.edits = os.path.join(os.path.realpath(\n                                   data.paramsdict[\"project_dir\"]), \n                                   data.name+\"_edits\")\n    if not os.path.exists(data.dirs.edits):\n        os.makedirs(data.dirs.edits)\n\n    ## get samples\n    subsamples = choose_samples(samples, force)\n\n    ## only allow extra adapters in filters==3, \n    ## and add poly repeats if not in list of adapters\n    if int(data.paramsdict[\"filter_adapters\"]) == 3:\n        if not data._hackersonly[\"p3_adapters_extra\"]:\n            for poly in [\"A\"*8, \"T\"*8, \"C\"*8, \"G\"*8]:\n                data._hackersonly[\"p3_adapters_extra\"].append(poly)\n        if not data._hackersonly[\"p5_adapters_extra\"]:    \n            for poly in [\"A\"*8, \"T\"*8, \"C\"*8, \"G\"*8]:\n                data._hackersonly[\"p5_adapters_extra\"].append(poly)\n    else:\n        data._hackersonly[\"p5_adapters_extra\"] = []\n        data._hackersonly[\"p3_adapters_extra\"] = []\n\n    ## concat is not parallelized (since it's disk limited, generally)\n    subsamples = concat_reads(data, subsamples, ipyclient)\n\n    ## cutadapt is parallelized by ncores/2 because cutadapt spawns threads\n    lbview = ipyclient.load_balanced_view(targets=ipyclient.ids[::2])\n    run_cutadapt(data, subsamples, lbview)\n\n    ## cleanup is ...\n    assembly_cleanup(data)", "response": "Run cutadapt on the given samples."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nconcatenates if multiple input files for a single samples", "response": "def concat_reads(data, subsamples, ipyclient):\n    \"\"\" concatenate if multiple input files for a single samples \"\"\"\n\n    ## concatenate reads if they come from merged assemblies.\n    if any([len(i.files.fastqs) > 1 for i in subsamples]):\n        ## run on single engine for now\n        start = time.time()\n        printstr = \" concatenating inputs  | {} | s2 |\"\n        finished = 0\n        catjobs = {}\n        for sample in subsamples:\n            if len(sample.files.fastqs) > 1:\n                catjobs[sample.name] = ipyclient[0].apply(\\\n                                       concat_multiple_inputs, *(data, sample))\n            else:\n                sample.files.concat = sample.files.fastqs\n\n        ## wait for all to finish\n        while 1:\n            finished = sum([i.ready() for i in catjobs.values()])\n            elapsed = datetime.timedelta(seconds=int(time.time()-start))\n            progressbar(len(catjobs), finished, printstr.format(elapsed), spacer=data._spacer)\n            time.sleep(0.1)\n            if finished == len(catjobs):\n                print(\"\")\n                break\n\n        ## collect results, which are concat file handles.\n        for async in catjobs:\n            if catjobs[async].successful():\n                data.samples[async].files.concat = catjobs[async].result()\n            else:\n                error = catjobs[async].result()#exception()\n                LOGGER.error(\"error in step2 concat %s\", error)\n                raise IPyradWarningExit(\"error in step2 concat: {}\".format(error))\n    else:\n        for sample in subsamples:\n            ## just copy fastqs handles to concat attribute\n            sample.files.concat = sample.files.fastqs\n\n    return subsamples"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nrunning cutadapt on the subsamples and return the list of subsamples", "response": "def run_cutadapt(data, subsamples, lbview):\n    \"\"\"\n    sends fastq files to cutadapt\n    \"\"\"\n    ## choose cutadapt function based on datatype\n    start = time.time()\n    printstr = \" processing reads      | {} | s2 |\"\n    finished = 0\n    rawedits = {}\n\n    ## sort subsamples so that the biggest files get submitted first\n    subsamples.sort(key=lambda x: x.stats.reads_raw, reverse=True)\n    LOGGER.info([i.stats.reads_raw for i in subsamples])\n\n    ## send samples to cutadapt filtering\n    if \"pair\" in data.paramsdict[\"datatype\"]:\n        for sample in subsamples:\n            rawedits[sample.name] = lbview.apply(cutadaptit_pairs, *(data, sample))\n    else:\n        for sample in subsamples:\n            rawedits[sample.name] = lbview.apply(cutadaptit_single, *(data, sample))\n\n    ## wait for all to finish\n    while 1:\n        finished = sum([i.ready() for i in rawedits.values()])\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))\n        progressbar(len(rawedits), finished, printstr.format(elapsed), spacer=data._spacer)\n        time.sleep(0.1)\n        if finished == len(rawedits):\n            print(\"\")\n            break\n\n    ## collect results, report failures, and store stats. async = sample.name\n    for async in rawedits:\n        if rawedits[async].successful():\n            res = rawedits[async].result()\n\n            ## if single cleanup is easy\n            if \"pair\" not in data.paramsdict[\"datatype\"]:\n                parse_single_results(data, data.samples[async], res)\n            else:\n                parse_pair_results(data, data.samples[async], res)\n        else:\n            print(\"  found an error in step2; see ipyrad_log.txt\")\n            LOGGER.error(\"error in run_cutadapt(): %s\", rawedits[async].exception())"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef choose_samples(samples, force):\n\n    ## hold samples that pass\n    subsamples = []\n    ## filter the samples again\n    if not force:\n        for sample in samples:\n            if sample.stats.state >= 2:\n                print(\"\"\"\\\n    Skipping Sample {}; Already filtered. Use force argument to overwrite.\\\n    \"\"\".format(sample.name))\n            elif not sample.stats.reads_raw:\n                print(\"\"\"\\\n    Skipping Sample {}; No reads found in file {}\\\n    \"\"\".format(sample.name, sample.files.fastqs))\n            else:\n                subsamples.append(sample)\n\n    else:\n        for sample in samples:\n            if not sample.stats.reads_raw:\n                print(\"\"\"\\\n    Skipping Sample {}; No reads found in file {}\\\n    \"\"\".format(sample.name, sample.files.fastqs))\n            else:\n                subsamples.append(sample)\n    return subsamples", "response": "Choose samples from the sequence of samples."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nconcatenate multiple fastq files into one.", "response": "def concat_multiple_inputs(data, sample):\n    \"\"\" \n    If multiple fastq files were appended into the list of fastqs for samples\n    then we merge them here before proceeding. \n    \"\"\"\n\n    ## if more than one tuple in fastq list\n    if len(sample.files.fastqs) > 1:\n        ## create a cat command to append them all (doesn't matter if they \n        ## are gzipped, cat still works). Grab index 0 of tuples for R1s.\n        cmd1 = [\"cat\"] + [i[0] for i in sample.files.fastqs]\n\n        isgzip = \".gz\"\n        if not sample.files.fastqs[0][0].endswith(\".gz\"):\n            isgzip = \"\"\n\n        ## write to new concat handle\n        conc1 = os.path.join(data.dirs.edits, sample.name+\"_R1_concat.fq{}\".format(isgzip))\n        with open(conc1, 'w') as cout1:\n            proc1 = sps.Popen(cmd1, stderr=sps.STDOUT, stdout=cout1, close_fds=True)\n            res1 = proc1.communicate()[0]\n        if proc1.returncode:\n            raise IPyradWarningExit(\"error in: {}, {}\".format(cmd1, res1))\n\n        ## Only set conc2 if R2 actually exists\n        conc2 = 0\n        if \"pair\" in data.paramsdict[\"datatype\"]:\n            cmd2 = [\"cat\"] + [i[1] for i in sample.files.fastqs]\n            conc2 = os.path.join(data.dirs.edits, sample.name+\"_R2_concat.fq{}\".format(isgzip))\n            with open(conc2, 'w') as cout2:\n                proc2 = sps.Popen(cmd2, stderr=sps.STDOUT, stdout=cout2, close_fds=True)\n                res2 = proc2.communicate()[0]\n            if proc2.returncode:\n                raise IPyradWarningExit(\"Error concatenating fastq files. Make sure all \"\\\n                    + \"these files exist: {}\\nError message: {}\".format(cmd2, proc2.returncode))\n\n        ## store new file handles\n        sample.files.concat = [(conc1, conc2)]\n    return sample.files.concat"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef make( data, samples ):\n    invcffile   =  os.path.join( data.dirs.consens, data.name+\".vcf\" )\n    outlocifile  =  os.path.join( data.dirs.outfiles, data.name+\".loci\" )\n\n    importvcf( invcffile, outlocifile )", "response": "Convert from step6 to. loci format"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nfunctioning for importing a vcf file into loci format.", "response": "def importvcf( vcffile, locifile ):\n    \"\"\" Function for importing a vcf file into loci format. Arguments\n        are the input vcffile and the loci file to write out. \"\"\"\n\n    try:\n        ## Get names of all individuals in the vcf\n        with open( invcffile, 'r' ) as invcf:\n            for line in invcf:\n                if line.split()[0] == \"#CHROM\":\n                    ## This is maybe a little clever. The names in the vcf are everything after\n                    ## the \"FORMAT\" column, so find that index, then slice everything after it.\n                    names_col = line.split().index( \"FORMAT\" ) + 1\n                    names = line.split()[ names_col:]\n                    LOGGER.debug( \"Got names - %s\", names )\n                    break\n\n            print( \"wat\" )\n        ## Get the column to start reading at\n    except Exception:\n        print( \"wat\" )"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef get_targets(ipyclient):\n    ## fill hosts with async[gethostname] \n    hosts = []\n    for eid in ipyclient.ids:\n        engine = ipyclient[eid]\n        if not engine.outstanding:\n            hosts.append(engine.apply(socket.gethostname))\n\n    ## capture results of asyncs\n    hosts = [i.get() for i in hosts]\n    hostset = set(hosts)\n    hostzip = zip(hosts, ipyclient.ids)\n    hostdict = {host: [i[1] for i in hostzip if i[0] == host] for host in hostset}\n    targets = list(itertools.chain(*[hostdict[i][:2] for i in hostdict]))\n\n    ## return first two engines from each host\n    return targets", "response": "A function to find 2 engines per hostname on the ipyclient."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ncomputing stats for tree features and NHX tree features", "response": "def compute_tree_stats(self, ipyclient):\n    \"\"\" \n    compute stats for stats file and NHX tree features\n    \"\"\"\n\n    ## get name indices\n    names = self.samples\n\n    ## get majority rule consensus tree of weighted Q bootstrap trees\n    if self.params.nboots:\n        \n        ## Tree object\n        fulltre = ete3.Tree(self.trees.tree, format=0)\n        fulltre.unroot()\n\n        ## only grab as many boots as the last option said was max\n        with open(self.trees.boots, 'r') as inboots:\n            bb = [ete3.Tree(i.strip(), format=0) for i in inboots.readlines()]\n            wboots = [fulltre] + bb[-self.params.nboots:]\n        \n        ## infer consensus tree and write to file\n        wctre, wcounts = consensus_tree(wboots, names=names)\n        self.trees.cons = os.path.join(self.dirs, self.name + \".cons\")\n        with open(self.trees.cons, 'w') as ocons:\n            ocons.write(wctre.write(format=0))\n    else:\n        wctre = ete3.Tree(self.trees.tree, format=0)\n        wctre.unroot()\n\n    ## build stats file and write trees\n    self.trees.nhx = os.path.join(self.dirs, self.name + \".nhx\")\n    with open(self.files.stats, 'w') as ostats:\n\n        ## print Tetrad info\n        #ostats.write(STATS_STRING.format(**self.stats))\n\n        ## print bootstrap splits\n        if self.params.nboots:\n            ostats.write(\"## splits observed in {} trees\\n\".format(len(wboots)))\n            for i, j in enumerate(self.samples):\n                ostats.write(\"{:<3} {}\\n\".format(i, j))\n            ostats.write(\"\\n\")\n            for split, freq in wcounts:\n                if split.count('1') > 1:\n                    ostats.write(\"{}   {:.2f}\\n\".format(split, round(freq, 2)))\n            ostats.write(\"\\n\")\n\n    ## parallelized this function because it can be slogging\n    lbview = ipyclient.load_balanced_view()\n    \n    ## store results in dicts\n    qtots = {}\n    qsamp = {}\n    tots = sum(1 for i in wctre.iter_leaves())\n    totn = set(wctre.get_leaf_names())\n\n    ## iterate over node traversal. \n    for node in wctre.traverse():\n        ## this is slow, needs to look at every sampled quartet\n        ## so we send it be processed on an engine\n        qtots[node] = lbview.apply(_get_total, *(tots, node))\n        qsamp[node] = lbview.apply(_get_sampled, *(self, totn, node))\n\n    ## wait for jobs to finish\n    ipyclient.wait()\n\n    ## put results into tree\n    for node in wctre.traverse():\n        ## this is fast, just calcs n_choose_k\n        total = qtots[node].result()\n        sampled = qsamp[node].result()\n        ## store the results to the tree            \n        node.add_feature(\"quartets_total\", total)\n        node.add_feature(\"quartets_sampled\", sampled)\n    features = [\"quartets_total\", \"quartets_sampled\"]\n\n    ## return as NHX format with extra info\n    with open(self.trees.nhx, 'w') as outtre:\n        outtre.write(wctre.write(format=0, features=features))"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef random_product(iter1, iter2):\n    pool1 = tuple(iter1)\n    pool2 = tuple(iter2)\n    ind1 = random.sample(pool1, 2)\n    ind2 = random.sample(pool2, 2)\n    return tuple(ind1+ind2)", "response": "random sampler for equal_splits func"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\ngets the number of quartets as n - choose - k", "response": "def n_choose_k(n, k):\n    \"\"\" get the number of quartets as n-choose-k. This is used\n    in equal splits to decide whether a split should be exhaustively sampled\n    or randomly sampled. Edges near tips can be exhaustive while highly nested\n    edges probably have too many quartets\n    \"\"\"\n    return int(reduce(MUL, (Fraction(n-i, i+1) for i in range(k)), 1))"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef count_snps(mat):\n\n    ## get [aabb, baba, abba, aaab] \n    snps = np.zeros(4, dtype=np.uint32)\n\n    ## get concordant (aabb) pis sites\n    snps[0] = np.uint32(\\\n           mat[0, 5] + mat[0, 10] + mat[0, 15] + \\\n           mat[5, 0] + mat[5, 10] + mat[5, 15] + \\\n           mat[10, 0] + mat[10, 5] + mat[10, 15] + \\\n           mat[15, 0] + mat[15, 5] + mat[15, 10])\n\n    ## get discordant (baba) sites\n    for i in range(16):\n        if i % 5:\n            snps[1] += mat[i, i]\n    \n    ## get discordant (abba) sites\n    snps[2] = mat[1, 4] + mat[2, 8] + mat[3, 12] +\\\n              mat[4, 1] + mat[6, 9] + mat[7, 13] +\\\n              mat[8, 2] + mat[9, 6] + mat[11, 14] +\\\n              mat[12, 3] + mat[13, 7] + mat[14, 11]\n\n    ## get autapomorphy sites\n    snps[3] = (mat.sum() - np.diag(mat).sum()) - snps[2]\n\n    return snps", "response": "count_snps returns a tuple of the number of snps in the count array"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nremoves ncolumns from snparray prior to matrix calculation, and subsamples 'linked' snps (those from the same RAD locus) such that for these four samples only 1 SNP per locus is kept. This information comes from the 'map' array (map file).", "response": "def subsample_snps_map(seqchunk, nmask, maparr):\n    \"\"\" \n    removes ncolumns from snparray prior to matrix calculation, and \n    subsamples 'linked' snps (those from the same RAD locus) such that\n    for these four samples only 1 SNP per locus is kept. This information\n    comes from the 'map' array (map file). \n    \"\"\"\n    ## mask columns that contain Ns\n    rmask = np.zeros(seqchunk.shape[1], dtype=np.bool_)\n\n    ## apply mask to the mapfile\n    last_loc = -1\n    for idx in xrange(maparr.shape[0]):\n        if maparr[idx] != last_loc:\n            if not nmask[idx]:\n                rmask[idx] = True\n            last_loc = maparr[idx]\n    \n    ## apply mask\n    #newarr = seqchunk[:, rmask]\n    \n    ## return smaller Nmasked array\n    return rmask"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nconverts a chunk of numpy array into a list of numpy arrays", "response": "def chunk_to_matrices(narr, mapcol, nmask):\n    \"\"\" \n    numba compiled code to get matrix fast.\n    arr is a 4 x N seq matrix converted to np.int8\n    I convert the numbers for ATGC into their respective index for the MAT\n    matrix, and leave all others as high numbers, i.e., -==45, N==78. \n    \"\"\"\n\n    ## get seq alignment and create an empty array for filling\n    mats = np.zeros((3, 16, 16), dtype=np.uint32)\n\n    ## replace ints with small ints that index their place in the \n    ## 16x16. This no longer checks for big ints to exclude, so resolve=True\n    ## is now the default, TODO. \n    last_loc = -1\n    for idx in xrange(mapcol.shape[0]):\n        if not nmask[idx]:\n            if not mapcol[idx] == last_loc:\n                i = narr[:, idx]\n                mats[0, (4*i[0])+i[1], (4*i[2])+i[3]] += 1      \n                last_loc = mapcol[idx]\n\n    ## fill the alternates\n    x = np.uint8(0)\n    for y in np.array([0, 4, 8, 12], dtype=np.uint8):\n        for z in np.array([0, 4, 8, 12], dtype=np.uint8):\n            mats[1, y:y+np.uint8(4), z:z+np.uint8(4)] = mats[0, x].reshape(4, 4)\n            mats[2, y:y+np.uint8(4), z:z+np.uint8(4)] = mats[0, x].reshape(4, 4).T\n            x += np.uint8(1)\n\n    return mats"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef calculate(seqnon, mapcol, nmask, tests):\n\n    ## create empty matrices\n    #LOGGER.info(\"tests[0] %s\", tests[0])\n    #LOGGER.info('seqnon[[tests[0]]] %s', seqnon[[tests[0]]])\n    mats = chunk_to_matrices(seqnon, mapcol, nmask)\n\n    ## empty svdscores for each arrangement of seqchunk\n    svds = np.zeros((3, 16), dtype=np.float64)\n    qscores = np.zeros(3, dtype=np.float64)\n    ranks = np.zeros(3, dtype=np.float64)\n\n    for test in range(3):\n        ## get svd scores\n        svds[test] = np.linalg.svd(mats[test].astype(np.float64))[1]\n        ranks[test] = np.linalg.matrix_rank(mats[test].astype(np.float64))\n\n    ## get minrank, or 11\n    minrank = int(min(11, ranks.min()))\n    for test in range(3):\n        qscores[test] = np.sqrt(np.sum(svds[test, minrank:]**2))\n\n    ## sort to find the best qorder\n    best = np.where(qscores == qscores.min())[0]\n    #best = qscores[qscores == qscores.min()][0]\n    bidx = tests[best][0]\n    qsnps = count_snps(mats[best][0])\n\n    return bidx, qsnps", "response": "calculate the best quality and qsnps for each seqnon"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef nworker(data, smpchunk, tests):\n    \n    ## tell engines to limit threads\n    #numba.config.NUMBA_DEFAULT_NUM_THREADS = 1\n    \n    ## open the seqarray view, the modified array is in bootsarr\n    with h5py.File(data.database.input, 'r') as io5:\n        seqview = io5[\"bootsarr\"][:]\n        maparr = io5[\"bootsmap\"][:]\n\n    ## create an N-mask array of all seq cols (this isn't really too slow)\n    nall_mask = seqview[:] == 78\n\n    ## tried numba compiling everythign below here, but was not faster\n    ## than making nmask w/ axis arg in numpy\n\n    ## get the input arrays ready\n    rquartets = np.zeros((smpchunk.shape[0], 4), dtype=np.uint16)\n    rweights = None\n    #rweights = np.ones(smpchunk.shape[0], dtype=np.float64)\n    rdstats = np.zeros((smpchunk.shape[0], 4), dtype=np.uint32)\n\n    #times = []\n    ## fill arrays with results using numba funcs\n    for idx in xrange(smpchunk.shape[0]):\n        ## get seqchunk for 4 samples (4, ncols) \n        sidx = smpchunk[idx]\n        seqchunk = seqview[sidx]\n\n        ## get N-containing columns in 4-array, and invariant sites.\n        nmask = np.any(nall_mask[sidx], axis=0)\n        nmask += np.all(seqchunk == seqchunk[0], axis=0)  ## <- do we need this?\n\n        ## get matrices if there are any shared SNPs\n        ## returns best-tree index, qscores, and qstats\n        #bidx, qscores, qstats = calculate(seqchunk, maparr[:, 0], nmask, tests)\n        bidx, qstats = calculate(seqchunk, maparr[:, 0], nmask, tests)\n        \n        ## get weights from the three scores sorted. \n        ## Only save to file if the quartet has information\n        rdstats[idx] = qstats \n        rquartets[idx] = smpchunk[idx][bidx]\n\n    return rquartets, rweights, rdstats", "response": "This function is used to workhorse the workhorse function."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nshuffle the columns of a sequence array", "response": "def shuffle_cols(seqarr, newarr, cols):\n    \"\"\" used in bootstrap resampling without a map file \"\"\"\n    for idx in xrange(cols.shape[0]):\n        newarr[:, idx] = seqarr[:, cols[idx]]\n    return newarr"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreturns a seq array with RSKWYM randomly replaced with resolved bases", "response": "def resolve_ambigs(tmpseq):\n    \"\"\" returns a seq array with 'RSKYWM' randomly replaced with resolved bases\"\"\"\n    ## iterate over the bases 'RSKWYM': [82, 83, 75, 87, 89, 77]\n    for ambig in np.uint8([82, 83, 75, 87, 89, 77]):\n        ## get all site in this ambig\n        idx, idy = np.where(tmpseq == ambig)\n        ## get the two resolutions of the ambig\n        res1, res2 = AMBIGS[ambig.view(\"S1\")]\n        ## randomly sample half those sites\n        halfmask = np.random.choice([True, False], idx.shape[0])\n        ## replace ambig bases with their resolutions\n        for i in xrange(halfmask.shape[0]):\n            if halfmask[i]:\n                tmpseq[idx[i], idy[i]] = np.array(res1).view(np.uint8)\n            else:\n                tmpseq[idx[i], idy[i]] = np.array(res2).view(np.uint8)\n    return tmpseq"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nget the span distance for each locus in original seqarray", "response": "def get_spans(maparr, spans):\n    \"\"\" get span distance for each locus in original seqarray \"\"\"\n    ## start at 0, finds change at 1-index of map file\n    bidx = 1\n    spans = np.zeros((maparr[-1, 0], 2), np.uint64)\n    ## read through marr and record when locus id changes\n    for idx in xrange(1, maparr.shape[0]):\n        cur = maparr[idx, 0]\n        if cur != bidx:\n            idy = idx + 1\n            spans[cur-2, 1] = idx\n            spans[cur-1, 0] = idx\n            bidx = cur\n    spans[-1, 1] = maparr[-1, -1]\n    return spans"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef get_shape(spans, loci):\n    width = 0\n    for idx in xrange(loci.shape[0]):\n        width += spans[loci[idx], 1] - spans[loci[idx], 0]\n    return width", "response": "get shape of new bootstrap resampled locus array"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef fill_boot(seqarr, newboot, newmap, spans, loci):\n    ## column index\n    cidx = 0\n  \n    ## resample each locus\n    for i in xrange(loci.shape[0]):\n        \n        ## grab a random locus's columns\n        x1 = spans[loci[i]][0]\n        x2 = spans[loci[i]][1]\n        cols = seqarr[:, x1:x2]\n\n        ## randomize columns within colsq\n        cord = np.random.choice(cols.shape[1], cols.shape[1], replace=False)\n        rcols = cols[:, cord]\n        \n        ## fill bootarr with n columns from seqarr\n        ## the required length was already measured\n        newboot[:, cidx:cidx+cols.shape[1]] = rcols\n\n        ## fill bootmap with new map info\n        newmap[cidx: cidx+cols.shape[1], 0] = i+1\n        \n        ## advance column index\n        cidx += cols.shape[1]\n\n    ## return the concatenated cols\n    return newboot, newmap", "response": "fills the new boot array with the new map info"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef consensus_tree(trees, names=None, cutoff=0.0):\n\n    ## find which clades occured with freq > cutoff\n    namedict, clade_counts = _find_clades(trees, names=names)\n\n    ## filter out the < cutoff clades\n    fclade_counts = _filter_clades(clade_counts, cutoff)\n\n    ## build tree\n    consens_tree, _ = _build_trees(fclade_counts, namedict)\n    ## make sure no singleton nodes were left behind\n    return consens_tree, clade_counts", "response": "This function is used to build a consensus tree from a list of trees."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef refresh(self):\n\n        ## clear any existing results files\n        oldfiles = [self.files.qdump] + \\\n                    self.database.__dict__.values() + \\\n                    self.trees.__dict__.values()\n        for oldfile in oldfiles:\n            if oldfile:\n                if os.path.exists(oldfile):\n                    os.remove(oldfile)\n\n        ## store old ipcluster info\n        oldcluster = copy.deepcopy(self._ipcluster)\n\n        ## reinit the tetrad object data.\n        self.__init__(\n            name=self.name, \n            data=self.files.data, \n            mapfile=self.files.mapfile,\n            workdir=self.dirs,\n            method=self.params.method,\n            guidetreefile=self.files.guidetreefile,\n            resolve=self._resolve, \n            nboots=self.params.nboots, \n            nquartets=self.params.nquartets, \n            initarr=True, \n            quiet=True,\n            cli=self.kwargs.get(\"cli\")\n            )\n\n        ## retain the same ipcluster info\n        self._ipcluster = oldcluster", "response": "Refresh the internal object with the current values of the object."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nparsing sample names from the sequence file", "response": "def _parse_names(self):\n        \"\"\" parse sample names from the sequence file\"\"\"\n        self.samples = []\n        with iter(open(self.files.data, 'r')) as infile:\n            infile.next().strip().split()\n            while 1:\n                try:\n                    self.samples.append(infile.next().split()[0])\n                except StopIteration:\n                    break"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _init_seqarray(self, quiet=False):\n\n        ## read in the data (seqfile)\n        try:\n            spath = open(self.files.data, 'r')\n        except IOError:\n            raise IPyradWarningExit(NO_SNP_FILE.format(self.files.data))\n        line = spath.readline().strip().split()\n        ntax = int(line[0])\n        nbp = int(line[1])\n\n        ## make a tmp seq array\n        if not quiet:\n            print(\"loading seq array [{} taxa x {} bp]\".format(ntax, nbp))        \n        tmpseq = np.zeros((ntax, nbp), dtype=np.uint8)\n    \n        ## create array storage for real seq and the tmp bootstrap seqarray\n        with h5py.File(self.database.input, 'w') as io5:\n            io5.create_dataset(\"seqarr\", (ntax, nbp), dtype=np.uint8)\n            io5.create_dataset(\"bootsarr\", (ntax, nbp), dtype=np.uint8)\n            io5.create_dataset(\"bootsmap\", (nbp, 2), dtype=np.uint32)\n\n            ## if there is a map file, load it into the bootsmap\n            if self.files.mapfile:\n                with open(self.files.mapfile, 'r') as inmap:\n                    ## parse the map file from txt and save as dataset\n                    maparr = np.genfromtxt(inmap, dtype=np.uint64)\n                    io5[\"bootsmap\"][:] = maparr[:, [0, 3]]\n\n                    ## parse the span info from maparr and save to dataset\n                    spans = np.zeros((maparr[-1, 0], 2), np.uint64)\n                    spans = get_spans(maparr, spans)\n                    io5.create_dataset(\"spans\", data=spans)\n                    if not quiet:\n                        print(\"max unlinked SNPs per quartet (nloci): {}\"\\\n                              .format(spans.shape[0]))\n            else:\n                io5[\"bootsmap\"][:, 0] = np.arange(io5[\"bootsmap\"].shape[0])\n\n            ## fill the tmp array from the input phy\n            for line, seq in enumerate(spath.readlines()):\n                tmpseq[line] = np.array(list(seq.split()[-1])).view(np.uint8)\n\n            ## convert '-' or '_' into 'N'\n            tmpseq[tmpseq == 45] = 78\n            tmpseq[tmpseq == 95] = 78            \n\n            ## save array to disk so it can be easily accessed by slicing\n            ## This unmodified array is used again later for sampling boots\n            io5[\"seqarr\"][:] = tmpseq\n\n            ## resolve ambiguous IUPAC codes\n            if self._resolve:\n                tmpseq = resolve_ambigs(tmpseq)\n\n            ## convert CATG bases to matrix indices\n            tmpseq[tmpseq == 65] = 0\n            tmpseq[tmpseq == 67] = 1\n            tmpseq[tmpseq == 71] = 2\n            tmpseq[tmpseq == 84] = 3\n\n            ## save modified array to disk            \n            io5[\"bootsarr\"][:] = tmpseq\n\n            ## memory cleanup\n            #del tmpseq\n\n            ## get initial array\n            LOGGER.info(\"original seqarr \\n %s\", io5[\"seqarr\"][:, :20])\n            LOGGER.info(\"original bootsarr \\n %s\", io5[\"bootsarr\"][:, :20])\n            LOGGER.info(\"original bootsmap \\n %s\", io5[\"bootsmap\"][:20, :])", "response": "Initialize the seqarray with the full data set and creates a bootstrapmap with the real data set."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nstore the number of samples in a large array.", "response": "def _store_N_samples(self, ncpus):\n        \"\"\" \n        Find all quartets of samples and store in a large array\n        Create a chunk size for sampling from the array of quartets. \n        This should be relatively large so that we don't spend a lot of time \n        doing I/O, but small enough that jobs finish often for checkpointing.\n        \"\"\"\n        breaks = 2\n        if self.params.nquartets < 5000:\n            breaks = 1\n        if self.params.nquartets > 100000:\n            breaks = 4\n        if self.params.nquartets > 500000:\n            breaks = 8\n\n        ## chunk up the data\n        self._chunksize = (self.params.nquartets // (breaks * ncpus) + \\\n                          (self.params.nquartets % (breaks * ncpus)))\n        LOGGER.info(\"nquarts = %s, chunk = %s\", self.params.nquartets, self._chunksize)\n\n        ## 'samples' stores the indices of the quartet. \n        ## `quartets` stores the correct quartet in the order (1,2|3,4)\n        ## `weights` stores the weight of the quartet in 'quartets'\n        ## we gzip this for now, but check later if this has a big speed cost\n\n        ## create h5 OUT empty arrays\n        with h5py.File(self.database.output, 'w') as io5:\n            io5.create_dataset(\"quartets\", \n                               (self.params.nquartets, 4), \n                               dtype=np.uint16, \n                               chunks=(self._chunksize, 4))\n            io5.create_dataset(\"qstats\", \n                               (self.params.nquartets, 4), \n                               dtype=np.uint32, \n                               chunks=(self._chunksize, 4))\n            io5.create_group(\"qboots\")\n\n\n        ## append to h5 IN array (which also has seqarray) and fill it\n        with h5py.File(self.database.input, 'a') as io5:\n            ## create data sets\n            io5.create_dataset(\"samples\", \n                               (self.params.nquartets, 4), \n                               dtype=np.uint16, \n                               chunks=(self._chunksize, 4),\n                               compression='gzip')\n\n            ## populate array with all possible quartets. This allows us to \n            ## sample from the total, and also to continue from a checkpoint\n            qiter = itertools.combinations(xrange(len(self.samples)), 4)\n            i = 0\n\n            ## fill chunksize at a time for efficiency\n            while i < self.params.nquartets:\n                if self.params.method != \"all\":\n                    ## grab the next random 1000\n                    qiter = []\n                    while len(qiter) < min(self._chunksize, io5[\"samples\"].shape[0]):\n                        qiter.append(\n                            random_combination(range(len(self.samples)), 4))\n                    dat = np.array(qiter)\n                else:\n                    ## grab the next ordered chunksize\n                    dat = np.array(list(itertools.islice(qiter, self._chunksize)))\n\n                ## store to h5 \n                io5[\"samples\"][i:i+self._chunksize] = dat[:io5[\"samples\"].shape[0] - i]\n                i += self._chunksize"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nstore the equal quartets in the database.", "response": "def _store_equal_samples(self, ncpus):\n        \"\"\" \n        sample quartets evenly across splits of the starting tree, and fills\n        in remaining samples with random quartet samples. Uses a hash dict to \n        not sample the same quartet twice, so for very large trees this can \n        take a few minutes to find millions of possible quartet samples. \n        \"\"\"\n        \n        ## choose chunker for h5 arr\n        breaks = 2\n        if self.params.nquartets < 5000:\n            breaks = 1\n        if self.params.nquartets > 100000:\n            breaks = 4\n        if self.params.nquartets > 500000:\n            breaks = 8\n\n        self._chunksize = (self.params.nquartets // (breaks * ncpus) + \\\n                         (self.params.nquartets % (breaks * ncpus)))\n        LOGGER.info(\"nquarts = %s, chunk = %s\", self.params.nquartets, self._chunksize)\n\n        ## create h5 OUT empty arrays\n        with h5py.File(self.database.output, 'w') as io5:\n            io5.create_dataset(\"quartets\", \n                               (self.params.nquartets, 4), \n                               dtype=np.uint16, \n                               chunks=(self._chunksize, 4))\n            io5.create_dataset(\"qstats\", \n                               (self.params.nquartets, 4), \n                               dtype=np.uint32, \n                               chunks=(self._chunksize, 4))\n            io5.create_group(\"qboots\")\n\n        ## get starting tree, unroot, randomly resolve, ladderize\n        tre = ete3.Tree(self.files.guidetreefile, format=0)\n        #tre = toytree.tree(self.files.guidetreefile, format=0)\n        tre.tree.unroot()\n        tre.tree.resolve_polytomy(recursive=True)\n        tre.tree.ladderize()\n\n        ## randomly sample all splits of tree and convert tip names to indices\n        splits = [([self.samples.index(z.name) for z in i], \n                   [self.samples.index(z.name) for z in j]) \\\n                   for (i, j) in tre.get_edges()]\n    \n        ## only keep internal splits (no single tips edges)\n        ## this seemed to cause problems with unsampled tips\n        splits = [i for i in splits if all([len(j) > 1 for j in i])]\n\n        ## turn each into an iterable split sampler\n        ## if the nquartets for that split is small, then sample all of them\n        ## if it is big, then make it a random sampler from that split\n        qiters = []\n\n        ## how many min quartets are we gonna sample from each split?\n        squarts = self.params.nquartets // len(splits)\n\n        ## how many iterators can be sampled to saturation?\n        nsaturation = 0\n\n        for split in splits:\n            ## if small number at this split then sample all possible sets\n            ## we will exhaust this quickly and then switch to random for \n            ## the larger splits.\n            if n_choose_k(len(split[0]), 2) * n_choose_k(len(split[1]), 2) < squarts*2:\n                qiter = (i+j for (i, j) in itertools.product(\n                            itertools.combinations(split[0], 2), \n                            itertools.combinations(split[1], 2)))\n                nsaturation += 1\n\n            ## else create random sampler across that split, this is slower\n            ## because it can propose the same split repeatedly and so we \n            ## have to check it against the 'sampled' set.\n            else:\n                qiter = (random_product(split[0], split[1]) for _ \\\n                         in xrange(self.params.nquartets))\n                nsaturation += 1\n\n            ## store all iterators into a list\n            qiters.append(qiter)\n\n        #for split in splits:\n        #    print(split)\n\n        ## make qiters infinitely cycling\n        qiters = itertools.cycle(qiters)\n        cycler = itertools.cycle(range(len(splits)))\n\n        ## store visiting quartets\n        sampled = set()\n\n        ## iterate over qiters sampling from each, if one runs out, keep \n        ## sampling from remaining qiters. Keep going until samples is filled\n        with h5py.File(self.database.input, 'a') as io5:\n            ## create data sets\n            io5.create_dataset(\"samples\", \n                               (self.params.nquartets, 4), \n                               dtype=np.uint16, \n                               chunks=(self._chunksize, 4),\n                               compression='gzip')\n\n            ## fill chunksize at a time for efficiency\n            i = 0\n            empty = set()\n            edge_targeted = 0\n            random_target = 0\n\n            ## keep filling quartets until nquartets are sampled\n            while i < self.params.nquartets:\n                qdat = []\n                ## keep filling this chunk until its full\n                while len(qdat) < self._chunksize:\n                    ## grab the next iterator\n                    qiter = qiters.next()\n                    cycle = cycler.next()\n\n                    ## sample from iterator\n                    try:\n                        qrtsamp = qiter.next()\n                        if tuple(qrtsamp) not in sampled:\n                            qdat.append(qrtsamp)\n                            sampled.add(qrtsamp)\n                            edge_targeted += 1\n                        #else:\n                        #    print('repeat')\n                        \n                    ## unless iterator is empty, then skip it\n                    except StopIteration:\n                        empty.add(cycle)\n\n                    ## break when all edge samplers are empty\n                    if len(empty) == nsaturation:\n                        break\n\n                ## if array is not full then add random samples\n                while len(qdat) < self._chunksize:\n                    qrtsamp = random_combination(range(len(self.samples)), 4)\n                    if tuple(qrtsamp) not in sampled:\n                        qdat.append(qrtsamp)\n                        sampled.add(qrtsamp)\n                        random_target += 1\n\n                ## stick chunk into h5 array\n                dat = np.array(qdat, dtype=np.uint16)\n                io5[\"samples\"][i:i+self._chunksize] = dat[:io5[\"samples\"].shape[0] - i]\n                i += self._chunksize\n\n            print(\"  equal sampling: {} edge quartets, {} random quartets \"\\\n                  .format(edge_targeted, random_target))"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nruns QMC and saves the result in self. trees. tree and self. boots.", "response": "def _run_qmc(self, boot):\n        \"\"\" runs quartet max-cut on a quartets file \"\"\"\n\n        ## convert to txt file for wQMC\n        self._tmp = os.path.join(self.dirs, \".tmpwtre\")\n        cmd = [ip.bins.qmc, \"qrtt=\"+self.files.qdump, \"otre=\"+self._tmp] \n\n        ## run them\n        proc = subprocess.Popen(cmd, stderr=subprocess.STDOUT, stdout=subprocess.PIPE)\n        res = proc.communicate()\n        if proc.returncode:\n            #LOGGER.error(\"Error in QMC: \\n({}).\".format(res))\n            LOGGER.error(res)\n            raise IPyradWarningExit(res[1])\n\n        ## read in the tmp files since qmc does not pipe\n        with open(self._tmp) as intree:\n            ## convert int names back to str names renamer returns a newick str\n            #tmp = toytree.tree(intree.read().strip())\n            tmp = ete3.Tree(intree.read().strip())\n            tmpwtre = self._renamer(tmp)#.tree)\n\n        ## save the tree\n        if boot:\n            self.trees.boots = os.path.join(self.dirs, self.name+\".boots\")\n            with open(self.trees.boots, 'a') as outboot:\n                outboot.write(tmpwtre+\"\\n\")\n        else:\n            self.trees.tree = os.path.join(self.dirs, self.name+\".tree\")\n            with open(self.trees.tree, 'w') as outtree:\n                outtree.write(tmpwtre)\n\n        ## save JSON file checkpoint\n        self._save()"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _dump_qmc(self):\n        ## open the h5 database\n        io5 = h5py.File(self.database.output, 'r')\n\n        ## create an output file for writing\n        self.files.qdump = os.path.join(self.dirs, self.name+\".quartets.txt\")\n        LOGGER.info(\"qdump file %s\", self.files.qdump)\n        outfile = open(self.files.qdump, 'w')\n\n        ## todo: should pull quarts order in randomly? or doesn't matter?\n        for idx in xrange(0, self.params.nquartets, self._chunksize):\n            ## get mask of zero weight quartets\n            #mask = io5[\"weights\"][idx:idx+self.chunksize] != 0\n            #weight = io5[\"weights\"][idx:idx+self.chunksize][mask]\n            #LOGGER.info(\"exluded = %s, mask shape %s\", \n            #            self._chunksize - mask.shape[0], mask.shape)\n            #LOGGER.info('q shape %s', io5[\"quartets\"][idx:idx+self._chunksize].shape)\n            masked_quartets = io5[\"quartets\"][idx:idx+self._chunksize, :]#[mask, :]\n            quarts = [list(j) for j in masked_quartets]\n\n            ## format and print\n            #chunk = [\"{},{}|{},{}:{}\".format(*i+[j]) for i, j \\\n            #                                        in zip(quarts, weight)]\n            chunk = [\"{},{}|{},{}\".format(*i) for i in quarts]\n            outfile.write(\"\\n\".join(chunk)+\"\\n\")\n\n\n        ## close output file and h5 database\n        outfile.close()\n        io5.close()", "response": "Dumps the quartets and weights of the entry into a file."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _renamer(self, tre):\n        ## get the tre with numbered tree tip labels\n        names = tre.get_leaves()\n\n        ## replace numbered names with snames\n        for name in names:\n            name.name = self.samples[int(name.name)]\n\n        ## return with only topology and leaf labels\n        return tre.write(format=9)", "response": "renames newick from numbers to sample names"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _finalize_stats(self, ipyclient):\n\n        ## print stats file location:\n        #print(STATSOUT.format(opr(self.files.stats)))\n\n        ## print finished tree information ---------------------\n        print(FINALTREES.format(opr(self.trees.tree)))\n\n        ## print bootstrap information --------------------------\n        if self.params.nboots:\n            ## get consensus, map values to tree edges, record stats file\n            self._compute_tree_stats(ipyclient)\n            ## print bootstrap info\n            print(BOOTTREES.format(opr(self.trees.cons), opr(self.trees.boots))) \n\n        ## print the ASCII tree only if its small\n        if len(self.samples) < 20:\n            if self.params.nboots:\n                wctre = ete3.Tree(self.trees.cons, format=0)\n                wctre.ladderize()\n                print(wctre.get_ascii(show_internal=True, \n                                      attributes=[\"dist\", \"name\"]))\n                print(\"\")\n            else:\n                qtre = ete3.Tree(self.trees.tree, format=0)\n                qtre.ladderize()\n                #qtre = toytree.tree(self.trees.tree, format=0)\n                #qtre.tree.unroot()\n                print(qtre.get_ascii())\n                print(\"\")\n\n        ## print PDF filename & tips -----------------------------\n        docslink = \"https://toytree.readthedocs.io/\"    \n        citelink = \"https://ipyrad.readthedocs.io/tetrad.html\"\n        print(LINKS.format(docslink, citelink))", "response": "Write final tree files and print final statistics."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nsave a JSON file representation of the Tetrad Class for checkpoint", "response": "def _save(self):\n        \"\"\" save a JSON file representation of Tetrad Class for checkpoint\"\"\"\n\n        ## save each attribute as dict\n        fulldict = copy.deepcopy(self.__dict__)\n        for i, j in fulldict.items():\n            if isinstance(j, Params):\n                fulldict[i] = j.__dict__\n        fulldumps = json.dumps(fulldict,\n                               sort_keys=False, \n                               indent=4, \n                               separators=(\",\", \":\"),\n                               )\n\n        ## save to file, make dir if it wasn't made earlier\n        assemblypath = os.path.join(self.dirs, self.name+\".tet.json\")\n        if not os.path.exists(self.dirs):\n            os.mkdir(self.dirs)\n    \n        ## protect save from interruption\n        done = 0\n        while not done:\n            try:\n                with open(assemblypath, 'w') as jout:\n                    jout.write(fulldumps)\n                done = 1\n            except (KeyboardInterrupt, SystemExit): \n                print('.')\n                continue"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ninsert the results from workers into hdf4 array", "response": "def _insert_to_array(self, start, results):\n        \"\"\" inputs results from workers into hdf4 array \"\"\"\n        qrts, wgts, qsts = results\n        #qrts, wgts = results\n        #print(qrts)\n\n        with h5py.File(self.database.output, 'r+') as out:\n            chunk = self._chunksize\n            out['quartets'][start:start+chunk] = qrts\n            ##out['weights'][start:start+chunk] = wgts\n\n            ## entered as 0-indexed !\n            if self.checkpoint.boots:\n                key = \"qboots/b{}\".format(self.checkpoint.boots-1)\n                out[key][start:start+chunk] = qsts\n            else:\n                out[\"qstats\"][start:start+chunk] = qsts"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nruns the quartet inference on a SNP alignment and distribute workon a single cluster.", "response": "def run(self, force=0, verbose=2, ipyclient=None):\n        \"\"\" \n        Run quartet inference on a SNP alignment and distribute work\n        across an ipyparallel cluster (ipyclient). Unless passed an \n        ipyclient explicitly, it looks for a running ipcluster instance\n        running from the defautl (\"\") profile, and will raise an exception\n        if one is not found within a set time limit. If not using the default\n        profile then you can set \"profile\" as an argument to the tetrad object. \n        Parameter settings influencing the run (e.g., nquartets, method) should\n        be set on the tetrad Class object itself. \n\n        Parameters\n        ----------\n        force (bool):\n            Overwrite results for an object with this name if they exist.\n        verbose (int):\n            0=print nothing, 1=print progress bars, 2=print progress bars and\n            print cluster info. \n        ipyclient (ipyparallel.Client object):\n            Default is None (use running Default ipcluster instance). To use\n            a different ipcluster instance start a Client class object \n            and pass it in as an argument here. \n        \"\"\"\n\n        ## clear object results and data if force=True\n        if force:\n            self.refresh()\n\n        ## wrap everything in a try statement so we can ensure that it will\n        ## save if interrupted and we will clean up the \n        inst = None\n        try:\n            ## launch and connect to ipcluster instance if doesn't exist\n            if not ipyclient:\n                args = self._ipcluster.items() + [(\"spacer\", \"\")] \n                ipyclient = ip.core.parallel.get_client(**dict(args)) \n\n            ## print a message about the cluster status\n            if verbose == 2:\n                ip.cluster_info(ipyclient)\n\n            ## grab 2 engines from each host (2 multi-thread jobs per host)\n            ## skips over engines that are busy running something else to avoid\n            ## blocking if user is sharing an ipcluster.\n            targets = get_targets(ipyclient)\n            lbview = ipyclient.load_balanced_view(targets=targets)\n\n            ## store ipyclient pids to the ipcluster instance so we can \n            ## hard-kill them later. \n            self._ipcluster[\"pids\"] = ipyclient[:].apply(os.getpid).get_dict()\n\n            ## get or init quartet sampling ---------------------------\n            ## if load=True then chunksize will exist and this will skip\n            if not self._chunksize:\n                #self.nquartets = n_choose_k(len(self.samples), 4)\n                ## store N sampled quartets into the h5 array\n                if self.params.method != 'equal':\n                    self._store_N_samples(ncpus=len(lbview))\n                else:\n                    self._store_equal_samples(ncpus=len(lbview))\n\n            ## calculate invariants for the full array ----------------\n            start = time.time()            \n            if not self.trees.tree:\n                if verbose:\n                    print(\"inferring {} induced quartet trees\".format(self.params.nquartets))\n                self._inference(start, lbview, quiet=verbose == 0)\n                if verbose:\n                    print(\"\")\n            else:\n                if verbose:\n                    print(\"initial tree already inferred\")\n\n            ## calculate for bootstraps -------------------------------            \n            start = time.time()\n            if self.params.nboots:\n                if self.checkpoint.boots == self.params.nboots:\n                    if verbose:\n                        print(\"{} bootstrap trees already inferred\".format(self.params.nboots))\n                else:\n                    while self.checkpoint.boots < self.params.nboots:\n                        ## resample bootsstrap seqarray\n                        if self.files.mapfile:\n                            self._sample_bootseq_array_map()\n                        else:\n                            self._sample_bootseq_array() \n\n                        ## start boot inference, (1-indexed !!!)\n                        self.checkpoint.boots += 1\n                        self._inference(start, lbview, quiet=verbose == 0)\n                if verbose:\n                    print(\"\")\n\n            ## write outputs with bootstraps ---------------------------\n            self.files.stats = os.path.join(self.dirs, self.name+\"_stats.txt\")\n            if not self.kwargs.get(\"cli\"):\n                self._compute_tree_stats(ipyclient)\n            else:\n                self._finalize_stats(ipyclient)\n\n\n        ## handle exceptions so they will be raised after we clean up below\n        except KeyboardInterrupt as inst:\n            LOGGER.info(\"assembly interrupted by user.\")\n            print(\"\\nKeyboard Interrupt by user. Cleaning up...\")\n\n        except IPyradWarningExit as inst:\n            LOGGER.info(\"IPyradWarningExit: %s\", inst)\n            print(inst)\n\n        except Exception as inst:\n            LOGGER.info(\"caught an unknown exception %s\", inst)\n            print(\"\\n  Exception found: {}\".format(inst))\n\n        ## close client when done or interrupted\n        finally:\n            try:\n                ## save the Assembly\n                self._save()                \n                \n                ## can't close client if it was never open\n                if ipyclient:\n\n                    ## send SIGINT (2) to all engines\n                    ipyclient.abort()\n                    LOGGER.info(\"what %s\", self._ipcluster[\"pids\"])\n                    for engine_id, pid in self._ipcluster[\"pids\"].items():\n                        LOGGER.info(\"eid %s\", engine_id)\n                        LOGGER.info(\"pid %s\", pid)\n                        LOGGER.info(\"queue %s\", ipyclient.queue_status()[engine_id][\"queue\"])\n                        if ipyclient.queue_status()[engine_id][\"queue\"]:\n                            LOGGER.info('interrupting engine {} w/ SIGINT to {}'\\\n                                        .format(engine_id, pid))\n                            os.kill(pid, 2)\n                    time.sleep(1)\n\n                    ## if CLI, stop jobs and shutdown\n                    if 'ipyrad-cli' in self._ipcluster[\"cluster_id\"]:\n                        LOGGER.info(\"  shutting down engines\")\n                        ipyclient.shutdown(hub=True, block=False)\n                        ipyclient.close()\n                        LOGGER.info(\"  finished shutdown\")\n                    else:\n                        if not ipyclient.outstanding:\n                            ipyclient.purge_everything()\n                        else:\n                            ## nanny: kill everything, something bad happened\n                            ipyclient.shutdown(hub=True, block=False)\n                            ipyclient.close()\n                            print(\"\\nwarning: ipcluster shutdown and must be restarted\")\n                \n                ## reraise the error now that we're cleaned up\n                if inst:\n                    raise inst\n            ## if exception is close and save, print and ignore\n            except Exception as inst2:\n                print(\"warning: error during shutdown:\\n{}\".format(inst2))\n                LOGGER.error(\"shutdown warning: %s\", inst2)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef run(data, samples, force, ipyclient):\n\n    ## prepare dirs\n    data.dirs.outfiles = os.path.join(data.dirs.project, data.name+\"_outfiles\")\n    if not os.path.exists(data.dirs.outfiles):\n        os.mkdir(data.dirs.outfiles)\n\n    ## make the snps/filters data base, fills the dups and inds filters\n    ## and fills the splits locations\n    data.database = os.path.join(data.dirs.outfiles, data.name+\".hdf5\")\n    init_arrays(data)\n\n    ## Apply filters to supercatg and superhdf5 with selected samples\n    ## and fill the filters and edge arrays.\n    filter_all_clusters(data, samples, ipyclient)\n\n    ## Everything needed is in the now filled h5 database. Filters were applied\n    ## with 'samples' taken into account. Now we create the loci file (default)\n    ## output and build a stats file.\n    data.outfiles.loci = os.path.join(data.dirs.outfiles, data.name+\".loci\")\n    data.outfiles.alleles = os.path.join(data.dirs.outfiles, data.name+\".alleles.loci\")\n    make_loci_and_stats(data, samples, ipyclient)\n\n    ## OPTIONAL OUTPUTS:\n    output_formats = data.paramsdict[\"output_formats\"]\n\n    ## held separate from *output_formats cuz it's big and parallelized\n    if any([x in output_formats for x in [\"v\", \"V\"]]):\n        full = \"V\" in output_formats\n        try:\n            make_vcf(data, samples, ipyclient, full=full)\n        except IPyradWarningExit as inst:\n            ## Something fsck vcf build. Sometimes this is simply a memory\n            ## issue, so trap the exception and allow it to try building\n            ## the other output formats.\n            print(\"  Error building vcf. See ipyrad_log.txt for details.\")\n            LOGGER.error(inst)\n\n    ## make other array-based formats, recalcs keeps and arrays\n    make_outfiles(data, samples, output_formats, ipyclient)\n\n    ## print friendly message\n    shortpath = data.dirs.outfiles.replace(os.path.expanduser(\"~\"), \"~\")\n    print(\"{}Outfiles written to: {}\\n\".format(data._spacer, shortpath))", "response": "Run the clustering process."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef make_stats(data, samples, samplecounts, locuscounts):\n\n    ## get meta info\n    with h5py.File(data.clust_database, 'r') as io5:\n        anames = io5[\"seqs\"].attrs[\"samples\"]\n        nloci = io5[\"seqs\"].shape[0]\n        optim = io5[\"seqs\"].attrs[\"chunksize\"][0]\n\n    ## open the out handle. This will have three data frames saved to it.\n    ## locus_filtering, sample_coverages, and snp_distributions\n    data.stats_files.s7 = os.path.join(data.dirs.outfiles,\n                                       data.name+\"_stats.txt\")\n    outstats = io.open(data.stats_files.s7, 'w', encoding=\"utf-8\")\n\n    ########################################################################\n    ## get stats for locus_filtering, use chunking.\n    filters = np.zeros(6, dtype=int)\n    passed = 0\n    start = 0\n\n    piscounts = Counter()\n    varcounts = Counter()\n    for i in range(200):\n        piscounts[i] = 0\n        varcounts[i] = 0\n\n\n    applied = pd.Series([0]*8,\n        name=\"applied_order\",\n        index=[\n        \"total_prefiltered_loci\",\n        \"filtered_by_rm_duplicates\",\n        \"filtered_by_max_indels\",\n        \"filtered_by_max_snps\",\n        \"filtered_by_max_shared_het\",\n        \"filtered_by_min_sample\",\n        \"filtered_by_max_alleles\",\n        \"total_filtered_loci\"])\n\n    ## load the h5 database\n    co5 = h5py.File(data.database, 'r')\n\n    while start < nloci:\n        hslice = [start, start+optim]\n        ## load each array\n        afilt = co5[\"filters\"][hslice[0]:hslice[1], ]\n        asnps = co5[\"snps\"][hslice[0]:hslice[1], ]\n\n        ## get subarray results from filter array\n        # max_indels, max_snps, max_hets, min_samps, bad_edges, max_alleles\n        filters += afilt.sum(axis=0)\n        applied[\"filtered_by_rm_duplicates\"] += afilt[:, 0].sum()\n        mask = afilt[:, 0].astype(np.bool)\n        applied[\"filtered_by_max_indels\"] += afilt[~mask, 1].sum()\n        mask = afilt[:, 0:2].sum(axis=1).astype(np.bool)\n        applied[\"filtered_by_max_snps\"] += afilt[~mask, 2].sum()\n        mask = afilt[:, 0:3].sum(axis=1).astype(np.bool)\n        applied[\"filtered_by_max_shared_het\"] += afilt[~mask, 3].sum()\n        mask = afilt[:, 0:4].sum(axis=1).astype(np.bool)\n        applied[\"filtered_by_min_sample\"] += afilt[~mask, 4].sum()\n        mask = afilt[:, 0:5].sum(axis=1).astype(np.bool)\n        applied[\"filtered_by_max_alleles\"] += afilt[~mask, 5].sum()\n\n        passed += np.sum(afilt.sum(axis=1) == 0)\n\n        ## get filter to count snps for only passed loci\n        ## should we filter by all vars, or just by pis? doing all var now.\n        apply_filter = afilt.sum(axis=1).astype(np.bool)\n\n        ## get snps counts\n        snplocs = asnps[~apply_filter, :].sum(axis=1)\n        varlocs = snplocs.sum(axis=1)\n        varcounts.update(Counter(varlocs))\n        #snpcounts.update(Counter(snplocs[:, 0]))\n        piscounts.update(Counter(snplocs[:, 1]))\n\n        ## increase counter to advance through h5 database\n        start += optim\n\n    ## record filtering of loci from total to final\n    filtdat = pd.Series(np.concatenate([[nloci], filters, [passed]]),\n        name=\"total_filters\",\n        index=[\n        \"total_prefiltered_loci\",\n        \"filtered_by_rm_duplicates\",\n        \"filtered_by_max_indels\",\n        \"filtered_by_max_snps\",\n        \"filtered_by_max_shared_het\",\n        \"filtered_by_min_sample\",\n        \"filtered_by_max_alleles\",\n        \"total_filtered_loci\"])\n\n\n    retained = pd.Series([0]*8,\n        name=\"retained_loci\",\n        index=[\n        \"total_prefiltered_loci\",\n        \"filtered_by_rm_duplicates\",\n        \"filtered_by_max_indels\",\n        \"filtered_by_max_snps\",\n        \"filtered_by_max_shared_het\",\n        \"filtered_by_min_sample\",\n        \"filtered_by_max_alleles\",\n        \"total_filtered_loci\"])\n    retained[\"total_prefiltered_loci\"] = nloci\n    retained[\"filtered_by_rm_duplicates\"] = nloci - applied[\"filtered_by_rm_duplicates\"]\n    retained[\"filtered_by_max_indels\"] = retained[\"filtered_by_rm_duplicates\"] - applied[\"filtered_by_max_indels\"]\n    retained[\"filtered_by_max_snps\"] = retained[\"filtered_by_max_indels\"] - applied[\"filtered_by_max_snps\"]\n    retained[\"filtered_by_max_shared_het\"] = retained[\"filtered_by_max_snps\"] - applied[\"filtered_by_max_shared_het\"]\n    retained[\"filtered_by_min_sample\"] = retained[\"filtered_by_max_shared_het\"] - applied[\"filtered_by_min_sample\"]\n    retained[\"filtered_by_max_alleles\"] = retained[\"filtered_by_min_sample\"] - applied[\"filtered_by_max_alleles\"]\n    retained[\"total_filtered_loci\"] = passed\n\n\n    print(u\"\\n\\n## The number of loci caught by each filter.\"+\\\n          u\"\\n## ipyrad API location: [assembly].stats_dfs.s7_filters\\n\",\n          file=outstats)\n    data.stats_dfs.s7_filters = pd.DataFrame([filtdat, applied, retained]).T\n    data.stats_dfs.s7_filters.to_string(buf=outstats)\n\n\n    ########################################################################\n    ## make dataframe of sample_coverages\n    ## samplecounts is len of anames from db. Save only samples in samples.\n    #print(samplecounts)\n    #samples = [i.name for i in samples]\n    ## get sample names in the order of anames\n    #sids = [list(anames).index(i) for i in samples]\n    #covdict = {name: val for name, val in zip(np.array(samples)[sidx], samplecounts)}\n    #covdict = {name: val for name, val in zip(samples, samplecounts[sidx])}\n    covdict = pd.Series(samplecounts, name=\"sample_coverage\", index=anames)\n    covdict = covdict[covdict != 0]\n    print(u\"\\n\\n\\n## The number of loci recovered for each Sample.\"+\\\n          u\"\\n## ipyrad API location: [assembly].stats_dfs.s7_samples\\n\",\n          file=outstats)\n    data.stats_dfs.s7_samples = pd.DataFrame(covdict)\n    data.stats_dfs.s7_samples.to_string(buf=outstats)\n\n\n    ########################################################################\n    ## get stats for locus coverage\n    lrange = range(1, len(samples)+1)\n    locdat = pd.Series(locuscounts, name=\"locus_coverage\", index=lrange)\n    start = data.paramsdict[\"min_samples_locus\"]-1\n    locsums = pd.Series({i: np.sum(locdat.values[start:i]) for i in lrange},\n                        name=\"sum_coverage\", index=lrange)\n    print(u\"\\n\\n\\n## The number of loci for which N taxa have data.\"+\\\n          u\"\\n## ipyrad API location: [assembly].stats_dfs.s7_loci\\n\",\n          file=outstats)\n    data.stats_dfs.s7_loci = pd.concat([locdat, locsums], axis=1)\n    data.stats_dfs.s7_loci.to_string(buf=outstats)\n\n\n    #########################################################################\n    ## get stats for SNP_distribution\n    try:\n        smax = max([i+1 for i in varcounts if varcounts[i]])\n    except Exception as inst:\n        raise IPyradWarningExit(\"\"\"\n    Exception: empty varcounts array. This could be because no samples \n    passed filtering, or it could be because you have overzealous filtering.\n    Check the values for `trim_loci` and make sure you are not trimming the\n    edge too far\n    \"\"\")\n\n    vardat = pd.Series(varcounts, name=\"var\", index=range(smax)).fillna(0)\n    sumd = {}\n    for i in range(smax):\n        sumd[i] = np.sum([i*vardat.values[i] for i in range(i+1)])\n    varsums = pd.Series(sumd, name=\"sum_var\", index=range(smax))\n\n    pisdat = pd.Series(piscounts, name=\"pis\", index=range(smax)).fillna(0)\n    sumd = {}\n    for i in range(smax):\n        sumd[i] = np.sum([i*pisdat.values[i] for i in range(i+1)])\n    pissums = pd.Series(sumd, name=\"sum_pis\", index=range(smax))\n\n    print(u\"\\n\\n\\n## The distribution of SNPs (var and pis) per locus.\"+\\\n          u\"\\n## var = Number of loci with n variable sites (pis + autapomorphies)\"+\\\n          u\"\\n## pis = Number of loci with n parsimony informative site (minor allele in >1 sample)\"+\\\n          u\"\\n## ipyrad API location: [assembly].stats_dfs.s7_snps\\n\",\n          file=outstats)\n    data.stats_dfs.s7_snps = pd.concat([vardat, varsums, pisdat, pissums],\n                                        axis=1)\n    data.stats_dfs.s7_snps.to_string(buf=outstats)\n\n    ##########################################################################\n    ## print the stats summary (-r summary) with final sample loci data.\n    fullstat = data.stats\n    fullstat['state'] = 7\n    fullstat[\"loci_in_assembly\"] = data.stats_dfs.s7_samples\n\n    print(u\"\\n\\n\\n## Final Sample stats summary\\n\", file=outstats)\n    fullstat.to_string(buf=outstats)\n\n    ## close it\n    outstats.close()\n    co5.close()", "response": "make stats file and save to Assembly obj"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef select_samples(dbsamples, samples, pidx=None):\n    ## get index from dbsamples\n    samples = [i.name for i in samples]\n    if pidx:\n        sidx = [list(dbsamples[pidx]).index(i) for i in samples]\n    else:\n        sidx = [list(dbsamples).index(i) for i in samples]\n    sidx.sort()\n    return sidx", "response": "Select the samples that are included in the\n   ."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nfilters all clusters in the clust_database file with the given samples.", "response": "def filter_all_clusters(data, samples, ipyclient):\n    \"\"\"\n    Open the clust_database HDF5 array with seqs, catg, and filter data. \n    Fill the remaining filters.\n    \"\"\"\n    ## create loadbalanced ipyclient\n    lbview = ipyclient.load_balanced_view()\n\n    ## get chunk size from the HD5 array and close\n    with h5py.File(data.clust_database, 'r') as io5:\n        ## the size of chunks for reading/writing\n        optim = io5[\"seqs\"].attrs[\"chunksize\"][0]\n        ## the samples in the database in their locus order\n        dbsamples = io5[\"seqs\"].attrs[\"samples\"]\n        ## the total number of loci\n        nloci = io5[\"seqs\"].shape[0]\n\n    ## make a tmp directory for saving chunked arrays to\n    chunkdir = os.path.join(data.dirs.outfiles, data.name+\"_tmpchunks\")\n    if not os.path.exists(chunkdir):\n        os.mkdir(chunkdir)\n\n    ## get the indices of the samples that we are going to include\n    sidx = select_samples(dbsamples, samples)\n    ## do the same for the populations samples\n    if data.populations:\n        data._populations = {}\n        for pop in data.populations:\n            try:\n                _samps = [data.samples[i] for i in data.populations[pop][1]]\n                data._populations[pop] = (\n                    data.populations[pop][0],\n                    select_samples(dbsamples, _samps, sidx))\n            except:\n                print(\"    Sample in populations file not present in assembly - {}\".format(data.populations[pop][1]))\n                raise\n                                                \n    LOGGER.info(\"samples %s \\n, dbsamples %s \\n, sidx %s \\n\",\n                samples, dbsamples, sidx)\n\n    ## Put inside a try statement so we can delete tmpchunks\n    try:\n        ## load a list of args to send to Engines. Each arg contains the index\n        ## to sample optim loci from catg, seqs, filters &or edges, which will\n        ## be loaded on the remote Engine.\n\n        ## create job queue\n        start = time.time()\n        printstr = \" filtering loci        | {} | s7 |\"\n        fasyncs = {}\n        submitted = 0\n        while submitted < nloci:\n            hslice = np.array([submitted, submitted+optim])\n            fasyncs[hslice[0]] = lbview.apply(filter_stacks, *(data, sidx, hslice))\n            submitted += optim\n\n        ## run filter_stacks on all chunks\n        while 1:\n            readies = [i.ready() for i in fasyncs.values()]\n            elapsed = datetime.timedelta(seconds=int(time.time()-start))\n            progressbar(len(readies), sum(readies),\n                printstr.format(elapsed), spacer=data._spacer)\n            time.sleep(0.1)\n            if sum(readies) == len(readies):\n                print(\"\")\n                break\n\n        ## raise error if any jobs failed\n        for async in fasyncs:\n            if not fasyncs[async].successful():\n                LOGGER.error(\"error in filter_stacks on chunk %s: %s\",\n                             async, fasyncs[async].exception())\n                raise IPyradWarningExit(\"error in filter_stacks on chunk {}: {}\"\\\n                             .format(async, fasyncs[async].exception()))\n        ipyclient.purge_everything()\n\n        ## get all the saved tmp arrays for each slice\n        tmpsnp = glob.glob(os.path.join(chunkdir, \"snpf.*.npy\"))\n        tmphet = glob.glob(os.path.join(chunkdir, \"hetf.*.npy\"))\n        tmpmin = glob.glob(os.path.join(chunkdir, \"minf.*.npy\"))\n        tmpedg = glob.glob(os.path.join(chunkdir, \"edgf.*.npy\"))\n        tmppld = glob.glob(os.path.join(chunkdir, \"pldf.*.npy\"))\n        tmpind = glob.glob(os.path.join(chunkdir, \"indf.*.npy\"))\n\n        ## sort array files within each group\n        arrdict = OrderedDict([('ind', tmpind),\n                               ('snp', tmpsnp), ('het', tmphet),\n                               ('min', tmpmin), ('edg', tmpedg),\n                               ('pld', tmppld)])\n        for arrglob in arrdict.values():\n            arrglob.sort(key=lambda x: int(x.rsplit(\".\")[-2]))\n\n        ## re-load the full filter array who's order is\n        ## [\"duplicates\", \"max_indels\", \"max_snps\", \"max_hets\", \"min_samps\", \"max_alleles\"]\n        io5 = h5py.File(data.database, 'r+')\n        superfilter = np.zeros(io5[\"filters\"].shape, io5[\"filters\"].dtype)\n\n        ## iterate across filter types (dups is already filled)\n        ## we have [4,4] b/c minf and edgf both write to minf\n        for fidx, ftype in zip([1, 2, 3, 4, 4, 5], arrdict.keys()):\n            ## fill in the edgefilters\n            for ffile in arrdict[ftype]:\n                ## grab a file and get it's slice\n                hslice = int(ffile.split(\".\")[-2])\n                ## load in the array\n                arr = np.load(ffile)\n                ## store slice into full array (we use += here because the minf\n                ## and edgf arrays both write to the same filter).\n                superfilter[hslice:hslice+optim, fidx] += arr\n\n        ## store to DB\n        io5[\"filters\"][:] += superfilter\n        del arr, superfilter\n\n        ## store the other arrayed values (edges, snps)\n        edgarrs = glob.glob(os.path.join(chunkdir, \"edgearr.*.npy\"))\n        snparrs = glob.glob(os.path.join(chunkdir, \"snpsarr.*.npy\"))\n\n        ## sort array files within each group\n        arrdict = OrderedDict([('edges', edgarrs), ('snps', snparrs)])\n        for arrglob in arrdict.values():\n            arrglob.sort(key=lambda x: int(x.rsplit(\".\")[-2]))\n\n        ## fill the edge array, splits are already in there.\n        superedge = np.zeros(io5['edges'].shape, io5['edges'].dtype)\n        for ffile in arrdict['edges']:\n            ## grab a file and get it's slice\n            hslice = int(ffile.split(\".\")[-2])\n            ## load in the array w/ shape (hslice, 5)\n            arr = np.load(ffile)\n            ## store slice into full array\n            superedge[hslice:hslice+optim, :] = arr\n        io5[\"edges\"][:, :] = superedge\n        del arr, superedge\n\n        ## fill the snps array. shape= (nloci, maxlen, 2)\n        supersnps = np.zeros(io5['snps'].shape, io5['snps'].dtype)\n        for ffile in arrdict['snps']:\n            ## grab a file and get it's slice\n            hslice = int(ffile.split(\".\")[-2])\n            ## load in the array w/ shape (hslice, maxlen, 2)\n            arr = np.load(ffile)\n            ## store slice into full array\n            LOGGER.info(\"shapes, %s %s\", supersnps.shape, arr.shape)\n            supersnps[hslice:hslice+optim, :, :] = arr\n        io5[\"snps\"][:] = supersnps\n        del arr\n        io5.close()\n\n    finally:\n        ## clean up the tmp files/dirs even if we failed.\n        try:\n            LOGGER.info(\"finished filtering\")\n            shutil.rmtree(chunkdir)\n        except (IOError, OSError):\n            pass"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\npads names for loci output", "response": "def padnames(names):\n    \"\"\" pads names for loci output \"\"\"\n\n    ## get longest name\n    longname_len = max(len(i) for i in names)\n    ## Padding distance between name and seq.\n    padding = 5\n    ## add pad to names\n    pnames = [name + \" \" * (longname_len - len(name)+ padding) \\\n              for name in names]\n    snppad = \"//\" + \" \" * (longname_len - 2 + padding)\n    return np.array(pnames), snppad"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nmakes the. loci file from h5py data base and write stats to file.", "response": "def make_loci_and_stats(data, samples, ipyclient):\n    \"\"\"\n    Makes the .loci file from h5 data base. Iterates by optim loci at a\n    time and write to file. Also makes alleles file if requested.\n    \"\"\"\n    ## start vcf progress bar\n    start = time.time()\n    printstr = \" building loci/stats   | {} | s7 |\"\n    elapsed = datetime.timedelta(seconds=int(time.time()-start))\n    progressbar(20, 0, printstr.format(elapsed), spacer=data._spacer)\n\n    ## get some db info\n    with h5py.File(data.clust_database, 'r') as io5:\n        ## will iterate optim loci at a time\n        optim = io5[\"seqs\"].attrs[\"chunksize\"][0]\n        nloci = io5[\"seqs\"].shape[0]\n        anames = io5[\"seqs\"].attrs[\"samples\"]\n\n    ## get name and snp padding\n    pnames, snppad = padnames(anames)\n    snames = [i.name for i in samples]\n    smask = np.array([i not in snames for i in anames])\n\n    ## keep track of how many loci from each sample pass all filters\n    samplecov = np.zeros(len(anames), dtype=np.int32)\n\n    ## set initial value to zero for all values above min_samples_locus\n    #for cov in range(data.paramsdict[\"min_samples_locus\"], len(anames)+1):\n    locuscov = Counter()\n    for cov in range(len(anames)+1):\n        locuscov[cov] = 0\n\n    ## client for sending jobs to parallel engines\n    lbview = ipyclient.load_balanced_view()\n\n    ## send jobs in chunks\n    loci_asyncs = {}\n    for istart in xrange(0, nloci, optim):\n        args = [data, optim, pnames, snppad, smask, istart, samplecov, locuscov, 1]\n        loci_asyncs[istart] = lbview.apply(locichunk, args)\n\n    while 1:\n        done = [i.ready() for i in loci_asyncs.values()]\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))\n        progressbar(len(done), sum(done), printstr.format(elapsed), spacer=data._spacer)\n        time.sleep(0.1)\n        if len(done) == sum(done):\n            print(\"\")\n            break\n\n    ## check for errors\n    for job in loci_asyncs:\n        if loci_asyncs[job].ready() and not loci_asyncs[job].successful():\n            LOGGER.error(\"error in building loci [%s]: %s\",\n                         job, loci_asyncs[job].exception())\n            raise IPyradWarningExit(loci_asyncs[job].exception())\n\n    ## concat and cleanup\n    results = [i.get() for i in loci_asyncs.values()]\n    ## update dictionaries\n    for chunk in results:\n        samplecov += chunk[0]\n        locuscov.update(chunk[1])\n\n    ## get all chunk files\n    tmploci = glob.glob(data.outfiles.loci+\".[0-9]*\")\n    ## sort by start value\n    tmploci.sort(key=lambda x: int(x.split(\".\")[-1]))\n\n    ## write tmpchunks to locus file\n    locifile = open(data.outfiles.loci, 'w')\n    for tmploc in tmploci:\n        with open(tmploc, 'r') as inloc:\n            locdat = inloc.read()\n            locifile.write(locdat)\n            os.remove(tmploc)\n    locifile.close()\n\n    ## make stats file from data\n    make_stats(data, samples, samplecov, locuscov)\n\n    ## repeat for alleles output\n    if \"a\" in data.paramsdict[\"output_formats\"]:\n\n        loci_asyncs = {}\n        for istart in xrange(0, nloci, optim):\n            args = [data, optim, pnames, snppad, smask, istart, samplecov, locuscov, 0]\n            loci_asyncs[istart] = lbview.apply(locichunk, args)\n\n        while 1:\n            done = [i.ready() for i in loci_asyncs.values()]\n            elapsed = datetime.timedelta(seconds=int(time.time()-start))\n            progressbar(len(done), sum(done),\n                \" building alleles      | {} | s7 |\".format(elapsed), \n                spacer=data._spacer)\n            time.sleep(0.1)\n            if len(done) == sum(done):\n                print(\"\")\n                break\n\n        ## check for errors\n        for job in loci_asyncs:\n            if loci_asyncs[job].ready() and not loci_asyncs[job].successful():\n                LOGGER.error(\"error in building alleles [%s]: %s\",\n                             job, loci_asyncs[job].exception())\n                raise IPyradWarningExit(loci_asyncs[job].exception())\n\n        ## concat and cleanup\n        #results = [i.get() for i in loci_asyncs.values()]\n\n        ## get all chunk files\n        tmploci = glob.glob(data.outfiles.loci+\".[0-9]*\")\n        ## sort by start value\n        tmploci.sort(key=lambda x: int(x.split(\".\")[-1]))\n\n        ## write tmpchunks to locus file\n        locifile = open(data.outfiles.alleles, 'w')\n        for tmploc in tmploci:\n            with open(tmploc, 'r') as inloc:\n                locdat = inloc.read()\n                inalleles = get_alleles(locdat)\n                locifile.write(inalleles)\n                os.remove(tmploc)\n        locifile.close()"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef locichunk(args):\n    ## parse args\n    data, optim, pnames, snppad, smask, start, samplecov, locuscov, upper = args\n\n    ## this slice\n    hslice = [start, start+optim]\n\n    ## get filter db info\n    co5 = h5py.File(data.database, 'r')\n    afilt = co5[\"filters\"][hslice[0]:hslice[1], ]\n    aedge = co5[\"edges\"][hslice[0]:hslice[1], ]\n    asnps = co5[\"snps\"][hslice[0]:hslice[1], ]\n\n    ## get seqs db\n    io5 = h5py.File(data.clust_database, 'r')\n    if upper:\n        aseqs = np.char.upper(io5[\"seqs\"][hslice[0]:hslice[1], ])\n    else:\n        aseqs = io5[\"seqs\"][hslice[0]:hslice[1], ]\n\n    ## which loci passed all filters\n    keep = np.where(np.sum(afilt, axis=1) == 0)[0]\n    store = []\n\n    ## write loci that passed after trimming edges, then write snp string\n    for iloc in keep:\n        edg = aedge[iloc]\n        #LOGGER.info(\"!!!!!! iloc edg %s, %s\", iloc, edg)\n        args = [iloc, pnames, snppad, edg, aseqs, asnps, smask, samplecov, locuscov, start]\n        if edg[4]:\n            outstr, samplecov, locuscov = enter_pairs(*args)\n            store.append(outstr)\n        else:\n            outstr, samplecov, locuscov = enter_singles(*args)\n            store.append(outstr)\n\n    ## write to file and clear store\n    tmpo = os.path.join(data.dirs.outfiles, data.name+\".loci.{}\".format(start))\n    with open(tmpo, 'w') as tmpout:\n        tmpout.write(\"\\n\".join(store) + \"\\n\")\n\n    ## close handles\n    io5.close()\n    co5.close()\n\n    ## return sample counter\n    return samplecov, locuscov, start", "response": "Function from make_loci to apply to chunks. smask is sample mask. start is sample start. samplecov is sample coverage. locuscov is sample coverage. upper is True if we want to use upper = True if we want to use upper = False"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef enter_pairs(iloc, pnames, snppad, edg, aseqs, asnps, smask, samplecov, locuscov, start):\n\n    ## snps was created using only the selected samples.\n    LOGGER.info(\"edges in enter_pairs %s\", edg)\n    seq1 = aseqs[iloc, :, edg[0]:edg[1]+1]\n    snp1 = asnps[iloc, edg[0]:edg[1]+1, ]\n\n    ## the 2nd read edges are +5 for the spacer\n    seq2 = aseqs[iloc, :, edg[2]:edg[3]+1]\n    snp2 = asnps[iloc, edg[2]:edg[3]+1, ]\n\n    ## remove rows with all Ns, seq has only selected samples\n    nalln = np.all(seq1 == \"N\", axis=1)\n\n    ## make mask of removed rows and excluded samples. Use the inverse\n    ## of this to save the coverage for samples\n    nsidx = nalln + smask\n    LOGGER.info(\"nsidx %s, nalln %s, smask %s\", nsidx, nalln, smask)\n    samplecov = samplecov + np.invert(nsidx).astype(np.int32)\n    LOGGER.info(\"samplecov %s\", samplecov)\n    idx = np.sum(np.invert(nsidx).astype(np.int32))\n    LOGGER.info(\"idx %s\", idx)\n    locuscov[idx] += 1\n\n    ## select the remaining names in order\n    seq1 = seq1[~nsidx, ]\n    seq2 = seq2[~nsidx, ]\n    names = pnames[~nsidx]\n\n    ## save string for printing, excluding names not in samples\n    outstr = \"\\n\".join(\\\n        [name + s1.tostring()+\"nnnn\"+s2.tostring() for name, s1, s2 in \\\n         zip(names, seq1, seq2)])\n\n    #LOGGER.info(\"s1 %s\", s1.tostring())\n    #LOGGER.info(\"s2 %s\", s2.tostring())\n\n    ## get snp string and add to store\n    snpstring1 = [\"-\" if snp1[i, 0] else \\\n                 \"*\" if snp1[i, 1] else \\\n                 \" \" for i in range(len(snp1))]\n    snpstring2 = [\"-\" if snp2[i, 0] else \\\n                 \"*\" if snp2[i, 1] else \\\n                 \" \" for i in range(len(snp2))]\n\n    #npis = str(snpstring1+snpstring2).count(\"*\")\n    #nvars = str(snpstring1+snpstring2).count(\"-\") + npis\n    outstr += \"\\n\" + snppad + \"\".join(snpstring1)+\\\n              \"    \"+\"\".join(snpstring2)+\"|{}|\".format(iloc+start)\n              #\"|LOCID={},DBID={},NVAR={},NPIS={}|\"\\\n              #.format(1+iloc+start, iloc, nvars, npis)\n\n    return outstr, samplecov, locuscov", "response": "enter_pairs is the function that enters the pairs of the input sequences"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nentering funcs for SE or merged data", "response": "def enter_singles(iloc, pnames, snppad, edg, aseqs, asnps, smask, samplecov, locuscov, start):\n    \"\"\" enter funcs for SE or merged data \"\"\"\n\n    ## grab all seqs between edges\n    seq = aseqs[iloc, :, edg[0]:edg[1]+1]\n    ## snps was created using only the selected samples, and is edge masked.\n    ## The mask is for counting snps quickly, but trimming is still needed here\n    ## to make the snps line up with the seqs in the snp string.\n    snp = asnps[iloc, edg[0]:edg[1]+1, ]\n\n    ## remove rows with all Ns, seq has only selected samples\n    nalln = np.all(seq == \"N\", axis=1)\n\n    ## make mask of removed rows and excluded samples. Use the inverse\n    ## of this to save the coverage for samples\n    nsidx = nalln + smask\n    samplecov = samplecov + np.invert(nsidx).astype(np.int32)\n    idx = np.sum(np.invert(nsidx).astype(np.int32))\n    locuscov[idx] += 1\n\n    ## select the remaining names in order\n    seq = seq[~nsidx, ]\n    names = pnames[~nsidx]\n\n    ## save string for printing, excluding names not in samples\n    outstr = \"\\n\".join(\\\n        [name + s.tostring() for name, s in zip(names, seq)])\n\n    ## get snp string and add to store\n    snpstring = [\"-\" if snp[i, 0] else \\\n                 \"*\" if snp[i, 1] else \\\n                 \" \" for i in range(len(snp))]\n    outstr += \"\\n\" + snppad + \"\".join(snpstring) + \"|{}|\".format(iloc+start)\n    #LOGGER.info(\"outstr %s\", outstr)\n    return outstr, samplecov, locuscov"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ncreating array for final filtered snps data.", "response": "def init_arrays(data):\n    \"\"\"\n    Create database file for storing final filtered snps data as hdf5 array.\n    Copies splits and duplicates info from clust_database to database.\n    \"\"\"\n\n    ## get stats from step6 h5 and create new h5\n    co5 = h5py.File(data.clust_database, 'r')\n    io5 = h5py.File(data.database, 'w')\n\n    ## get maxlen and chunk len\n    maxlen = data._hackersonly[\"max_fragment_length\"] + 20\n    chunks = co5[\"seqs\"].attrs[\"chunksize\"][0]\n    nloci = co5[\"seqs\"].shape[0]\n\n    ## make array for snp string, 2 cols, - and *\n    snps = io5.create_dataset(\"snps\", (nloci, maxlen, 2),\n                              dtype=np.bool,\n                              chunks=(chunks, maxlen, 2),\n                              compression='gzip')\n    snps.attrs[\"chunksize\"] = chunks\n    snps.attrs[\"names\"] = [\"-\", \"*\"]\n\n    ## array for filters that will be applied in step7\n    filters = io5.create_dataset(\"filters\", (nloci, 6), dtype=np.bool)\n    filters.attrs[\"filters\"] = [\"duplicates\", \"max_indels\",\n                                \"max_snps\", \"max_shared_hets\",\n                                \"min_samps\", \"max_alleles\"]\n\n    ## array for edgetrimming\n    edges = io5.create_dataset(\"edges\", (nloci, 5),\n                               dtype=np.uint16,\n                               chunks=(chunks, 5),\n                               compression=\"gzip\")\n    edges.attrs[\"chunksize\"] = chunks\n    edges.attrs[\"names\"] = [\"R1_L\", \"R1_R\", \"R2_L\", \"R2_R\", \"sep\"]\n\n    ## xfer data from clustdb to finaldb\n    edges[:, 4] = co5[\"splits\"][:]\n    filters[:, 0] = co5[\"duplicates\"][:]\n\n    ## close h5s\n    io5.close()\n    co5.close()"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef filter_stacks(data, sidx, hslice):\n    LOGGER.info(\"Entering filter_stacks\")\n\n    ## open h5 handles\n    io5 = h5py.File(data.clust_database, 'r')\n    co5 = h5py.File(data.database, 'r')\n    ## get a chunk (hslice) of loci for the selected samples (sidx)\n    #superseqs = io5[\"seqs\"][hslice[0]:hslice[1], sidx,]\n    ## get an int view of the seq array\n    #superints = io5[\"seqs\"][hslice[0]:hslice[1], sidx, :].view(np.int8)\n\n    ## we need to use upper to skip lowercase allele storage\n    ## this slows down the rate of loading in data by a ton.\n    superints = np.char.upper(io5[\"seqs\"][hslice[0]:hslice[1], sidx,]).view(np.int8)\n    LOGGER.info(\"superints shape {}\".format(superints.shape))\n\n    ## fill edge filter\n    ## get edges of superseqs and supercats, since edges need to be trimmed\n    ## before counting hets, snps, inds. Technically, this could edge trim\n    ## clusters to the point that they are below the minlen, and so this\n    ## also constitutes a filter, though one that is uncommon. For this\n    ## reason we have another filter called edgfilter.\n    splits = co5[\"edges\"][hslice[0]:hslice[1], 4]\n    edgfilter, edgearr = get_edges(data, superints, splits)\n    del splits\n    LOGGER.info('passed edges %s', hslice[0])\n\n    ## minsamp coverages filtered from superseqs\n    minfilter = filter_minsamp(data, superints)\n    LOGGER.info('passed minfilt %s', hslice[0])\n\n    ## maxhets per site column from superseqs after trimming edges\n    hetfilter = filter_maxhet(data, superints, edgearr)\n    LOGGER.info('passed minhet %s', hslice[0])\n\n    ## ploidy filter\n    pldfilter = io5[\"nalleles\"][hslice[0]:hslice[1]].max(axis=1) > \\\n                                         data.paramsdict[\"max_alleles_consens\"]\n\n    ## indel filter, needs a fresh superints b/c get_edges does (-)->(N)\n    indfilter = filter_indels(data, superints, edgearr)\n    LOGGER.info('passed minind %s', hslice[0])\n\n    ## Build the .loci snpstring as an array (snps)\n    ## shape = (chunk, 1) dtype=S1, or should it be (chunk, 2) for [-,*] ?\n    snpfilter, snpsarr = filter_maxsnp(data, superints, edgearr)\n\n    LOGGER.info(\"edg %s\", edgfilter.sum())\n    LOGGER.info(\"min %s\", minfilter.sum())\n    LOGGER.info(\"het %s\", hetfilter.sum())\n    LOGGER.info(\"pld %s\", pldfilter.sum())\n    LOGGER.info(\"snp %s\", snpfilter.sum())\n    LOGGER.info(\"ind %s\", indfilter.sum())\n\n    ## SAVE FILTERS AND INFO TO DISK BY SLICE NUMBER (.0.tmp.h5)\n    chunkdir = os.path.join(data.dirs.outfiles, data.name+\"_tmpchunks\")\n\n    handle = os.path.join(chunkdir, \"edgf.{}.npy\".format(hslice[0]))\n    with open(handle, 'w') as out:\n        np.save(out, edgfilter)\n\n    handle = os.path.join(chunkdir, \"minf.{}.npy\".format(hslice[0]))\n    with open(handle, 'w') as out:\n        np.save(out, minfilter)\n\n    handle = os.path.join(chunkdir, \"hetf.{}.npy\".format(hslice[0]))\n    with open(handle, 'w') as out:\n        np.save(out, hetfilter)\n\n    handle = os.path.join(chunkdir, \"snpf.{}.npy\".format(hslice[0]))\n    with open(handle, 'w') as out:\n        np.save(out, snpfilter)\n\n    handle = os.path.join(chunkdir, \"pldf.{}.npy\".format(hslice[0]))\n    with open(handle, 'w') as out:\n        np.save(out, pldfilter)\n\n    handle = os.path.join(chunkdir, \"indf.{}.npy\".format(hslice[0]))\n    with open(handle, 'w') as out:\n        np.save(out, indfilter)\n\n    handle = os.path.join(chunkdir, \"snpsarr.{}.npy\".format(hslice[0]))\n    with open(handle, 'w') as out:\n        np.save(out, snpsarr)\n\n    handle = os.path.join(chunkdir, \"edgearr.{}.npy\".format(hslice[0]))\n    with open(handle, 'w') as out:\n        np.save(out, edgearr)\n\n    io5.close()\n    co5.close()", "response": "Filter the given stack of loci."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ngets the edges of the base sequence of the base sequence of the base sequence of the base sequence of the base sequence of the base sequence of the base sequence.", "response": "def get_edges(data, superints, splits):\n    \"\"\"\n    Gets edge trimming based on the overlap of sequences at the edges of\n    alignments and the tuple arg passed in for edge_trimming. Trims as\n    (R1 left, R1 right, R2 left, R2 right). We also trim off the restriction\n    site if it present. This modifies superints, and so should be run on an\n    engine so it doesn't affect local copy. If this is changed to run locally\n    for some reason make sure we copy the superints instead.\n    \"\"\"\n    ## the filtering arg and parse it into minsamp numbers\n    if \"trim_overhang\" in data.paramsdict:\n        edgetrims = np.array(data.paramsdict[\"trim_overhang\"]).astype(np.int16)\n    else:\n        edgetrims = np.array(data.paramsdict[\"trim_loci\"]).astype(np.int16)\n\n    ## Cuts 3 and 4 are only for 3rad/radcap\n    ## TODO: This is moderately hackish, it's not using cut3/4\n    ## correctly, just assuming the length is the same as cut1/2\n    try:\n        cut1, cut2, _, _ = data.paramsdict[\"restriction_overhang\"]\n        LOGGER.debug(\"Found 3Rad cut sites\")\n    except ValueError:\n        cut1, cut2 = data.paramsdict[\"restriction_overhang\"]\n\n    cuts = np.array([len(cut1), len(cut2)], dtype=np.int16)\n\n    ## a local array for storing edge trims\n    edges = np.zeros((superints.shape[0], 5), dtype=np.int16)\n\n    ## a local array for storing edge filtered loci, these are stored\n    ## eventually as minsamp excludes.\n    edgefilter = np.zeros((superints.shape[0],), dtype=np.bool)\n\n    ## TRIM GUIDE. The cut site lengths are always trimmed. In addition,\n    ## edge overhangs are trimmed to min(4, minsamp), and then additional\n    ## number of columns is trimmed based on edgetrims values.\n    ## A special case, -1 value means no trim at all.\n    if data.paramsdict[\"min_samples_locus\"] <= 4:\n        minedge = np.int16(data.paramsdict[\"min_samples_locus\"])\n    else:\n        minedge = np.int16(max(4, data.paramsdict[\"min_samples_locus\"]))\n\n    ## convert all - to N to make this easier\n    nodashints = copy.deepcopy(superints)#.copy()\n    nodashints[nodashints == 45] = 78\n\n    ## trim overhanging edges\n    ## get the number not Ns in each site,\n    #ccx = np.sum(superseqs != \"N\", axis=1)\n    ccx = np.sum(nodashints != 78, axis=1, dtype=np.uint16)\n    efi, edg = edgetrim_numba(splits, ccx, edges, edgefilter, edgetrims, cuts, minedge)\n    return efi, edg"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef filter_minsamp(data, superints):\n    ## global minsamp\n    minsamp = data.paramsdict[\"min_samples_locus\"]\n\n    ## use population minsamps\n    if data.populations:\n        ## data._populations will look like this:\n        ## {'a': (3, [0, 1, 2, 3],\n        ##  'b': (3, [4, 5, 6, 7],\n        ##  'c': (3, [8, 9, 10, 11]}\n        LOGGER.info(\"POPULATIONS %s\", data.populations)\n        \n        ## superints has already been subsampled by sidx\n        ## get the sidx values for each pop\n        minfilters = []\n        for pop in data._populations:\n            samps = data._populations[pop][1]\n            minsamp = data._populations[pop][0]\n            mini = np.sum(~np.all(superints[:, samps, :] == 78, axis=2), axis=1) < minsamp\n            minfilters.append(mini)\n        ## get sum across all pops for each locus\n        minfilt = np.any(minfilters, axis=0)\n\n    else:\n        ## if not pop-file use global minsamp filter\n        minfilt = np.sum(~np.all(superints == 78, axis=2), axis=1) < minsamp\n        #LOGGER.info(\"Filtered by min_samples_locus - {}\".format(minfilt.sum()))\n    return minfilt", "response": "Filter minimum samples per locus from superseqs [ chunk ]."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef ucount(sitecol):\n\n    ## a list for only catgs\n    catg = [i for i in sitecol if i in \"CATG\"]\n\n    ## find sites that are ambigs\n    where = [sitecol[sitecol == i] for i in \"RSKYWM\"]\n\n    ## for each occurrence of RSKWYM add ambig resolution to catg\n    for ambig in where:\n        for _ in range(ambig.size):\n            catg += list(AMBIGS[ambig[0]])\n\n    ## if invariant return \" \"\n    if len(set(catg)) < 2:\n        return \" \"\n    else:\n        ## get second most common site\n        second = Counter(catg).most_common()[1][1]\n        if second > 1:\n            return \"*\"\n        else:\n            return \"-\"", "response": "Returns the number of unique bases in a site for a given site."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nfilters max SNPs per locus and save in the array", "response": "def filter_maxsnp(data, superints, edgearr):\n    \"\"\"\n    Filter max # of SNPs per locus. Do R1 and R2 separately if PE.\n    Also generate the snpsite line for the .loci format and save in the snp arr\n    This uses the edge filters that have been built based on trimming, and\n    saves the snps array with edges filtered. **Loci are not yet filtered.**\n    \"\"\"\n\n    ## an empty array to count with failed loci\n    snpfilt = np.zeros(superints.shape[0], dtype=np.bool)\n    snpsarr = np.zeros((superints.shape[0], superints.shape[2], 2), dtype=np.bool)\n    maxsnps = np.array(data.paramsdict['max_SNPs_locus'], dtype=np.int16)\n\n    ## get the per site snp string | shape=(chunk, maxlen)\n    # snpsarr[:, :, 0] = snps == \"-\"\n    # snpsarr[:, :, 1] = snps == \"*\"\n    snpsarr = snpcount_numba(superints, snpsarr)\n    LOGGER.info(\"---found the snps: %s\", snpsarr.sum())\n    snpfilt, snpsarr = snpfilter_numba(snpsarr, snpfilt, edgearr, maxsnps)\n    LOGGER.info(\"---filtered snps: %s\", snpfilt.sum())\n    return snpfilt, snpsarr"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\ncounts the number of unique bases in a site.", "response": "def snpcount_numba(superints, snpsarr):\n    \"\"\"\n    Used to count the number of unique bases in a site for snpstring.\n    \"\"\"\n    ## iterate over all loci\n    for iloc in xrange(superints.shape[0]):\n        for site in xrange(superints.shape[2]):\n\n            ## make new array\n            catg = np.zeros(4, dtype=np.int16)\n\n            ## a list for only catgs\n            ncol = superints[iloc, :, site]\n            for idx in range(ncol.shape[0]):\n                if ncol[idx] == 67: #C\n                    catg[0] += 1\n                elif ncol[idx] == 65: #A\n                    catg[1] += 1\n                elif ncol[idx] == 84: #T\n                    catg[2] += 1\n                elif ncol[idx] == 71: #G\n                    catg[3] += 1\n                elif ncol[idx] == 82: #R\n                    catg[1] += 1        #A\n                    catg[3] += 1        #G\n                elif ncol[idx] == 75: #K\n                    catg[2] += 1        #T\n                    catg[3] += 1        #G\n                elif ncol[idx] == 83: #S\n                    catg[0] += 1        #C\n                    catg[3] += 1        #G\n                elif ncol[idx] == 89: #Y\n                    catg[0] += 1        #C\n                    catg[2] += 1        #T\n                elif ncol[idx] == 87: #W\n                    catg[1] += 1        #A\n                    catg[2] += 1        #T\n                elif ncol[idx] == 77: #M\n                    catg[0] += 1        #C\n                    catg[1] += 1        #A\n\n\n            ## get second most common site\n            catg.sort()\n            ## if invariant e.g., [0, 0, 0, 9], then nothing (\" \")\n            if not catg[2]:\n                pass\n            else:\n                if catg[2] > 1:\n                    snpsarr[iloc, site, 1] = True\n                else:\n                    snpsarr[iloc, site, 0] = True\n    return snpsarr"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef filter_maxhet(data, superints, edgearr):\n    ## the filter max\n    ## The type of max_shared_Hs_locus is determined and the cast to either\n    ## int or float is made at assembly load time\n    maxhet = data.paramsdict[\"max_shared_Hs_locus\"]\n    if isinstance(maxhet, float):\n        ## get an array with maxhet fraction * ntaxa with data for each locus\n        #maxhet = np.array(superints.shape[1]*maxhet, dtype=np.int16)\n        maxhet = np.floor(\n            maxhet * (superints.shape[1] - \n                np.all(superints == 78, axis=2).sum(axis=1))).astype(np.int16)\n    elif isinstance(maxhet, int):\n        maxhet = np.zeros(superints.shape[0], dtype=np.int16)\n        maxhet.fill(data.paramsdict[\"max_shared_Hs_locus\"])\n\n    ## an empty array to fill with failed loci\n    LOGGER.info(\"--------------maxhet mins %s\", maxhet)\n    hetfilt = np.zeros(superints.shape[0], dtype=np.bool)\n    hetfilt = maxhet_numba(superints, edgearr, maxhet, hetfilt)\n    LOGGER.info(\"--------------maxhet sums %s\", hetfilt.sum())\n    return hetfilt", "response": "Filter maxheterozygosity per locus."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef filter_indels(data, superints, edgearr):\n\n    maxinds = np.array(data.paramsdict[\"max_Indels_locus\"]).astype(np.int64)\n\n    ## an empty array to fill with failed loci\n    ifilter = np.zeros(superints.shape[0], dtype=np.bool_)\n\n    ## if paired then worry about splits\n    if \"pair\" in data.paramsdict[\"datatype\"]:\n        for idx in xrange(superints.shape[0]):\n            block1 = superints[idx, :, edgearr[idx, 0]:edgearr[idx, 1]]\n            block2 = superints[idx, :, edgearr[idx, 2]:edgearr[idx, 3]]\n\n            sums1 = maxind_numba(block1)\n            ## If all loci are merged then block2 will be empty which will\n            ## cause maxind_numba to throw a very confusing ValueError\n            if np.any(block2):\n                sums2 = maxind_numba(block2)\n            else:\n                sums2 = 0\n\n            if (sums1 > maxinds[0]) or (sums2 > maxinds[1]):\n                ifilter[idx] = True\n\n    else:\n        for idx in xrange(superints.shape[0]):\n            ## get block based on edge filters\n            block = superints[idx, :, edgearr[idx, 0]:edgearr[idx, 1]]\n            ## shorten block to exclude terminal indels\n            ## if data at this locus (not already filtered by edges/minsamp)\n            if block.shape[1] > 1:\n                try:\n                    sums = maxind_numba(block)\n                except ValueError as inst:\n                    msg = \"All loci filterd by max_Indels_locus. Try increasing this parameter value.\"\n                    raise IPyradError(msg)\n                except Exception as inst:\n                    LOGGER.error(\"error in block {}\".format(block))\n                #LOGGER.info(\"maxind numba %s %s\", idx, sums)\n                #LOGGER.info(\"sums, maxinds[0], compare: %s %s %s\",\n                #             sums, maxinds[0], sums > maxinds[0])\n                if sums > maxinds[0]:\n                    ifilter[idx] = True\n\n    LOGGER.info(\"--------------maxIndels sums %s\", ifilter.sum())\n    return ifilter", "response": "Filter out all indels that are not in the max_Indels_locus parameter."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nfilter for indels that are not 45", "response": "def maxind_numba(block):\n    \"\"\" filter for indels \"\"\"\n    ## remove terminal edges\n    inds = 0\n    for row in xrange(block.shape[0]):\n        where = np.where(block[row] != 45)[0]\n        if len(where) == 0:\n            obs = 100\n        else:\n            left = np.min(where)\n            right = np.max(where)\n            obs = np.sum(block[row, left:right] == 45)\n        if obs > inds:\n            inds = obs\n    return inds"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef make_outfiles(data, samples, output_formats, ipyclient):\n\n    ## will iterate optim loci at a time\n    with h5py.File(data.clust_database, 'r') as io5:\n        optim = io5[\"seqs\"].attrs[\"chunksize\"][0]\n        nloci = io5[\"seqs\"].shape[0]\n\n        ## get name and snp padding\n        anames = io5[\"seqs\"].attrs[\"samples\"]\n        snames = [i.name for i in samples]\n        ## get only snames in this data set sorted in the order they are in io5\n        names = [i for i in anames if i in snames]\n        pnames, _ = padnames(names)\n\n\n    ## get names boolean\n    sidx = np.array([i in snames for i in anames])\n    assert len(pnames) == sum(sidx)\n\n    ## get names index in order of pnames\n    #sindx = [list(anames).index(i) for i in snames]\n\n    ## send off outputs as parallel jobs\n    lbview = ipyclient.load_balanced_view()\n    start = time.time()\n    results = {}\n\n    ## build arrays and outputs from arrays.\n    ## these arrays are keys in the tmp h5 array: seqarr, snparr, bisarr, maparr\n    boss_make_arrays(data, sidx, optim, nloci, ipyclient)\n\n    start = time.time()\n    ## phy and partitions are a default output ({}.phy, {}.phy.partitions)\n    if \"p\" in output_formats:\n        data.outfiles.phy = os.path.join(data.dirs.outfiles, data.name+\".phy\")\n        async = lbview.apply(write_phy, *[data, sidx, pnames])\n        results['phy'] = async\n\n    ## nexus format includes ... additional information ({}.nex)\n    if \"n\" in output_formats:\n        data.outfiles.nexus = os.path.join(data.dirs.outfiles, data.name+\".nex\")\n        async = lbview.apply(write_nex, *[data, sidx, pnames])\n        results['nexus'] = async\n\n    ## snps is actually all snps written in phylip format ({}.snps.phy)\n    if \"s\" in output_formats:\n        data.outfiles.snpsmap = os.path.join(data.dirs.outfiles, data.name+\".snps.map\")\n        data.outfiles.snpsphy = os.path.join(data.dirs.outfiles, data.name+\".snps.phy\")\n        async = lbview.apply(write_snps, *[data, sidx, pnames])\n        results['snps'] = async\n        async = lbview.apply(write_snps_map, data)\n        results['snpsmap'] = async\n\n    ## usnps is one randomly sampled snp from each locus ({}.u.snps.phy)\n    if \"u\" in output_formats:\n        data.outfiles.usnpsphy = os.path.join(data.dirs.outfiles, data.name+\".u.snps.phy\")\n        async = lbview.apply(write_usnps, *[data, sidx, pnames])\n        results['usnps'] = async\n\n    ## str and ustr are for structure analyses. A fairly outdated format, six\n    ## columns of empty space. Full and subsample included ({}.str, {}.u.str)\n    if \"k\" in output_formats:\n        data.outfiles.str = os.path.join(data.dirs.outfiles, data.name+\".str\")\n        data.outfiles.ustr = os.path.join(data.dirs.outfiles, data.name+\".ustr\")        \n        async = lbview.apply(write_str, *[data, sidx, pnames])\n        results['structure'] = async\n\n    ## geno output is for admixture and other software. We include all SNPs,\n    ## but also a .map file which has \"distances\" between SNPs.\n    if 'g' in output_formats:\n        data.outfiles.geno = os.path.join(data.dirs.outfiles, data.name+\".geno\")\n        data.outfiles.ugeno = os.path.join(data.dirs.outfiles, data.name+\".u.geno\")\n        async = lbview.apply(write_geno, *[data, sidx])\n        results['geno'] = async\n\n    ## G-PhoCS output. Have to use cap G here cuz little g is already taken, lol.\n    if 'G' in output_formats:\n        data.outfiles.gphocs = os.path.join(data.dirs.outfiles, data.name+\".gphocs\")\n        async = lbview.apply(write_gphocs, *[data, sidx])\n        results['gphocs'] = async\n\n    ## wait for finished outfiles\n    while 1:\n        readies = [i.ready() for i in results.values()]\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))\n        progressbar(len(readies), sum(readies),\n            \" writing outfiles      | {} | s7 |\".format(elapsed), \n            spacer=data._spacer)\n        time.sleep(0.1)\n        if all(readies):\n            break\n    print(\"\")\n\n    ## check for errors\n    for suff, async in results.items():\n        if not async.successful():\n            print(\"  Warning: error encountered while writing {} outfile: {}\"\\\n                  .format(suff, async.exception()))\n            LOGGER.error(\"  Warning: error in writing %s outfile: %s\", \\\n                         suff, async.exception())\n\n    ## remove the tmparrays\n    tmparrs = os.path.join(data.dirs.outfiles, \"tmp-{}.h5\".format(data.name))\n    os.remove(tmparrs)", "response": "Create output files for all samples in the data set."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef worker_make_arrays(data, sidx, hslice, optim, maxlen):\n    \n    ## big data arrays\n    io5 = h5py.File(data.clust_database, 'r')\n    co5 = h5py.File(data.database, 'r')\n    \n    ## temporary storage until writing to h5 array    \n    maxsnp = co5[\"snps\"][hslice:hslice+optim].sum()         ## concat later\n    maparr = np.zeros((maxsnp, 4), dtype=np.uint32)\n    snparr = np.zeros((sum(sidx), maxsnp), dtype=\"S1\")\n    bisarr = np.zeros((sum(sidx), maxsnp), dtype=\"S1\")\n    seqarr = np.zeros((sum(sidx), maxlen*optim), dtype=\"S1\")\n\n    ## apply all filters and write loci data\n    seqleft = 0\n    snpleft = 0\n    bis = 0                     \n\n    ## edge filter has already been applied to snps, but has not yet been\n    ## applied to seqs. The locus filters have not been applied to either yet.\n    mapsnp = 0\n    totloc = 0\n\n    afilt = co5[\"filters\"][hslice:hslice+optim, :]\n    aedge = co5[\"edges\"][hslice:hslice+optim, :]\n    asnps = co5[\"snps\"][hslice:hslice+optim, :]\n    #aseqs = io5[\"seqs\"][hslice:hslice+optim, sidx, :]\n    ## have to run upper on seqs b/c they have lowercase storage of alleles\n    aseqs = np.char.upper(io5[\"seqs\"][hslice:hslice+optim, sidx, :])\n\n    ## which loci passed all filters\n    keep = np.where(np.sum(afilt, axis=1) == 0)[0]\n\n    ## write loci that passed after trimming edges, then write snp string\n    for iloc in keep:\n        ## grab r1 seqs between edges\n        edg = aedge[iloc]\n\n        ## grab SNPs from seqs already sidx subsampled and edg masked.\n        ## needs to be done here before seqs are edgetrimmed.\n        getsnps = asnps[iloc].sum(axis=1).astype(np.bool)\n        snps = aseqs[iloc, :, getsnps].T\n\n        ## trim edges and split from seqs and concatenate for pairs.\n        ## this seq array will be the phy output.\n        if not \"pair\" in data.paramsdict[\"datatype\"]:\n            seq = aseqs[iloc, :, edg[0]:edg[1]+1]\n        else:\n            seq = np.concatenate([aseqs[iloc, :, edg[0]:edg[1]+1],\n                                  aseqs[iloc, :, edg[2]:edg[3]+1]], axis=1)\n\n        ## remove cols from seq (phy) array that are all N-\n        lcopy = seq\n        lcopy[lcopy == \"-\"] = \"N\"\n        bcols = np.all(lcopy == \"N\", axis=0)\n        seq = seq[:, ~bcols]\n\n        ## put into large array (could put right into h5?)\n        seqarr[:, seqleft:seqleft+seq.shape[1]] = seq\n        seqleft += seq.shape[1]\n\n        ## subsample all SNPs into an array\n        snparr[:, snpleft:snpleft+snps.shape[1]] = snps\n        snpleft += snps.shape[1]\n\n        ## Enter each snp into the map file\n        for i in xrange(snps.shape[1]):\n            ## 1-indexed loci in first column\n            ## actual locus number in second column\n            ## counter for this locus in third column\n            ## snp counter total in fourth column\n            maparr[mapsnp, :] = [totloc+1, hslice+iloc, i, mapsnp+1]\n            mapsnp += 1\n\n        ## subsample one SNP into an array\n        if snps.shape[1]:\n            samp = np.random.randint(snps.shape[1])\n            bisarr[:, bis] = snps[:, samp]\n            bis += 1\n            totloc += 1\n            \n    ## clean up\n    io5.close()\n    co5.close()\n    \n    ## trim trailing edges b/c we made the array bigger than needed.\n    ridx = np.all(seqarr == \"\", axis=0)\n    seqarr = seqarr[:, ~ridx]\n    ridx = np.all(snparr == \"\", axis=0)\n    snparr = snparr[:, ~ridx]\n    ridx = np.all(bisarr == \"\", axis=0)\n    bisarr = bisarr[:, ~ridx]\n    ridx = np.all(maparr == 0, axis=1)\n    maparr = maparr[~ridx, :]\n\n    ## return these three arrays which are pretty small\n    ## catg array gets to be pretty huge, so we return only\n    return seqarr, snparr, bisarr, maparr", "response": "Parallelized worker to build array chunks for output files."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nwrites the phylip output file from the tmparr [ seqarray ]", "response": "def write_phy(data, sidx, pnames):\n    \"\"\" \n    write the phylip output file from the tmparr[seqarray] \n    \"\"\"\n\n    ## grab seq data from tmparr\n    start = time.time()\n    tmparrs = os.path.join(data.dirs.outfiles, \"tmp-{}.h5\".format(data.name)) \n    with h5py.File(tmparrs, 'r') as io5:\n        seqarr = io5[\"seqarr\"]\n\n        ## trim to size b/c it was made longer than actual\n        end = np.where(np.all(seqarr[:] == \"\", axis=0))[0]\n        if np.any(end):\n            end = end.min()\n        else:\n            end = seqarr.shape[1] \n\n        ## write to phylip \n        with open(data.outfiles.phy, 'w') as out:\n            ## write header\n            out.write(\"{} {}\\n\".format(seqarr.shape[0], end))\n\n            ## write data rows\n            for idx, name in enumerate(pnames):\n                out.write(\"{}{}\\n\".format(name, \"\".join(seqarr[idx, :end])))\n    LOGGER.debug(\"finished writing phy in: %s\", time.time() - start)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef write_nex(data, sidx, pnames):\n\n    ## grab seq data from tmparr\n    start = time.time()\n    tmparrs = os.path.join(data.dirs.outfiles, \"tmp-{}.h5\".format(data.name)) \n    with h5py.File(tmparrs, 'r') as io5:\n        seqarr = io5[\"seqarr\"]\n\n        ## trim to size b/c it was made longer than actual\n        end = np.where(np.all(seqarr[:] == \"\", axis=0))[0]\n        if np.any(end):\n            end = end.min()\n        else:\n            end = seqarr.shape[1] \n\n        ## write to nexus\n        data.outfiles.nex = os.path.join(data.dirs.outfiles, data.name+\".nex\")\n\n        with open(data.outfiles.nex, 'w') as out:\n\n            ## write nexus seq header\n            out.write(NEXHEADER.format(seqarr.shape[0], end))\n\n            ## grab a big block of data\n            chunksize = 100000  # this should be a multiple of 100\n            for bidx in xrange(0, end, chunksize):\n                bigblock = seqarr[:, bidx:bidx+chunksize]\n                lend = end-bidx\n                #LOGGER.info(\"BIG: %s %s %s %s\", bigblock.shape, bidx, lend, end)\n\n                ## write interleaved seqs 100 chars with longname+2 before\n                tmpout = []            \n                for block in xrange(0, min(chunksize, lend), 100):\n                    stop = min(block+100, end)\n\n                    for idx, name in enumerate(pnames):\n                        seqdat = bigblock[idx, block:stop]\n                        tmpout.append(\"  {}{}\\n\".format(name, \"\".join(seqdat)))\n                    tmpout.append(\"\\n\")\n\n                ## print intermediate result and clear\n                if any(tmpout):\n                    out.write(\"\".join(tmpout))\n            ## closer\n            out.write(NEXCLOSER)\n    LOGGER.debug(\"finished writing nex in: %s\", time.time() - start)", "response": "write the nexus output file from the tmparr seqarray and maparr"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nwriting a map file with linkage information for SNPs file", "response": "def write_snps_map(data):\n    \"\"\" write a map file with linkage information for SNPs file\"\"\"\n\n    ## grab map data from tmparr\n    start = time.time()\n    tmparrs = os.path.join(data.dirs.outfiles, \"tmp-{}.h5\".format(data.name)) \n    with h5py.File(tmparrs, 'r') as io5:\n        maparr = io5[\"maparr\"][:]\n\n        ## get last data \n        end = np.where(np.all(maparr[:] == 0, axis=1))[0]\n        if np.any(end):\n            end = end.min()\n        else:\n            end = maparr.shape[0]\n\n        ## write to map file (this is too slow...)\n        outchunk = []\n        with open(data.outfiles.snpsmap, 'w') as out:\n            for idx in xrange(end):\n                ## build to list\n                line = maparr[idx, :]\n                #print(line)\n                outchunk.append(\\\n                    \"{}\\trad{}_snp{}\\t{}\\t{}\\n\"\\\n                    .format(line[0], line[1], line[2], 0, line[3]))\n                ## clear list\n                if not idx % 10000:\n                    out.write(\"\".join(outchunk))\n                    outchunk = []\n            ## write remaining\n            out.write(\"\".join(outchunk))\n    LOGGER.debug(\"finished writing snps_map in: %s\", time.time() - start)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nwriting the bisnp string", "response": "def write_usnps(data, sidx, pnames):\n    \"\"\" write the bisnp string \"\"\"\n\n    ## grab bis data from tmparr\n    tmparrs = os.path.join(data.dirs.outfiles, \"tmp-{}.h5\".format(data.name)) \n    with h5py.File(tmparrs, 'r') as io5:\n        bisarr = io5[\"bisarr\"]\n\n        ## trim to size b/c it was made longer than actual\n        end = np.where(np.all(bisarr[:] == \"\", axis=0))[0]\n        if np.any(end):\n            end = end.min()\n        else:\n            end = bisarr.shape[1]        \n\n        ## write to usnps file\n        with open(data.outfiles.usnpsphy, 'w') as out:\n            out.write(\"{} {}\\n\".format(bisarr.shape[0], end))\n            for idx, name in enumerate(pnames):\n                out.write(\"{}{}\\n\".format(name, \"\".join(bisarr[idx, :end])))"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef write_str(data, sidx, pnames):\n\n    ## grab snp and bis data from tmparr\n    start = time.time()\n    tmparrs = os.path.join(data.dirs.outfiles, \"tmp-{}.h5\".format(data.name)) \n    with h5py.File(tmparrs, 'r') as io5:\n        snparr = io5[\"snparr\"]\n        bisarr = io5[\"bisarr\"]\n\n        ## trim to size b/c it was made longer than actual\n        bend = np.where(np.all(bisarr[:] == \"\", axis=0))[0]\n        if np.any(bend):\n            bend = bend.min()\n        else:\n            bend = bisarr.shape[1]        \n\n        send = np.where(np.all(snparr[:] == \"\", axis=0))[0]       \n        if np.any(send):\n            send = send.min()\n        else:\n            send = snparr.shape[1]        \n\n        ## write to str and ustr\n        out1 = open(data.outfiles.str, 'w')\n        out2 = open(data.outfiles.ustr, 'w')\n        numdict = {'A': '0', 'T': '1', 'G': '2', 'C': '3', 'N': '-9', '-': '-9'}\n        if data.paramsdict[\"max_alleles_consens\"] > 1:\n            for idx, name in enumerate(pnames):\n                out1.write(\"{}\\t\\t\\t\\t\\t{}\\n\"\\\n                    .format(name,\n                    \"\\t\".join([numdict[DUCT[i][0]] for i in snparr[idx, :send]])))\n                out1.write(\"{}\\t\\t\\t\\t\\t{}\\n\"\\\n                    .format(name,\n                    \"\\t\".join([numdict[DUCT[i][1]] for i in snparr[idx, :send]])))\n                out2.write(\"{}\\t\\t\\t\\t\\t{}\\n\"\\\n                    .format(name,\n                    \"\\t\".join([numdict[DUCT[i][0]] for i in bisarr[idx, :bend]])))\n                out2.write(\"{}\\t\\t\\t\\t\\t{}\\n\"\\\n                    .format(name,\n                    \"\\t\".join([numdict[DUCT[i][1]] for i in bisarr[idx, :bend]])))\n        else:\n            ## haploid output\n            for idx, name in enumerate(pnames):\n                out1.write(\"{}\\t\\t\\t\\t\\t{}\\n\"\\\n                    .format(name,\n                    \"\\t\".join([numdict[DUCT[i][0]] for i in snparr[idx, :send]])))\n                out2.write(\"{}\\t\\t\\t\\t\\t{}\\n\"\\\n                    .format(name,\n                    \"\\t\".join([numdict[DUCT[i][0]] for i in bisarr[idx, :bend]])))\n        out1.close()\n        out2.close()\n    LOGGER.debug(\"finished writing str in: %s\", time.time() - start)", "response": "Write STRUCTURE format for all SNPs and unlinked SNPs"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef write_geno(data, sidx):\n\n    ## grab snp and bis data from tmparr\n    start = time.time()\n    tmparrs = os.path.join(data.dirs.outfiles, \"tmp-{}.h5\".format(data.name)) \n    with h5py.File(tmparrs, 'r') as io5:\n        snparr = io5[\"snparr\"]\n        bisarr = io5[\"bisarr\"]\n\n        ## trim to size b/c it was made longer than actual\n        bend = np.where(np.all(bisarr[:] == \"\", axis=0))[0]\n        if np.any(bend):\n            bend = bend.min()\n        else:\n            bend = bisarr.shape[1]        \n\n        send = np.where(np.all(snparr[:] == \"\", axis=0))[0]       \n        if np.any(send):\n            send = send.min()\n        else:\n            send = snparr.shape[1]        \n\n        ## get most common base at each SNP as a pseudo-reference\n        ## and record 0,1,2 or missing=9 for counts of the ref allele\n        snpref = reftrick(snparr[:, :send].view(np.int8), GETCONS).view(\"S1\")\n        bisref = reftrick(bisarr[:, :bend].view(np.int8), GETCONS).view(\"S1\")\n\n        ## geno matrix to fill (9 is empty)\n        snpgeno = np.zeros((snparr.shape[0], send), dtype=np.uint8)\n        snpgeno.fill(9)\n        bisgeno = np.zeros((bisarr.shape[0], bend), dtype=np.uint8)\n        bisgeno.fill(9)\n\n        ##--------------------------------------------------------------------\n        ## fill in complete hits (match to first column ref base)\n        mask2 = np.array(snparr[:, :send] == snpref[:, 0])\n        snpgeno[mask2] = 2\n\n        ## fill in single hits (heteros) match to hetero of first+second column\n        ambref = np.apply_along_axis(lambda x: TRANSFULL[tuple(x)], 1, snpref[:, :2])\n        mask1 = np.array(snparr[:, :send] == ambref)\n        snpgeno[mask1] = 1\n\n        ## fill in zero hits, meaning a perfect match to the second column base\n        ## anything else is left at 9 (missing), b/c it's either missing or it\n        ## is not bi-allelic. \n        mask0 = np.array(snparr[:, :send] == snpref[:, 1])\n        snpgeno[mask0] = 0\n\n        ##--------------------------------------------------------------------\n\n        ## fill in complete hits\n        mask2 = np.array(bisarr[:, :bend] == bisref[:, 0])\n        bisgeno[mask2] = 2\n\n        ## fill in single hits (heteros)\n        ambref = np.apply_along_axis(lambda x: TRANSFULL[tuple(x)], 1, bisref[:, :2])\n        mask1 = np.array(bisarr[:, :bend] == ambref)\n        bisgeno[mask1] = 1\n\n        ## fill in zero hits (match to second base)\n        mask0 = np.array(bisarr[:, :bend] == bisref[:, 1])\n        bisgeno[mask0] = 0\n\n        ##---------------------------------------------------------------------\n        ## print to files\n        np.savetxt(data.outfiles.geno, snpgeno.T, delimiter=\"\", fmt=\"%d\")\n        np.savetxt(data.outfiles.ugeno, bisgeno.T, delimiter=\"\", fmt=\"%d\")\n    LOGGER.debug(\"finished writing geno in: %s\", time.time() - start)", "response": "write the geno output for the given sNMF"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nwrites the g - phocs output for a given sample index", "response": "def write_gphocs(data, sidx):\n    \"\"\"\n    write the g-phocs output. This code is hella ugly bcz it's copy/pasted\n    directly from the old loci2gphocs script from pyrad. I figure having it\n    get done the stupid way is better than not having it done at all, at\n    least for the time being. This could probably be sped up significantly.\n    \"\"\"\n\n    outfile = data.outfiles.gphocs\n    infile = data.outfiles.loci\n\n    infile = open(infile)\n    outfile = open(outfile, 'w')\n\n    ## parse the loci\n    ## Each set of reads at a locus is appended with a line\n    ## beginning with // and ending with |x, where x in the locus id.\n    ## so after this call 'loci' will contain an array\n    ## of sets of each read per locus.\n    loci = re.compile(\"\\|[0-9]+\\|\").split(infile.read())[:-1]\n\n    # Print the header, the number of loci in this file\n    outfile.write(str(len(loci)) + \"\\n\\n\")\n\n    # iterate through each locus, print out the header for each locus:\n    # <locus_name> <n_samples> <locus_length>\n    # Then print the data for each sample in this format:\n    # <individual_name> <sequence>\n    for i, loc in enumerate(loci):\n        ## Get rid of the line that contains the snp info\n        loc = loc.rsplit(\"\\n\", 1)[0]\n\n        # Separate out each sequence within the loc block. 'sequences'\n        # will now be a list strings containing name/sequence pairs.\n        # We select each line in the locus string that starts with \">\"\n        names = [line.split()[0] for line in loc.strip().split(\"\\n\")]\n        try:\n            sequences = [line.split()[1] for line in loc.strip().split(\"\\n\")]\n        except:\n            pass\n        # Strips off 'nnnn' separator for paired data\n        # replaces '-' with 'N'\n        editsequences = [seq.replace(\"n\",\"\").replace('-','N') for seq in sequences]\n        sequence_length = len(editsequences[0])\n\n        # get length of longest name and add 4 spaces\n        longname = max(map(len,names))+4\n\n        # Print out the header for this locus\n        outfile.write('locus{} {} {}\\n'.format(str(i), len(sequences), sequence_length))\n\n        # Iterate through each sequence read at this locus and write it to the file.\n        for name,sequence in zip(names, editsequences):\n            # Clean up the sequence data to make gphocs happy. Only accepts UPPER\n            # case chars for bases, and only accepts 'N' for missing data.\n            outfile.write(name+\" \"*(longname-len(name))+sequence + \"\\n\")\n        ## Separate loci with so it's prettier\n        outfile.write(\"\\n\")"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\ncreates a vcf file for the loci passing filtering.", "response": "def make_vcf(data, samples, ipyclient, full=0):\n    \"\"\"\n    Write the full VCF for loci passing filtering. Other vcf formats are\n    possible, like SNPs-only, or with filtered loci included but the filter\n    explicitly labeled. These are not yet supported, however.\n    \"\"\"\n    ## start vcf progress bar\n    start = time.time()\n    printstr = \" building vcf file     | {} | s7 |\"\n    LOGGER.info(\"Writing .vcf file\")\n    elapsed = datetime.timedelta(seconds=int(time.time()-start))\n    progressbar(20, 0, printstr.format(elapsed), spacer=data._spacer)\n\n    ## create outputs for v and V, gzip V to be friendly\n    data.outfiles.vcf = os.path.join(data.dirs.outfiles, data.name+\".vcf\")\n    if full:\n        data.outfiles.VCF = os.path.join(data.dirs.outfiles, data.name+\".vcf.gz\")\n\n    ## get some db info\n    with h5py.File(data.clust_database, 'r') as io5:\n        ## will iterate optim loci at a time\n        optim = io5[\"seqs\"].attrs[\"chunksize\"][0]\n        nloci = io5[\"seqs\"].shape[0]\n        ## get name and snp padding\n        anames = io5[\"seqs\"].attrs[\"samples\"]\n        snames = [i.name for i in samples]\n        names = [i for i in anames if i in snames]\n\n    ## get names index\n    sidx = np.array([i in snames for i in anames])\n\n    ## client for sending jobs to parallel engines, for this step we'll limit\n    ## to half of the available cpus if\n    lbview = ipyclient.load_balanced_view()\n\n    ## send jobs in chunks\n    vasyncs = {}\n    total = 0\n    for chunk in xrange(0, nloci, optim):\n        vasyncs[chunk] = lbview.apply(vcfchunk, *(data, optim, sidx, chunk, full))\n        total += 1\n\n    ## tmp files get left behind and intensive processes are left running when a\n    ## a job is killed/interrupted during vcf build, so we try/except wrap.\n    try:\n        while 1:\n            keys = [i for (i, j) in vasyncs.items() if j.ready()]\n            ## check for failures\n            for job in keys:\n                if not vasyncs[job].successful():\n                    ## raise exception\n                    err = \" error in vcf build chunk {}: {}\"\\\n                          .format(job, vasyncs[job].result())\n                    LOGGER.error(err)\n                    raise IPyradWarningExit(err)\n                else:\n                    ## free up memory\n                    del vasyncs[job]\n\n            finished = total - len(vasyncs) #sum([i.ready() for i in vasyncs.values()])\n            elapsed = datetime.timedelta(seconds=int(time.time()-start))\n            progressbar(total, finished, printstr.format(elapsed), spacer=data._spacer)\n            time.sleep(0.5)\n            if not vasyncs:\n                break\n        print(\"\")\n\n    except Exception as inst:\n        ## make sure all future jobs are aborted\n        keys = [i for (i, j) in vasyncs.items() if not j.ready()]\n        try:\n            for job in keys:\n                #vasyncs[job].abort()\n                vasyncs[job].cancel()\n        except Exception:\n            pass\n        ## make sure all tmp files are destroyed\n        vcfchunks = glob.glob(os.path.join(data.dirs.outfiles, \"*.vcf.[0-9]*\"))\n        h5chunks = glob.glob(os.path.join(data.dirs.outfiles, \".tmp.[0-9]*.h5\"))\n        for dfile in vcfchunks+h5chunks:\n            os.remove(dfile)\n        ## reraise the error\n        raise inst\n\n\n    ## writing full vcf file to disk\n    start = time.time()\n    printstr = \" writing vcf file      | {} | s7 |\"\n    res = lbview.apply(concat_vcf, *(data, names, full))\n    ogchunks = len(glob.glob(data.outfiles.vcf+\".*\"))\n    while 1:\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))\n        curchunks = len(glob.glob(data.outfiles.vcf+\".*\"))\n        progressbar(ogchunks, ogchunks-curchunks, printstr.format(elapsed), spacer=data._spacer)\n        time.sleep(0.1)\n        if res.ready():\n            break\n    elapsed = datetime.timedelta(seconds=int(time.time()-start))\n    progressbar(1, 1, printstr.format(elapsed), spacer=data._spacer)\n    print(\"\")"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nconcatenate a VCF file into a single file.", "response": "def concat_vcf(data, names, full):\n    \"\"\"\n    Sorts, concatenates, and gzips VCF chunks. Also cleans up chunks.\n    \"\"\"\n    ## open handle and write headers\n    if not full:\n        writer = open(data.outfiles.vcf, 'w')\n    else:\n        writer = gzip.open(data.outfiles.VCF, 'w')\n    vcfheader(data, names, writer)\n    writer.close()\n\n    ## get vcf chunks\n    vcfchunks = glob.glob(data.outfiles.vcf+\".*\")\n    vcfchunks.sort(key=lambda x: int(x.rsplit(\".\")[-1]))\n\n    ## concatenate\n    if not full:\n        writer = open(data.outfiles.vcf, 'a')\n    else:\n        writer = gzip.open(data.outfiles.VCF, 'a')\n\n    ## what order do users want? The order in the original ref file?\n    ## Sorted by the size of chroms? that is the order in faidx.\n    ## If reference mapping then it's nice to sort the vcf data by\n    ## CHROM and POS. This is doing a very naive sort right now, so the\n    ## CHROM will be ordered, but not the pos within each chrom.\n    if data.paramsdict[\"assembly_method\"] in [\"reference\", \"denovo+reference\"]:\n        ## Some unix sorting magic to get POS sorted within CHROM\n        ## First you sort by POS (-k 2,2), then you do a `stable` sort \n        ## by CHROM. You end up with POS ordered and grouped correctly by CHROM\n        ## but relatively unordered CHROMs (locus105 will be before locus11).\n        cmd = [\"cat\"] + vcfchunks + [\" | sort -k 2,2 -n | sort -k 1,1 -s\"]\n        cmd = \" \".join(cmd)\n        proc = sps.Popen(cmd, shell=True, stderr=sps.STDOUT, stdout=writer, close_fds=True)\n    else:\n        proc = sps.Popen([\"cat\"] + vcfchunks, stderr=sps.STDOUT, stdout=writer, close_fds=True)\n\n    err = proc.communicate()[0]\n    if proc.returncode:\n        raise IPyradWarningExit(\"err in concat_vcf: %s\", err)\n    writer.close()\n\n    for chunk in vcfchunks:\n        os.remove(chunk)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nfunctioning called within make_vcf to run a single chunk of vcf files", "response": "def vcfchunk(data, optim, sidx, chunk, full):\n    \"\"\"\n    Function called within make_vcf to run chunks on separate engines.\n    \"\"\"\n    ## empty array to be filled before writing\n    ## will not actually be optim*maxlen, extra needs to be trimmed\n    maxlen = data._hackersonly[\"max_fragment_length\"] + 20\n\n    ## get data sliced (optim chunks at a time)\n    hslice = [chunk, chunk+optim]\n\n    ## read all taxa from disk (faster), then subsample taxa with sidx and\n    ## keepmask to greatly reduce the memory load\n    with h5py.File(data.database, 'r') as co5:\n        afilt = co5[\"filters\"][hslice[0]:hslice[1], :]\n        keepmask = afilt.sum(axis=1) == 0\n        ## apply mask to edges\n        aedge = co5[\"edges\"][hslice[0]:hslice[1], :]\n        aedge = aedge[keepmask, :]\n    del afilt\n    ## same memory subsampling.\n    with h5py.File(data.clust_database, 'r') as io5:\n        ## apply mask to edges to aseqs and acatg\n        #aseqs = io5[\"seqs\"][hslice[0]:hslice[1], :, :].view(np.uint8)\n        ## need to read in seqs with upper b/c lowercase allele info\n        aseqs = np.char.upper(io5[\"seqs\"][hslice[0]:hslice[1], :, :]).view(np.uint8)\n        aseqs = aseqs[keepmask, :]\n        aseqs = aseqs[:, sidx, :]\n        acatg = io5[\"catgs\"][hslice[0]:hslice[1], :, :, :]\n        acatg = acatg[keepmask, :]\n        acatg = acatg[:, sidx, :, :]\n        achrom = io5[\"chroms\"][hslice[0]:hslice[1]]\n        achrom = achrom[keepmask, :]        \n\n    LOGGER.info('acatg.shape %s', acatg.shape)\n    ## to save memory some columns are stored in diff dtypes until printing\n    if not full:\n        with h5py.File(data.database, 'r') as co5:\n            snps = co5[\"snps\"][hslice[0]:hslice[1], :]\n            snps = snps[keepmask, :]\n            snps = snps.sum(axis=2)\n        snpidxs = snps > 0\n        maxsnplen = snps.sum()\n\n    ## vcf info to fill, this is bigger than the actual array\n    nrows = maxsnplen\n    cols0 = np.zeros(nrows, dtype=np.int64) #h5py.special_dtype(vlen=bytes))\n    cols1 = np.zeros(nrows, dtype=np.uint32)\n    cols34 = np.zeros((nrows, 2), dtype=\"S5\")\n    cols7 = np.zeros((nrows, 1), dtype=\"S20\")\n\n    ## when nsamples is high this blows up memory (e.g., dim=(5M x 500))\n    ## so we'll instead create a list of arrays with 10 samples at a time.\n    ## maybe later replace this with a h5 array\n    tmph = os.path.join(data.dirs.outfiles, \".tmp.{}.h5\".format(hslice[0]))\n    htmp = h5py.File(tmph, 'w')\n    htmp.create_dataset(\"vcf\", shape=(nrows, sum(sidx)), dtype=\"S24\")\n\n    ## which loci passed all filters\n    init = 0\n\n    ## write loci that passed after trimming edges, then write snp string\n    locindex = np.where(keepmask)[0]\n    for iloc in xrange(aseqs.shape[0]):\n        edg = aedge[iloc]\n        ## grab all seqs between edges\n        if not 'pair' in data.paramsdict[\"datatype\"]:\n            seq = aseqs[iloc, :, edg[0]:edg[1]+1]\n            catg = acatg[iloc, :, edg[0]:edg[1]+1]\n            if not full:\n                snpidx = snpidxs[iloc, edg[0]:edg[1]+1]\n                seq = seq[:, snpidx]\n                catg = catg[:, snpidx]\n        else:\n            seq = np.hstack([aseqs[iloc, :, edg[0]:edg[1]+1],\n                             aseqs[iloc, :, edg[2]:edg[3]+1]])\n            catg = np.hstack([acatg[iloc, :, edg[0]:edg[1]+1],\n                              acatg[iloc, :, edg[2]:edg[3]+1]])\n            if not full:\n                snpidx = np.hstack([snpidxs[iloc, edg[0]:edg[1]+1],\n                                    snpidxs[iloc, edg[2]:edg[3]+1]])\n                seq = seq[:, snpidx]\n                catg = catg[:, snpidx]\n\n        ## empty arrs to fill\n        alleles = np.zeros((nrows, 4), dtype=np.uint8)\n        genos = np.zeros((seq.shape[1], sum(sidx)), dtype=\"S4\")\n        genos[:] = \"./.:\"\n\n        ## ----  build string array ----\n        pos = 0\n        ## If any < 0 this indicates an anonymous locus in denovo+ref assembly\n        if achrom[iloc][0] > 0:\n            pos = achrom[iloc][1]\n            cols0[init:init+seq.shape[1]] = achrom[iloc][0]\n            cols1[init:init+seq.shape[1]] = pos + np.where(snpidx)[0] + 1\n        else:\n            if full:\n                cols1[init:init+seq.shape[1]] = pos + np.arange(seq.shape[1]) + 1\n            else:\n                cols1[init:init+seq.shape[1]] = pos + np.where(snpidx)[0] + 1\n                cols0[init:init+seq.shape[1]] = (chunk + locindex[iloc] + 1) * -1\n\n        ## fill reference base\n        alleles = reftrick(seq, GETCONS)\n\n        ## get the info string column\n        tmp0 = np.sum(catg, axis=2)\n        tmp1 = tmp0 != 0\n        tmp2 = tmp1.sum(axis=1) > 0\n        nsamp = np.sum(tmp1, axis=0)\n        depth = np.sum(tmp0, axis=0)\n        list7 = [[\"NS={};DP={}\".format(i, j)] for i, j in zip(nsamp, depth)]\n        if list7:\n            cols7[init:init+seq.shape[1]] = list7\n\n        ## default fill cons sites where no variants\n        genos[tmp1.T] = \"0/0:\"\n\n        ## fill cons genotypes for sites with alt alleles for taxa in order\n        mask = alleles[:, 1] == 46\n        mask += alleles[:, 1] == 45\n\n        obs = alleles[~mask, :]\n        alts = seq[:, ~mask]\n        who = np.where(mask == False)[0]\n        ## fill variable sites\n        for site in xrange(alts.shape[1]):\n            bases = alts[:, site]\n            #LOGGER.info(\"bases %s\", bases)\n            ohere = obs[site][obs[site] != 0]\n            #LOGGER.info(\"ohere %s\", ohere)\n            alls = np.array([DCONS[i] for i in bases], dtype=np.uint32)\n            #LOGGER.info(\"all %s\", alls)\n            for jdx in xrange(ohere.shape[0]):\n                alls[alls == ohere[jdx]] = jdx\n\n            #LOGGER.info(\"all2 %s\", alls)\n            ## fill into array\n            for cidx in xrange(catg.shape[0]):\n                if tmp2[cidx]:\n                    if alls[cidx][0] < 5:\n                        genos[who[site], cidx] = \"/\".join(alls[cidx].astype(\"S1\").tolist())+\":\"\n                    else:\n                        genos[who[site], cidx] = \"./.:\"\n                    #LOGGER.info(\"genos filled: %s %s %s\", who[site], cidx, genos)\n\n        ## build geno+depth strings\n        ## for each taxon enter 4 catg values\n        fulltmp = np.zeros((seq.shape[1], catg.shape[0]), dtype=\"S24\")\n        for cidx in xrange(catg.shape[0]):\n            ## fill catgs from catgs\n            tmp0 = [str(i.sum()) for i in catg[cidx]]\n            tmp1 = [\",\".join(i) for i in catg[cidx].astype(\"S4\").tolist()]\n            tmp2 = [\"\".join(i+j+\":\"+k) for i, j, k in zip(genos[:, cidx], tmp0, tmp1)]\n            ## fill tmp allcidx\n            fulltmp[:, cidx] = tmp2\n        ## write to h5 for this locus\n        htmp[\"vcf\"][init:init+seq.shape[1], :] = fulltmp\n\n        cols34[init:init+seq.shape[1], 0] = alleles[:, 0].view(\"S1\")\n        cols34[init:init+seq.shape[1], 1] = [\",\".join([j for j in i if j]) \\\n                                    for i in alleles[:, 1:].view(\"S1\").tolist()]\n\n        ## advance counter\n        init += seq.shape[1]\n\n    ## trim off empty rows if they exist\n    withdat = cols0 != 0\n    tot = withdat.sum()\n\n    ## get scaffold names\n    faidict = {}\n    if (data.paramsdict[\"assembly_method\"] in [\"reference\", \"denovo+reference\"]) and \\\n       (os.path.exists(data.paramsdict[\"reference_sequence\"])):\n        fai = pd.read_csv(data.paramsdict[\"reference_sequence\"] + \".fai\", \n                names=['scaffold', 'size', 'sumsize', 'a', 'b'],\n                sep=\"\\t\")\n        faidict = {i+1:j for i,j in enumerate(fai.scaffold)}\n    try:\n        ## This is hax, but it's the only way it will work. The faidict uses positive numbers\n        ## for reference sequence mapped loci for the CHROM/POS info, and it uses negative\n        ## numbers for anonymous loci. Both are 1 indexed, which is where that last `+ 2` comes from.\n        faidict.update({-i:\"locus_{}\".format(i-1) for i in xrange(chunk+1, chunk + optim + 2)})\n        chroms = [faidict[i] for i in cols0]\n    except Exception as inst:\n        LOGGER.error(\"Invalid chromosome dictionary indexwat: {}\".format(inst))\n        LOGGER.debug(\"faidict {}\".format([str(k)+\"/\"+str(v) for k, v in faidict.items() if \"locus\" in v]))\n        LOGGER.debug(\"chroms {}\".format([x for x in cols0 if x < 0]))\n        raise\n    cols0 = np.array(chroms)\n    #else:\n    #    cols0 = np.array([\"locus_{}\".format(i) for i in cols0-1])\n\n    ## Only write if there is some data that passed filtering\n    if tot:\n        LOGGER.debug(\"Writing data to vcf\")\n        if not full:\n            writer = open(data.outfiles.vcf+\".{}\".format(chunk), 'w')\n        else:\n            writer = gzip.open(data.outfiles.vcf+\".{}\".format(chunk), 'w')\n\n        try:\n            ## write in iterations b/c it can be freakin huge.\n            ## for cols0 and cols1 the 'newaxis' slice and the transpose\n            ## are for turning the 1d arrays into column vectors.\n            np.savetxt(writer,\n                np.concatenate(\n                   (cols0[:tot][np.newaxis].T,\n                    cols1[:tot][np.newaxis].T,\n                    np.array([[\".\"]]*tot, dtype=\"S1\"),\n                    cols34[:tot, :],\n                    np.array([[\"13\", \"PASS\"]]*tot, dtype=\"S4\"),\n                    cols7[:tot, :],\n                    np.array([[\"GT:DP:CATG\"]]*tot, dtype=\"S10\"),\n                    htmp[\"vcf\"][:tot, :],\n                    ),\n                    axis=1),\n                delimiter=\"\\t\", fmt=\"%s\")\n        except Exception as inst:\n            LOGGER.error(\"Error building vcf file - \".format(inst))\n            raise\n        writer.close()\n\n    ## close and remove tmp h5\n    htmp.close()\n    os.remove(tmph)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nreturn the most common base at each site in order.", "response": "def reftrick(iseq, consdict):\n    \"\"\" Returns the most common base at each site in order. \"\"\"\n\n    altrefs = np.zeros((iseq.shape[1], 4), dtype=np.uint8)\n    altrefs[:, 1] = 46\n\n    for col in xrange(iseq.shape[1]):\n        ## expand colums with ambigs and remove N-\n        fcounts = np.zeros(111, dtype=np.int64)\n        counts = np.bincount(iseq[:, col])#, minlength=90)\n        fcounts[:counts.shape[0]] = counts\n        ## set N and - to zero, wish numba supported minlen arg\n        fcounts[78] = 0\n        fcounts[45] = 0\n        ## add ambig counts to true bases\n        for aidx in xrange(consdict.shape[0]):\n            nbases = fcounts[consdict[aidx, 0]]\n            for _ in xrange(nbases):\n                fcounts[consdict[aidx, 1]] += 1\n                fcounts[consdict[aidx, 2]] += 1\n            fcounts[consdict[aidx, 0]] = 0\n\n        ## now get counts from the modified counts arr\n        who = np.argmax(fcounts)\n        altrefs[col, 0] = who\n        fcounts[who] = 0\n\n        ## if an alt allele fill over the \".\" placeholder\n        who = np.argmax(fcounts)\n        if who:\n            altrefs[col, 1] = who\n            fcounts[who] = 0\n\n            ## if 3rd or 4th alleles observed then add to arr\n            who = np.argmax(fcounts)\n            altrefs[col, 2] = who\n            fcounts[who] = 0\n\n            ## if 3rd or 4th alleles observed then add to arr\n            who = np.argmax(fcounts)\n            altrefs[col, 3] = who\n\n    return altrefs"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef vcfheader(data, names, ofile):\n    ## choose reference string\n    if data.paramsdict[\"reference_sequence\"]:\n        reference = data.paramsdict[\"reference_sequence\"]\n    else:\n        reference = \"pseudo-reference (most common base at site)\"\n\n\n    ##FILTER=<ID=minCov,Description=\"Data shared across <{mincov} samples\">\n    ##FILTER=<ID=maxSH,Description=\"Heterozygosous site shared across >{maxsh} samples\">\n    header = \"\"\"\\\n##fileformat=VCFv4.0\n##fileDate={date}\n##source=ipyrad_v.{version}\n##reference={reference}\n##phasing=unphased\n##INFO=<ID=NS,Number=1,Type=Integer,Description=\"Number of Samples With Data\">\n##INFO=<ID=DP,Number=1,Type=Integer,Description=\"Total Depth\">\n##FORMAT=<ID=GT,Number=1,Type=String,Description=\"Genotype\">\n##FORMAT=<ID=DP,Number=1,Type=Integer,Description=\"Read Depth\">\n##FORMAT=<ID=CATG,Number=1,Type=String,Description=\"Base Counts (CATG)\">\n#CHROM\\tPOS\\tID\\tREF\\tALT\\tQUAL\\tFILTER\\tINFO\\tFORMAT\\t{names}\n\"\"\".format(date=time.strftime(\"%Y/%m/%d\"),\n           version=__version__,\n           reference=os.path.basename(reference),\n           mincov=data.paramsdict[\"min_samples_locus\"],\n           maxsh=data.paramsdict[\"max_shared_Hs_locus\"],\n           names=\"\\t\".join(names))\n    ## WRITE\n    ofile.write(header)", "response": "Prints header for vcf files\n   "}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef loci2bpp(name, locifile, imap, guidetree,\n    minmap=None,\n    maxloci=None,\n    infer_sptree=0,\n    infer_delimit=0,\n    delimit_alg=(0, 5),\n    seed=12345,\n    burnin=1000,\n    nsample=10000,\n    sampfreq=2,\n    thetaprior=(5, 5),\n    tauprior=(4, 2, 1),\n    traits_df=None,\n    nu=0,\n    kappa=0,\n    useseqdata=1,\n    usetraitdata=1,\n    cleandata=0,\n    wdir=None,\n    finetune=(0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01),\n    verbose=0):\n    \"\"\"\n    Converts loci file format to bpp file format, i.e., concatenated phylip-like\n    format, and produces imap and ctl input files for bpp.\n\n    Parameters:\n    -----------\n    name:\n        A prefix name for output files that will be produced\n    locifile:\n        A .loci file produced by ipyrad.\n    imap:\n        A Python dictionary with 'species' names as keys, and lists of sample\n        names for the values. Any sample that is not included in the imap\n        dictionary will be filtered out of the data when converting the .loci\n        file into the bpp formatted sequence file. Each species in the imap\n        dictionary must also be present in the input 'guidetree'.\n    guidetree:\n        A newick string species tree hypothesis [e.g., (((a,b),(c,d)),e);]\n        All species in the imap dictionary must also be present in the guidetree\n\n    Optional parameters:\n    --------------------\n    infer_sptree:\n        Default=0, only infer parameters on a fixed species tree. If 1, then the\n        input tree is treated as a guidetree and tree search is employed to find\n        the best tree. The results will include support values for the inferred\n        topology.\n    infer_delimit:\n        Default=0, no delimitation. If 1 then splits in the tree that separate\n        'species' will be collapsed to test whether fewer species are a better\n        fit to the data than the number in the input guidetree.\n    delimit_alg:\n        Species delimitation algorithm. This is a tuple. The first value\n        is the algorithm (0 or 1) and the following values are arguments\n        for the given algorithm. See other ctl files for examples of what the\n        delimitation line looks like. This is where you can enter the params\n        (e.g., alpha, migration) for the two different algorithms.\n        For example, the following args would produce the following ctl lines:\n           alg=0, epsilon=5\n           > delimit_alg = (0, 5)\n           speciesdelimitation = 1 0 5\n\n           alg=1, alpha=2, migration=1\n           > delimit_alg = (1, 2, 1)\n           speciesdelimitation = 1 1 2 1\n\n           alg=1, alpha=2, migration=1, diagnosis=0, ?=1\n           > delimit_alg = (1, 2, 1, 0, 1)\n           speciesdelimitation = 1 1 2 1 0 1\n    seed:\n        A random number seed at start of analysis.\n    burnin:\n        Number of burnin generations in mcmc\n    nsample:\n        Number of mcmc generations to run.\n    sampfreq:\n        How often to sample from the mcmc chain.\n    thetaprior:\n        Prior on theta (4Neu), gamma distributed. mean = a/b. e.g., (5, 5)\n    tauprior\n        Prior on root tau, gamma distributed mean = a/b. Last number is \n        dirichlet prior for other taus. e.g., (4, 2, 1)\n    traits_df:\n        A pandas DataFrame with trait data properly formatted. This means only\n        quantitative traits are included, and missing values are NaN.\n        The first column contains sample names, with \"Indiv\" as the header.\n        The following columns have a header row with trait names. This script\n        will write a CSV trait file with trait values mean-standardized, with\n        NaN replaced by \"NA\", and with sample not present in IMAP removed.\n    nu:\n        A prior on phenotypic trait variance (0) for iBPP analysis.\n    kappa:\n        A prior on phenotypic trait mean (0) for iBPP analysis.\n    useseqdata:\n        If false inference proceeds without sequence data (can be used to test\n        the effect of priors on the tree distributions).\n    usetraitdata:\n        If false inference proceeds without trait data (can be used to test\n        the effect of priors on the trait distributions).\n    cleandata:\n        If 1 then sites with missing or hetero characters are removed.\n    wdir:\n        A working directory to write files to.\n    finetune:\n        See bpp documentation.\n    verbose:\n        If verbose=1 the ctl file text will also be written to screen (stderr).\n\n    \"\"\"\n    ## check args\n    if not imap:\n        raise IPyradWarningExit(IMAP_REQUIRED)\n    if minmap:\n        if minmap.keys() != imap.keys():\n            raise IPyradWarningExit(KEYS_DIFFER)\n\n    ## working directory, make sure it exists\n    if wdir:\n        wdir = os.path.abspath(wdir)\n        if not os.path.exists(wdir):\n            raise IPyradWarningExit(\" working directory (wdir) does not exist\")\n    else:\n        wdir = os.path.curdir\n\n    ## if traits_df then we make '.ibpp' files\n    prog = 'bpp'\n    if isinstance(traits_df, pd.DataFrame):\n        prog = 'ibpp'\n    outfile = OPJ(wdir, \"{}.{}.seq.txt\".format(name, prog))\n    mapfile = OPJ(wdir, \"{}.{}.imap.txt\".format(name, prog))\n\n    ## open outhandles\n    fout = open(outfile, 'w')\n    fmap = open(mapfile, 'w')\n\n    ## parse the loci file\n    with open(locifile, 'r') as infile:\n        ## split on \"//\" for legacy compatibility\n        loci = infile.read().strip().split(\"|\\n\")\n        nloci = len(loci)\n\n    ## all samples\n    samples = list(itertools.chain(*imap.values()))\n\n    ## iterate over loci, printing to outfile\n    nkept = 0\n    for iloc in xrange(nloci):\n        lines = loci[iloc].split(\"//\")[0].split()\n        names = lines[::2]\n        names = [\"^\"+i for i in names]\n        seqs = [list(i) for i in lines[1::2]]\n        seqlen = len(seqs[0])\n\n        ## whether to skip this locus based on filters below\n        skip = 0\n\n        ## if minmap filter for sample coverage\n        if minmap:\n            covd = {}\n            for group, vals in imap.items():\n                covd[group] = sum([\"^\"+i in names for i in vals])\n            ## check that coverage is good enough\n            if not all([covd[group] >= minmap[group] for group in minmap]):\n                skip = 1\n\n        ## too many loci?\n        if maxloci:\n            if nkept >= maxloci:\n                skip = 1\n\n        ## build locus as a string\n        if not skip:\n            ## convert to phylip with caret starter and replace - with N.\n            data = [\"{:<30} {}\".format(i, \"\".join(k).replace(\"-\", \"N\")) for \\\n                (i, k) in zip(names, seqs) if i[1:] in samples]\n\n            ## if not empty, write to the file\n            if data:\n                fout.write(\"{} {}\\n\\n{}\\n\\n\"\\\n                           .format(len(data), seqlen, \"\\n\".join(data)))\n                nkept += 1\n\n    ## close up shop\n    fout.close()\n\n    ## write the imap file:\n    data = [\"{:<30} {}\".format(val, key) for key \\\n            in sorted(imap) for val in imap[key]]\n    fmap.write(\"\\n\".join(data))\n    fmap.close()\n\n    ## write ctl file\n    write_ctl(name, imap, guidetree, nkept,\n              infer_sptree, infer_delimit, delimit_alg,\n              seed, burnin, nsample, sampfreq,\n              thetaprior, tauprior, traits_df, nu, kappa,\n              cleandata, useseqdata, usetraitdata, wdir,\n              finetune, verbose)\n\n    ## print message?\n    sys.stderr.write(\"new files created ({} loci, {} species, {} samples)\\n\"\\\n                     .format(nkept, len(imap.keys()),\n                             sum([len(i) for i in imap.values()])))\n    sys.stderr.write(\"  {}.{}.seq.txt\\n\".format(name, prog))\n    sys.stderr.write(\"  {}.{}.imap.txt\\n\".format(name, prog))\n    sys.stderr.write(\"  {}.{}.ctl.txt\\n\".format(name, prog))\n    if isinstance(traits_df, pd.DataFrame):\n        sys.stderr.write(\"  {}.{}.traits.txt\\n\".format(name, prog))\n\n    ## return the ctl file string\n    return os.path.abspath(\n        \"{}.{}.ctl.txt\".format(OPJ(wdir, name), prog))", "response": "This function takes a loci file and returns a bpp file format for the given loci file."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nwrite a control file", "response": "def write_ctl(name, imap, guidetree, nloci,\n              infer_sptree, infer_delimit, delimit_alg,\n              seed, burnin, nsample, sampfreq,\n              thetaprior, tauprior, traits_df, nu0, kappa0,\n              cleandata, useseqdata, usetraitdata, wdir,\n              finetune, verbose):\n\n    \"\"\" write outfile with any args in argdict \"\"\"\n\n    ## A string to store ctl info\n    ctl = []\n\n    ## check the tree (can do this better once we install ete3 w/ ipyrad)\n    if not guidetree.endswith(\";\"):\n        guidetree += \";\"\n\n    ## if traits_df then we make '.ibpp' files\n    prog = 'bpp'\n    if isinstance(traits_df, pd.DataFrame):\n        prog = 'ibpp'\n\n    ## write the top header info\n    ctl.append(\"seed = {}\".format(seed))\n    ctl.append(\"seqfile = {}.{}.seq.txt\".format(OPJ(wdir, name), prog))\n    ctl.append(\"Imapfile = {}.{}.imap.txt\".format(OPJ(wdir, name), prog))\n    ctl.append(\"mcmcfile = {}.{}.mcmc.txt\".format(OPJ(wdir, name), prog))\n    ctl.append(\"outfile = {}.{}.out.txt\".format(OPJ(wdir, name), prog))\n    if isinstance(traits_df, pd.DataFrame):\n        ctl.append(\"traitfile = {}.{}.traits.txt\".format(OPJ(wdir, name), prog))\n\n    ## number of loci (checks that seq file exists and parses from there)\n    ctl.append(\"nloci = {}\".format(nloci))\n    ctl.append(\"usedata = {}\".format(useseqdata))\n    ctl.append(\"cleandata = {}\".format(cleandata))\n\n    ## infer species tree\n    if infer_sptree:\n        ctl.append(\"speciestree = 1 0.4 0.2 0.1\")\n    else:\n        ctl.append(\"speciestree = 0\")\n\n    ## infer delimitation (with algorithm 1 by default)\n    ctl.append(\"speciesdelimitation = {} {} {}\"\\\n               .format(infer_delimit, delimit_alg[0],\n                       \" \".join([str(i) for i in delimit_alg[1:]])))\n\n    ## if using iBPP (if not traits_df, we assume you're using bpp (v.3.3+)\n    if isinstance(traits_df, pd.DataFrame):\n        ## check that the data frame is properly formatted\n        try:\n            traits_df.values.astype(float)\n        except Exception:\n            raise IPyradWarningExit(PDREAD_ERROR)\n\n        ## subsample to keep only samples that are in IMAP, we do not need to\n        ## standarize traits b/c ibpp does that for us.\n        samples = sorted(list(itertools.chain(*imap.values())))\n        didx = [list(traits_df.index).index(i) for i in traits_df.index \\\n                if i not in samples]\n        dtraits = traits_df.drop(traits_df.index[didx])\n\n        ## mean standardize traits values after excluding samples\n        straits = dtraits.apply(lambda x: (x - x.mean()) / (x.std()))\n\n        ## convert NaN to \"NA\" cuz that's what ibpp likes, and write to file\n        ftraits = straits.fillna(\"NA\")\n        traitdict = ftraits.T.to_dict(\"list\")\n\n        ## get reverse imap dict\n        rev = {val:key for key in sorted(imap) for val in imap[key]}\n\n        ## write trait file\n        traitfile = \"{}.{}.traits.txt\".format(os.path.join(wdir, name), prog)\n        with open(traitfile, 'w') as tout:\n            tout.write(\"Indiv\\n\")\n            tout.write(\"\\t\".join(\n                ['Species'] + list(ftraits.columns))+\"\\n\"\n                )\n            #for key in sorted(traitdict):\n            #    tout.write(\"\\t\".join([key, rev[key]] + \\\n            #        [\"^\"+str(i) for i in traitdict[key]])+\"\\n\"\n            #        )\n            nindT = 0\n            for ikey in sorted(imap.keys()):\n                samps = imap[ikey]\n                for samp in sorted(samps):\n                    if samp in traitdict:\n                        tout.write(\"\\t\".join([samp, rev[samp]] + \\\n                            [str(i) for i in traitdict[samp]])+\"\\n\"\n                        )\n                        nindT += 1\n\n        #    tout.write(\"Indiv\\n\"+\"\\t\".join([\"Species\"]+\\\n        #    [\"t_{}\".format(i) for i in range(len(traitdict.values()[0]))])+\"\\n\")\n        #    for key in sorted(traitdict):\n        #        print >>tout, \"\\t\".join([key, rev[key]] + \\\n        #                                [str(i) for i in traitdict[key]])\n        #ftraits.to_csv(traitfile)\n\n        ## write ntraits and nindT and traitfilename\n        ctl.append(\"ntraits = {}\".format(traits_df.shape[1]))\n        ctl.append(\"nindT = {}\".format(nindT))  #traits_df.shape[0]))\n        ctl.append(\"usetraitdata = {}\".format(usetraitdata))\n        ctl.append(\"useseqdata = {}\".format(useseqdata))\n\n        ## trait priors\n        ctl.append(\"nu0 = {}\".format(nu0))\n        ctl.append(\"kappa0 = {}\".format(kappa0))\n\n        ## remove ibpp incompatible options\n        ctl.remove(\"usedata = {}\".format(useseqdata))\n        ctl.remove(\"speciestree = {}\".format(infer_sptree))\n\n    ## get tree values\n    nspecies = str(len(imap))\n    species = \" \".join(sorted(imap))\n    ninds = \" \".join([str(len(imap[i])) for i in sorted(imap)])\n\n    ## write the tree\n    ctl.append(\"\"\"\\\nspecies&tree = {} {}\n                 {}\n                 {}\"\"\".format(nspecies, species, ninds, guidetree))\n\n\n    ## priors\n    ctl.append(\"thetaprior = {} {}\".format(*thetaprior))\n    ctl.append(\"tauprior = {} {} {}\".format(*tauprior))\n\n    ## other values, fixed for now\n    ctl.append(\"finetune = 1: {}\".format(\" \".join([str(i) for i in finetune])))\n    #CTL.append(\"finetune = 1: 1 0.002 0.01 0.01 0.02 0.005 1.0\")\n    ctl.append(\"print = 1 0 0 0\")\n    ctl.append(\"burnin = {}\".format(burnin))\n    ctl.append(\"sampfreq = {}\".format(sampfreq))\n    ctl.append(\"nsample = {}\".format(nsample))\n\n    ## write out the ctl file\n    with open(\"{}.{}.ctl.txt\".format(OPJ(wdir, name), prog), 'w') as out:\n        out.write(\"\\n\".join(ctl))\n\n    ## if verbose print ctl\n    if verbose:\n        sys.stderr.write(\"ctl file\\n--------\\n\"+\"\\n\".join(ctl)+\"\\n--------\\n\\n\")"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\ncollapsing outgroup in ete Tree for easier viewing", "response": "def _collapse_outgroup(tree, taxdicts):\n    \"\"\" collapse outgroup in ete Tree for easier viewing \"\"\"\n    ## check that all tests have the same outgroup\n    outg = taxdicts[0][\"p4\"]\n    if not all([i[\"p4\"] == outg for i in taxdicts]):\n        raise Exception(\"no good\")\n   \n    ## prune tree, keep only one sample from outgroup\n    tre = ete.Tree(tree.write(format=1)) #tree.copy(method=\"deepcopy\")\n    alltax = [i for i in tre.get_leaf_names() if i not in outg]\n    alltax += [outg[0]]\n    tre.prune(alltax)\n    tre.search_nodes(name=outg[0])[0].name = \"outgroup\"\n    tre.ladderize()\n\n    ## remove other ougroups from taxdicts\n    taxd = copy.deepcopy(taxdicts)\n    newtaxdicts = []\n    for test in taxd:\n        #test[\"p4\"] = [outg[0]]\n        test[\"p4\"] = [\"outgroup\"]\n        newtaxdicts.append(test)\n\n    return tre, newtaxdicts"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _decompose_tree(ttree, orient='right', use_edge_lengths=True): \n\n    ## set attributes\n    ttree._orient = orient\n    ttree._use_edge_lengths = use_edge_lengths\n    ult = use_edge_lengths == False\n\n    ## map numeric values to internal nodes from root to tips\n    names = {}\n    idx = 0\n    for node in ttree.tree.traverse(\"preorder\"):\n        if not node.is_leaf():\n            if node.name:\n                names[idx] = node.name\n            else:\n                names[idx] = idx\n                node.name = str(idx)\n            node.idx = idx\n            idx += 1\n            \n    ## map number to the tips, these will be the highest numbers\n    for node in ttree.tree.get_leaves(): \n        names[idx] = node.name\n        node.idx = idx\n        idx += 1\n\n    ## create empty edges and coords arrays\n    ttree.node_labels = names\n    ttree.tip_labels = ttree.tree.get_leaf_names()\n    #self.tip_labels = self.tree.get_leaf_names()[::-1]\n    #self.node_labels = self.names\n    ttree.edges = np.zeros((idx - 1, 2), dtype=int)\n    ttree.verts = np.zeros((idx, 2), dtype=float)\n    ttree._lines = []        # np.zeros((ntips-1), dtype=int)\n    ttree._coords = []       # np.zeros((idx * 2 - ntips), dtype=float)\n\n    ## postorder: first children and then parents. This moves up the list .\n    nidx = 0\n    tip_num = len(ttree.tree.get_leaves()) - 1\n    \n    ## tips to root to fill in the verts and edges\n    for node in ttree.tree.traverse(\"postorder\"):\n        if node.is_leaf():\n            ## set the xy-axis positions of the tips\n            node.y = ttree.tree.get_distance(node)\n            if ult:\n                node.y = 0. \n            node.x = tip_num\n            tip_num -= 1\n            \n            ## edges connect this vert to\n            ttree.verts[node.idx] = [node.x, node.y]\n            ttree.edges[nidx] = [node.up.idx, node.idx]\n\n        elif node.is_root():\n            node.y = ttree.tree.get_distance(node)\n            if ult:\n                node.y = -1 * node.get_farthest_leaf(True)[1] - 1\n            node.x = sum(i.x for i in node.children) / float(len(node.children))\n            ttree.verts[node.idx] = [node.x, node.y]\n        \n        else:\n            ## create new nodes left and right\n            node.y = ttree.tree.get_distance(node)\n            if ult:\n                node.y = -1 * node.get_farthest_leaf(True)[1] - 1\n            node.x = sum(i.x for i in node.children) / float(len(node.children))\n            ttree.edges[nidx, :] = [node.up.idx, node.idx]\n            ttree.verts[node.idx] = [node.x, node.y]\n        nidx += 1\n        \n    ## root to tips to fill in the coords and lines\n    cidx = 0\n    for node in ttree.tree.traverse():\n        ## add yourself\n        if not node.is_leaf():\n            ttree._coords += [[node.x, node.y]]\n            pidx = cidx\n            cidx += 1\n            for child in node.children:\n                ## add children\n                ttree._coords += [[child.x, node.y], [child.x, child.y]]\n                ttree._lines += [[pidx, cidx]]    ## connect yourself to newx\n                ttree._lines += [[cidx, cidx+1]]  ## connect newx to child\n                cidx += 2\n    ttree._coords = np.array(ttree._coords, dtype=float)\n    ttree._lines = np.array(ttree._lines, dtype=int)\n\n    ## invert for sideways trees\n    if ttree._orient in ['up', 0]:\n        pass\n    if ttree._orient in ['left', 1]:\n        ttree.verts[:, 1] = ttree.verts[:, 1] * -1\n        ttree.verts = ttree.verts[:, [1, 0]]\n        ttree._coords[:, 1] = ttree._coords[:, 1] * -1\n        ttree._coords = ttree._coords[:, [1, 0]]\n    if ttree._orient in ['down', 0]:\n        ttree.verts[:, 1] = ttree.verts[:, 1] * -1\n        ttree._coords[:, 1] = ttree._coords[:, 1] * -1\n    if ttree._orient in ['right', 3]:\n        ttree.verts = ttree.verts[:, [1, 0]]\n        ttree._coords = ttree._coords[:, [1, 0]]", "response": "decompose a tree into a single tree structure"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ndrawing the tree using toyplot. graph. MacroMacro", "response": "def draw(\n        self, \n        show_tip_labels=True, \n        show_node_support=False,\n        use_edge_lengths=False, \n        orient=\"right\",\n        print_args=False,\n        *args,\n        **kwargs):\n        \"\"\"\n        plot the tree using toyplot.graph. \n\n        Parameters:\n        -----------\n            show_tip_labels: bool\n                Show tip names from tree.\n            use_edge_lengths: bool\n                Use edge lengths from newick tree.\n            show_node_support: bool\n                Show support values at nodes using a set of default \n                options. \n\n            ...\n        \"\"\"\n        ## re-decompose tree for new orient and edges args\n        self._decompose_tree(orient=orient, use_edge_lengths=use_edge_lengths)\n\n        ## update kwargs with entered args and all other kwargs\n        dwargs = {}\n        dwargs[\"show_tip_labels\"] = show_tip_labels\n        dwargs[\"show_node_support\"] = show_node_support\n        dwargs.update(kwargs)\n\n        ## pass to panel plotter\n        canvas, axes, panel = tree_panel_plot(self, print_args, **dwargs)\n        return canvas, axes, panel"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef tree_panel_plot(ttree,\n    print_args=False,\n    *args, \n    **kwargs):\n    \"\"\"\n    signature...\n    \"\"\"\n\n    ## create Panel plot object and set height & width\n    panel = Panel(ttree)          #tree, edges, verts, names)\n    if not kwargs.get(\"width\"):\n        panel.kwargs[\"width\"] = min(1000, 25*len(panel.tree))\n    if not kwargs.get(\"height\"):\n        panel.kwargs[\"height\"] = panel.kwargs[\"width\"]\n\n    ## update defaults with kwargs & update size based on ntips & ntests\n    panel.kwargs.update(kwargs)\n\n    ## magic node label arguments overrides others\n    if panel.kwargs[\"show_node_support\"]:\n        nnodes = sum(1 for i in panel.tree.traverse()) - len(panel.tree)\n        ## set node values\n        supps = [int(panel.tree.search_nodes(idx=j)[0].support) \\\n                 for j in range(nnodes)]\n        if not panel.kwargs[\"vsize\"]:\n            panel.kwargs[\"vsize\"] = 20\n        sizes = [panel.kwargs[\"vsize\"] for j in range(nnodes)]\n        ## add leaf values\n        supps += [\"\"] * len(panel.tree)\n        sizes += [0] * len(panel.tree)\n        ## override args\n        panel.kwargs[\"vlabel\"] = supps\n        panel.kwargs[\"vsize\"] = sizes\n        panel.kwargs[\"vlshow\"] = True\n        #panel.kwargs[\"vmarker\"] = 's'  ## square\n        ## if unrooted then hide root node scores\n        if len(panel.tree.children) > 2:\n            supps[0] = \"\"\n            sizes[0] = 0\n        #print(panel.kwargs[\"vlabels\"])\n        #print(panel.kwargs[\"vsize\"])\n    elif panel.kwargs.get(\"vlabel\"):\n        panel.kwargs[\"vlabel\"] = panel.kwargs[\"vlabel\"]\n        panel.kwargs[\"vlshow\"] = True\n    else:\n        panel.kwargs[\"vlabel\"] = panel.node_labels.keys() #names.keys()\n\n    ## debugger / see all options\n    if print_args:\n        print(panel.kwargs)\n\n    ## maybe add panels for plotting tip traits in the future\n    ## ...\n\n    ## create a canvas and a single cartesian coord system\n    canvas = toyplot.Canvas(height=panel.kwargs['height'], width=panel.kwargs['width'])\n    axes = canvas.cartesian(bounds=(\"10%\", \"90%\", \"10%\", \"90%\"))    \n    axes.show = panel.kwargs[\"show_axes\"]\n    \n    ## add panel plots to the axis\n    panel._panel_tree(axes)\n    if panel.kwargs[\"show_tip_labels\"]:\n        panel._panel_tip_labels(axes)\n\n    return canvas, axes, panel", "response": "create a Panel object and plot it"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_quick_depths(data, sample):\n\n    ## use existing sample cluster path if it exists, since this\n    ## func can be used in step 4 and that can occur after merging\n    ## assemblies after step3, and if we then referenced by data.dirs.clusts\n    ## the path would be broken.\n    ##\n    ## If branching at step 3 to test different clust thresholds, the\n    ## branched samples will retain the samples.files.clusters of the\n    ## parent (which have the clust_threshold value of the parent), so\n    ## it will look like nothing has changed. If we call this func\n    ## from step 3 then it indicates we are in a branch and should\n    ## reset the sample.files.clusters handle to point to the correct\n    ## data.dirs.clusts directory. See issue #229.\n    ## Easier to just always trust that samples.files.clusters is right,\n    ## no matter what step?\n    #if sample.files.clusters and not sample.stats.state == 3:\n    #    pass\n    #else:\n    #    ## set cluster file handles\n    sample.files.clusters = os.path.join(\n         data.dirs.clusts, sample.name+\".clustS.gz\")\n\n    ## get new clustered loci\n    fclust = data.samples[sample.name].files.clusters\n    clusters = gzip.open(fclust, 'r')\n    pairdealer = itertools.izip(*[iter(clusters)]*2)\n\n    ## storage\n    depths = []\n    maxlen = []\n\n    ## start with cluster 0\n    tdepth = 0\n    tlen = 0\n\n    ## iterate until empty\n    while 1:\n        ## grab next\n        try:\n            name, seq = pairdealer.next()\n        except StopIteration:\n            break\n\n        ## if not the end of a cluster\n        #print name.strip(), seq.strip()\n        if name.strip() == seq.strip():\n            depths.append(tdepth)\n            maxlen.append(tlen)\n            tlen = 0\n            tdepth = 0\n\n        else:\n            tdepth += int(name.split(\";\")[-2][5:])\n            tlen = len(seq)\n\n    ## return\n    clusters.close()\n    return np.array(maxlen), np.array(depths)", "response": "get the quick depths of the clustered loci"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef sample_cleanup(data, sample):\n\n    ## get maxlen and depths array from clusters\n    maxlens, depths = get_quick_depths(data, sample)\n\n    try:\n        depths.max()\n    except ValueError:\n        ## If depths is an empty array max() will raise\n        print(\"    no clusters found for {}\".format(sample.name))\n        return\n\n    ## Test if depths is non-empty, but just full of zeros.\n    if depths.max():\n        ## store which min was used to calculate hidepth here\n        sample.stats_dfs.s3[\"hidepth_min\"] = data.paramsdict[\"mindepth_majrule\"]\n\n        ## If our longest sequence is longer than the current max_fragment_length\n        ## then update max_fragment_length. For assurance we require that\n        ## max len is 4 greater than maxlen, to allow for pair separators.\n        hidepths = depths >= data.paramsdict[\"mindepth_majrule\"]\n        maxlens = maxlens[hidepths]\n\n        ## Handle the case where there are no hidepth clusters\n        if maxlens.any():\n            maxlen = int(maxlens.mean() + (2.*maxlens.std()))\n        else:\n            maxlen = 0\n        if maxlen > data._hackersonly[\"max_fragment_length\"]:\n            data._hackersonly[\"max_fragment_length\"] = maxlen + 4\n\n        ## make sense of stats\n        keepmj = depths[depths >= data.paramsdict[\"mindepth_majrule\"]]\n        keepstat = depths[depths >= data.paramsdict[\"mindepth_statistical\"]]\n\n        ## sample summary stat assignments\n        sample.stats[\"state\"] = 3\n        sample.stats[\"clusters_total\"] = depths.shape[0]\n        sample.stats[\"clusters_hidepth\"] = keepmj.shape[0]\n\n        ## store depths histogram as a dict. Limit to first 25 bins\n        bars, bins = np.histogram(depths, bins=range(1, 26))\n        sample.depths = {int(i):v for i, v in zip(bins, bars) if v}\n\n        ## sample stat assignments\n        ## Trap numpy warnings (\"mean of empty slice\") printed by samples\n        ## with few reads.\n        with warnings.catch_warnings():\n            warnings.simplefilter(\"ignore\", category=RuntimeWarning)\n            sample.stats_dfs.s3[\"merged_pairs\"] = sample.stats.reads_merged\n            sample.stats_dfs.s3[\"clusters_total\"] = depths.shape[0]\n            try:\n                sample.stats_dfs.s3[\"clusters_hidepth\"] = int(sample.stats[\"clusters_hidepth\"])\n            except ValueError:\n                ## Handle clusters_hidepth == NaN\n                sample.stats_dfs.s3[\"clusters_hidepth\"] = 0\n            sample.stats_dfs.s3[\"avg_depth_total\"] = depths.mean()\n            LOGGER.debug(\"total depth {}\".format(sample.stats_dfs.s3[\"avg_depth_total\"]))\n            sample.stats_dfs.s3[\"avg_depth_mj\"] = keepmj.mean()\n            LOGGER.debug(\"mj depth {}\".format(sample.stats_dfs.s3[\"avg_depth_mj\"]))\n            sample.stats_dfs.s3[\"avg_depth_stat\"] = keepstat.mean()\n            sample.stats_dfs.s3[\"sd_depth_total\"] = depths.std()\n            sample.stats_dfs.s3[\"sd_depth_mj\"] = keepmj.std()\n            sample.stats_dfs.s3[\"sd_depth_stat\"] = keepstat.std()\n\n    else:\n        print(\"    no clusters found for {}\".format(sample.name))\n\n    ## Get some stats from the bam files\n    ## This is moderately hackish. samtools flagstat returns\n    ## the number of reads in the bam file as the first element\n    ## of the first line, this call makes this assumption.\n    if not data.paramsdict[\"assembly_method\"] == \"denovo\":\n        refmap_stats(data, sample)\n\n    log_level = logging.getLevelName(LOGGER.getEffectiveLevel())\n\n    if not log_level == \"DEBUG\":\n        ## Clean up loose files only if not in DEBUG\n        ##- edits/*derep, utemp, *utemp.sort, *htemp, *clust.gz\n        derepfile = os.path.join(data.dirs.edits, sample.name+\"_derep.fastq\")\n        mergefile = os.path.join(data.dirs.edits, sample.name+\"_merged_.fastq\")\n        uhandle = os.path.join(data.dirs.clusts, sample.name+\".utemp\")\n        usort = os.path.join(data.dirs.clusts, sample.name+\".utemp.sort\")\n        hhandle = os.path.join(data.dirs.clusts, sample.name+\".htemp\")\n        clusters = os.path.join(data.dirs.clusts, sample.name+\".clust.gz\")\n\n        for f in [derepfile, mergefile, uhandle, usort, hhandle, clusters]:\n            try:\n                os.remove(f)\n            except:\n                pass", "response": "This function is used to clean up the sample and store the stats and cleanup and link to samples."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\naligning the read1s and read2s in the file with muscle.", "response": "def persistent_popen_align3(clusts, maxseqs=200, is_gbs=False):\n    \"\"\" keeps a persistent bash shell open and feeds it muscle alignments \"\"\"\n\n    ## create a separate shell for running muscle in, this is much faster\n    ## than spawning a separate subprocess for each muscle call\n    proc = sps.Popen([\"bash\"], \n                     stdin=sps.PIPE, \n                     stdout=sps.PIPE, \n                     universal_newlines=True)\n\n    ## iterate over clusters in this file until finished\n    aligned = []\n    for clust in clusts:\n\n        ## new alignment string for read1s and read2s\n        align1 = \"\"\n        align2 = \"\"\n\n        ## don't bother aligning if only one seq\n        if clust.count(\">\") == 1:\n            aligned.append(clust.replace(\">\", \"\").strip())\n        else:\n\n            ## do we need to split the alignment? (is there a PE insert?)\n            try:\n                ## make into list (only read maxseqs lines, 2X cuz names)\n                lclust = clust.split()[:maxseqs*2]\n\n                ## try to split cluster list at nnnn separator for each read\n                lclust1 = list(itertools.chain(*zip(\\\n                     lclust[::2], [i.split(\"nnnn\")[0] for i in lclust[1::2]])))\n                lclust2 = list(itertools.chain(*zip(\\\n                     lclust[::2], [i.split(\"nnnn\")[1] for i in lclust[1::2]])))\n\n                ## put back into strings\n                clust1 = \"\\n\".join(lclust1)\n                clust2 = \"\\n\".join(lclust2)\n\n                ## Align the first reads.\n                ## The muscle command with alignment as stdin and // as splitter\n                cmd1 = \"echo -e '{}' | {} -quiet -in - ; echo {}\"\\\n                        .format(clust1, ipyrad.bins.muscle, \"//\")\n\n                ## send cmd1 to the bash shell\n                print(cmd1, file=proc.stdin)\n\n                ## read the stdout by line until splitter is reached\n                ## meaning that the alignment is finished.\n                for line in iter(proc.stdout.readline, '//\\n'):\n                    align1 += line\n\n                ## Align the second reads.\n                ## The muscle command with alignment as stdin and // as splitter\n                cmd2 = \"echo -e '{}' | {} -quiet -in - ; echo {}\"\\\n                        .format(clust2, ipyrad.bins.muscle, \"//\")\n\n                ## send cmd2 to the bash shell\n                print(cmd2, file=proc.stdin)\n\n                ## read the stdout by line until splitter is reached\n                ## meaning that the alignment is finished.\n                for line in iter(proc.stdout.readline, '//\\n'):\n                    align2 += line\n\n                ## join up aligned read1 and read2 and ensure names order matches\n                la1 = align1[1:].split(\"\\n>\")\n                la2 = align2[1:].split(\"\\n>\")\n                dalign1 = dict([i.split(\"\\n\", 1) for i in la1])\n                dalign2 = dict([i.split(\"\\n\", 1) for i in la2])\n                align1 = []\n                try:\n                    keys = sorted(dalign1.keys(), key=DEREP, reverse=True)\n                except ValueError as inst:\n                    ## Lines is empty. This means the call to muscle alignment failed.\n                    ## Not sure how to handle this, but it happens only very rarely.\n                    LOGGER.error(\"Muscle alignment failed: Bad clust - {}\\nBad lines - {}\"\\\n                                .format(clust, lines))\n                    continue\n\n                ## put seed at top of alignment\n                seed = [i for i in keys if i.split(\";\")[-1][0]==\"*\"][0]\n                keys.pop(keys.index(seed))\n                keys = [seed] + keys\n                for key in keys:\n                    align1.append(\"\\n\".join([key, \n                                    dalign1[key].replace(\"\\n\", \"\")+\"nnnn\"+\\\n                                    dalign2[key].replace(\"\\n\", \"\")]))\n\n                ## append aligned cluster string\n                aligned.append(\"\\n\".join(align1).strip())\n\n            ## Malformed clust. Dictionary creation with only 1 element will raise.\n            except ValueError as inst:\n                LOGGER.debug(\"Bad PE cluster - {}\\nla1 - {}\\nla2 - {}\".format(\\\n                                clust, la1, la2))\n\n            ## Either reads are SE, or at least some pairs are merged.\n            except IndexError:\n                    \n                ## limit the number of input seqs\n                lclust = \"\\n\".join(clust.split()[:maxseqs*2])\n\n                ## the muscle command with alignment as stdin and // as splitter\n                cmd = \"echo -e '{}' | {} -quiet -in - ; echo {}\"\\\n                            .format(lclust, ipyrad.bins.muscle, \"//\")\n\n                ## send cmd to the bash shell (TODO: PIPE could overflow here!)\n                print(cmd, file=proc.stdin)\n\n                ## read the stdout by line until // is reached. This BLOCKS.\n                for line in iter(proc.stdout.readline, '//\\n'):\n                    align1 += line\n\n                ## remove '>' from names, and '\\n' from inside long seqs                \n                lines = align1[1:].split(\"\\n>\")\n\n                try:\n                    ## find seed of the cluster and put it on top.\n                    seed = [i for i in lines if i.split(\";\")[-1][0]==\"*\"][0]\n                    lines.pop(lines.index(seed))\n                    lines = [seed] + sorted(lines, key=DEREP, reverse=True)\n                except ValueError as inst:\n                    ## Lines is empty. This means the call to muscle alignment failed.\n                    ## Not sure how to handle this, but it happens only very rarely.\n                    LOGGER.error(\"Muscle alignment failed: Bad clust - {}\\nBad lines - {}\"\\\n                                .format(clust, lines))\n                    continue\n\n                ## format remove extra newlines from muscle\n                aa = [i.split(\"\\n\", 1) for i in lines]\n                align1 = [i[0]+'\\n'+\"\".join([j.replace(\"\\n\", \"\") for j in i[1:]]) for i in aa]\n                \n                ## trim edges in sloppy gbs/ezrad data. Maybe relevant to other types too...\n                if is_gbs:\n                    align1 = gbs_trim(align1)\n\n                ## append to aligned\n                aligned.append(\"\\n\".join(align1).strip())\n               \n    # cleanup\n    proc.stdout.close()\n    if proc.stderr:\n        proc.stderr.close()\n    proc.stdin.close()\n    proc.wait()\n\n    ## return the aligned clusters\n    return aligned"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef gbs_trim(align1):\n    leftmost = rightmost = None\n    dd = {k:v for k,v in [j.rsplit(\"\\n\", 1) for j in align1]}\n    seed = [i for i in dd.keys() if i.rsplit(\";\")[-1][0] == \"*\"][0]\n    leftmost = [i != \"-\" for i in dd[seed]].index(True)\n    revs = [i for i in dd.keys() if i.rsplit(\";\")[-1][0] == \"-\"]\n    if revs:\n        subright = max([[i!=\"-\" for i in seq[::-1]].index(True) \\\n            for seq in [dd[i] for i in revs]])\n    else:\n        subright = 0\n    rightmost = len(dd[seed]) - subright\n\n    ## if locus got clobbered then print place-holder NNN\n    names, seqs = zip(*[i.rsplit(\"\\n\", 1) for i in align1])\n    if rightmost > leftmost:\n        newalign1 = [n+\"\\n\"+i[leftmost:rightmost] for i,n in zip(seqs, names)]\n    else:\n        newalign1 = [n+\"\\nNNN\" for i,n in zip(seqs, names)]\n    return newalign1", "response": "trims the input sequence of read sequences"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nalign the file with the given handle and parses the data into a single file.", "response": "def align_and_parse(handle, max_internal_indels=5, is_gbs=False):\n    \"\"\" much faster implementation for aligning chunks \"\"\"\n\n    ## data are already chunked, read in the whole thing. bail if no data.\n    try:\n        with open(handle, 'rb') as infile:\n            clusts = infile.read().split(\"//\\n//\\n\")\n            ## remove any empty spots\n            clusts = [i for i in clusts if i]\n            ## Skip entirely empty chunks\n            if not clusts:\n                raise IPyradError\n    except (IOError, IPyradError):\n        LOGGER.debug(\"skipping empty chunk - {}\".format(handle))\n        return 0\n\n    ## count discarded clusters for printing to stats later\n    highindels = 0\n\n    ## iterate over clusters sending each to muscle, splits and aligns pairs\n    try:\n        aligned = persistent_popen_align3(clusts, 200, is_gbs)\n    except Exception as inst:\n        LOGGER.debug(\"Error in handle - {} - {}\".format(handle, inst))\n        #raise IPyradWarningExit(\"error hrere {}\".format(inst))\n        aligned = []        \n\n    ## store good alignments to be written to file\n    refined = []\n\n    ## filter and trim alignments\n    for clust in aligned:\n\n        ## check for too many internal indels\n        filtered = aligned_indel_filter(clust, max_internal_indels)\n\n        ## reverse complement matches. No longer implemented.\n        #filtered = overshoot_filter(clust)\n\n        ## finally, add to outstack if alignment is good\n        if not filtered:\n            refined.append(clust)#\"\\n\".join(stack))\n        else:\n            highindels += 1\n\n    ## write to file after\n    if refined:\n        outhandle = handle.rsplit(\".\", 1)[0]+\".aligned\"\n        with open(outhandle, 'wb') as outfile:\n            outfile.write(\"\\n//\\n//\\n\".join(refined)+\"\\n\")\n\n    ## remove the old tmp file\n    log_level = logging.getLevelName(LOGGER.getEffectiveLevel())\n    if not log_level == \"DEBUG\":\n        os.remove(handle)\n    return highindels"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef aligned_indel_filter(clust, max_internal_indels):\n\n    ## make into list\n    lclust = clust.split()\n    \n    ## paired or not\n    try:\n        seq1 = [i.split(\"nnnn\")[0] for i in lclust[1::2]]\n        seq2 = [i.split(\"nnnn\")[1] for i in lclust[1::2]]\n        intindels1 = [i.rstrip(\"-\").lstrip(\"-\").count(\"-\") for i in seq1]\n        intindels2 = [i.rstrip(\"-\").lstrip(\"-\").count(\"-\") for i in seq2]\n        intindels = intindels1 + intindels2\n        if max(intindels) > max_internal_indels:\n            return 1\n       \n    except IndexError:\n        seq1 = lclust[1::2]\n        intindels = [i.rstrip(\"-\").lstrip(\"-\").count(\"-\") for i in seq1]\n        if max(intindels) > max_internal_indels:\n            return 1 \n    \n    return 0", "response": "checks for too many internal indels in muscle aligned clusters"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ncreating cluster files for the unaligned reads of the sample.", "response": "def build_clusters(data, sample, maxindels):\n    \"\"\"\n    Combines information from .utemp and .htemp files to create .clust files,\n    which contain un-aligned clusters. Hits to seeds are only kept in the\n    cluster if the number of internal indels is less than 'maxindels'.\n    By default, we set maxindels=6 for this step (within-sample clustering).\n    \"\"\"\n\n    ## If reference assembly then here we're clustering the unmapped reads\n    if \"reference\" in data.paramsdict[\"assembly_method\"]:\n        derepfile = os.path.join(data.dirs.edits, sample.name+\"-refmap_derep.fastq\")\n    else:\n        derepfile = os.path.join(data.dirs.edits, sample.name+\"_derep.fastq\")\n    ## i/o vsearch files\n    uhandle = os.path.join(data.dirs.clusts, sample.name+\".utemp\")\n    usort = os.path.join(data.dirs.clusts, sample.name+\".utemp.sort\")\n    hhandle = os.path.join(data.dirs.clusts, sample.name+\".htemp\")\n\n    ## create an output file to write clusters to\n    sample.files.clusters = os.path.join(data.dirs.clusts, sample.name+\".clust.gz\")\n    clustsout = gzip.open(sample.files.clusters, 'wb')\n\n    ## Sort the uhandle file so we can read through matches efficiently\n    cmd = [\"sort\", \"-k\", \"2\", uhandle, \"-o\", usort]\n    proc = sps.Popen(cmd, close_fds=True)\n    _ = proc.communicate()[0]\n\n    ## load ALL derep reads into a dictionary (this can be a few GB of RAM)\n    ## and is larger if names are larger. We are grabbing two lines at a time.\n    alldereps = {}\n    with open(derepfile, 'rb') as ioderep:\n        dereps = itertools.izip(*[iter(ioderep)]*2)\n        for namestr, seq in dereps:\n            nnn, sss = [i.strip() for i in namestr, seq]\n            alldereps[nnn[1:]] = sss\n\n    ## store observed seeds (this could count up to >million in bad data sets)\n    seedsseen = set()\n\n    ## Iterate through the usort file grabbing matches to build clusters\n    with open(usort, 'rb') as insort:\n        ## iterator, seed null, seqlist null\n        isort = iter(insort)\n        lastseed = 0\n        fseqs = []\n        seqlist = []\n        seqsize = 0\n        while 1:\n            ## grab the next line\n            try:\n                hit, seed, _, ind, ori, _ = isort.next().strip().split()\n                LOGGER.debug(\">{} {} {}\".format(hit, seed, ori, seq))\n            except StopIteration:\n                break\n\n            ## same seed, append match\n            if seed != lastseed:\n                seedsseen.add(seed)\n                ## store the last cluster (fseq), count it, and clear fseq\n                if fseqs:\n                    ## sort fseqs by derep after pulling out the seed\n                    fseqs = [fseqs[0]] + sorted(fseqs[1:], key=lambda x: \\\n                        int(x.split(\";size=\")[1].split(\";\")[0]), reverse=True)                    \n                    seqlist.append(\"\\n\".join(fseqs))\n                    seqsize += 1\n                    fseqs = []\n\n                ## occasionally write/dump stored clusters to file and clear mem\n                if not seqsize % 10000:\n                    if seqlist:\n                        clustsout.write(\"\\n//\\n//\\n\".join(seqlist)+\"\\n//\\n//\\n\")\n                        ## reset list and counter\n                        seqlist = []\n\n                ## store the new seed on top of fseq list\n                fseqs.append(\">{}*\\n{}\".format(seed, alldereps[seed]))\n                lastseed = seed\n\n            ## add match to the seed\n            ## revcomp if orientation is reversed (comp preserves nnnn)\n            if ori == \"-\":\n                seq = comp(alldereps[hit])[::-1]\n            else:\n                seq = alldereps[hit]\n            ## only save if not too many indels\n            if int(ind) <= maxindels:\n                fseqs.append(\">{}{}\\n{}\".format(hit, ori, seq))\n            else:\n                LOGGER.info(\"filtered by maxindels: %s %s\", ind, seq)\n\n    ## write whatever is left over to the clusts file\n    if fseqs:\n        seqlist.append(\"\\n\".join(fseqs))\n    if seqlist:\n        clustsout.write(\"\\n//\\n//\\n\".join(seqlist)+\"\\n//\\n//\\n\")\n\n    ## now write the seeds that had no hits. Make dict from htemp\n    with open(hhandle, 'rb') as iotemp:\n        nohits = itertools.izip(*[iter(iotemp)]*2)\n        seqlist = []\n        seqsize = 0\n        while 1:\n            try:\n                nnn, _ = [i.strip() for i in nohits.next()]\n            except StopIteration:\n                break\n\n            ## occasionally write to file\n            if not seqsize % 10000:\n                if seqlist:\n                    clustsout.write(\"\\n//\\n//\\n\".join(seqlist)+\"\\n//\\n//\\n\")\n                    ## reset list and counter\n                    seqlist = []\n\n            ## append to list if new seed\n            if nnn[1:] not in seedsseen:\n                seqlist.append(\"{}*\\n{}\".format(nnn, alldereps[nnn[1:]]))\n                seqsize += 1\n\n    ## write whatever is left over to the clusts file\n    if seqlist:\n        clustsout.write(\"\\n//\\n//\\n\".join(seqlist))#+\"\\n//\\n//\\n\")\n\n    ## close the file handle\n    clustsout.close()\n    del alldereps"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nsets up directories for step 3 data", "response": "def setup_dirs(data):\n    \"\"\" sets up directories for step3 data \"\"\"\n    ## make output folder for clusters\n    pdir = os.path.realpath(data.paramsdict[\"project_dir\"])\n    data.dirs.clusts = os.path.join(pdir, \"{}_clust_{}\"\\\n                       .format(data.name, data.paramsdict[\"clust_threshold\"]))\n    if not os.path.exists(data.dirs.clusts):\n        os.mkdir(data.dirs.clusts)\n\n    ## make a tmpdir for align files\n    data.tmpdir = os.path.abspath(os.path.expanduser(\n        os.path.join(pdir, data.name+'-tmpalign')))\n    if not os.path.exists(data.tmpdir):\n        os.mkdir(data.tmpdir)\n\n    ## If ref mapping, init samples and make the refmapping output directory.\n    if not data.paramsdict[\"assembly_method\"] == \"denovo\":\n        ## make output directory for read mapping process\n        data.dirs.refmapping = os.path.join(pdir, \"{}_refmapping\".format(data.name))\n        if not os.path.exists(data.dirs.refmapping):\n            os.mkdir(data.dirs.refmapping)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ncreate a DAG of pre - align jobs for each sample.", "response": "def new_apply_jobs(data, samples, ipyclient, nthreads, maxindels, force):\n    \"\"\"\n    Create a DAG of prealign jobs to be run in order for each sample. Track\n    Progress, report errors. Each assembly method has a slightly different\n    DAG setup, calling different functions.\n    \"\"\"\n\n    ## is datatype gbs? used in alignment-trimming by align_and_parse()\n    is_gbs = bool(\"gbs\" in data.paramsdict[\"datatype\"])\n\n    ## Two view objects, threaded and unthreaded\n    lbview = ipyclient.load_balanced_view()\n    start = time.time()\n    elapsed = datetime.timedelta(seconds=int(time.time()-start))\n    firstfunc = \"derep_concat_split\"\n    printstr = \" {}    | {} | s3 |\".format(PRINTSTR[firstfunc], elapsed)\n    #printstr = \" {}      | {} | s3 |\".format(PRINTSTR[], elapsed)\n    progressbar(10, 0, printstr, spacer=data._spacer)\n\n    ## TODO: for HPC systems this should be done to make sure targets are spread\n    ## among different nodes.\n    if nthreads:\n        if nthreads < len(ipyclient.ids):\n            thview = ipyclient.load_balanced_view(targets=ipyclient.ids[::nthreads])\n        elif nthreads == 1:\n            thview = ipyclient.load_balanced_view()\n        else:\n            if len(ipyclient) > 40:\n                thview = ipyclient.load_balanced_view(targets=ipyclient.ids[::4])\n            else:\n                thview = ipyclient.load_balanced_view(targets=ipyclient.ids[::2])\n\n\n    ## get list of jobs/dependencies as a DAG for all pre-align funcs.\n    dag, joborder = build_dag(data, samples)\n\n    ## dicts for storing submitted jobs and results\n    results = {}\n\n    ## submit jobs to the engines in single or threaded views. The topological\n    ## sort makes sure jobs are input with all dependencies found.\n    for node in nx.topological_sort(dag):\n        ## get list of async results leading to this job\n        deps = [results.get(n) for n in dag.predecessors(node)]\n        deps = ipp.Dependency(dependencies=deps, failure=True)\n\n        ## get func, sample, and args for this func (including [data, sample])\n        funcstr, chunk, sname = node.split(\"-\", 2)\n        func = FUNCDICT[funcstr]\n        sample = data.samples[sname]\n\n        ## args vary depending on the function\n        if funcstr in [\"derep_concat_split\", \"cluster\"]:\n            args = [data, sample, nthreads, force]\n        elif funcstr in [\"mapreads\"]:\n            args = [data, sample, nthreads, force]\n        elif funcstr in [\"build_clusters\"]:\n            args = [data, sample, maxindels]\n        elif funcstr in [\"muscle_align\"]:\n            handle = os.path.join(data.tmpdir, \n                        \"{}_chunk_{}.ali\".format(sample.name, chunk))\n            args = [handle, maxindels, is_gbs]\n        else:\n            args = [data, sample]\n\n        # submit and store AsyncResult object. Some jobs are threaded.\n        if nthreads and (funcstr in THREADED_FUNCS):\n            #LOGGER.info('submitting %s to %s-threaded view', funcstr, nthreads)\n            with thview.temp_flags(after=deps, block=False):\n                results[node] = thview.apply(func, *args)\n        else:\n            #LOGGER.info('submitting %s to single-threaded view', funcstr)\n            with lbview.temp_flags(after=deps, block=False):\n                results[node] = lbview.apply(func, *args)\n\n    ## track jobs as they finish, abort if someone fails. This blocks here\n    ## until all jobs are done. Keep track of which samples have failed so\n    ## we only print the first error message.\n    sfailed = set()\n    for funcstr in joborder + [\"muscle_align\", \"reconcat\"]:\n        errfunc, sfails, msgs = trackjobs(funcstr, results, spacer=data._spacer)\n        LOGGER.info(\"{}-{}-{}\".format(errfunc, sfails, msgs))\n        if errfunc:\n            for sidx in xrange(len(sfails)):\n                sname = sfails[sidx]\n                errmsg = msgs[sidx]\n                if sname not in sfailed:\n                    print(\"  sample [{}] failed. See error in ./ipyrad_log.txt\"\\\n                          .format(sname))\n                    LOGGER.error(\"sample [%s] failed in step [%s]; error: %s\",\n                                  sname, errfunc, errmsg)\n                    sfailed.add(sname)\n\n    ## Cleanup of successful samples, skip over failed samples\n    badaligns = {}\n    for sample in samples:\n        ## The muscle_align step returns the number of excluded bad alignments\n        for async in results:\n            func, chunk, sname = async.split(\"-\", 2)\n            if (func == \"muscle_align\") and (sname == sample.name):\n                if results[async].successful():\n                    badaligns[sample] = int(results[async].get())\n\n    ## for the samples that were successful:\n    for sample in badaligns:\n        ## store the result\n        sample.stats_dfs.s3.filtered_bad_align = badaligns[sample]\n        ## store all results\n        try:\n            sample_cleanup(data, sample)\n        except Exception as inst:\n            msg = \"  Sample {} failed this step. See ipyrad_log.txt.\\\n                  \".format(sample.name)\n            print(msg)\n            LOGGER.error(\"%s - %s\", sample.name, inst)\n\n    ## store the results to data\n    data_cleanup(data)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nbuilding a directed acyclic graph describing jobs to be run in order.", "response": "def build_dag(data, samples):\n    \"\"\"\n    build a directed acyclic graph describing jobs to be run in order.\n    \"\"\"\n\n    ## Create DAGs for the assembly method being used, store jobs in nodes\n    snames = [i.name for i in samples]\n    dag = nx.DiGraph()\n\n    ## get list of pre-align jobs from globals based on assembly method\n    joborder = JOBORDER[data.paramsdict[\"assembly_method\"]]\n\n    ## WHICH JOBS TO RUN: iterate over the sample names\n    for sname in snames:\n        ## append pre-align job for each sample to nodes list\n        for func in joborder:\n            dag.add_node(\"{}-{}-{}\".format(func, 0, sname))\n\n        ## append align func jobs, each will have max 10\n        for chunk in xrange(10):\n            dag.add_node(\"{}-{}-{}\".format(\"muscle_align\", chunk, sname))\n\n        ## append final reconcat jobs\n        dag.add_node(\"{}-{}-{}\".format(\"reconcat\", 0, sname))\n\n    ## ORDER OF JOBS: add edges/dependency between jobs: (first-this, then-that)\n    for sname in snames:\n        for sname2 in snames:\n            ## enforce that clust/map cannot start until derep is done for ALL\n            ## samples. This is b/c...\n            dag.add_edge(\"{}-{}-{}\".format(joborder[0], 0, sname2),\n                         \"{}-{}-{}\".format(joborder[1], 0, sname))\n\n        ## add remaining pre-align jobs \n        for idx in xrange(2, len(joborder)):\n            dag.add_edge(\"{}-{}-{}\".format(joborder[idx-1], 0, sname),\n                         \"{}-{}-{}\".format(joborder[idx], 0, sname))\n\n        ## Add 10 align jobs, none of which can start until all chunker jobs\n        ## are finished. Similarly, reconcat jobs cannot start until all align\n        ## jobs are finished.\n        for sname2 in snames:\n            for chunk in range(10):\n                dag.add_edge(\"{}-{}-{}\".format(\"muscle_chunker\", 0, sname2),\n                             \"{}-{}-{}\".format(\"muscle_align\", chunk, sname))\n                ## add that the final reconcat job can't start until after\n                ## each chunk of its own sample has finished aligning.\n                dag.add_edge(\"{}-{}-{}\".format(\"muscle_align\", chunk, sname),\n                             \"{}-{}-{}\".format(\"reconcat\", 0, sname))\n    ## return the dag\n    return dag, joborder"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nplotting the DAG with the given results.", "response": "def _plot_dag(dag, results, snames):\n    \"\"\"\n    makes plot to help visualize the DAG setup. For developers only.\n    \"\"\"\n    try:\n        import matplotlib.pyplot as plt\n        from matplotlib.dates import date2num\n        from matplotlib.cm import gist_rainbow\n\n        ## first figure is dag layout\n        plt.figure(\"dag_layout\", figsize=(10, 10))\n        nx.draw(dag,\n                pos=nx.spring_layout(dag),\n                node_color='pink',\n                with_labels=True)\n        plt.savefig(\"./dag_layout.png\", bbox_inches='tight', dpi=200)\n\n        ## second figure is times for steps\n        pos = {}\n        colors = {}\n\n        for node in dag:\n            #jobkey = \"{}-{}\".format(node, sample)\n            mtd = results[node].metadata\n            start = date2num(mtd.started)\n            #runtime = date2num(md.completed)# - start\n            ## sample id to separate samples on x-axis\n            _, _, sname = node.split(\"-\", 2)\n            sid = snames.index(sname)\n            ## 1e6 to separate on y-axis\n            pos[node] = (start+sid, start*1e6)\n            colors[node] = mtd.engine_id\n\n        ## x just spaces out samples;\n        ## y is start time of each job with edge leading to next job\n        ## color is the engine that ran the job\n        ## all jobs were submitted as 3 second wait times\n        plt.figure(\"dag_starttimes\", figsize=(10, 16))\n        nx.draw(dag, pos,\n                node_list=colors.keys(),\n                node_color=colors.values(),\n                cmap=gist_rainbow,\n                with_labels=True)\n        plt.savefig(\"./dag_starttimes.png\", bbox_inches='tight', dpi=200)\n\n    except Exception as inst:\n        LOGGER.warning(inst)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ntrack the jobs that have been submitted from a list of engine jobs. Returns whether any of the jobs failed or not.", "response": "def trackjobs(func, results, spacer):\n    \"\"\"\n    Blocks and prints progress for just the func being requested from a list\n    of submitted engine jobs. Returns whether any of the jobs failed.\n\n    func = str\n    results = dict of asyncs\n    \"\"\"\n\n    ## TODO: try to insert a better way to break on KBD here.\n    LOGGER.info(\"inside trackjobs of %s\", func)\n\n    ## get just the jobs from results that are relevant to this func\n    asyncs = [(i, results[i]) for i in results if i.split(\"-\", 2)[0] == func]\n\n    ## progress bar\n    start = time.time()\n    while 1:\n        ## how many of this func have finished so far\n        ready = [i[1].ready() for i in asyncs]\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))\n        printstr = \" {}    | {} | s3 |\".format(PRINTSTR[func], elapsed)\n        progressbar(len(ready), sum(ready), printstr, spacer=spacer)\n        time.sleep(0.1)\n        if len(ready) == sum(ready):\n            print(\"\")\n            break\n\n    sfails = []\n    errmsgs = []\n    for job in asyncs:\n        if not job[1].successful():\n            sfails.append(job[0])\n            errmsgs.append(job[1].result())\n\n    return func, sfails, errmsgs"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef data_cleanup(data):\n    data.stats_dfs.s3 = data._build_stat(\"s3\")\n    data.stats_files.s3 = os.path.join(data.dirs.clusts, \"s3_cluster_stats.txt\")\n    with io.open(data.stats_files.s3, 'w') as outfile:\n        data.stats_dfs.s3.to_string(\n            buf=outfile,\n            formatters={\n                'merged_pairs':'{:.0f}'.format,\n                'clusters_total':'{:.0f}'.format,\n                'clusters_hidepth':'{:.0f}'.format,\n                'filtered_bad_align':'{:.0f}'.format,\n                'avg_depth_stat':'{:.2f}'.format,\n                'avg_depth_mj':'{:.2f}'.format,\n                'avg_depth_total':'{:.2f}'.format,\n                'sd_depth_stat':'{:.2f}'.format,\n                'sd_depth_mj':'{:.2f}'.format,\n                'sd_depth_total':'{:.2f}'.format\n            })", "response": "cleanup statswriting function for Assembly obj"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nconcatenates multiple sample. files.", "response": "def concat_multiple_edits(data, sample):\n    \"\"\"\n    if multiple fastq files were appended into the list of fastqs for samples\n    then we merge them here before proceeding.\n    \"\"\"\n\n    ## if more than one tuple in fastq list\n    if len(sample.files.edits) > 1:\n        ## create a cat command to append them all (doesn't matter if they\n        ## are gzipped, cat still works). Grab index 0 of tuples for R1s.\n        cmd1 = [\"cat\"] + [i[0] for i in sample.files.edits]\n\n        ## write to new concat handle\n        conc1 = os.path.join(data.dirs.edits, sample.name+\"_R1_concatedit.fq.gz\")\n        with open(conc1, 'w') as cout1:\n            proc1 = sps.Popen(cmd1, stderr=sps.STDOUT, stdout=cout1, close_fds=True)\n            res1 = proc1.communicate()[0]\n        if proc1.returncode:\n            raise IPyradWarningExit(\"error in: %s, %s\", cmd1, res1)\n\n        ## Only set conc2 if R2 actually exists\n        conc2 = 0\n        if os.path.exists(str(sample.files.edits[0][1])):\n            cmd2 = [\"cat\"] + [i[1] for i in sample.files.edits]\n            conc2 = os.path.join(data.dirs.edits, sample.name+\"_R2_concatedit.fq.gz\")\n            with gzip.open(conc2, 'w') as cout2:\n                proc2 = sps.Popen(cmd2, stderr=sps.STDOUT, stdout=cout2, close_fds=True)\n                res2 = proc2.communicate()[0]\n            if proc2.returncode:\n                raise IPyradWarningExit(\"error in: %s, %s\", cmd2, res2)\n\n        ## store new file handles\n        sample.files.edits = [(conc1, conc2)]\n    return sample.files.edits"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef cluster(data, sample, nthreads, force):\n\n    ## get the dereplicated reads\n    if \"reference\" in data.paramsdict[\"assembly_method\"]:\n        derephandle = os.path.join(data.dirs.edits, sample.name+\"-refmap_derep.fastq\")\n        ## In the event all reads for all samples map successfully then clustering\n        ## the unmapped reads makes no sense, so just bail out.\n        if not os.stat(derephandle).st_size:\n            ## In this case you do have to create empty, dummy vsearch output\n            ## files so building_clusters will not fail.\n            uhandle = os.path.join(data.dirs.clusts, sample.name+\".utemp\")\n            usort = os.path.join(data.dirs.clusts, sample.name+\".utemp.sort\")\n            hhandle = os.path.join(data.dirs.clusts, sample.name+\".htemp\")\n            for f in [uhandle, usort, hhandle]:\n                open(f, 'a').close()\n            return\n    else:\n        derephandle = os.path.join(data.dirs.edits, sample.name+\"_derep.fastq\")\n\n    ## create handles for the outfiles\n    uhandle = os.path.join(data.dirs.clusts, sample.name+\".utemp\")\n    temphandle = os.path.join(data.dirs.clusts, sample.name+\".htemp\")\n\n    ## If derep file doesn't exist then bail out\n    if not os.path.isfile(derephandle):\n        LOGGER.warn(\"Bad derephandle - {}\".format(derephandle))\n        raise IPyradError(\"Input file for clustering doesn't exist - {}\"\\\n                        .format(derephandle))\n\n    ## testing one sample fail\n    #if sample.name == \"1C_0\":\n    #    x\n\n    ## datatype specific optimization\n    ## minsl: the percentage of the seed that must be matched\n    ##    smaller values for RAD/ddRAD where we might want to combine, say 50bp\n    ##    reads and 100bp reads in the same analysis.\n    ## query_cov: the percentage of the query sequence that must match seed\n    ##    smaller values are needed for gbs where only the tips might overlap\n    ##    larger values for pairgbs where they should overlap near completely\n    ##    small minsl and high query cov allows trimmed reads to match to untrim\n    ##    seed for rad/ddrad/pairddrad.\n    strand = \"plus\"\n    cov = 0.75\n    minsl = 0.5\n    if data.paramsdict[\"datatype\"] in [\"gbs\", \"2brad\"]:\n        strand = \"both\"\n        cov = 0.5\n        minsl = 0.5\n    elif data.paramsdict[\"datatype\"] == 'pairgbs':\n        strand = \"both\"\n        cov = 0.75\n        minsl = 0.75\n\n    ## If this value is not null (which is the default) then override query cov\n    if data._hackersonly[\"query_cov\"]:\n        cov = str(data._hackersonly[\"query_cov\"])\n        assert float(cov) <= 1, \"query_cov must be <= 1.0\"\n\n    ## get call string\n    cmd = [ipyrad.bins.vsearch,\n           \"-cluster_smallmem\", derephandle,\n           \"-strand\", strand,\n           \"-query_cov\", str(cov),\n           \"-id\", str(data.paramsdict[\"clust_threshold\"]),\n           \"-minsl\", str(minsl),\n           \"-userout\", uhandle,\n           \"-userfields\", \"query+target+id+gaps+qstrand+qcov\",\n           \"-maxaccepts\", \"1\",\n           \"-maxrejects\", \"0\",\n           \"-threads\", str(nthreads),\n           \"-notmatched\", temphandle,\n           \"-fasta_width\", \"0\",\n           \"-fastq_qmax\", \"100\",\n           \"-fulldp\",\n           \"-usersort\"]\n\n    ## not sure what the benefit of this option is exactly, needs testing,\n    ## might improve indel detection on left side, but we don't want to enforce\n    ## aligning on left side if not necessarily, since quality trimmed reads\n    ## might lose bases on left side in step2 and no longer align.\n    #if data.paramsdict[\"datatype\"] in [\"rad\", \"ddrad\", \"pairddrad\"]:\n    #    cmd += [\"-leftjust\"]\n\n    ## run vsearch\n    LOGGER.debug(\"%s\", cmd)\n    proc = sps.Popen(cmd, stderr=sps.STDOUT, stdout=sps.PIPE, close_fds=True)\n\n    ## This is long running so we wrap it to make sure we can kill it\n    try:\n        res = proc.communicate()[0]\n    except KeyboardInterrupt:\n        proc.kill()\n        raise KeyboardInterrupt\n\n    ## check for errors\n    if proc.returncode:\n        LOGGER.error(\"error %s: %s\", cmd, res)\n        raise IPyradWarningExit(\"cmd {}: {}\".format(cmd, res))", "response": "Cluster the reads in the input file."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef muscle_chunker(data, sample):\n    ## log our location for debugging\n    LOGGER.info(\"inside muscle_chunker\")\n\n    ## only chunk up denovo data, refdata has its own chunking method which \n    ## makes equal size chunks, instead of uneven chunks like in denovo\n    if data.paramsdict[\"assembly_method\"] != \"reference\":\n        ## get the number of clusters\n        clustfile = os.path.join(data.dirs.clusts, sample.name+\".clust.gz\")\n        with iter(gzip.open(clustfile, 'rb')) as clustio:\n            nloci = sum(1 for i in clustio if \"//\" in i) // 2\n            #tclust = clustio.read().count(\"//\")//2\n            optim = (nloci//20) + (nloci%20)\n            LOGGER.info(\"optim for align chunks: %s\", optim)\n\n        ## write optim clusters to each tmp file\n        clustio = gzip.open(clustfile, 'rb')\n        inclusts = iter(clustio.read().strip().split(\"//\\n//\\n\"))\n        \n        ## splitting loci so first file is smaller and last file is bigger\n        inc = optim // 10\n        for idx in range(10):\n            ## how big is this chunk?\n            this = optim + (idx * inc)\n            left = nloci-this\n            if idx == 9:\n                ## grab everything left\n                grabchunk = list(itertools.islice(inclusts, int(1e9)))\n            else:\n                ## grab next chunks-worth of data\n                grabchunk = list(itertools.islice(inclusts, this))\n                nloci = left\n\n            ## write the chunk to file\n            tmpfile = os.path.join(data.tmpdir, sample.name+\"_chunk_{}.ali\".format(idx))\n            with open(tmpfile, 'wb') as out:\n                out.write(\"//\\n//\\n\".join(grabchunk))\n\n        ## write the chunk to file\n        #grabchunk = list(itertools.islice(inclusts, left))\n        #if grabchunk:\n        #    tmpfile = os.path.join(data.tmpdir, sample.name+\"_chunk_9.ali\")\n        #    with open(tmpfile, 'a') as out:\n        #        out.write(\"\\n//\\n//\\n\".join(grabchunk))\n        clustio.close()", "response": "This function splits the muscle alignment into chunks of size 10 and 10."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreconciles the aligned reads and writes the new cluster files to the cluster files", "response": "def reconcat(data, sample):\n    \"\"\" takes aligned chunks (usually 10) and concatenates them \"\"\"\n\n    try:\n        ## get chunks\n        chunks = glob.glob(os.path.join(data.tmpdir,\n                 sample.name+\"_chunk_[0-9].aligned\"))\n\n        ## sort by chunk number, cuts off last 8 =(aligned)\n        chunks.sort(key=lambda x: int(x.rsplit(\"_\", 1)[-1][:-8]))\n        LOGGER.info(\"chunk %s\", chunks)\n        ## concatenate finished reads\n        sample.files.clusters = os.path.join(data.dirs.clusts,\n                                             sample.name+\".clustS.gz\")\n        ## reconcats aligned clusters\n        with gzip.open(sample.files.clusters, 'wb') as out:\n            for fname in chunks:\n                with open(fname) as infile:\n                    dat = infile.read()\n                    ## avoids mess if last chunk was empty\n                    if dat.endswith(\"\\n\"):\n                        out.write(dat+\"//\\n//\\n\")\n                    else:\n                        out.write(dat+\"\\n//\\n//\\n\")\n                os.remove(fname)\n    except Exception as inst:\n        LOGGER.error(\"Error in reconcat {}\".format(inst))\n        raise"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nconcatenate the reads in the Assembly into a single Assembly.", "response": "def derep_concat_split(data, sample, nthreads, force):\n    \"\"\"\n    Running on remote Engine. Refmaps, then merges, then dereplicates,\n    then denovo clusters reads.\n    \"\"\"\n\n    ## report location for debugging\n    LOGGER.info(\"INSIDE derep %s\", sample.name)\n\n    ## MERGED ASSEMBIES ONLY:\n    ## concatenate edits files within Samples. Returns a new sample.files.edits \n    ## with the concat file. No change if not merged Assembly.\n    mergefile = os.path.join(data.dirs.edits, sample.name+\"_merged_.fastq\")\n    if not force:\n        if not os.path.exists(mergefile):\n            sample.files.edits = concat_multiple_edits(data, sample)\n        else:\n            LOGGER.info(\"skipped concat_multiple_edits: {} exists\"\\\n                        .format(mergefile))\n    else:\n        sample.files.edits = concat_multiple_edits(data, sample)\n\n    ## PAIRED DATA ONLY:\n    ## Denovo: merge or concat fastq pairs [sample.files.pairs]\n    ## Reference: only concat fastq pairs  []\n    ## Denovo + Reference: ...\n    if 'pair' in data.paramsdict['datatype']:\n        ## the output file handle for merged reads\n        \n\n        ## modify behavior of merging vs concating if reference\n        if \"reference\" in data.paramsdict[\"assembly_method\"]:\n            nmerged = merge_pairs(data, sample.files.edits, mergefile, 0, 0)\n        else:\n            nmerged = merge_pairs(data, sample.files.edits, mergefile, 1, 1)\n\n        ## store results\n        sample.files.edits = [(mergefile, )]\n        sample.stats.reads_merged = nmerged\n\n    ## 3rad uses random adapters to identify pcr duplicates. We will\n    ## remove pcr dupes here. Basically append the radom adapter to\n    ## each sequence, do a regular old vsearch derep, then trim\n    ## off the adapter, and push it down the pipeline. This will\n    ## remove all identical seqs with identical random i5 adapters.\n    if \"3rad\" in data.paramsdict[\"datatype\"]:\n        declone_3rad(data, sample)\n        derep_and_sort(data,\n                os.path.join(data.dirs.edits, sample.name+\"_declone.fastq\"),\n                os.path.join(data.dirs.edits, sample.name+\"_derep.fastq\"),\n                nthreads)\n    else:\n        ## convert fastq to fasta, then derep and sort reads by their size.\n        ## we pass in only one file b/c paired should be merged by now.\n        derep_and_sort(data,\n                sample.files.edits[0][0],\n                os.path.join(data.dirs.edits, sample.name+\"_derep.fastq\"),\n                nthreads)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nrun the major functions for clustering within samples", "response": "def run(data, samples, noreverse, maxindels, force, ipyclient):\n    \"\"\" run the major functions for clustering within samples \"\"\"\n\n    ## list of samples to submit to queue\n    subsamples = []\n\n    ## if sample is already done skip\n    for sample in samples:\n        ## If sample not in state 2 don't try to cluster it.\n        if sample.stats.state < 2:\n            print(\"\"\"\\\n    Sample not ready for clustering. First run step2 on sample: {}\"\"\".\\\n    format(sample.name))\n            continue\n\n        if not force:\n            if sample.stats.state >= 3:\n                print(\"\"\"\\\n    Skipping {}; aleady clustered. Use force to re-cluster\"\"\".\\\n    format(sample.name))\n            else:\n                if sample.stats.reads_passed_filter:\n                    subsamples.append(sample)\n        else:\n            ## force to overwrite\n            if sample.stats.reads_passed_filter:\n                subsamples.append(sample)\n\n    ## run subsamples\n    if not subsamples:\n        print(\"  No Samples ready to be clustered. First run step2().\")\n\n    else:\n        ## arguments to apply_jobs, inst catches exceptions\n        try:\n            ## make dirs that are needed including tmpdir\n            setup_dirs(data)\n\n            ## if refmapping make filehandles that will be persistent\n            if not data.paramsdict[\"assembly_method\"] == \"denovo\":\n                for sample in subsamples:\n                    refmap_init(data, sample, force)\n\n                    ## set thread-count to 2 for paired-data\n                    nthreads = 2\n            ## set thread-count to 1 for single-end data          \n            else:\n                nthreads = 1\n\n            ## overwrite nthreads if value in _ipcluster dict\n            if \"threads\" in data._ipcluster.keys():\n                nthreads = int(data._ipcluster[\"threads\"])\n\n                ## if more CPUs than there are samples then increase threads\n                _ncpus = len(ipyclient)\n                if _ncpus > 2*len(data.samples):\n                    nthreads *= 2\n\n            ## submit jobs to be run on cluster\n            args = [data, subsamples, ipyclient, nthreads, maxindels, force]\n            new_apply_jobs(*args)\n\n\n        finally:\n            ## this can fail if jobs were not stopped properly and are still\n            ## writing to tmpdir. don't cleanup if debug is on.\n            try:\n                log_level = logging.getLevelName(LOGGER.getEffectiveLevel())\n                if not log_level == \"DEBUG\":\n\n                    if os.path.exists(data.tmpdir):\n                        shutil.rmtree(data.tmpdir)\n                    ## get all refmap_derep.fastqs\n                    rdereps = glob.glob(os.path.join(data.dirs.edits, \"*-refmap_derep.fastq\"))\n                    ## Remove the unmapped fastq files\n                    for rmfile in rdereps:\n                        os.remove(rmfile)\n\n            except Exception as _:\n                LOGGER.warning(\"failed to cleanup files/dirs\")"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nparses the params. txt file args create and return Assembly object.", "response": "def parse_params(args):\n    \"\"\" Parse the params file args, create and return Assembly object.\"\"\"\n\n    ## check that params.txt file is correctly formatted.\n    try:\n        with open(args.params) as paramsin:\n            plines = paramsin.readlines()\n    except IOError as _:\n        sys.exit(\"  No params file found\")\n\n    ## check header: big version changes can be distinguished by the header\n    legacy_version = 0\n    try:\n        ## try to update the Assembly ...\n        legacy_version = 1\n        if not len(plines[0].split()[0]) == 7:\n            raise IPyradWarningExit(\"\"\"\n        Error: file '{}' is not compatible with ipyrad v.{}.\n        Please create and update a new params file using the -n argument. \n        For info on which parameters have changed see the changelog:\n        (http://ipyrad.readthedocs.io/releasenotes.html)\n        \"\"\".format(args.params, ip.__version__))\n\n    except IndexError:\n        raise IPyradWarningExit(\"\"\"\n        Error: Params file should not have any empty lines at the top\n        of the file. Verify there are no blank lines and rerun ipyrad.\n        Offending file - {}\n        \"\"\".format(args.params))\n\n    ## update and backup\n    if legacy_version:\n        #which version...\n        #update_to_6()\n        pass\n\n    ## make into a dict. Ignore blank lines at the end of file\n    ## Really this will ignore all blank lines\n    items = [i.split(\"##\")[0].strip() for i in plines[1:] if not i.strip() == \"\"]\n\n    #keys = [i.split(\"]\")[-2][-1] for i in plines[1:]]\n    #keys = range(len(plines)-1)\n    keys = ip.Assembly('null', quiet=True).paramsdict.keys()\n    parsedict = {str(i):j for i, j in zip(keys, items)}\n    return parsedict"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nload the Assembly and print the stats to screen.", "response": "def showstats(parsedict):\n    \"\"\" loads assembly or dies, and print stats to screen \"\"\"\n\n    #project_dir = parsedict['1']\n    project_dir = parsedict[\"project_dir\"]\n    if not project_dir:\n        project_dir = \"./\"\n    ## Be nice if somebody also puts in the file extension\n    #assembly_name = parsedict['0']\n    assembly_name = parsedict[\"assembly_name\"]\n    my_assembly = os.path.join(project_dir, assembly_name)\n\n    ## If the project_dir doesn't exist don't even bother trying harder.\n    if not os.path.isdir(project_dir):\n        msg = \"\"\"\n    Trying to print stats for Assembly ({}) that doesn't exist. You must \n    first run steps before you can show results.\n    \"\"\".format(project_dir)\n        sys.exit(msg)\n\n    if not assembly_name:\n        msg = \"\"\"\n    Assembly name is not set in params.txt, meaning it was either changed or\n    erased since the Assembly was started. Please restore the original name. \n    You can find the name of your Assembly in the \"project dir\": {}.\n    \"\"\".format(project_dir)\n        raise IPyradError(msg)\n\n    data = ip.load_json(my_assembly, quiet=True, cli=True)\n\n    print(\"\\nSummary stats of Assembly {}\".format(data.name) \\\n         +\"\\n------------------------------------------------\")\n    \n    if not data.stats.empty:\n        print(data.stats)\n        print(\"\\n\\nFull stats files\"\\\n         +\"\\n------------------------------------------------\")\n\n        fullcurdir = os.path.realpath(os.path.curdir)\n        for i in range(1, 8):\n            #enumerate(sorted(data.stats_files)):\n            key = \"s\"+str(i)\n            try:\n                val = data.stats_files[key]\n                val = val.replace(fullcurdir, \".\")                \n                print(\"step {}: {}\".format(i, val))\n            except (KeyError, AttributeError):\n                print(\"step {}: None\".format(i))\n        print(\"\\n\")\n    else:\n        print(\"No stats to display\")"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nloads the passed in assembly and create a branch.", "response": "def branch_assembly(args, parsedict):\n    \"\"\" \n    Load the passed in assembly and create a branch. Copy it\n    to a new assembly, and also write out the appropriate params.txt\n    \"\"\"\n\n    ## Get the current assembly\n    data = getassembly(args, parsedict)\n\n\n    ## get arguments to branch command\n    bargs = args.branch\n\n    ## get new name, trim off .txt if it was accidentally added\n    newname = bargs[0]\n    if newname.endswith(\".txt\"):\n        newname = newname[:-4]\n\n    ## look for subsamples\n    if len(bargs) > 1:\n        ## Branching and subsampling at step 6 is a bad idea, it messes up\n        ## indexing into the hdf5 cluster file. Warn against this.\n        if any([x.stats.state == 6 for x in data.samples.values()]):\n            pass\n            ## TODODODODODO\n            #print(\"wat\")\n\n        ## are we removing or keeping listed samples?\n        subsamples = bargs[1:]\n\n        ## drop the matching samples\n        if bargs[1] == \"-\":\n            ## check drop names\n            fails = [i for i in subsamples[1:] if i not in data.samples.keys()]\n            if any(fails):\n                raise IPyradWarningExit(\"\\\n                    \\n  Failed: unrecognized names requested, check spelling:\\n  {}\"\\\n                    .format(\"\\n  \".join([i for i in fails])))\n            print(\"  dropping {} samples\".format(len(subsamples)-1))\n            subsamples = list(set(data.samples.keys()) - set(subsamples))\n\n        ## If the arg after the new param name is a file that exists\n        if os.path.exists(bargs[1]):\n            new_data = data.branch(newname, infile=bargs[1])\n        else:\n            new_data = data.branch(newname, subsamples)\n\n    ## keeping all samples\n    else:\n        new_data = data.branch(newname, None)\n\n    print(\"  creating a new branch called '{}' with {} Samples\".\\\n             format(new_data.name, len(new_data.samples)))\n\n    print(\"  writing new params file to {}\"\\\n            .format(\"params-\"+new_data.name+\".txt\\n\"))\n    new_data.write_params(\"params-\"+new_data.name+\".txt\", force=args.force)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nmerging all given assemblies into a new assembly. Copies the params from the first passed in extant assembly. this function is called with the ipyrad -m flag. You must pass it at least 3 values, the first is a new assembly name (a new `param-newname.txt` will be created). The second and third args must be params files for currently existing assemblies. Any args beyond the third must also be params file for extant assemblies.", "response": "def merge_assemblies(args):\n    \"\"\" \n    merge all given assemblies into a new assembly. Copies the params\n    from the first passed in extant assembly. this function is called \n    with the ipyrad -m flag. You must pass it at least 3 values, the first\n    is a new assembly name (a new `param-newname.txt` will be created).\n    The second and third args must be params files for currently existing\n    assemblies. Any args beyond the third must also be params file for\n    extant assemblies.\n    \"\"\"\n    print(\"\\n  Merging assemblies: {}\".format(args.merge[1:]))\n\n    ## Make sure there are the right number of args\n    if len(args.merge) < 3:\n        sys.exit(_WRONG_NUM_CLI_MERGE)\n\n    ## Make sure the first arg isn't a params file, i could see someone doing it\n    newname = args.merge[0]\n    if os.path.exists(newname) and \"params-\" in newname:\n        sys.exit(_WRONG_ORDER_CLI_MERGE) \n\n    ## Make sure first arg will create a param file that doesn't already exist\n    if os.path.exists(\"params-\" + newname + \".txt\") and not args.force:\n        sys.exit(_NAME_EXISTS_MERGE.format(\"params-\" + newname + \".txt\"))\n\n    ## Make sure the rest of the args are params files that already exist\n    assemblies_to_merge = args.merge[1:]\n    for assembly in assemblies_to_merge:\n        if not os.path.exists(assembly):\n            sys.exit(_DOES_NOT_EXIST_MERGE.format(assembly))\n\n    ## Get assemblies for each of the passed in params files.\n    ## We're recycling some of the machinery for loading assemblies here\n    assemblies = []\n    for params_file in args.merge[1:]:\n        args.params = params_file\n        parsedict = parse_params(args)\n        assemblies.append(getassembly(args, parsedict))\n\n    ## Do the merge\n    merged_assembly = ip.merge(newname, assemblies)\n\n    ## Write out the merged assembly params file and report success\n    merged_assembly.write_params(\"params-{}.txt\".format(newname), force=args.force)\n\n    print(\"\\n  Merging succeeded. New params file for merged assembly:\")\n    print(\"\\n    params-{}.txt\\n\".format(newname))"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef getassembly(args, parsedict):\n\n    ## Creating an assembly with a full path in the name will \"work\"\n    ## but it is potentially dangerous, so here we have assembly_name\n    ## and assembly_file, name is used for creating new in cwd, file is\n    ## used for loading existing.\n    ##\n    ## Be nice if the user includes the extension.\n    #project_dir = ip.core.assembly._expander(parsedict['1'])\n    #assembly_name = parsedict['0']\n    project_dir = ip.core.assembly._expander(parsedict['project_dir'])\n    assembly_name = parsedict['assembly_name']\n    assembly_file = os.path.join(project_dir, assembly_name)\n\n    ## Assembly creation will handle error checking  on\n    ## the format of the assembly_name\n\n    ## make sure the working directory exists.\n    if not os.path.exists(project_dir):\n        os.mkdir(project_dir)\n\n    try:\n        ## If 1 and force then go ahead and create a new assembly\n        if ('1' in args.steps) and args.force:\n            data = ip.Assembly(assembly_name, cli=True)\n        else:\n            data = ip.load_json(assembly_file, cli=True)\n            data._cli = True\n\n    except IPyradWarningExit as _:\n        ## if no assembly is found then go ahead and make one\n        if '1' not in args.steps:\n            raise IPyradWarningExit(\\\n                \"  Error: You must first run step 1 on the assembly: {}\"\\\n                .format(assembly_file))\n        else:\n            ## create a new assembly object\n            data = ip.Assembly(assembly_name, cli=True)\n\n    ## for entering some params...\n    for param in parsedict:\n\n        ## trap assignment of assembly_name since it is immutable.\n        if param == \"assembly_name\":\n            ## Raise error if user tried to change assembly name\n            if parsedict[param] != data.name:\n                data.set_params(param, parsedict[param])\n        else:\n            ## all other params should be handled by set_params\n            try:\n                data.set_params(param, parsedict[param])\n            except IndexError as _:\n                print(\"  Malformed params file: {}\".format(args.params))\n                print(\"  Bad parameter {} - {}\".format(param, parsedict[param]))\n                sys.exit(-1)\n    return data", "response": "Loads an assembly or creates a new one and set its params from parsedict. Does not launch ipcluster."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\ntesting if there s a newer version and nag the user to upgrade.", "response": "def _check_version():\n    \"\"\" Test if there's a newer version and nag the user to upgrade.\"\"\"\n    import urllib2\n    from distutils.version import LooseVersion\n\n    header = \\\n    \"\\n -------------------------------------------------------------\"+\\\n    \"\\n  ipyrad [v.{}]\".format(ip.__version__)+\\\n    \"\\n  Interactive assembly and analysis of RAD-seq data\"+\\\n    \"\\n -------------------------------------------------------------\"\n\n    try:\n        htmldat = urllib2.urlopen(\"https://anaconda.org/ipyrad/ipyrad\").readlines()\n        curversion = next((x for x in htmldat if \"subheader\" in x), None).split(\">\")[1].split(\"<\")[0]\n        if LooseVersion(ip.__version__) < LooseVersion(curversion):\n            msg = \"\"\"\n  A new version of ipyrad is available (v.{}). To upgrade run:\n\n    conda install -c ipyrad ipyrad\\n\"\"\".format(curversion)\n            print(header + \"\\n\" + msg)\n        else:\n            pass\n            #print(\"You are up to date\")\n    except Exception as inst:\n        ## Always fail silently\n        pass"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ncalculate probability of binom", "response": "def get_binom(base1, base2, estE, estH):\n    \"\"\"\n    return probability of base call\n    \"\"\"\n        \n    prior_homo = (1. - estH) / 2.\n    prior_hete = estH\n    \n    ## calculate probs\n    bsum = base1 + base2\n    hetprob = scipy.misc.comb(bsum, base1)/(2. **(bsum))\n    homoa = scipy.stats.binom.pmf(base2, bsum, estE)\n    homob = scipy.stats.binom.pmf(base1, bsum, estE)\n    \n    ## calculate probs\n    hetprob *= prior_hete\n    homoa *= prior_homo\n    homob *= prior_homo\n    \n    ## final \n    probabilities = [homoa, homob, hetprob]\n    bestprob = max(probabilities)/float(sum(probabilities))\n\n    ## return\n    if hetprob > homoa:\n        return True, bestprob\n    else:\n        return False, bestprob"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nremove all non - overlapping consensus seqs from the arrayed list.", "response": "def removerepeats(consens, arrayed):\n    \"\"\"\n    Checks for interior Ns in consensus seqs and removes those that are at\n    low depth, here defined as less than 1/3 of the average depth. The prop 1/3\n    is chosen so that mindepth=6 requires 2 base calls that are not in [N,-].\n    \"\"\"\n\n    ## default trim no edges\n    consens = \"\".join(consens).replace(\"-\", \"N\")\n\n    ## split for pairs\n    try:\n        cons1, cons2 = consens.split(\"nnnn\")\n        split = consens.index(\"nnnn\")\n        arr1 = arrayed[:, :split]\n        arr2 = arrayed[:, split+4:]\n    except ValueError:\n        cons1 = consens\n        cons2 = \"\"\n        arr1 = arrayed\n\n    ## trim from left and right of cons1\n    edges = [None, None]\n    lcons = len(cons1)\n    cons1 = cons1.lstrip(\"N\")\n    edges[0] = lcons - len(cons1)\n\n    ## trim from right if nonzero\n    lcons = len(cons1)\n    cons1 = cons1.rstrip(\"N\")\n    if lcons - len(cons1):\n        edges[1] = -1*(lcons - len(cons1))\n\n    ## trim same from arrayed\n    arr1 = arr1[:, edges[0]:edges[1]]\n\n    ## trim from left and right of cons2 if present\n    if cons2:\n        ## trim from left and right of cons1\n        edges = [None, None]\n        lcons = len(cons2)\n        cons2 = cons2.lstrip(\"N\")\n        edges[0] = lcons - len(cons2)\n\n        ## trim from right if nonzero\n        lcons = len(cons2)\n        cons2 = cons2.rstrip(\"N\")\n        if lcons - len(cons2):\n            edges[1] = -1*(lcons - len(cons2))\n\n        ## trim same from arrayed\n        arr2 = arr2[:, edges[0]:edges[1]]\n\n        ## reconstitute pairs\n        consens = cons1 + \"nnnn\" + cons2\n        consens = np.array(list(consens))\n        sep = np.array(arr1.shape[0]*[list(\"nnnn\")])\n        arrayed = np.hstack([arr1, sep, arr2])\n\n    ## if single-end...\n    else:\n        consens = np.array(list(cons1))\n        arrayed = arr1\n\n    ## get column counts of Ns and -s\n    ndepths = np.sum(arrayed == 'N', axis=0)\n    idepths = np.sum(arrayed == '-', axis=0)\n\n    ## get proportion of bases that are N- at each site\n    nons = ((ndepths + idepths) / float(arrayed.shape[0])) >= 0.75\n    ## boolean of whether base was called N\n    isn = consens == \"N\"\n    ## make ridx\n    ridx = nons * isn\n\n    ## apply filter\n    consens = consens[~ridx]\n    arrayed = arrayed[:, ~ridx]\n\n    return consens, arrayed"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef newconsensus(data, sample, tmpchunk, optim):\n    ## do reference map funcs?\n    isref = \"reference\" in data.paramsdict[\"assembly_method\"]\n\n    ## temporarily store the mean estimates to Assembly\n    data._este = data.stats.error_est.mean()\n    data._esth = data.stats.hetero_est.mean()\n\n    ## get number relative to tmp file\n    tmpnum = int(tmpchunk.split(\".\")[-1])\n\n    ## prepare data for reading\n    clusters = open(tmpchunk, 'rb')\n    pairdealer = itertools.izip(*[iter(clusters)]*2)    \n    maxlen = data._hackersonly[\"max_fragment_length\"]\n\n    ## write to tmp cons to file to be combined later\n    consenshandle = os.path.join(\n        data.dirs.consens, sample.name+\"_tmpcons.\"+str(tmpnum))\n    tmp5 = consenshandle.replace(\"_tmpcons.\", \"_tmpcats.\")\n    with h5py.File(tmp5, 'w') as io5:\n        io5.create_dataset(\"cats\", (optim, maxlen, 4), dtype=np.uint32)\n        io5.create_dataset(\"alls\", (optim, ), dtype=np.uint8)\n        io5.create_dataset(\"chroms\", (optim, 3), dtype=np.int64)\n\n        ## local copies to use to fill the arrays\n        catarr = io5[\"cats\"][:]\n        nallel = io5[\"alls\"][:]\n        refarr = io5[\"chroms\"][:]\n\n    ## if reference-mapped then parse the fai to get index number of chroms\n    if isref:\n        fai = pd.read_csv(data.paramsdict[\"reference_sequence\"] + \".fai\", \n                names=['scaffold', 'size', 'sumsize', 'a', 'b'],\n                sep=\"\\t\")\n        faidict = {j:i for i,j in enumerate(fai.scaffold)}\n\n    ## store data for stats counters\n    counters = {\"name\" : tmpnum,\n                \"heteros\": 0,\n                \"nsites\" : 0,\n                \"nconsens\" : 0}\n\n    ## store data for what got filtered\n    filters = {\"depth\" : 0,\n               \"maxh\" : 0,\n               \"maxn\" : 0}\n\n    ## store data for writing\n    storeseq = {}\n\n    ## set max limits\n    if 'pair' in data.paramsdict[\"datatype\"]:\n        maxhet = sum(data.paramsdict[\"max_Hs_consens\"])\n        maxn = sum(data.paramsdict[\"max_Ns_consens\"])\n    else:\n        maxhet = data.paramsdict[\"max_Hs_consens\"][0]\n        maxn = data.paramsdict[\"max_Ns_consens\"][0]\n\n    ## load the refmap dictionary if refmapping\n    done = 0\n    while not done:\n        try:\n            done, chunk = clustdealer(pairdealer, 1)\n        except IndexError:\n            raise IPyradError(\"clustfile formatting error in %s\", chunk)\n\n        if chunk:\n            ## get names and seqs\n            piece = chunk[0].strip().split(\"\\n\")\n            names = piece[0::2]\n            seqs = piece[1::2]\n\n            ## pull replicate read info from seqs\n            reps = [int(sname.split(\";\")[-2][5:]) for sname in names]\n\n            ## IF this is a reference mapped read store the chrom and pos info\n            ## -1 defaults to indicating an anonymous locus, since we are using\n            ## the faidict as 0 indexed. If chrompos fails it defaults to -1\n            ref_position = (-1, 0, 0)\n            if isref:\n                try:\n                    ## parse position from name string\n                    name, _, _ = names[0].rsplit(\";\", 2)\n                    chrom, pos0, pos1 = name.rsplit(\":\", 2)\n                    \n                    ## pull idx from .fai reference dict \n                    chromint = faidict[chrom] + 1\n                    ref_position = (int(chromint), int(pos0), int(pos1))\n                    \n                except Exception as inst:\n                    LOGGER.debug(\"Reference sequence chrom/pos failed for {}\".format(names[0]))\n                    LOGGER.debug(inst)\n                    \n            ## apply read depth filter\n            if nfilter1(data, reps):\n\n                ## get stacks of base counts\n                sseqs = [list(seq) for seq in seqs]\n                arrayed = np.concatenate(\n                    [[seq]*rep for seq, rep in zip(sseqs, reps)])\n                arrayed = arrayed[:, :maxlen]\n                \n                ## get consens call for each site, applies paralog-x-site filter\n                #consens = np.apply_along_axis(basecall, 0, arrayed, data)\n                consens = basecaller(\n                    arrayed, \n                    data.paramsdict[\"mindepth_majrule\"], \n                    data.paramsdict[\"mindepth_statistical\"],\n                    data._esth, \n                    data._este,\n                    )\n\n                ## apply a filter to remove low coverage sites/Ns that\n                ## are likely sequence repeat errors. This is only applied to\n                ## clusters that already passed the read-depth filter (1)\n                if \"N\" in consens:\n                    try:\n                        consens, arrayed = removerepeats(consens, arrayed)\n\n                    except ValueError as _:\n                        LOGGER.info(\"Caught a bad chunk w/ all Ns. Skip it.\")\n                        continue\n\n                ## get hetero sites\n                hidx = [i for (i, j) in enumerate(consens) \\\n                            if j in list(\"RKSYWM\")]\n                nheteros = len(hidx)\n                \n                ## filter for max number of hetero sites\n                if nfilter2(nheteros, maxhet):\n                    ## filter for maxN, & minlen\n                    if nfilter3(consens, maxn):\n                        ## counter right now\n                        current = counters[\"nconsens\"]\n                        ## get N alleles and get lower case in consens\n                        consens, nhaps = nfilter4(consens, hidx, arrayed)\n                        ## store the number of alleles observed\n                        nallel[current] = nhaps\n\n                        ## store a reduced array with only CATG\n                        catg = np.array(\\\n                            [np.sum(arrayed == i, axis=0)  \\\n                            for i in list(\"CATG\")],\n                            dtype='uint32').T\n                        catarr[current, :catg.shape[0], :] = catg\n                        refarr[current] = ref_position\n\n                        ## store the seqdata for tmpchunk\n                        storeseq[counters[\"name\"]] = \"\".join(list(consens))\n                        counters[\"name\"] += 1\n                        counters[\"nconsens\"] += 1\n                        counters[\"heteros\"] += nheteros\n                    else:\n                        #LOGGER.debug(\"@haplo\")\n                        filters['maxn'] += 1\n                else:\n                    #LOGGER.debug(\"@hetero\")\n                    filters['maxh'] += 1\n            else:\n                #LOGGER.debug(\"@depth\")\n                filters['depth'] += 1\n                \n    ## close infile io\n    clusters.close()\n\n    ## write final consens string chunk\n    if storeseq:\n        with open(consenshandle, 'wb') as outfile:\n            outfile.write(\"\\n\".join([\">\"+sample.name+\"_\"+str(key)+\"\\n\"+\\\n                                   str(storeseq[key]) for key in storeseq]))\n\n    ## write to h5 array, this can be a bit slow on big data sets and is not \n    ## currently convered by progressbar movement.\n    with h5py.File(tmp5, 'a') as io5:\n        io5[\"cats\"][:] = catarr\n        io5[\"alls\"][:] = nallel\n        io5[\"chroms\"][:] = refarr\n    del catarr\n    del nallel\n    del refarr\n\n    ## return stats\n    counters['nsites'] = sum([len(i) for i in storeseq.itervalues()])\n    return counters, filters", "response": "new faster replacement to consensus \n   "}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ncall all sites in a locus array.", "response": "def basecaller(arrayed, mindepth_majrule, mindepth_statistical, estH, estE):\n    \"\"\"\n    call all sites in a locus array.\n    \"\"\"\n\n    ## an array to fill with consensus site calls\n    cons = np.zeros(arrayed.shape[1], dtype=np.uint8)\n    cons.fill(78)\n    arr = arrayed.view(np.uint8)\n    \n    ## iterate over columns\n    for col in xrange(arr.shape[1]):\n        ## the site of focus\n        carr = arr[:, col]\n        \n        ## make mask of N and - sites\n        mask = carr == 45\n        mask += carr == 78        \n        marr = carr[~mask]\n        \n        ## skip if only empties (e.g., N-)\n        if not marr.shape[0]:\n            cons[col] = 78\n        \n        ## skip if not variable\n        elif np.all(marr == marr[0]):\n            cons[col] = marr[0]\n        \n        ## estimate variable site call\n        else:\n            ## get allele freqs (first-most, second, third = p, q, r)\n            counts = np.bincount(marr)\n            \n            pbase = np.argmax(counts)\n            nump = counts[pbase]\n            counts[pbase] = 0\n\n            qbase = np.argmax(counts)\n            numq = counts[qbase]\n            counts[qbase] = 0\n\n            rbase = np.argmax(counts)\n            numr = counts[rbase]\n            \n            ## based on biallelic depth\n            bidepth = nump + numq \n            if bidepth < mindepth_majrule:\n                cons[col] = 78\n            \n            else:\n                ## if depth is too high, reduce to sampled int\n                if bidepth > 500:\n                    base1 = int(500 * (nump / float(bidepth)))\n                    base2 = int(500 * (numq / float(bidepth)))\n                else:\n                    base1 = nump\n                    base2 = numq\n\n                ## make statistical base call  \n                if bidepth >= mindepth_statistical:\n                    ishet, prob = get_binom(base1, base2, estE, estH)\n                    #LOGGER.info(\"ishet, prob, b1, b2: %s %s %s %s\", ishet, prob, base1, base2)\n                    if prob < 0.95:\n                        cons[col] = 78\n                    else:\n                        if ishet:\n                            cons[col] = TRANS[(pbase, qbase)]\n                        else:\n                            cons[col] = pbase\n                \n                ## make majrule base call\n                else: #if bidepth >= mindepth_majrule:\n                    if nump == numq:\n                        cons[col] = TRANS[(pbase, qbase)]\n                    else:\n                        cons[col] = pbase\n    return cons.view(\"S1\")"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef nfilter1(data, reps):\n    if sum(reps) >= data.paramsdict[\"mindepth_majrule\"] and \\\n        sum(reps) <= data.paramsdict[\"maxdepth\"]:\n        return 1\n    else:\n        return 0", "response": "applies read depths filter"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef nfilter4(consens, hidx, arrayed):\n\n    ## if less than two Hs then there is only one allele\n    if len(hidx) < 2:\n        return consens, 1\n\n    ## store base calls for hetero sites\n    harray = arrayed[:, hidx]\n\n    ## remove any reads that have N or - base calls at hetero sites\n    ## these cannot be used when calling alleles currently.\n    harray = harray[~np.any(harray == \"-\", axis=1)]\n    harray = harray[~np.any(harray == \"N\", axis=1)]\n\n    ## get counts of each allele (e.g., AT:2, CG:2)\n    ccx = Counter([tuple(i) for i in harray])\n\n    ## Two possibilities we would like to distinguish, but we can't. Therefore,\n    ## we just throw away low depth third alleles that are within seq. error.\n    ## 1) a third base came up as a sequencing error but is not a unique allele\n    ## 2) a third or more unique allele is there but at low frequency\n\n    ## remove low freq alleles if more than 2, since they may reflect\n    ## sequencing errors at hetero sites, making a third allele, or a new\n    ## allelic combination that is not real.\n    if len(ccx) > 2:\n        totdepth = harray.shape[0]\n        cutoff = max(1, totdepth // 10)\n        alleles = [i for i in ccx if ccx[i] > cutoff]\n    else:\n        alleles = ccx.keys()\n\n    ## how many high depth alleles?\n    nalleles = len(alleles)\n\n    ## if 2 alleles then save the phase using lowercase coding\n    if nalleles == 2:\n        try:\n            consens = storealleles(consens, hidx, alleles)\n        except (IndexError, KeyError):\n            ## the H sites do not form good alleles\n            LOGGER.info(\"failed at phasing loc, skipping\")\n            LOGGER.info(\"\"\"\n    consens %s\n    hidx %s\n    alleles %s\n                \"\"\", consens, hidx, alleles)\n        return consens, nalleles\n    ## just return the info for later filtering\n    else:\n        return consens, nalleles", "response": "apply max haplotypes filter returns pass and consens"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef storealleles(consens, hidx, alleles):\n    ## find the first hetero site and choose the priority base\n    ## example, if W: then priority base in A and not T. PRIORITY=(order: CATG)\n    bigbase = PRIORITY[consens[hidx[0]]]\n\n    ## find which allele has priority based on bigbase\n    bigallele = [i for i in alleles if i[0] == bigbase][0]\n\n    ## uplow other bases relative to this one and the priority list\n    ## e.g., if there are two hetero sites (WY) and the two alleles are\n    ## AT and TC, then since bigbase of (W) is A second hetero site should\n    ## be stored as y, since the ordering is swapped in this case; the priority\n    ## base (C versus T) is C, but C goes with the minor base at h site 1.\n    #consens = list(consens)\n    for hsite, pbase in zip(hidx[1:], bigallele[1:]):\n        if PRIORITY[consens[hsite]] != pbase:\n            consens[hsite] = consens[hsite].lower()\n\n    ## return consens\n    return consens", "response": "store phased allele data for diploids"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef cleanup(data, sample, statsdicts):\n    LOGGER.info(\"in cleanup for: %s\", sample.name)\n    isref = 'reference' in data.paramsdict[\"assembly_method\"]\n\n    ## collect consens chunk files\n    combs1 = glob.glob(os.path.join(\n                        data.dirs.consens,\n                        sample.name+\"_tmpcons.*\"))\n    combs1.sort(key=lambda x: int(x.split(\".\")[-1]))\n\n    ## collect tmpcat files\n    tmpcats = glob.glob(os.path.join(\n                        data.dirs.consens,\n                        sample.name+\"_tmpcats.*\"))\n    tmpcats.sort(key=lambda x: int(x.split(\".\")[-1]))\n\n    ## get shape info from the first cat, (optim, maxlen, 4)\n    with h5py.File(tmpcats[0], 'r') as io5:\n        optim, maxlen, _ = io5['cats'].shape\n\n    ## save as a chunked compressed hdf5 array\n    handle1 = os.path.join(data.dirs.consens, sample.name+\".catg\")\n    with h5py.File(handle1, 'w') as ioh5:\n        nloci = len(tmpcats) * optim\n        dcat = ioh5.create_dataset(\"catg\", (nloci, maxlen, 4),\n                                   dtype=np.uint32,\n                                   chunks=(optim, maxlen, 4),\n                                   compression=\"gzip\")\n        dall = ioh5.create_dataset(\"nalleles\", (nloci, ),\n                                   dtype=np.uint8,\n                                   chunks=(optim, ),\n                                   compression=\"gzip\")\n        ## only create chrom for reference-aligned data\n        if isref:\n            dchrom = ioh5.create_dataset(\"chroms\", (nloci, 3), \n                                         dtype=np.int64, \n                                         chunks=(optim, 3), \n                                         compression=\"gzip\")\n\n        ## Combine all those tmp cats into the big cat\n        start = 0\n        for icat in tmpcats:\n            io5 = h5py.File(icat, 'r')\n            end = start + optim\n            dcat[start:end] = io5['cats'][:]\n            dall[start:end] = io5['alls'][:]\n            if isref:\n                dchrom[start:end] = io5['chroms'][:]\n            start += optim\n            io5.close()\n            os.remove(icat)\n\n    ## store the handle to the Sample\n    sample.files.database = handle1\n\n    ## record results\n    xcounters = {\"nconsens\": 0,\n                 \"heteros\": 0,\n                 \"nsites\": 0}\n    xfilters = {\"depth\": 0,\n               \"maxh\": 0,\n               \"maxn\": 0}\n\n    ## merge finished consens stats\n    for counters, filters in statsdicts:\n        ## sum individual counters\n        for key in xcounters:\n            xcounters[key] += counters[key]\n        for key in xfilters:\n            xfilters[key] += filters[key]\n\n    ## merge consens read files\n    handle1 = os.path.join(data.dirs.consens, sample.name+\".consens.gz\")\n    with gzip.open(handle1, 'wb') as out:\n        for fname in combs1:\n            with open(fname) as infile:\n                out.write(infile.read()+\"\\n\")\n            os.remove(fname)\n    sample.files.consens = [handle1]\n\n    ## set Sample stats_dfs values\n    if int(xcounters['nsites']):\n        prop = int(xcounters[\"heteros\"]) / float(xcounters['nsites'])\n    else:\n        prop = 0\n\n    sample.stats_dfs.s5.nsites = int(xcounters[\"nsites\"])\n    sample.stats_dfs.s5.nhetero = int(xcounters[\"heteros\"])\n    sample.stats_dfs.s5.filtered_by_depth = xfilters['depth']\n    sample.stats_dfs.s5.filtered_by_maxH = xfilters['maxh']\n    sample.stats_dfs.s5.filtered_by_maxN = xfilters['maxn']\n    sample.stats_dfs.s5.reads_consens = int(xcounters[\"nconsens\"])\n    sample.stats_dfs.s5.clusters_total = sample.stats_dfs.s3.clusters_total\n    sample.stats_dfs.s5.heterozygosity = float(prop)\n\n    ## set the Sample stats summary value\n    sample.stats.reads_consens = int(xcounters[\"nconsens\"])\n\n    ## save state to Sample if successful\n    if sample.stats.reads_consens:\n        sample.stats.state = 5\n    else:\n        print(\"No clusters passed filtering in Sample: {}\".format(sample.name))\n    return sample", "response": "cleanup the unaligned data structures"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef chunk_clusters(data, sample):\n\n    ## counter for split job submission\n    num = 0\n\n    ## set optim size for chunks in N clusters. The first few chunks take longer\n    ## because they contain larger clusters, so we create 4X as many chunks as\n    ## processors so that they are split more evenly.\n    optim = int((sample.stats.clusters_total // data.cpus) + \\\n                (sample.stats.clusters_total % data.cpus))\n\n    ## break up the file into smaller tmp files for each engine\n    ## chunking by cluster is a bit trickier than chunking by N lines\n    chunkslist = []\n\n    ## open to clusters\n    with gzip.open(sample.files.clusters, 'rb') as clusters:\n        ## create iterator to sample 2 lines at a time\n        pairdealer = itertools.izip(*[iter(clusters)]*2)\n\n        ## Use iterator to sample til end of cluster\n        done = 0\n        while not done:\n            ## grab optim clusters and write to file.\n            done, chunk = clustdealer(pairdealer, optim)\n            chunkhandle = os.path.join(data.dirs.clusts,\n                                    \"tmp_\"+str(sample.name)+\".\"+str(num*optim))\n            if chunk:\n                chunkslist.append((optim, chunkhandle))\n                with open(chunkhandle, 'wb') as outchunk:\n                    outchunk.write(\"//\\n//\\n\".join(chunk)+\"//\\n//\\n\")\n                num += 1\n\n    return chunkslist", "response": "chunk the cluster file into smaller files"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef get_subsamples(data, samples, force):\n\n    subsamples = []\n    for sample in samples:\n        if not force:\n            if sample.stats.state >= 5:\n                print(\"\"\"\\\n    Skipping Sample {}; Already has consens reads. Use force arg to overwrite.\\\n    \"\"\".format(sample.name))\n            elif not sample.stats.clusters_hidepth:\n                print(\"\"\"\\\n    Skipping Sample {}; No clusters found.\"\"\"\\\n    .format(sample.name, int(sample.stats.clusters_hidepth)))\n            elif sample.stats.state < 4:\n                print(\"\"\"\\\n    Skipping Sample {}; not yet finished step4 \"\"\"\\\n    .format(sample.name))\n            else:\n                subsamples.append(sample)\n\n        else:\n            if not sample.stats.clusters_hidepth:\n                print(\"\"\"\\\n    Skipping Sample {}; No clusters found in {}.\"\"\"\\\n    .format(sample.name, sample.files.clusters))\n            elif sample.stats.state < 4:\n                print(\"\"\"\\\n    Skipping Sample {}; not yet finished step4\"\"\"\\\n    .format(sample.name))\n            else:\n                subsamples.append(sample)\n\n    if len(subsamples) == 0:\n        raise IPyradWarningExit(\"\"\"\n    No samples to cluster, exiting.\n    \"\"\")\n\n    ## if sample is already done skip\n    if \"hetero_est\" not in data.stats:\n        print(\"  No estimates of heterozygosity and error rate. Using default \"\\\n              \"values\")\n        for sample in subsamples:\n            sample.stats.hetero_est = 0.001\n            sample.stats.error_est = 0.0001\n\n    if data._headers:\n        print(u\"\"\"\\\n  Mean error  [{:.5f} sd={:.5f}]\n  Mean hetero [{:.5f} sd={:.5f}]\"\"\"\\\n  .format(data.stats.error_est.mean(), data.stats.error_est.std(),\n          data.stats.hetero_est.mean(), data.stats.hetero_est.std()))\n\n    return subsamples", "response": "Get the subsamples of the specified samples."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef run(data, samples, force, ipyclient):\n    ## prepare dirs\n    data.dirs.consens = os.path.join(data.dirs.project, data.name+\"_consens\")\n    if not os.path.exists(data.dirs.consens):\n        os.mkdir(data.dirs.consens)\n\n    ## zap any tmp files that might be leftover\n    tmpcons = glob.glob(os.path.join(data.dirs.consens, \"*_tmpcons.*\"))\n    tmpcats = glob.glob(os.path.join(data.dirs.consens, \"*_tmpcats.*\"))\n    for tmpfile in tmpcons+tmpcats:\n        os.remove(tmpfile)\n\n    ## filter through samples for those ready\n    samples = get_subsamples(data, samples, force)\n\n    ## set up parallel client: how many cores?\n    lbview = ipyclient.load_balanced_view()\n    data.cpus = data._ipcluster[\"cores\"]\n    if not data.cpus:\n        data.cpus = len(ipyclient.ids)\n\n    ## wrap everything to ensure destruction of temp files\n    inst = \"\"\n    try:\n        ## calculate depths, if they changed.\n        samples = calculate_depths(data, samples, lbview)\n\n        ## chunk clusters into bits for parallel processing\n        lasyncs = make_chunks(data, samples, lbview)\n\n        ## process chunks and cleanup\n        process_chunks(data, samples, lasyncs, lbview)\n\n    except KeyboardInterrupt as inst:\n        raise inst\n\n    finally:\n        ## if process failed at any point delete tmp files\n        tmpcons = glob.glob(os.path.join(data.dirs.clusts, \"tmp_*.[0-9]*\"))\n        tmpcons += glob.glob(os.path.join(data.dirs.consens, \"*_tmpcons.*\"))\n        tmpcons += glob.glob(os.path.join(data.dirs.consens, \"*_tmpcats.*\"))\n        for tmpchunk in tmpcons:\n            os.remove(tmpchunk)\n\n        ## Finished step 5. Set step 6 checkpoint to 0 to force\n        ## re-running from scratch.\n        data._checkpoint = 0", "response": "run the sample and return the base class"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ncalculates the depths of the cluster by running the lbview function on each of the samples.", "response": "def calculate_depths(data, samples, lbview):\n    \"\"\"\n    check whether mindepth has changed, and thus whether clusters_hidepth\n    needs to be recalculated, and get new maxlen for new highdepth clusts.\n    if mindepth not changed then nothing changes.\n    \"\"\"\n\n    ## send jobs to be processed on engines\n    start = time.time()\n    printstr = \" calculating depths    | {} | s5 |\"\n    recaljobs = {}\n    maxlens = []\n    for sample in samples:\n        recaljobs[sample.name] = lbview.apply(recal_hidepth, *(data, sample))\n\n    ## block until finished\n    while 1:\n        ready = [i.ready() for i in recaljobs.values()]\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))\n        progressbar(len(ready), sum(ready), printstr.format(elapsed), spacer=data._spacer)\n        time.sleep(0.1)\n        if len(ready) == sum(ready):\n            print(\"\")\n            break\n\n    ## check for failures and collect results\n    modsamples = []\n    for sample in samples:\n        if not recaljobs[sample.name].successful():\n            LOGGER.error(\"  sample %s failed: %s\", sample.name, recaljobs[sample.name].exception())\n        else:\n            modsample, _, maxlen, _, _ = recaljobs[sample.name].result()\n            modsamples.append(modsample)\n            maxlens.append(maxlen)\n\n    ## reset global maxlen if something changed\n    data._hackersonly[\"max_fragment_length\"] = int(max(maxlens)) + 4\n\n    return samples"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nmakes a chunk of the data into a single node", "response": "def make_chunks(data, samples, lbview):\n    \"\"\"\n    calls chunk_clusters and tracks progress.\n    \"\"\"\n    ## first progress bar\n    start = time.time()\n    printstr = \" chunking clusters     | {} | s5 |\"\n    elapsed = datetime.timedelta(seconds=int(time.time()-start))\n    progressbar(10, 0, printstr.format(elapsed), spacer=data._spacer)\n\n    ## send off samples to be chunked\n    lasyncs = {}\n    for sample in samples:\n        lasyncs[sample.name] = lbview.apply(chunk_clusters, *(data, sample))\n\n    ## block until finished\n    while 1:\n        ready = [i.ready() for i in lasyncs.values()]\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))\n        progressbar(len(ready), sum(ready), printstr.format(elapsed), spacer=data._spacer)\n        time.sleep(0.1)\n        if len(ready) == sum(ready):\n            print(\"\")\n            break\n\n    ## check for failures\n    for sample in samples:\n        if not lasyncs[sample.name].successful():\n            LOGGER.error(\"  sample %s failed: %s\", sample.name, \n                        lasyncs[sample.name].exception())\n\n    return lasyncs"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef process_chunks(data, samples, lasyncs, lbview):\n\n    ## send chunks to be processed\n    start = time.time()\n    asyncs = {sample.name:[] for sample in samples}\n    printstr = \" consens calling       | {} | s5 |\"\n\n    ## get chunklist from results\n    for sample in samples:\n        clist = lasyncs[sample.name].result()\n        for optim, chunkhandle in clist:\n            args = (data, sample, chunkhandle, optim)\n            #asyncs[sample.name].append(lbview.apply_async(consensus, *args))\n            asyncs[sample.name].append(lbview.apply_async(newconsensus, *args))\n            elapsed = datetime.timedelta(seconds=int(time.time()-start))\n            progressbar(10, 0, printstr.format(elapsed), spacer=data._spacer)\n\n    ## track progress\n    allsyncs = list(itertools.chain(*[asyncs[i.name] for i in samples]))\n    while 1:\n        ready = [i.ready() for i in allsyncs]\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))\n        progressbar(len(ready), sum(ready), printstr.format(elapsed), spacer=data._spacer)\n        time.sleep(0.1)\n        if len(ready) == sum(ready):\n            break\n\n    ## get clean samples\n    casyncs = {}\n    for sample in samples:\n        rlist = asyncs[sample.name]\n        statsdicts = [i.result() for i in rlist]\n        casyncs[sample.name] = lbview.apply(cleanup, *(data, sample, statsdicts))\n    while 1:\n        ready = [i.ready() for i in casyncs.values()]\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))\n        progressbar(10, 10, printstr.format(elapsed), spacer=data._spacer)\n        time.sleep(0.1)\n        if len(ready) == sum(ready):\n            print(\"\")\n            break\n\n    ## check for failures:\n    for key in asyncs:\n        asynclist = asyncs[key]\n        for async in asynclist:\n            if not async.successful():\n                LOGGER.error(\"  async error: %s \\n%s\", key, async.exception())\n    for key in casyncs:\n        if not casyncs[key].successful():\n            LOGGER.error(\"  casync error: %s \\n%s\", key, casyncs[key].exception())\n\n    ## get samples back\n    subsamples = [i.result() for i in casyncs.values()]\n    for sample in subsamples:\n        data.samples[sample.name] = sample\n\n    ## build Assembly stats\n    data.stats_dfs.s5 = data._build_stat(\"s5\")\n\n    ## write stats file\n    data.stats_files.s5 = os.path.join(data.dirs.consens, 's5_consens_stats.txt')\n    with io.open(data.stats_files.s5, 'w') as out:\n        #out.write(data.stats_dfs.s5.to_string())\n        data.stats_dfs.s5.to_string(\n            buf=out,\n            formatters={\n                'clusters_total':'{:.0f}'.format,\n                'filtered_by_depth':'{:.0f}'.format,\n                'filtered_by_maxH':'{:.0f}'.format,\n                'filtered_by_maxN':'{:.0f}'.format,\n                'reads_consens':'{:.0f}'.format,\n                'nsites':'{:.0f}'.format,\n                'nhetero':'{:.0f}'.format,\n                'heterozygosity':'{:.5f}'.format\n            })", "response": "process chunks from consens func and..."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreading in. loci and builds alleles from case characters", "response": "def make(data, samples):\n    \"\"\" reads in .loci and builds alleles from case characters \"\"\"\n    \n    #read in loci file\n    outfile = open(os.path.join(data.dirs.outfiles, data.name+\".alleles\"), 'w')\n    lines = open(os.path.join(data.dirs.outfiles, data.name+\".loci\"), 'r')\n\n    ## Get the longest sample name for pretty printing\n    longname = max(len(x) for x in data.samples.keys())\n\n    ## Padding between name and sequence in output file. This should be the \n    ## same as write_outfiles.write_tmp_loci.name_padding\n    name_padding = 5\n    writing = []\n    loc = 0\n    for line in lines:\n        if \">\" in line:\n            name, seq = line.split(\" \")[0], line.split(\" \")[-1]\n            allele1, allele2 = splitalleles(seq.strip())\n\n            ## Format the output string. the \"-2\" below accounts for the additional\n            ## 2 characters added to the sample name that don't get added to the\n            ## snpsites line, so you gotta bump this line back 2 to make it\n            ## line up right.\n            writing.append(name+\"_0\"+\" \"*(longname-len(name)-2+name_padding)+allele1)\n            writing.append(name+\"_1\"+\" \"*(longname-len(name)-2+name_padding)+allele2)\n        else:\n            writing.append(line.strip())\n        loc += 1\n\n        ## print every 10K loci \"\n        if not loc % 10000:\n            outfile.write(\"\\n\".join(writing)+\"\\n\")\n            writing = []\n\n    outfile.write(\"\\n\".join(writing))\n    outfile.close()"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nbuild SNPs and Stats files for a single sample.", "response": "def make(data, samples):\n    \"\"\" builds snps output \"\"\"\n\n    ## get attr \n    ploidy = data.paramsdict[\"max_alleles_consens\"]\n    names = [i.name for i in samples]\n    longname = max([len(i) for i in names])\n\n    ## TODO: use iter cuz of super huge files\n    inloci = open(os.path.join(\\\n                  data.dirs.outfiles, data.name+\".loci\" ), 'r').read()\n\n    ## Potential outfiles\n    snpsout = os.path.join(data.dirs.outfiles, data.name+\".snps\")\n    usnpsout = os.path.join(data.dirs.outfiles, data.name+\".usnps\")\n    structout = os.path.join(data.dirs.outfiles, data.name+\".str\")\n    genoout = os.path.join(data.dirs.outfiles, data.name+\".snps.geno\")\n    ugenoout = os.path.join(data.dirs.outfiles, data.name+\".usnps.geno\")\n\n    ## Output file for writing some stats\n    statsfile= os.path.join(data.dirs.outfiles, data.name+\".snps.stats\")\n\n    ## The output formats to write\n    formats = data.paramsdict[\"output_formats\"]\n\n    seed = data._hackersonly[\"random_seed\"]\n    np.random.seed(int(seed))\n\n    ## output .snps and .unlinked_snps\"\n    S = {}      ## snp dict\n    Si = {}     ## unlinked snp dict\n    for name in list(names):\n        S[name] = []\n        Si[name] = []\n\n    ## record bi-allelic snps\"\n    bis = 0\n\n    ## for each locus select out the SNPs\"\n    for loc in inloci.strip().split(\"|\")[:-1]:\n        pis = \"\"\n        ns = []\n        ss = []\n        cov = {}  ## record coverage for each SNP\n        for line in loc.split(\"\\n\"):\n            if \">\" in line:\n                ns.append(line.split()[0].replace(\">\",\"\"))\n                ss.append(line.split()[-1])\n            else:\n                pis = [i[0] for i in enumerate(line) if i[1] in list('*-')]\n\n        ## assign snps to S, and record coverage for usnps\"\n        for tax in S:\n            if tax in ns:\n                if pis:\n                    for snpsite in pis:\n                        snpsite -= (longname+5)\n                        S[tax].append(ss[ns.index(tax)][snpsite])\n                        if snpsite not in cov:\n                            cov[snpsite] = 1\n                        else:\n                            cov[snpsite] += 1\n                        ## downweight selection of gap sites \"\n                        if ss[ns.index(tax)][snpsite] != '-':\n                           cov[snpsite] += 1\n            else:\n                if pis:\n                    for snpsite in pis:\n                        S[tax].append(\"N\")\n                    Si[tax].append(\"N\")\n\n        ## randomly select among snps w/ greatest coverage for unlinked snp \"\n        maxlist = []\n        for j,k in cov.items():\n            if k == max(cov.values()):\n                maxlist.append(j)\n\n        ## Is bi-allelic ? \"\n        bisnps = []\n        for i in maxlist:\n            if len(set([ss[ns.index(tax)][i] for tax in S if tax in ns])) < 3:\n                bisnps.append(i)\n\n        #rando = pis[np.random.randint(len(pis))]\n        #rando -= (longname+5)\n        if bisnps:\n            rando = bisnps[np.random.randint(len(bisnps))]\n        elif maxlist:\n            rando = maxlist[np.random.randint(len(maxlist))]\n        tbi = 0\n        for tax in S:\n            if tax in ns:\n                if pis:\n                    ## if none are bi-allelic \"\n                    if not bisnps:\n                        tbi += 1\n                    Si[tax].append(ss[ns.index(tax)][rando])\n            if pis:\n                ## add spacer between loci \"                \n                S[tax].append(\" \")\n            else:\n                ## invariable locus \"\n                S[tax].append(\"_ \")\n        bis += tbi\n    ## names\n    SF = list(S.keys())\n    SF.sort()\n\n    ## Write linked snps format\n    if \"snps\" in formats:\n        with open(snpsout, 'w') as outfile:\n            print >>outfile, \"## %s taxa, %s loci, %s snps\" % \\\n            (len(S), len(\"\".join(S.values()[0]).split(\" \"))-1,\n                    len(\"\".join(S[SF[0]]).replace(\" \", \"\")))\n            for i in SF:\n                print >>outfile, i+(\" \"*(longname-len(i)+3))+\"\".join(S[i])\n\n    ## Write unlinked snps format\n    if \"usnps\" in formats:\n        with open(usnpsout, 'w') as outfile:\n            print >>outfile, len(Si), len(\"\".join(Si.values()[0]))\n            for i in SF:\n                print >>outfile, i+(\" \"*(longname-len(i)+3))+\"\".join(Si[i])\n\n        with open(statsfile, 'a') as statsout:\n            print >>statsout, \"sampled unlinked SNPs=\",len(Si.values()[0])\n            print >>statsout, \"sampled unlinked bi-allelic SNPs=\",len(Si.values()[0])-bis\n\n    ## Write STRUCTURE format\n    if \"str\" in formats:\n        with open(structout, 'w') as outfile:\n        \n            B = {'A': '0',\n                 'T': '1',\n                 'G': '2',\n                 'C': '3',\n                 'N': '-9',\n                 '-': '-9'}\n            if ploidy > 1:\n                for line in SF:\n                    print >>outfile, line+(\" \"*(longname-len(line)+3))+\\\n                          \"\\t\"*6+\"\\t\".join([B[unstruct(j)[0]] for j in Si[line]])\n                    print >>outfile, line+(\" \"*(longname-len(line)+3))+\\\n                          \"\\t\"*6+\"\\t\".join([B[unstruct(j)[1]] for j in Si[line]])\n            else:\n                for line in SF:\n                    print >>outfile, line+(\" \"*(longname-len(line)+3))+\\\n                          \"\\t\"*6+\"\\t\".join([B[unstruct(j)[1]] for j in Si[line]])\n\n    ## Do linked and unlinked snps in .geno format\n    if \"geno\" in formats:\n        with open(ugenoout, 'w') as outfile:\n\n            for i in range(len(Si.values()[0])):\n                getref = 0\n                ref = \"N\"\n                while ref == \"N\":\n                    ref = unstruct(Si[SF[getref]][i])[0]\n                    getref += 1\n                SNProw = \"\".join(map(str,[unstruct(Si[j][i]).count(ref) if Si[j][i] != \"N\" \\\n                                          else \"9\" for j in SF]))\n                ## print ref,SNProw\n                if len(set(SNProw)) > 1:\n                    print >>outfile, SNProw \n\n        with open(genoout, 'w') as outfile:\n            for i in range(len(S.values()[0])):\n                if S[SF[0]][i].strip(\"_\").strip():\n                    getref = 0\n                    ref = \"N\"\n                    while ref == \"N\":\n                        #print i, S[SF[0]][i]\n                        ref = unstruct(S[SF[getref]][i])[0]\n                        getref += 1\n                        SNProw = \"\".join(map(str,[unstruct(S[j][i]).count(ref) if \\\n                                                  S[j][i] != \"N\" else \"9\" for j in SF]))\n                    ## print ref,SNProw\n                    if len(set(SNProw)) > 1:\n                        print >>outfile, SNProw"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef cluster_info(ipyclient, spacer=\"\"):\n    ## get engine data, skips busy engines.    \n    hosts = []\n    for eid in ipyclient.ids:\n        engine = ipyclient[eid]\n        if not engine.outstanding:\n            hosts.append(engine.apply(_socket.gethostname))\n\n    ## report it\n    hosts = [i.get() for i in hosts]\n    result = []\n    for hostname in set(hosts):\n        result.append(\"{}host compute node: [{} cores] on {}\"\\\n            .format(spacer, hosts.count(hostname), hostname))\n    print \"\\n\".join(result)", "response": "report host and engine info for an ipyclient"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _debug_on():\n    ## make tmp file and set loglevel for top-level init\n    with open(__debugflag__, 'w') as dfile:\n        dfile.write(\"wat\")\n    __loglevel__ = \"DEBUG\"\n    _LOGGER.info(\"debugging turned on and registered to be turned off at exit\")\n    _set_debug_dict(__loglevel__)", "response": "Turn on debugging by creating hidden tmp file\n       "}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _set_debug_dict(__loglevel__):\n\n    _lconfig.dictConfig({\n    'version': 1,\n    'disable_existing_loggers': False,\n\n    'formatters': {\n        'standard': {\n            'format': \"%(asctime)s \\t\"\\\n                     +\"pid=%(process)d \\t\"\\\n                     +\"[%(filename)s]\\t\"\\\n                     +\"%(levelname)s \\t\"\\\n                     +\"%(message)s\"\n        },\n    },\n    'handlers': {\n        __name__: {\n            'level':__loglevel__,\n            'class':'logging.FileHandler',\n            'filename':__debugfile__,\n            'formatter':\"standard\",\n            'mode':'a+'\n        }\n    },\n    'loggers':{\n        __name__: {\n            'handlers': [__name__],\n            'level': __loglevel__,\n            'propogate': True\n        }\n    }\n    })", "response": "set the debug dict"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _debug_off():\n    if _os.path.exists(__debugflag__):\n        _os.remove(__debugflag__)\n    __loglevel__ = \"ERROR\"\n    _LOGGER.info(\"debugging turned off\")\n    _set_debug_dict(__loglevel__)", "response": "Turn off debugging by removing hidden tmp file."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ncheck if dependency program exists", "response": "def _cmd_exists(cmd):\n    \"\"\" check if dependency program is there \"\"\"\n    return _subprocess.call(\"type \" + cmd,\n                           shell=True,\n                           stdout=_subprocess.PIPE,\n                           stderr=_subprocess.PIPE) == 0"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _getbins():\n\n    # Return error if system is 32-bit arch.\n    # This is straight from the python docs:\n    # https://docs.python.org/2/library/platform.html#cross-platform\n    if not _sys.maxsize > 2**32:\n        _sys.exit(\"ipyrad requires 64bit architecture\")\n\n    ## get platform mac or linux\n    _platform = _sys.platform\n\n    ## get current location\n    if 'VIRTUAL_ENV' in _os.environ:\n        ipyrad_path = _os.environ['VIRTUAL_ENV']\n    else:\n        path = _os.path.abspath(_os.path.dirname(__file__))\n        ipyrad_path = _os.path.dirname(path)\n\n    ## find bin directory\n    ipyrad_path = _os.path.dirname(path)\n    bin_path = _os.path.join(ipyrad_path, \"bin\")\n\n    ## get the correct binaries\n    if 'linux' in _platform:\n        vsearch = _os.path.join(\n                       _os.path.abspath(bin_path),\n                       \"vsearch-linux-x86_64\")\n        muscle = _os.path.join(\n                       _os.path.abspath(bin_path),\n                       \"muscle-linux-x86_64\")\n        smalt = _os.path.join(\n                       _os.path.abspath(bin_path),\n                       \"smalt-linux-x86_64\")\n        bwa = _os.path.join(\n                       _os.path.abspath(bin_path),\n                       \"bwa-linux-x86_64\")\n        samtools = _os.path.join(\n                       _os.path.abspath(bin_path),\n                       \"samtools-linux-x86_64\")\n        bedtools = _os.path.join(\n                       _os.path.abspath(bin_path),\n                       \"bedtools-linux-x86_64\")\n        qmc = _os.path.join(\n                       _os.path.abspath(bin_path),\n                       \"QMC-linux-x86_64\")\n    else:\n        vsearch = _os.path.join(\n                       _os.path.abspath(bin_path),\n                       \"vsearch-osx-x86_64\")\n        muscle = _os.path.join(\n                       _os.path.abspath(bin_path),\n                       \"muscle-osx-x86_64\")\n        smalt = _os.path.join(\n                       _os.path.abspath(bin_path),\n                       \"smalt-osx-x86_64\")\n        bwa = _os.path.join(\n                       _os.path.abspath(bin_path),\n                       \"bwa-osx-x86_64\")\n        samtools = _os.path.join(\n                       _os.path.abspath(bin_path),\n                       \"samtools-osx-x86_64\")\n        bedtools = _os.path.join(\n                       _os.path.abspath(bin_path),\n                       \"bedtools-osx-x86_64\")\n        ## only one compiled version available, works for all?\n        qmc = _os.path.join(\n                       _os.path.abspath(bin_path),\n                       \"QMC-osx-x86_64\")\n\n    # Test for existence of binaries\n    assert _cmd_exists(muscle), \"muscle not found here: \"+muscle\n    assert _cmd_exists(vsearch), \"vsearch not found here: \"+vsearch\n    assert _cmd_exists(smalt), \"smalt not found here: \"+smalt\n    assert _cmd_exists(bwa), \"bwa not found here: \"+bwa\n    assert _cmd_exists(samtools), \"samtools not found here: \"+samtools\n    assert _cmd_exists(bedtools), \"bedtools not found here: \"+bedtools\n    #assert _cmd_exists(qmc), \"wQMC not found here: \"+qmc\n    return vsearch, muscle, smalt, bwa, samtools, bedtools, qmc", "response": "Get the right version of vsearch muscle smalt bwa and qqtools depending on linux vs osx"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef store_all(self):\n\n    with h5py.File(self.database.input, 'a') as io5:\n        fillsets = io5[\"quartets\"]\n\n        ## generator for all quartet sets\n        qiter = itertools.combinations(xrange(len(self.samples)), 4)\n        i = 0\n        while i < self.params.nquartets:\n            ## sample a chunk of the next ordered N set of quartets\n            dat = np.array(list(itertools.islice(qiter, self._chunksize)))\n            end = min(self.params.nquartets, dat.shape[0]+i)\n            fillsets[i:end] = dat[:end-i]\n            i += self._chunksize\n\n            ## send progress update to stdout on engine\n            print(min(i, self.params.nquartets))", "response": "Populate array with all possible quartets. This allows us to update the database with a checkpoint."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef store_random(self):\n\n    with h5py.File(self.database.input, 'a') as io5:\n        fillsets = io5[\"quartets\"]\n\n        ## set generators\n        qiter = itertools.combinations(xrange(len(self.samples)), 4)\n        rand = np.arange(0, n_choose_k(len(self.samples), 4))\n        np.random.shuffle(rand)\n        rslice = rand[:self.params.nquartets]\n        rss = np.sort(rslice)\n        riter = iter(rss)\n        del rand, rslice\n\n        ## print progress update 1 to the engine stdout\n        print(self._chunksize)\n\n        ## set to store\n        rando = riter.next()\n        tmpr = np.zeros((self.params.nquartets, 4), dtype=np.uint16)\n        tidx = 0\n        while 1:\n            try:\n                for i, j in enumerate(qiter):\n                    if i == rando:\n                        tmpr[tidx] = j\n                        tidx += 1\n                        rando = riter.next()\n\n                    ## print progress bar update to engine stdout\n                    if not i % self._chunksize:\n                        print(min(i, self.params.nquartets))\n\n            except StopIteration:\n                break\n        ## store into database\n        fillsets[:] = tmpr\n        del tmpr", "response": "Populate array with random quartets sampled from a generator and store them into the database."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ntaking a tetrad class object and populates array with random integers with unique integers for each element in the tree.", "response": "def store_equal(self):\n    \"\"\"\n    Takes a tetrad class object and populates array with random \n    quartets sampled equally among splits of the tree so that \n    deep splits are not overrepresented relative to rare splits, \n    like those near the tips. \n    \"\"\"\n\n    with h5py.File(self.database.input, 'a') as io5:\n        fillsets = io5[\"quartets\"]\n\n        ## require guidetree\n        if not os.path.exists(self.files.tree):\n            raise IPyradWarningExit(\n                \"To use sampling method 'equal' requires a guidetree\")\n        tre = ete3.Tree(self.files.tree)\n        tre.unroot()\n        tre.resolve_polytomy(recursive=True)\n\n        ## randomly sample internals splits\n        splits = [([self.samples.index(z.name) for z in i],\n                   [self.samples.index(z.name) for z in j]) \\\n                   for (i, j) in tre.get_edges()]\n\n        ## only keep internal splits, not single tip edges\n        splits = [i for i in splits if all([len(j) > 1 for j in i])]\n\n        ## how many min quartets shoudl be equally sampled from each split\n        squarts = self.params.nquartets // len(splits)\n\n        ## keep track of how many iterators are saturable.\n        saturable = 0\n\n        ## turn each into an iterable split sampler\n        ## if the nquartets for that split is small, then sample all, \n        ## if it is big then make it a random sampler for that split.\n        qiters = []\n\n        ## iterate over splits sampling quartets evenly\n        for idx, split in enumerate(splits):\n            ## if small number at this split then sample all possible sets\n            ## we will exhaust this quickly and then switch to random for \n            ## the larger splits.\n            total = n_choose_k(len(split[0]), 2) * n_choose_k(len(split[1]), 2)\n            if total < squarts*2:\n                qiter = (i+j for (i, j) in itertools.product(\n                            itertools.combinations(split[0], 2), \n                            itertools.combinations(split[1], 2)))\n                saturable += 1\n\n            ## else create random sampler across that split, this is slower\n            ## because it can propose the same split repeatedly and so we \n            ## have to check it against the 'sampled' set.\n            else:\n                qiter = (random_product(split[0], split[1]) for _ \\\n                         in xrange(self.params.nquartets))\n\n            ## store all iterators into a list\n            qiters.append((idx, qiter))\n\n        ## create infinite cycler of qiters\n        qitercycle = itertools.cycle(qiters)\n\n        ## store visited quartets\n        sampled = set()\n\n        ## fill chunksize at a time\n        i = 0\n        empty = set()\n        edge_targeted = 0\n        random_targeted = 0\n\n        ## keep filling quartets until nquartets are sampled.\n        while i < self.params.nquartets:\n            ## grab the next iterator\n            cycle, qiter = qitercycle.next()\n\n            ## sample from iterators, store sorted set.\n            try:\n                qrtsamp = tuple(sorted(qiter.next()))\n                if qrtsamp not in sampled:\n                    sampled.add(qrtsamp)\n                    edge_targeted += 1\n                    i += 1\n                    ## print progress bar update to engine stdout\n                    if not i % self._chunksize:\n                        print(min(i, self.params.nquartets))                    \n\n            except StopIteration:\n                empty.add(cycle)\n                if len(empty) == saturable:\n                    break\n\n\n        ## if array is not full then add random samples\n        while i <= self.params.nquartets:\n            newset = tuple(sorted(np.random.choice(\n                range(len(self.samples)), 4, replace=False)))\n            if newset not in sampled:\n                sampled.add(newset)\n                random_targeted += 1\n                i += 1\n                ## print progress bar update to engine stdout\n                if not i % self._chunksize:\n                    print(min(i, self.params.nquartets))\n\n        ## store into database\n        print(self.params.nquartets)\n        fillsets[:] = np.array(tuple(sampled))\n        del sampled"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn nsets unique random quartet sets sampled from n - choose - k without replacement combinations.", "response": "def random_combination(nsets, n, k):\n    \"\"\"\n    Returns nsets unique random quartet sets sampled from\n    n-choose-k without replacement combinations.\n    \"\"\"\n    sets = set()\n    while len(sets) < nsets:\n        newset = tuple(sorted(np.random.choice(n, k, replace=False)))\n        sets.add(newset)\n    return tuple(sets)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef set_mkl_thread_limit(cores):\n    if \"linux\" in sys.platform:\n        mkl_rt = ctypes.CDLL('libmkl_rt.so')\n    else:\n        mkl_rt = ctypes.CDLL('libmkl_rt.dylib')\n    oldlimit = mkl_rt.mkl_get_max_threads()\n    mkl_rt.mkl_set_num_threads(ctypes.byref(ctypes.c_int(cores)))\n    return oldlimit", "response": "set mkl thread limit and return old value so we can reset when finished."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nget total number of quartets possible for a split", "response": "def get_total(tots, node):\n    \"\"\" get total number of quartets possible for a split\"\"\"\n    if (node.is_leaf() or node.is_root()):\n        return 0\n    else:\n        ## Get counts on down edges. \n        ## How to treat polytomies here?\n        if len(node.children) > 2:\n            down_r = node.children[0]\n            down_l = node.children[1]\n            for child in node.children[2:]:\n                down_l += child\n        else:\n            down_r, down_l = node.children\n        lendr = sum(1 for i in down_r.iter_leaves())\n        lendl = sum(1 for i in down_l.iter_leaves())\n\n        ## get count on up edge sister\n        up_r = node.get_sisters()[0]\n        lenur = sum(1 for i in up_r.iter_leaves())\n\n        ## everyone else\n        lenul = tots - (lendr + lendl + lenur)\n\n        ## return product\n        return lendr * lendl * lenur * lenul"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ngets total number of quartets sampled for a split", "response": "def get_sampled(data, totn, node):\n    \"\"\" get total number of quartets sampled for a split\"\"\"\n    ## convert tip names to ints\n    names = sorted(totn)\n    cdict = {name: idx for idx, name in enumerate(names)}\n    \n    ## skip some nodes\n    if (node.is_leaf() or node.is_root()):\n        return 0\n    else:\n        ## get counts on down edges\n        if len(node.children) > 2:\n            down_r = node.children[0]\n            down_l = node.children[1]\n            for child in node.children[2:]:\n                down_l += child\n        else:\n            down_r, down_l = node.children\n\n        lendr = set(cdict[i] for i in down_r.get_leaf_names())\n        lendl = set(cdict[i] for i in down_l.get_leaf_names())\n\n        ## get count on up edge sister\n        up_r = node.get_sisters()[0]\n        lenur = set(cdict[i] for i in up_r.get_leaf_names())\n\n        ## everyone else\n        lenul = set(cdict[i] for i in totn) - set.union(lendr, lendl, lenur)\n\n    idx = 0\n    sampled = 0\n    with h5py.File(data.database.output, 'r') as io5:\n        end = io5[\"quartets\"].shape[0]\n        while 1:\n            ## break condition\n            if idx >= end:\n                break\n\n            ## counts matches\n            qrts = io5[\"quartets\"][idx:idx+data._chunksize]\n            for qrt in qrts:\n                sqrt = set(qrt)\n                if all([sqrt.intersection(i) for i in [lendr, lendl, lenur, lenul]]):\n                    sampled += 1\n\n            ## increase span\n            idx += data._chunksize\n    return sampled"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef build_trees(fclade_counts, namedict):\n\n    ## storage\n    nodes = {}\n    idxarr = np.arange(len(fclade_counts[0][0]))\n    queue = []\n\n    ## create dict of clade counts and set keys\n    countdict = defaultdict(int)\n    for clade, count in fclade_counts:\n        mask = np.int_(list(clade)).astype(np.bool)\n        ccx = idxarr[mask]\n        queue.append((len(ccx), frozenset(ccx)))\n        countdict[frozenset(ccx)] = count\n\n    while queue:\n        queue.sort()\n        (clade_size, clade) = queue.pop(0)\n        new_queue = []\n    \n        # search for ancestors of clade\n        for (_, ancestor) in queue:\n            if clade.issubset(ancestor):\n                # update ancestor such that, in the following example:\n                # ancestor == {1, 2, 3, 4}\n                # clade == {2, 3}\n                # new_ancestor == {1, {2, 3}, 4}\n                new_ancestor = (ancestor - clade) | frozenset([clade])          \n                countdict[new_ancestor] = countdict.pop(ancestor)\n                ancestor = new_ancestor\n            \n            new_queue.append((len(ancestor), ancestor))\n   \n        # if the clade is a tip, then we have a name\n        if clade_size == 1:\n            name = list(clade)[0]\n            name = namedict[name]\n        else:\n            name = None \n        \n        # the clade will not be in nodes if it is a tip\n        children = [nodes.pop(c) for c in clade if c in nodes]\n        node = ete3.Tree(name=name)    \n        #node = toytree.tree(name=name).tree\n        for child in children:\n            node.add_child(child)\n        if not node.is_leaf():\n            node.dist = int(round(100*countdict[clade]))\n            node.support = int(round(100*countdict[clade]))\n        else:\n            node.dist = int(100) \n            node.support = int(100)\n        \n        nodes[clade] = node\n        queue = new_queue\n    tre = nodes.values()[0]\n    tre.unroot()\n    ## return the tree and other trees if present\n    return tre, list(nodes.values())", "response": "A subfunc of consensus_tree. Builds a tree from a list of clades and a dictionary of names and counts."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _refresh(self):\n\n        ## clear any existing results files\n        oldfiles = [self.files.qdump] + \\\n                    self.database.__dict__.values() + \\\n                    self.trees.__dict__.values()\n        for oldfile in oldfiles:\n            if oldfile:\n                if os.path.exists(oldfile):\n                    os.remove(oldfile)\n\n        ## store old ipcluster info\n        oldcluster = copy.deepcopy(self._ipcluster)\n\n        ## reinit the tetrad object data.\n        self.__init__(\n            name=self.name, \n            data=self.files.data, \n            mapfile=self.files.mapfile,\n            workdir=self.dirs,\n            method=self.params.method,\n            guidetree=self.files.tree,\n            resolve_ambigs=self.params.resolve_ambigs,\n            save_invariants=self.params.save_invariants,\n            nboots=self.params.nboots, \n            nquartets=self.params.nquartets, \n            initarr=True, \n            quiet=True,\n            cli=self.kwargs.get(\"cli\")\n            )\n\n        ## retain the same ipcluster info\n        self._ipcluster = oldcluster", "response": "Refresh the internal object with the current values of the object."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _store_N_samples(self, start, ipyclient, quiet=False):\n\n        breaks = 2\n        if self.params.nquartets < 5000:\n            breaks = 1\n        if self.params.nquartets > 100000:\n            breaks = 8\n        if self.params.nquartets > 500000:\n            breaks = 16\n        if self.params.nquartets > 5000000:\n            breaks = 32\n\n        ## chunk up the data\n        ncpus = len(ipyclient)    \n        self._chunksize = (self.params.nquartets // (breaks * ncpus) \\\n                        + (self.params.nquartets % (breaks * ncpus)))\n\n        ## create h5 OUT empty arrays\n        ## 'quartets' stores the inferred quartet relationship (1 x 4)\n        ## This can get huge, so we need to choose the dtype wisely. \n        ## the values are simply the index of the taxa, so uint16 is good.\n        with h5py.File(self.database.output, 'w') as io5:\n            io5.create_dataset(\"quartets\", \n                               (self.params.nquartets, 4), \n                               dtype=np.uint16, \n                               chunks=(self._chunksize, 4))\n            ## group for bootstrap invariant matrices ((16, 16), uint32)\n            ## these store the actual matrix counts. dtype uint32 can store\n            ## up to 4294967295. More than enough. uint16 max is 65535.\n            ## the 0 boot is the original seqarray.\n            io5.create_group(\"invariants\")\n\n        ## append to h5 IN array (which has the seqarray, bootsarr, maparr)\n        ## and fill it with all of the quartet sets we will ever sample.\n        ## the samplign method will vary depending on whether this is random, \n        ## all, or equal splits (in a separate but similar function). \n        with h5py.File(self.database.input, 'a') as io5:\n            try:\n                io5.create_dataset(\"quartets\", \n                               (self.params.nquartets, 4), \n                               dtype=np.uint16, \n                               chunks=(self._chunksize, 4),\n                               compression='gzip')\n            except RuntimeError:\n                raise IPyradWarningExit(\n                    \"database file already exists for this analysis, \"\n                  + \"you must run with the force flag to overwrite\")\n            \n        ## submit store job to write into self.database.input\n        if self.params.method == \"all\":\n            async = ipyclient[0].apply(store_all, self)\n        elif self.params.method == \"random\":\n            async = ipyclient[0].apply(store_random, self)\n        elif self.params.method == \"equal\":\n            async = ipyclient[0].apply(store_equal, self) \n\n        ## progress bar \n        printstr = \"generating q-sets | {} | \"\n        prog = 0        \n        while 1:\n            elapsed = datetime.timedelta(seconds=int(time.time()-start))\n            if not quiet:\n                if async.stdout:\n                    prog = int(async.stdout.strip().split()[-1])\n                progressbar(self.params.nquartets, prog,\n                            printstr.format(elapsed), spacer=\"\")\n            if not async.ready():\n                time.sleep(0.1)\n            else:\n                break\n\n        if not async.successful():\n            raise IPyradWarningExit(async.result())\n        if not quiet:\n            print(\"\")", "response": "Store the number of samples in a large array."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _dump_qmc(self):\n\n        ## open the h5 database\n        with h5py.File(self.database.output, 'r') as io5:\n\n            ## create an output file for writing\n            self.files.qdump = os.path.join(self.dirs, self.name+\".quartets.txt\")\n            with open(self.files.qdump, 'w') as qdump:\n\n                ## pull from db\n                for idx in xrange(0, self.params.nquartets, self._chunksize):\n                    qchunk = io5[\"quartets\"][idx:idx+self._chunksize, :]\n                    quarts = [tuple(j) for j in qchunk if np.any(j)]\n\n                    ## shuffle and format for qmc\n                    np.random.shuffle(quarts)\n                    chunk = [\"{},{}|{},{}\".format(*i) for i in quarts]\n                    qdump.write(\"\\n\".join(chunk)+\"\\n\")", "response": "Dump inferred quartet sets to a text \n        file."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nrun the quartet max - cut QMC on the quartets qdump file and saves the tree to self. trees. tree.", "response": "def _run_qmc(self, boot):\n        \"\"\"\n        Runs quartet max-cut QMC on the quartets qdump file.\n        \"\"\"\n\n        ## build command\n        self._tmp = os.path.join(self.dirs, \".tmptre\")\n        cmd = [ip.bins.qmc, \"qrtt=\"+self.files.qdump, \"otre=\"+self._tmp]\n\n        ## run it\n        proc = subprocess.Popen(cmd, stderr=subprocess.STDOUT, stdout=subprocess.PIPE)\n        res = proc.communicate()\n        if proc.returncode:\n            raise IPyradWarningExit(res[1])\n\n        ## parse tmp file written by qmc into a tree and rename it\n        with open(self._tmp, 'r') as intree:\n            tre = ete3.Tree(intree.read().strip())\n            names = tre.get_leaves()\n            for name in names:\n                name.name = self.samples[int(name.name)]\n            tmptre = tre.write(format=9)\n\n        ## save the tree to file\n        if boot:\n            self.trees.boots = os.path.join(self.dirs, self.name+\".boots\")\n            with open(self.trees.boots, 'a') as outboot:\n                outboot.write(tmptre+\"\\n\")\n        else:\n            self.trees.tree = os.path.join(self.dirs, self.name+\".tree\")\n            with open(self.trees.tree, 'w') as outtree:\n                outtree.write(tmptre)\n\n        ## save the file\n        self._save()"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ncomputes sampling stats and consens trees.", "response": "def _compute_stats(self, start, ipyclient, quiet=False):\n        \"\"\"\n        Compute sampling stats and consens trees. \n        \"\"\"\n        \n        ## get name indices\n        names = self.samples\n\n        ## make a consensus from bootstrap reps.\n        if self.checkpoint.boots:\n            tre = ete3.Tree(self.trees.tree, format=0)\n            tre.unroot()\n            with open(self.trees.boots, 'r') as inboots:\n                bb = [ete3.Tree(i.strip(), format=0) for i in inboots.readlines()]\n                bb = [tre] + bb\n\n            ## calculate consensus supports\n            ctre, counts = consensus_tree(bb, names=names)\n            self.trees.cons = os.path.join(self.dirs, self.name+\".cons\")\n            with open(self.trees.cons, 'w') as ocons:\n                ocons.write(ctre.write(format=0))\n\n        else:\n            ctre = ete3.Tree(self.trees.tree, format=0)\n            ctre.unroot()\n\n        ## build stats file and write trees\n        self.trees.nhx = os.path.join(self.dirs, self.name+\".nhx\")\n        lbview = ipyclient.load_balanced_view()\n        qtots = {}\n        qsamp = {}\n        tots = sum(1 for i in ctre.iter_leaves())\n        totn = set(ctre.get_leaf_names())\n\n        ## iterate over node traversal\n        for node in ctre.traverse():\n            ## this is slow, needs to look at every sampled quartet\n            ## so we send it to be processed on engines\n            qtots[node] = lbview.apply(get_total, *(tots, node))\n            qsamp[node] = lbview.apply(get_sampled, *(self, totn, node))\n\n        ## wait for jobs to finish (+1 to lenjob is for final progress printer)\n        alljobs = qtots.values() + qsamp.values()\n        lenjobs = len(alljobs) + 1\n        printstr = \"calculating stats | {} | \"\n        done = 0\n        while 1:\n            if not quiet:\n                done = sum([i.ready() for i in alljobs])\n                elapsed = datetime.timedelta(seconds=int(time.time()-start))\n                progressbar(lenjobs, done, \n                    printstr.format(elapsed), spacer=\"\")\n            if (lenjobs - 1) == done:\n                break\n            else:\n                time.sleep(0.1)\n        ## store results in the tree object\n        for node in ctre.traverse():\n            total = qtots[node].result()\n            sampled = qsamp[node].result()\n            node.add_feature(\"quartets_total\", total)\n            node.add_feature(\"quartets_sampled\", sampled)\n        features = [\"quartets_total\", \"quartets_sampled\"]\n\n        ## update final progress\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))        \n        progressbar(1, 1, printstr.format(elapsed), spacer=\"\")\n        if not quiet:\n            print(\"\")\n\n        ## write tree in NHX format \n        with open(self.trees.nhx, 'w') as outtre:\n            outtre.write(ctre.write(format=0, features=features))"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _load(self, name, workdir, quiet=False):\n\n        ## load the JSON string and try with name+.json\n        path = os.path.join(workdir, name)\n        if not path.endswith(\".tet.json\"):\n            path += \".tet.json\"\n\n        ## expand user\n        path = path.replace(\"~\", os.path.expanduser(\"~\"))\n\n        ## load the json file as a dictionary\n        try:\n            with open(path, 'r') as infile:\n                fullj = _byteify(json.loads(infile.read(),\n                                object_hook=_byteify), \n                            ignore_dicts=True)\n        except IOError:\n            raise IPyradWarningExit(\"\"\"\\\n        Cannot find checkpoint (.tet.json) file at: {}\"\"\".format(path))\n\n        ## set old attributes into new tetrad object\n        self.name = fullj[\"name\"]\n        self.files.data = fullj[\"files\"][\"data\"]\n        self.files.mapfile = fullj[\"files\"][\"mapfile\"]        \n        self.dirs = fullj[\"dirs\"]\n        self._init_seqarray(quiet=quiet)\n        self._parse_names()\n\n        ## fill in the same attributes\n        for key in fullj:\n            ## fill Params a little different\n            if key in [\"files\", \"params\", \"database\", \n                       \"trees\", \"stats\", \"checkpoint\"]:\n                filler = fullj[key]\n                for ikey in filler:\n                    self.__dict__[key].__setattr__(ikey, fullj[key][ikey])\n            else:\n                self.__setattr__(key, fullj[key])", "response": "Load a tetrad object from a checkpoint."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nsends slices of quartet sets to parallel engines for computing, enters results into output database, sends finished quartet sets to QMC for tree inference, and prints progress bars.", "response": "def _inference(self, start, ipyclient, quiet):\n        \"\"\"\n        Sends slices of quartet sets to parallel engines for computing, \n        enters results into output database, sends finished quartet sets\n        to QMC for tree inference, and prints progress bars.\n        \"\"\"\n\n        ## load-balancer for single-threaded execution jobs\n        lbview = ipyclient.load_balanced_view()\n\n        ## an iterator that grabs quartet chunk start positions\n        jobs = range(self.checkpoint.arr, self.params.nquartets, self._chunksize)\n\n        ## if this is a bootstrap then init a new boot array in the database\n        ## max val is 65535 in here if uint16\n        bootkey = \"boot{}\".format(self.checkpoint.boots)\n        with h5py.File(self.database.output, 'r+') as io5:\n            if bootkey not in io5[\"invariants\"].keys():\n                io5[\"invariants\"].create_dataset(\n                    bootkey, \n                    (self.params.nquartets, 16, 16),\n                    dtype=np.uint16,\n                    chunks=(self._chunksize, 16, 16))\n\n        ## start progress bar if new or skip if bootstrapping\n        elapsed = datetime.timedelta(seconds=int(time.time()-start))\n        if self.checkpoint.boots:\n            printstr = \"bootstrap trees   | {} | \"\n        else:\n            printstr = \"initial tree      | {} | \"\n            if not quiet:\n                progressbar(1, 0, printstr.format(elapsed), spacer=\"\")\n\n        ## submit jobs distriuted across the cluster.\n        asyncs = {}\n        for job in jobs:\n            asyncs[job] = lbview.apply(nworker, *(self, job))\n\n        ## wait for jobs to finish, catch results as they return and\n        ## enter them into the HDF5 database to keep memory low.\n        done = 0\n        while 1:\n            ## gather finished jobs\n            finished = [i for i,j in asyncs.iteritems() if j.ready()]\n\n            ## iterate over finished list\n            for key in finished:\n                async = asyncs[key]\n                if async.successful():\n                    ## store result\n                    done += 1\n                    results = async.result()\n                    self._insert_to_array(key, results)\n                    ## purge from memory\n                    del asyncs[key]\n                else:\n                    raise IPyradWarningExit(async.result())\n\n            ## progress bar is different if first vs boot tree\n            elapsed = datetime.timedelta(seconds=int(time.time()-start))\n            if not self.checkpoint.boots:\n                if not quiet:\n                    progressbar(len(jobs), done, printstr.format(elapsed), spacer=\"\")\n            else:\n                if not quiet:\n                    progressbar(self.params.nboots, self.checkpoint.boots,\n                                printstr.format(elapsed), spacer=\"\")\n\n            ## done is counted on finish, so this means we're done\n            if len(asyncs) == 0:\n                break\n            else:\n                time.sleep(0.1)\n\n        ## dump quartets into a text file for QMC\n        self._dump_qmc()\n\n        ## send to QMC\n        if not self.checkpoint.boots:\n            self._run_qmc(0)\n        else:\n            self._run_qmc(1)\n\n        ## reset the checkpoint arr\n        self.checkpoint.arr = 0\n\n        ## print spacer if finished first tree or last boot.\n        if (not self.checkpoint.boots) and (not quiet):\n            print(\"\")\n\n        elif (self.checkpoint.boots == self.params.nboots) and (not quiet):\n            print(\"\")"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ninsert the result array into the HDF5 database.", "response": "def _insert_to_array(self, chunk, results):\n        \"\"\"\n        Enters results arrays into the HDF5 database.\n        \"\"\"\n\n        ## two result arrs\n        chunksize = self._chunksize\n        qrts, invs = results\n\n        ## enter into db\n        with h5py.File(self.database.output, 'r+') as io5:\n            io5['quartets'][chunk:chunk+chunksize] = qrts\n\n            ## entered as 0-indexed !\n            if self.params.save_invariants:\n                if self.checkpoint.boots:\n                    key = \"invariants/boot{}\".format(self.checkpoint.boots)\n                    io5[key][chunk:chunk+chunksize] = invs\n                else:\n                    io5[\"invariants/boot0\"][chunk:chunk+chunksize] = invs"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef run(self, force=False, quiet=False, ipyclient=None):\n\n        ## force overwrite needs to clear out the HDF5 database\n        if force:\n            self._refresh()\n\n        ## print nquartet statement\n        if not quiet:\n            print(\"inferring {} quartet tree sets\".format(self.params.nquartets))\n\n        ## wrap the run in a try statement to ensure we properly shutdown\n        ## and cleanup on exit or interrupt. \n        inst = None\n        try:\n            ## find and connect to an ipcluster instance given the information\n            ## in the _ipcluster dictionary if a connected client was not given.\n            if not ipyclient:\n                args = self._ipcluster.items() + [(\"spacer\", \"\")]\n                ipyclient = ip.core.parallel.get_client(**dict(args))\n\n            ## print the cluster connection information\n            if not quiet:\n                ip.cluster_info(ipyclient)\n\n            ## store ipyclient engine pids to the dict so we can \n            ## hard-interrupt them later if assembly is interrupted. \n            ## Only stores pids of engines that aren't busy at this moment, \n            ## otherwise it would block here while waiting to find their pids.\n            self._ipcluster[\"pids\"] = {}\n            for eid in ipyclient.ids:\n                engine = ipyclient[eid]\n                if not engine.outstanding:\n                    pid = engine.apply(os.getpid).get()\n                    self._ipcluster[\"pids\"][eid] = pid            \n\n            ## fill the input array with quartets to sample --------------------\n            start = time.time()\n            if (not self.checkpoint.boots) and (not self.trees.tree):\n                self._store_N_samples(start, ipyclient, quiet=quiet)\n\n            ## calculate invariants for the full seqarray ----------------------\n            start = time.time()            \n            if not self.trees.tree:\n                self._inference(start, ipyclient, quiet=quiet)\n            else:\n                if not quiet:\n                    print(\"initial tree already inferred\")\n\n            ## calculate invariants for each bootstrap rep ----------------------\n            start = time.time()\n            if self.params.nboots:\n                if self.checkpoint.boots: #<= self.params.nboots:\n                    if not quiet:\n                        print(\"{} bootstrap trees already inferred\"\\\n                              .format(self.checkpoint.boots))\n\n                while self.checkpoint.boots < self.params.nboots:\n                    ## resample the bootseq array\n                    if self.files.mapfile:\n                        self._sample_bootseq_array_map()\n                    else:\n                        self._sample_bootseq_array()\n\n                    ## start boot inference \n                    self.checkpoint.boots += 1\n                    self._inference(start, ipyclient, quiet=quiet)\n\n            ## write output stats -----------------------------------------------\n            #self.files.stats = os.path.join(self.dirs, self.name+\"_stats.txt\")\n            start = time.time()\n            self._compute_stats(start, ipyclient, quiet=quiet)\n\n        ## handle exceptions so they will be raised after we clean up below\n        except KeyboardInterrupt as inst:\n            print(\"\\nKeyboard Interrupt by user. Cleaning up...\")\n\n        except IPyradWarningExit as inst:\n            print(\"\\nError encountered: {}\".format(inst))\n\n        except Exception as inst:\n            print(\"\\nUnknown exception encountered: {}\".format(inst))\n\n        ## close client when done or interrupted\n        finally:\n            try:\n                ## save the Assembly\n                self._save()                \n                \n                ## can't close client if it was never open\n                if ipyclient:\n\n                    ## send SIGINT (2) to all engines\n                    ipyclient.abort()\n                    time.sleep(1)\n                    for engine_id, pid in self._ipcluster[\"pids\"].items():\n                        if ipyclient.queue_status()[engine_id][\"tasks\"]:\n                            os.kill(pid, 2)\n                        time.sleep(0.25)\n                    \n                    ## if CLI, stop jobs and shutdown\n                    if 'ipyrad-cli' in self._ipcluster[\"cluster_id\"]:\n                        ipyclient.shutdown(hub=True, block=False)\n                        ipyclient.close()\n                    else:\n                        if not ipyclient.outstanding:\n                            ipyclient.purge_everything()\n                        else:\n                            ## nanny: kill everything, something bad happened\n                            ipyclient.shutdown(hub=True, block=False)\n                            ipyclient.close()\n                            print(\"\\nwarning: ipcluster shutdown and must be restarted\")\n                \n                ## reraise the error now that we're cleaned up\n                #if inst:\n                #    raise inst\n\n            ## if exception during shutdown then we really screwed up\n            except Exception as inst2:\n                print(\"warning: error during shutdown:\\n{}\".format(inst2))", "response": "Run the nquartets analysis."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef start_ipcluster(data):\n\n    ## if MPI argument then use --ip arg to view all sockets\n    iparg = \"\"\n    if \"MPI\" in data._ipcluster[\"engines\"]:\n        iparg = \"--ip=*\"\n\n    ## make ipcluster arg call\n    standard = \"\"\"\n        ipcluster start \n                  --daemonize \n                  --cluster-id={}\n                  --engines={} \n                  --profile={}\n                  --n={}\n                  {}\"\"\"\\\n        .format(data._ipcluster[\"cluster_id\"], \n                data._ipcluster[\"engines\"], \n                data._ipcluster[\"profile\"],\n                data._ipcluster[\"cores\"],\n                iparg)\n                   \n    ## wrap ipcluster start\n    try: \n        LOGGER.info(shlex.split(standard))\n        subprocess.check_call(shlex.split(standard), \n            stderr=subprocess.STDOUT,\n            stdout=subprocess.PIPE)\n\n    except subprocess.CalledProcessError as inst:\n        LOGGER.debug(\"  ipcontroller already running.\")\n        raise\n\n    except Exception as inst:\n        sys.exit(\"  Error launching ipcluster for parallelization:\\n({})\\n\".\\\n                 format(inst))", "response": "Start the ipcluster for parallelization."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nregisters a new cluster with ipyrad.", "response": "def register_ipcluster(data):\n    \"\"\"\n    The name is a unique id that keeps this __init__ of ipyrad distinct\n    from interfering with other ipcontrollers. Run statements are wrapped\n    so that ipcluster will be killed on exit.\n    \"\"\"\n    ## check if this pid already has a running cluster\n    data._ipcluster[\"cluster_id\"] = \"ipyrad-cli-\"+str(os.getpid())\n    start_ipcluster(data)\n    return data"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ncreate a client to view ipcluster engines for a given profile and returns it with at least one engine spun up and ready to go.", "response": "def get_client(cluster_id, profile, engines, timeout, cores, quiet, spacer, **kwargs):\n    \"\"\" \n    Creates a client to view ipcluster engines for a given profile and \n    returns it with at least one engine spun up and ready to go. If no \n    engines are found after nwait amount of time then an error is raised.\n    If engines==MPI it waits a bit longer to find engines. If the number\n    of engines is set then it waits even longer to try to find that number\n    of engines.\n    \"\"\"\n\n    ## save stds for later, we're gonna hide them to prevent external printing \n    save_stdout = sys.stdout \n    save_stderr = sys.stderr\n    sys.stdout = cStringIO.StringIO()\n    sys.stderr = cStringIO.StringIO()\n\n    ## get cluster_info print string\n    connection_string = \"{}establishing parallel connection:\".format(spacer)\n\n    ## wrapped search for ipcluster\n    try: \n        ## are we looking for a running ipcluster instance?\n        if profile not in [None, \"default\"]:\n            args = {'profile': profile, \"timeout\": timeout}\n        else:\n            clusterargs = [cluster_id, profile, timeout]\n            argnames = [\"cluster_id\", \"profile\", \"timeout\"]\n            args = {key:value for key, value in zip(argnames, clusterargs)}\n\n        ## get connection within timeout window of wait time and hide messages\n        ipyclient = ipp.Client(**args)\n        sys.stdout = save_stdout\n        sys.stderr = save_stderr\n\n        ## check that all engines have connected            \n        if (engines == \"MPI\") or (\"ipyrad-cli-\" in cluster_id):\n            if not quiet:\n                print(connection_string)\n\n        for _ in range(6000):\n            initid = len(ipyclient)\n            time.sleep(0.01)\n            ## If MPI then wait for all engines to start so we can report\n            ## how many cores are on each host. If Local then only wait for\n            ## one engine to be ready and then just go.\n            if (engines == \"MPI\") or (\"ipyrad-cli-\" in cluster_id):\n                ## wait for cores to be connected\n                if cores:\n                    time.sleep(0.1)\n                    if initid == cores:\n                        break\n                if initid:\n                    time.sleep(3)\n                    if len(ipyclient) == initid:\n                        break\n            else:\n                if cores:\n                    if initid == cores:\n                        break\n                else:\n                    if initid:\n                        break\n\n\n    except KeyboardInterrupt as inst:\n        ## ensure stdout is reset even if Exception was raised            \n        sys.stdout = save_stdout\n        sys.stderr = save_stderr\n        raise inst\n\n    ## This is raised if ipcluster is not running ------------\n    except IOError as inst:\n        ## ensure stdout is reset even if Exception was raised\n        sys.stdout = save_stdout\n        sys.stderr = save_stderr\n        if \"ipyrad-cli-\" in cluster_id:\n            raise IPyradWarningExit(NO_IPCLUSTER_CLI)\n        else:\n            raise IPyradWarningExit(NO_IPCLUSTER_API)\n\n    except (ipp.TimeoutError, ipp.NoEnginesRegistered) as inst:\n        ## raised by ipp if no connection file is found for 'nwait' seconds\n        sys.stdout = save_stdout\n        sys.stderr = save_stderr\n        raise inst\n\n    except Exception as inst:\n        ## if any other exceptions were missed...\n        sys.stdout = save_stdout\n        sys.stderr = save_stderr\n        raise inst\n\n    finally:\n        ## ensure that no matter what we reset the stds\n        sys.stdout = save_stdout\n        sys.stderr = save_stderr\n\n    return ipyclient"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef memoize(func):\n    class Memodict(dict):\n        \"\"\" just a dict\"\"\"\n        def __getitem__(self, *key):\n            return dict.__getitem__(self, key)\n\n        def __missing__(self, key):\n            \"\"\" this makes it faster \"\"\"\n            ret = self[key] = func(*key)\n            return ret\n\n    return Memodict().__getitem__", "response": "Memoization decorator for a function taking one or more arguments."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef ambigcutters(seq):\n    resos = []\n    if any([i in list(\"RKSYWM\") for i in seq]):\n        for base in list(\"RKSYWM\"):\n            if base in seq:\n                resos.append(seq.replace(base, AMBIGS[base][0]))\n                resos.append(seq.replace(base, AMBIGS[base][1]))\n        return resos\n    else:\n        return [seq, \"\"]", "response": "Returns both resolutions of a cut site that has an ambiguous base in\n    it else the single cut site"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef splitalleles(consensus):\n\n    ## store two alleles, allele1 will start with bigbase\n    allele1 = list(consensus)\n    allele2 = list(consensus)\n    hidx = [i for (i, j) in enumerate(consensus) if j in \"RKSWYMrkswym\"]\n\n    ## do remaining h sites\n    for idx in hidx:\n        hsite = consensus[idx]\n        if hsite.isupper():\n            allele1[idx] = PRIORITY[hsite]\n            allele2[idx] = MINOR[hsite]\n        else:\n            allele1[idx] = MINOR[hsite.upper()]\n            allele2[idx] = PRIORITY[hsite.upper()]\n\n    ## convert back to strings\n    allele1 = \"\".join(allele1)\n    allele2 = \"\".join(allele2)\n\n    return allele1, allele2", "response": "takes diploid consensus alleles with phase data stored as a mixture\n    of upper and lower case characters and splits it into two alleles"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn a seq with complement. Preserves little n s for splitters.", "response": "def comp(seq):\n    \"\"\" returns a seq with complement. Preserves little n's for splitters.\"\"\"\n    ## makes base to its small complement then makes upper\n    return seq.replace(\"A\", 't')\\\n              .replace('T', 'a')\\\n              .replace('C', 'g')\\\n              .replace('G', 'c')\\\n              .replace('n', 'Z')\\\n              .upper()\\\n              .replace(\"Z\", \"n\")"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef fullcomp(seq):\n    ## this is surely not the most efficient...\n    seq = seq.replace(\"A\", 'u')\\\n             .replace('T', 'v')\\\n             .replace('C', 'p')\\\n             .replace('G', 'z')\\\n             .replace('u', 'T')\\\n             .replace('v', 'A')\\\n             .replace('p', 'G')\\\n             .replace('z', 'C')\n\n    ## No complement for S & W b/c complements are S & W, respectively\n    seq = seq.replace('R', 'u')\\\n             .replace('K', 'v')\\\n             .replace('Y', 'b')\\\n             .replace('M', 'o')\\\n             .replace('u', 'Y')\\\n             .replace('v', 'M')\\\n             .replace('b', 'R')\\\n             .replace('o', 'K')\n\n    seq = seq.replace('r', 'u')\\\n             .replace('k', 'v')\\\n             .replace('y', 'b')\\\n             .replace('m', 'o')\\\n             .replace('u', 'y')\\\n             .replace('v', 'm')\\\n             .replace('b', 'r')\\\n             .replace('o', 'k')\n    return seq", "response": "returns complement of sequence including ambiguity characters and saves lower case info for multiple hetero sequences"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef fastq_touchup_for_vsearch_merge(read, outfile, reverse=False):\n    \n    counts = 0\n    with open(outfile, 'w') as out:\n        ## read in paired end read files 4 lines at a time\n        if read.endswith(\".gz\"):\n            fr1 = gzip.open(read, 'rb')\n        else:\n            fr1 = open(read, 'rb')\n        quarts = itertools.izip(*[iter(fr1)]*4)\n\n        ## a list to store until writing\n        writing = []\n\n        while 1:\n            try:\n                lines = quarts.next()\n            except StopIteration:\n                break\n            if reverse:\n                seq = lines[1].strip()[::-1]\n            else:\n                seq = lines[1].strip()\n            writing.append(\"\".join([\n                lines[0],\n                seq+\"\\n\",\n                lines[2],\n                \"B\"*len(seq)\n            ]))\n\n            ## write to disk\n            counts += 1\n            if not counts % 1000:\n                out.write(\"\\n\".join(writing)+\"\\n\")\n                writing = []\n        if writing:\n            out.write(\"\\n\".join(writing))\n            \n    out.close()\n    fr1.close()", "response": "touchup for vsearch merge"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef merge_pairs_after_refmapping(data, two_files, merged_out):\n\n    ## create temp files \n    nonmerged1 = tempfile.NamedTemporaryFile(\n        mode='wb',\n        dir=data.dirs.edits,\n        suffix=\"_nonmerged_R1_.fastq\").name\n    nonmerged2 = tempfile.NamedTemporaryFile(\n        mode='wb',\n        dir=data.dirs.edits,\n        suffix=\"_nonmerged_R2_.fastq\").name \n\n    ## get the maxn and minlen values\n    minlen = str(max(32, data.paramsdict[\"filter_min_trim_len\"]))\n    try:\n        maxn = sum(data.paramsdict['max_low_qual_bases'])\n    except TypeError:\n        maxn = data.paramsdict['max_low_qual_bases']\n \n    ## set the quality scores abritrarily high and orient R2 correctly\n    tmp1 = two_files[0][0]\n    tmp2 = two_files[0][1]\n    fastq_touchup_for_vsearch_merge(tmp1, tmp1+\".tu\", False)\n    fastq_touchup_for_vsearch_merge(tmp2, tmp2+\".tu\", True)\n\n    ## command string to call vsearch\n    cmd = [ipyrad.bins.vsearch,\n           \"--fastq_mergepairs\", tmp1+\".tu\",\n           \"--reverse\", tmp2+\".tu\",\n           \"--fastqout\", merged_out,\n           \"--fastqout_notmerged_fwd\", nonmerged1,\n           \"--fastqout_notmerged_rev\", nonmerged2,\n           \"--fasta_width\", \"0\",\n           \"--fastq_minmergelen\", minlen,\n           \"--fastq_maxns\", str(maxn),\n           \"--fastq_minovlen\", \"10\",\n           \"--fastq_maxdiffs\", \"4\",\n           \"--label_suffix\", \"_m1\",\n           \"--fastq_qmax\", \"1000\",\n           \"--threads\", \"2\",\n           \"--fastq_allowmergestagger\"]\n    \n    ## run vsearch but allow kbd\n    proc = sps.Popen(cmd, stderr=sps.STDOUT, stdout=sps.PIPE)\n    try:\n        res = proc.communicate()[0]\n    except KeyboardInterrupt:\n        proc.kill()\n\n    ## cleanup tmp files if job failed or stopped    \n    if proc.returncode:\n        LOGGER.error(\"Error: %s %s\", cmd, res)\n        raise IPyradWarningExit(\"Error merge pairs:\\n %s\\n%s\", cmd, res)\n\n    ## record how many read pairs were merged\n    with open(merged_out, 'r') as tmpf:\n        nmerged = sum(1 for i in tmpf.readlines()) // 4\n\n    ## Concat unmerged pairs with a 'nnnn' separator\n    with open(merged_out, 'ab') as combout:\n        ## read in paired end read files 4 lines at a time\n        fr1 = open(nonmerged1, 'rb')\n        quart1 = itertools.izip(*[iter(fr1)]*4)\n        fr2 = open(nonmerged2, 'rb')\n        quart2 = itertools.izip(*[iter(fr2)]*4)\n        quarts = itertools.izip(quart1, quart2)\n\n        ## a list to store until writing\n        writing = []\n        counts = 0\n\n        ## iterate until done\n        while 1:\n            try:\n                read1s, read2s = quarts.next()\n            except StopIteration:\n                break\n\n            ## store the read\n            writing.append(\"\".join([\n                read1s[0],\n                read1s[1].strip() + \"nnnn\" + \\\n                read2s[1], \n                read1s[2],\n                read1s[3].strip() + \"nnnn\" + \\\n                read2s[3], \n                ]))\n\n            ## count up until time to write\n            counts += 1\n            if not counts % 10:\n                combout.write(\"\".join(writing))\n                writing = []\n        ## write the remaining\n        if writing:\n            combout.write(\"\".join(writing))\n\n        ## close handles\n        fr1.close()\n        fr2.close()\n        combout.close()\n\n    ## remove temp files (or do this later)\n    rmfiles = [nonmerged1, nonmerged2, tmp1, tmp2, tmp1+\".tu\", tmp2+\".tu\"]\n    for rmfile in rmfiles:\n        if os.path.exists(rmfile):\n            os.remove(rmfile)\n\n    return nmerged", "response": "A function to merge fastq files produced by bam2fq."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef merge_after_pysam(data, clust):\n    try:\n        r1file = tempfile.NamedTemporaryFile(mode='wb', delete=False,\n                                            dir=data.dirs.edits,\n                                            suffix=\"_R1_.fastq\")\n        r2file = tempfile.NamedTemporaryFile(mode='wb', delete=False,\n                                            dir=data.dirs.edits,\n                                            suffix=\"_R2_.fastq\")\n\n        r1dat = []\n        r2dat = []\n        for locus in clust:\n            sname, seq = locus.split(\"\\n\")\n            ## Have to munge the sname to make it look like fastq format\n            sname = \"@\" + sname[1:]\n            r1, r2 = seq.split(\"nnnn\")\n            r1dat.append(\"{}\\n{}\\n{}\\n{}\".format(sname, r1, \"+\", \"B\"*(len(r1))))\n            r2dat.append(\"{}\\n{}\\n{}\\n{}\".format(sname, r2, \"+\", \"B\"*(len(r2))))\n\n        r1file.write(\"\\n\".join(r1dat))\n        r2file.write(\"\\n\".join(r2dat))\n        r1file.close()\n        r2file.close()\n\n        ## Read in the merged data and format to return as a clust\n        merged_file = tempfile.NamedTemporaryFile(mode='wb',\n                                            dir=data.dirs.edits,\n                                            suffix=\"_merged.fastq\").name\n\n        clust = []\n        merge_pairs(data, [(r1file.name, r2file.name)], merged_file, 0, 1)\n\n        with open(merged_file) as infile:\n            quarts = itertools.izip(*[iter(infile)]*4)\n\n            while 1:\n                try:\n                    sname, seq, _, _ = quarts.next()\n                    ## Vsearch expects R2 oriented how it would be in a raw data file\n                    ## i.e. revcomp, and that's also how it returns it\n                    ## but we want to maintain the genomic orientation so R1 and\n                    ## R2 are both on the + strand and both in ascending positional order\n                    ## so here if the reads don't merge we have to revcomp R2\n                    if not \"_m1\" in sname.rsplit(\";\", 1)[1]:\n                        try:\n                            R1, R2 = seq.split(\"nnnn\")\n                            seq = R1 + \"nnnn\" + revcomp(R2)\n                        except ValueError as inst:\n                            LOGGER.error(\"Failed merge_after_pysam: {} {}\".format(sname, seq))\n                            raise\n\n                except StopIteration:\n                    break\n                ## put sname back\n                sname = \">\" + sname[1:]\n                clust.extend([sname.strip(), seq.strip()])\n    except:\n        LOGGER.info(\"Error in merge_pairs post-refmap.\")\n        raise\n    finally:\n        for i in [r1file.name, r2file.name, merged_file]:\n            if os.path.exists(i):\n                log_level = logging.getLevelName(LOGGER.getEffectiveLevel())\n                ## uncomment this to shit ALL over the filesystem. \n                ## if not log_level == \"DEBUG\":\n                os.remove(i)\n\n    return clust", "response": "This function is used to merge two clusteres into one."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nmerges PE reads. Takes in a list of unmerged files [r1, r2] and the filehandle to write merged data to, and it returns the number of reads that were merged (overlapping). If merge==0 then only concat pairs (nnnn), no merging in vsearch. Parameters ----------- two_files (tuple): A list or tuple of the [r1, r2] files to be merged. merged_out (str): A string file handle for the merged data to be written to. revcomp (bool): Whether or not to revcomp the R2s. merge (bool): Whether or not to perform vsearch merging. If not then reads are simply concatenated with a 'nnnn' separator. Returns -------- If merge is on then the func will return the number of pairs successfully merged, else it returns -1.", "response": "def merge_pairs(data, two_files, merged_out, revcomp, merge):\n    \"\"\"\n    Merge PE reads. Takes in a list of unmerged files [r1, r2] and the\n    filehandle to write merged data to, and it returns the number of reads\n    that were merged (overlapping). If merge==0 then only concat pairs (nnnn),\n    no merging in vsearch.\n\n    Parameters\n    -----------\n    two_files (tuple):\n        A list or tuple of the [r1, r2] files to be merged. \n\n    merged_out (str):\n        A string file handle for the merged data to be written to. \n\n    revcomp (bool):\n        Whether or not to revcomp the R2s. \n\n    merge (bool):\n        Whether or not to perform vsearch merging. If not then reads are simply\n        concatenated with a 'nnnn' separator. \n\n    Returns\n    --------\n    If merge is on then the func will return the number of pairs \n    successfully merged, else it returns -1. \n    \"\"\"\n\n    LOGGER.debug(\"Entering merge_pairs()\")\n\n    ## Return the number of merged pairs\n    nmerged = -1\n\n    ## Check input files from inside list-tuple [(r1, r2)]\n    for fhandle in two_files[0]:\n        if not os.path.exists(fhandle):\n            raise IPyradWarningExit(\"\"\"\n    Attempting to merge a file that doesn't exist - {}\"\"\".format(fhandle))\n\n    ## If it already exists, clean up the old merged file\n    if os.path.exists(merged_out):\n        os.remove(merged_out)\n\n    ## if merge then catch nonmerged in a separate file\n    if merge:\n        nonmerged1 = tempfile.NamedTemporaryFile(mode='wb',\n                                            dir=data.dirs.edits,\n                                            suffix=\"_nonmerged_R1_.fastq\").name\n        nonmerged2 = tempfile.NamedTemporaryFile(mode='wb',\n                                            dir=data.dirs.edits,\n                                            suffix=\"_nonmerged_R2_.fastq\").name\n\n    ## if not merging then the nonmerged reads will come from the normal edits\n    else:\n        nonmerged1 = two_files[0][0]\n        nonmerged2 = two_files[0][1]\n\n    ## get the maxn and minlen values\n    try:\n        maxn = sum(data.paramsdict['max_low_qual_bases'])\n    except TypeError:\n        maxn = data.paramsdict['max_low_qual_bases']\n    minlen = str(max(32, data.paramsdict[\"filter_min_trim_len\"]))\n\n    ## we need to gunzip the files if they are zipped (at least for now)\n    if merge and two_files[0][0].endswith(\".gz\"):\n        LOGGER.info(\"gunzipping pairs\")\n        tmp1 = os.path.splitext(two_files[0][0])[0]+\".tmp1\"\n        tmp2 = os.path.splitext(two_files[0][1])[0]+\".tmp2\"\n\n        out1 = open(tmp1, 'w')\n        out2 = open(tmp2, 'w')\n        gun1 = sps.Popen([\"gunzip\", \"-c\", two_files[0][0]],\n                          stderr=sps.STDOUT, stdout=out1, close_fds=True)\n        gun2 = sps.Popen([\"gunzip\", \"-c\", two_files[0][1]],\n                          stderr=sps.STDOUT, stdout=out2, close_fds=True)\n        _ = gun1.communicate()\n        _ = gun2.communicate()\n        out1.close()\n        out2.close()\n    else:\n        tmp1 = two_files[0][0]\n        tmp2 = two_files[0][1]\n    try:\n        ## If we are actually mergeing and not just joining then do vsearch\n        if merge:\n            ## create tmp files with high quality scores and with R2 oriented\n            cmd = [ipyrad.bins.vsearch,\n                   \"--fastq_mergepairs\", tmp1,\n                   \"--reverse\", tmp2,\n                   \"--fastqout\", merged_out,\n                   \"--fastqout_notmerged_fwd\", nonmerged1,\n                   \"--fastqout_notmerged_rev\", nonmerged2,\n                   \"--fasta_width\", \"0\",\n                   \"--fastq_minmergelen\", minlen,\n                   \"--fastq_maxns\", str(maxn),\n                   \"--fastq_minovlen\", \"20\",\n                   \"--fastq_maxdiffs\", \"4\",\n                   \"--label_suffix\", \"_m1\",\n                   \"--fastq_qmax\", \"1000\",\n                   \"--threads\", \"2\",\n                   \"--fastq_allowmergestagger\"]\n\n            LOGGER.debug(\"merge cmd: %s\", \" \".join(cmd))\n            proc = sps.Popen(cmd, stderr=sps.STDOUT, stdout=sps.PIPE)\n            try:\n                res = proc.communicate()[0]\n            except KeyboardInterrupt:\n                proc.kill()\n\n            if proc.returncode:\n                LOGGER.error(\"Error: %s %s\", cmd, res)\n                ## remove temp files\n                rmfiles = [os.path.splitext(two_files[0][0])[0]+\".tmp1\",\n                           os.path.splitext(two_files[0][1])[0]+\".tmp2\",\n                           nonmerged1, nonmerged2]\n                for rmfile in rmfiles:\n                    if os.path.exists(rmfile):\n                        os.remove(rmfile)\n                raise IPyradWarningExit(\"Error merge pairs:\\n %s\\n%s\", cmd, res)\n\n            ## record how many read pairs were merged\n            with open(merged_out, 'r') as tmpf:\n                #nmerged = len(tmpf.readlines()) // 4\n                nmerged = sum(1 for i in tmpf.readlines()) // 4\n\n        ## Combine the unmerged pairs and append to the merge file\n        with open(merged_out, 'ab') as combout:\n            ## read in paired end read files 4 lines at a time\n            if nonmerged1.endswith(\".gz\"):\n                fr1 = gzip.open(nonmerged1, 'rb')\n            else:\n                fr1 = open(nonmerged1, 'rb')\n            quart1 = itertools.izip(*[iter(fr1)]*4)\n            if nonmerged2.endswith(\".gz\"):\n                fr2 = gzip.open(nonmerged2, 'rb')\n            else:\n                fr2 = open(nonmerged2, 'rb')\n            quart2 = itertools.izip(*[iter(fr2)]*4)\n            quarts = itertools.izip(quart1, quart2)\n\n            ## a list to store until writing\n            writing = []\n            counts = 0\n\n            ## iterate until done\n            while 1:\n                try:\n                    read1s, read2s = quarts.next()\n                except StopIteration:\n                    break\n                if revcomp:\n                    writing.append(\"\".join([\n                        read1s[0],\n                        read1s[1].strip() + \"nnnn\" + \\\n                        comp(read2s[1].strip()[::-1]) + \"\\n\",\n                        read1s[2],\n                        read1s[3].strip() + \"nnnn\" + \\\n                        read2s[3].strip()[::-1] + \"\\n\",\n                        ]))\n                else:\n                    writing.append(\"\".join([\n                        read1s[0],\n                        read1s[1].strip() + \"nnnn\" + \\\n                        read2s[1],\n                        read1s[2],\n                        read1s[3].strip() + \"nnnn\" + \\\n                        read2s[3],\n                        ]))\n\n                counts += 1\n                if not counts % 10:\n                    combout.write(\"\".join(writing)) #+\"\\n\")\n                    writing = []\n\n            if writing:\n                combout.write(\"\".join(writing))\n\n        ## close handles\n        fr1.close()\n        fr2.close()\n        combout.close()\n\n    except Exception as inst:\n        LOGGER.error(\"Exception in merge_pairs - {}\".format(inst))\n        raise\n    ## No matter what happens please clean up the temp files.\n    finally:\n        ## if merged then delete the nonmerge tmp files\n        if merge:\n            ## remove temp files\n            rmfiles = [nonmerged1, nonmerged2,\n                       os.path.splitext(two_files[0][0])[0]+\".tmp1\",\n                       os.path.splitext(two_files[0][1])[0]+\".tmp2\"]\n            for rmfile in rmfiles:\n                if os.path.exists(rmfile):\n                    os.remove(rmfile)\n\n    return nmerged"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef revcomp(sequence):\n    \"returns reverse complement of a string\"\n    sequence = sequence[::-1].strip()\\\n                             .replace(\"A\", \"t\")\\\n                             .replace(\"T\", \"a\")\\\n                             .replace(\"C\", \"g\")\\\n                             .replace(\"G\", \"c\").upper()\n    return sequence", "response": "returns reverse complement of a string"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreturn optim clusters given iterators and whether it got all or not", "response": "def clustdealer(pairdealer, optim):\n    \"\"\" return optim clusters given iterators, and whether it got all or not\"\"\"\n    ccnt = 0\n    chunk = []\n    while ccnt < optim:\n        ## try refreshing taker, else quit\n        try:\n            taker = itertools.takewhile(lambda x: x[0] != \"//\\n\", pairdealer)\n            oneclust = [\"\".join(taker.next())]\n        except StopIteration:\n            #LOGGER.debug('last chunk %s', chunk)\n            return 1, chunk\n\n        ## load one cluster\n        while 1:\n            try:\n                oneclust.append(\"\".join(taker.next()))\n            except StopIteration:\n                break\n        chunk.append(\"\".join(oneclust))\n        ccnt += 1\n    return 0, chunk"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nprinting a progress bar", "response": "def progressbar(njobs, finished, msg=\"\", spacer=\"  \"):\n    \"\"\" prints a progress bar \"\"\"\n    if njobs:\n        progress = 100*(finished / float(njobs))\n    else:\n        progress = 100\n        \n    hashes = '#'*int(progress/5.)\n    nohash = ' '*int(20-len(hashes))\n    if not ipyrad.__interactive__:\n        msg = msg.rsplit(\"|\", 2)[0]\n\n    args = [spacer, hashes+nohash, int(progress), msg]\n    print(\"\\r{}[{}] {:>3}% {} \".format(*args), end=\"\")\n    sys.stdout.flush()"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_threaded_view(ipyclient, split=True):\n    ## engine ids\n    ## e.g., [0, 1, 2, 3, 4, 5, 6, 7, 8]\n    eids = ipyclient.ids\n\n    ## get host names\n    ## e.g., ['a', 'a', 'b', 'b', 'a', 'c', 'c', 'c', 'c']\n    dview = ipyclient.direct_view()\n    hosts = dview.apply_sync(socket.gethostname)\n\n    ## group ids into a dict by their hostnames\n    ## e.g., {a: [0, 1, 4], b: [2, 3], c: [5, 6, 7, 8]}\n    hostdict = defaultdict(list)\n    for host, eid in zip(hosts, eids):\n        hostdict[host].append(eid)\n\n    ## Now split threads on the same host into separate proc if there are many\n    hostdictkeys = hostdict.keys()\n    for key in hostdictkeys:\n        gids = hostdict[key]\n        maxt = len(gids)\n        if len(gids) >= 4:\n            maxt = 2\n        ## if 4 nodes and 4 ppn, put one sample per host\n        if (len(gids) == 4) and (len(hosts) >= 4):\n            maxt = 4\n        if len(gids) >= 6:\n            maxt = 3\n        if len(gids) >= 8:\n            maxt = 4\n        if len(gids) >= 16:\n            maxt = 4\n        ## split ids into groups of maxt\n        threaded = [gids[i:i+maxt] for i in xrange(0, len(gids), maxt)]\n        lth = len(threaded)\n        ## if anything was split (lth>1) update hostdict with new proc\n        if lth > 1:\n            hostdict.pop(key)\n            for hostid in range(lth):\n                hostdict[str(key)+\"_\"+str(hostid)] = threaded[hostid]\n\n    ## make sure split numbering is correct\n    #threaded = hostdict.values()\n    #assert len(ipyclient.ids) <= len(list(itertools.chain(*threaded)))\n    LOGGER.info(\"threaded_view: %s\", dict(hostdict))\n    return hostdict", "response": "gets optimum threaded view of ids given the host setup"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef detect_cpus():\n    # Linux, Unix and MacOS:\n    if hasattr(os, \"sysconf\"):\n        if os.sysconf_names.has_key(\"SC_NPROCESSORS_ONLN\"):\n            # Linux & Unix:\n            ncpus = os.sysconf(\"SC_NPROCESSORS_ONLN\")\n            if isinstance(ncpus, int) and ncpus > 0:\n                return ncpus\n        else: # OSX:\n            return int(os.popen2(\"sysctl -n hw.ncpu\")[1].read())\n    # Windows:\n    if os.environ.has_key(\"NUMBER_OF_PROCESSORS\"):\n        ncpus = int(os.environ[\"NUMBER_OF_PROCESSORS\"])\n        if ncpus > 0:\n            return ncpus\n    return 1", "response": "Detects the number of CPUs on a system."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _call_structure(mname, ename, sname, name, workdir, seed, ntaxa, nsites, kpop, rep):\n    ## create call string\n    outname = os.path.join(workdir, \"{}-K-{}-rep-{}\".format(name, kpop, rep))\n\n    cmd = [\"structure\", \n           \"-m\", mname, \n           \"-e\", ename, \n           \"-K\", str(kpop),\n           \"-D\", str(seed), \n           \"-N\", str(ntaxa), \n           \"-L\", str(nsites),\n           \"-i\", sname, \n           \"-o\", outname]\n\n    ## call the shell function\n    proc = subprocess.Popen(cmd,\n                            stdout=subprocess.PIPE, \n                            stderr=subprocess.STDOUT)\n    comm = proc.communicate()\n\n    ## cleanup\n    oldfiles = [mname, ename, sname]\n    for oldfile in oldfiles:\n        if os.path.exists(oldfile):\n            os.remove(oldfile)\n    return comm", "response": "make the subprocess call to structure"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ncombining structure replicates into a single indfile.", "response": "def _concat_reps(self, kpop, max_var_multiple, quiet, **kwargs):\n    \"\"\"\n    Combine structure replicates into a single indfile, \n    returns nreps, ninds. Excludes reps with too high of \n    variance (set with max_variance_multiplier) to exclude\n    runs that did not converge. \n    \"\"\"\n   \n    ## make an output handle\n    outf = os.path.join(self.workdir, \n        \"{}-K-{}.indfile\".format(self.name, kpop))\n    \n    ## combine replicates and write to indfile\n    excluded = 0\n    reps = []\n    with open(outf, 'w') as outfile:\n        repfiles = glob.glob(\n            os.path.join(self.workdir, \n                self.name+\"-K-{}-rep-*_f\".format(kpop)))\n\n        ## get result as a Rep object\n        for rep in repfiles:\n            result = Rep(rep, kpop=kpop)\n            reps.append(result)\n\n        ## exclude results with variance NX above (min) \n        newreps = []\n        if len(reps) > 1:\n            min_var_across_reps = np.min([i.var_lnlik for i in reps])\n        else:\n            min_var_across_reps = reps[0].var_lnlik\n\n        ## iterate over reps\n        for rep in reps:\n\n            ## store result w/o filtering\n            if not max_var_multiple:\n                newreps.append(rep)\n                outfile.write(rep.stable)\n\n            ## use max-var-multiple as a filter for convergence                \n            else:\n                #print(\n                #    rep.var_lnlik, \n                #    min_var_across_reps, \n                #    rep.var_lnlik / min_var_across_reps, \n                #    max_var_multiple)\n                ## e.g., repvar is 1.05X minvar. We keep it if maxvar <= 1.05\n                if (rep.var_lnlik / min_var_across_reps) <= max_var_multiple:\n                    newreps.append(rep)\n                    outfile.write(rep.stable)\n                else:\n                    excluded += 1\n\n    return newreps, excluded"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _get_evanno_table(self, kpops, max_var_multiple, quiet):\n\n    ## iterate across k-vals\n    kpops = sorted(kpops)\n    replnliks = []\n\n    for kpop in kpops:\n        ## concat results for k=x\n        reps, excluded = _concat_reps(self, kpop, max_var_multiple, quiet)\n\n        ## report if some results were excluded\n        if excluded:\n            if not quiet:\n                sys.stderr.write(\n                \"[K{}] {} reps excluded (not converged) see 'max_var_multiple'.\\n\"\\\n                .format(kpop, excluded))\n\n        if reps:\n            ninds = reps[0].inds\n            nreps = len(reps)\n        else:\n            ninds = nreps = 0\n        if not reps:\n            print \"no result files found\"\n\n        ## all we really need is the lnlik\n        replnliks.append([i.est_lnlik for i in reps])\n\n    ## compare lnlik and var of results\n    if len(replnliks) > 1:\n        lnmean = [np.mean(i) for i in replnliks]\n        lnstds = [np.std(i, ddof=1) for i in replnliks]\n    else:\n        lnmean = replnliks\n        lnstds = np.nan\n\n    tab = pd.DataFrame(\n        index=kpops,\n        data={\n            \"Nreps\": [len(i) for i in replnliks],\n            \"lnPK\": [0] * len(kpops),\n            \"lnPPK\": [0] * len(kpops),\n            \"deltaK\": [0] * len(kpops),\n            \"estLnProbMean\": lnmean, \n            \"estLnProbStdev\": lnstds,\n        }\n        )\n\n    ## calculate Evanno's\n    for kpop in kpops[1:]:\n        tab.loc[kpop, \"lnPK\"] = tab.loc[kpop, \"estLnProbMean\"] \\\n                              - tab.loc[kpop-1, \"estLnProbMean\"]\n\n    for kpop in kpops[1:-1]:\n        tab.loc[kpop, \"lnPPK\"] = abs(tab.loc[kpop+1, \"lnPK\"] \n                                     - tab.loc[kpop, \"lnPK\"])\n        tab.loc[kpop, \"deltaK\"] = (abs(\n                                    tab.loc[kpop+1, \"estLnProbMean\"] - \\\n                                    2.0 * tab.loc[kpop, \"estLnProbMean\"] + \\\n                                    tab.loc[kpop-1, \"estLnProbMean\"]) / \\\n                                   tab.loc[kpop, \"estLnProbStdev\"])\n        \n    ## return table\n    return tab", "response": "Calculates Evanno method K value scores for a series of permuted clumpp results."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef result_files(self):\n        reps = OPJ(self.workdir, self.name+\"-K-*-rep-*_f\")\n        repfiles = glob.glob(reps)\n        return repfiles", "response": "returns a list of files that have finished structure"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nsubmits a job to the cluster and returns a asynchronous result object.", "response": "def run(self,\n        kpop, \n        nreps, \n        ipyclient=None,\n        seed=12345, \n        force=False,\n        quiet=False, \n        ):\n\n        \"\"\" \n        submits a job to run on the cluster and returns an asynchronous result\n        object. K is the number of populations, randomseed if not set will be \n        randomly drawn, ipyclient if not entered will raise an error. If nreps\n        is set then multiple jobs will be started from new seeds, each labeled\n        by its replicate number. If force=True then replicates will be overwritten, \n        otherwise, new replicates will be created starting with the last file N \n        found in the workdir. \n\n        Parameters:\n        -----------\n        kpop: (int)\n            The MAXPOPS parameter in structure, i.e., the number of populations\n            assumed by the model (K). \n\n        nreps: (int):\n            Number of independent runs starting from distinct seeds.\n\n        ipyclient: (ipyparallel.Client Object)\n            An ipyparallel client connected to an ipcluster instance. This is \n            used to manage parallel jobs. If not present a single job will\n            run and block until finished (i.e., code is not parallel).\n\n        seed: (int):\n            Random number seed used for subsampling unlinked SNPs if a mapfile\n            is linked to the Structure Object. \n\n        force: (bool):\n            If force is true then old replicates are removed and new reps start\n            from rep-0. Otherwise, new reps start at end of existing rep numbers.\n\n        quiet: (bool)\n            Whether to print number of jobs submitted to stderr\n\n        Example: \n        ---------\n        import ipyparallel as ipp\n        import ipyrad.analysis as ipa\n\n        ## get parallel client\n        ipyclient = ipp.Client()\n\n        ## get structure object\n        s = ipa.structure(\n                name=\"test\",\n                data=\"mydata.str\", \n                mapfile=\"mydata.snps.map\",\n                workdir=\"structure-results\",\n                )\n\n        ## modify some basic params\n        s.mainparams.numreps = 100000\n        s.mainparams.burnin = 10000\n\n        ## submit many jobs\n        for kpop in [3, 4, 5]:\n            s.run(\n                kpop=kpop, \n                nreps=10, \n                ipyclient=ipyclient,\n                )\n\n        ## block until all jobs finish\n        ipyclient.wait()\n\n        \"\"\"\n        ## initiate starting seed\n        np.random.seed(seed)\n\n        ## check for stuructre here\n        proc = subprocess.Popen([\"which\", \"structure\"],\n                stdout=subprocess.PIPE, \n                stderr=subprocess.STDOUT).communicate()\n        if not proc:\n            raise Exception(\\\n                \"structure is not installed: run `conda install structure -c ipyrad`\")\n\n        ## start load balancer\n        if ipyclient:\n            lbview = ipyclient.load_balanced_view()\n\n        ## remove old jobs with this same name\n        handle = OPJ(self.workdir, self.name+\"-K-{}-*\".format(kpop))\n        oldjobs = glob.glob(handle)\n        if force or (not oldjobs):\n            for job in oldjobs:\n                os.remove(job)\n            repstart = 0\n            repend = nreps\n        else:\n            repstart = max([int(i.split(\"-\")[-1][:-2]) for i in oldjobs])\n            repend = repstart + nreps\n\n        ## check that there is a ipcluster instance running\n        for rep in xrange(repstart, repend):\n\n            ## sample random seed for this rep\n            self.extraparams.seed = np.random.randint(0, 1e9, 1)[0]\n\n            ## prepare files (randomly subsamples snps if mapfile)\n            mname, ename, sname = self.write_structure_files(kpop, rep)\n            args = [\n                mname, ename, sname,\n                self.name, \n                self.workdir,\n                self.extraparams.seed, \n                self.ntaxa, \n                self.nsites,\n                kpop, \n                rep]\n\n            if ipyclient:\n                ## call structure\n                async = lbview.apply(_call_structure, *(args))\n                self.asyncs.append(async)\n\n            else:\n                if not quiet:\n                    sys.stderr.write(\"submitted 1 structure job [{}-K-{}]\\n\"\\\n                                 .format(self.name, kpop))\n                comm = _call_structure(*args)\n                return comm\n\n        if ipyclient:\n            if not quiet:\n                sys.stderr.write(\"submitted {} structure jobs [{}-K-{}]\\n\"\\\n                                .format(nreps, self.name, kpop))"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef write_structure_files(self, kpop, rep=1):\n\n        ## check params\n        self.mainparams.numreps = int(self.mainparams.numreps)\n        self.mainparams.burnin = int(self.mainparams.burnin)\n\n        ## write tmp files for the job. Rando avoids filename conflict.\n        mname = OPJ(self.workdir, \"tmp-{}-{}-{}.mainparams.txt\".format(self.name, kpop, rep))\n        ename = OPJ(self.workdir, \"tmp-{}-{}-{}.extraparams.txt\".format(self.name, kpop, rep))\n        sname = OPJ(self.workdir, \"tmp-{}-{}-{}.strfile.txt\".format(self.name, kpop, rep))\n        tmp_m = open(mname, 'w')\n        tmp_e = open(ename, 'w')\n        tmp_s = open(sname, 'w')\n\n        ## write params files\n        tmp_m.write(self.mainparams._asfile())\n        tmp_e.write(self.extraparams._asfile())\n\n        ## subsample SNPs as unlinked if a mapfile is present.\n        ## & write pop data to the tmp_s file if present\n        assert len(self.popdata) == len(self.labels), \\\n            \"popdata list must be the same length as the number of taxa\"\n\n        with open(self.data) as ifile:\n            _data = ifile.readlines()\n            ## header\n            header = np.array([i.strip().split(\"\\t\")[:5] for i in _data])\n            ## seqdata\n            seqdata = np.array([i.strip().split(\"\\t\")[5:] for i in _data])\n\n            ## enter popdata into seqfile if present in self\n            if any(self.popdata):\n                ## set popdata in header\n                header[::2, 1] = self.popdata\n                header[1::2, 1] = self.popdata\n                \n                ## set flag to all 1s if user entered popdata but no popflag\n                if not any(self.popflag):\n                    self.popflag = [1 for i in self.popdata]\n                    header[:, 2] = 1\n                else:\n                    header[::2, 2] = self.popflag\n                    header[1::2, 2] = self.popflag\n\n            ## subsample SNPs if mapfile is present\n            if isinstance(self.maparr, np.ndarray):\n                seqdata = seqdata[:, self._subsample()]\n                \n            ## write fullstr\n            fullstr = np.concatenate([header, seqdata], axis=1)\n            np.savetxt(tmp_s, fullstr, delimiter=\"\\t\", fmt=\"%s\")\n\n        ## close tmp files\n        tmp_m.close()\n        tmp_e.close()\n        tmp_s.close()\n        return mname, ename, sname", "response": "Writes the main and extraparams files for the running structure."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreturns a dictionary of results tables for making structure barplots.", "response": "def get_clumpp_table(self, kvalues, max_var_multiple=0, quiet=False):\n        \"\"\"\n        Returns a dictionary of results tables for making structure barplots.\n        This calls the same functions used in get_evanno_table() to call \n        CLUMPP to permute replicates.\n\n        Parameters:\n        -----------\n        kvalues : list or int\n            A kvalue or list of kvalues to run CLUMPP on and return a \n            results table. \n\n        max_var_multiple: int\n            A multiplier value to use as a filter for convergence of runs. \n            Default=0=no filtering. As an example, if 10 replicates \n            were run then the variance of the run with the minimum variance is\n            used as a benchmark. If other runs have a variance that is N times \n            greater then that run will be excluded. Remember, if replicate runs \n            sampled different distributions of SNPs then it is not unexpected that \n            they will have very different variances. However, you may still want \n            to exclude runs with very high variance since they likely have \n            not converged. \n\n        Returns:\n        --------\n        table : dict or pd.DataFrame\n            A dictionary of dataframes with admixture proportions.\n        \"\"\"\n        ## do not allow bad vals\n        if max_var_multiple:\n            if max_var_multiple < 1:\n                raise ValueError('max_var_multiple must be >1')\n\n        if isinstance(kvalues, int):\n            return _get_clumpp_table(self, kvalues, max_var_multiple, quiet)\n        else:\n            tabledict = {}\n            for kpop in kvalues:\n                table = _get_clumpp_table(self, kpop, max_var_multiple, quiet)\n                tabledict[kpop] = table\n            return tabledict"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef get_evanno_table(self, kvalues, max_var_multiple=0, quiet=False):\n        ## do not allow bad vals\n        if max_var_multiple:\n            if max_var_multiple < 1:\n                raise ValueError('max_variance_multiplier must be >1')\n\n        table = _get_evanno_table(self, kvalues, max_var_multiple, quiet)\n        return table", "response": "Calculates the Evanno table for tests with the given list of K - values."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef parse(self, psearch, dsearch):\n        stable = \"\"\n        with open(self.repfile) as orep:\n            dat = orep.readlines()\n            for line in dat:\n                ## stat lines\n                if \"Estimated Ln Prob of Data\" in line:\n                    self.est_lnlik = float(line.split()[-1])\n                if \"Mean value of ln likelihood\" in line:\n                    self.mean_lnlik = float(line.split()[-1])\n                if \"Variance of ln likelihood\" in line:\n                    self.var_lnlik = float(line.split()[-1])\n                if \"Mean value of alpha\" in line:\n                    self.alpha = float(line.split()[-1])\n\n                ## matrix lines\n                nonline = psearch.search(line)\n                popline = dsearch.search(line)\n\n                #if \")   :  \" in line:\n                if nonline:\n                    ## check if sample is supervised...\n                    abc = line.strip().split()\n                    outstr = \"{}{}{}\".format(\n                        \" \".join([abc[0], abc[0], abc[2], \n                                  abc[0].split('.')[0]]),\n                        \" :  \",\n                        \" \".join(abc[4:])\n                    )\n                    self.inds += 1\n                    stable += outstr+\"\\n\"\n\n                elif popline:\n                    ## check if sample is supervised...\n                    abc = line.strip().split()\n                    prop = [\"0.000\"] * self.kpop\n                    pidx = int(abc[3]) - 1\n                    prop[pidx] = \"1.000\"\n                    outstr = \"{}{}{}\".format(\n                        \" \".join([abc[0], abc[0], abc[2], \n                                  abc[0].split('.')[0]]),\n                        \" :  \",\n                        \" \".join(prop)\n                    )\n                    self.inds += 1\n                    stable += outstr+\"\\n\"\n\n            stable += \"\\n\"\n        return stable", "response": "parse an _f structure output file"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ncalling the command as sps", "response": "def _call_raxml(command_list):\n    \"\"\" call the command as sps \"\"\"\n    proc = subprocess.Popen(\n        command_list,\n        stderr=subprocess.STDOUT, \n        stdout=subprocess.PIPE\n        )\n    comm = proc.communicate()\n    return comm"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nbuild the command list", "response": "def _command_list(self):\n        \"\"\" build the command list \"\"\"\n        cmd = [self.params.binary, \n                \"-f\", str(self.params.f), \n                \"-T\", str(self.params.T), \n                \"-m\", str(self.params.m),\n                \"-N\", str(self.params.N),\n                \"-x\", str(self.params.x),\n                \"-p\", str(self.params.p),\n                \"-n\", str(self.params.n),\n                \"-w\", str(self.params.w),\n                \"-s\", str(self.params.s),\n               ]\n        ## add ougroups\n        if self.params.o:\n            cmd += [\"-o\"]\n            cmd += [\",\".join(self.params.o)]\n        return cmd"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef run(self, \n        ipyclient=None, \n        quiet=False,\n        force=False,\n        block=False,\n        ):\n        \"\"\"\n        Submits raxml job to run. If no ipyclient object is provided then \n        the function will block until the raxml run is finished. If an ipyclient\n        is provided then the job is sent to a remote engine and an asynchronous \n        result object is returned which can be queried or awaited until it finishes.\n\n        Parameters\n        -----------\n        ipyclient:\n            Not yet supported... \n        quiet: \n            suppress print statements\n        force:\n            overwrite existing results files with this job name. \n        block:\n            will block progress in notebook until job finishes, even if job\n            is running on a remote ipyclient.\n        \"\"\"\n\n        ## stop before trying in raxml\n        if force:\n            for key, oldfile in self.trees:\n                if os.path.exists(oldfile):\n                    os.remove(oldfile)\n        if os.path.exists(self.trees.info):\n            print(\"Error: set a new name for this job or use Force flag.\\nFile exists: {}\"\\\n                  .format(self.trees.info))\n            return \n\n        ## TODO: add a progress bar tracker here. It could even read it from\n        ## the info file that is being written. \n        ## submit it\n        if not ipyclient:\n            proc = _call_raxml(self._command_list)\n            self.stdout = proc[0]\n            self.stderr = proc[1]\n        else:\n            ## find all hosts and submit job to the host with most available engines\n            lbview = ipyclient.load_balanced_view()\n            self.async = lbview.apply(_call_raxml, self._command_list)\n\n        ## initiate random seed\n        if not quiet:\n            if not ipyclient:\n                ## look for errors\n                if \"Overall execution time\" not in self.stdout:\n                    print(\"Error in raxml run\\n\" + self.stdout)\n                else:\n                    print(\"job {} finished successfully\".format(self.params.n))\n            else:\n                print(\"job {} submitted to cluster\".format(self.params.n))", "response": "This function runs the raxml job."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nfinding binaries available and update self. params. binary", "response": "def _get_binary(self):\n        \"\"\" find binaries available\"\"\"\n\n        ## check for binary\n        backup_binaries = [\"raxmlHPC-PTHREADS\", \"raxmlHPC-PTHREADS-SSE3\"]\n\n        ## check user binary first, then backups\n        for binary in [self.params.binary] + backup_binaries:\n            proc = subprocess.Popen([\"which\", self.params.binary], \n                    stdout=subprocess.PIPE, \n                    stderr=subprocess.STDOUT).communicate()\n            ## update the binary\n            if proc:\n                self.params.binary = binary\n\n        ## if none then raise error\n        if not proc[0]:\n            raise Exception(BINARY_ERROR.format(self.params.binary))"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef share_matrix(locifile, tree=None, nameorder=None):\n\n    ## load in the loci data\n    with open(locifile, 'r') as locidata:\n        loci = locidata.read().split(\"|\\n\")[:-1]\n\n    ## load in the tree from a string\n    if tree:\n        tree = ete.Tree(tree)\n        tree.ladderize()\n        snames = tree.get_leaf_names()\n        lxs, names = _getarray(loci, snames)\n    elif nameorder:\n        lxs, names = _getarray(loci, nameorder)\n    else:\n        raise IOError(\"must provide either tree or nameorder argument\")\n\n    ## get share matrix\n    share = _countmatrix(lxs)\n\n    return share", "response": "Returns a matrix of shared loci between all pairs of samples."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _getarray(loci, snames):\n    ## make an empty matrix\n    lxs = np.zeros((len(snames), len(loci)), dtype=np.uint64)\n\n    ## fill the matrix\n    for loc in xrange(len(loci)):\n        for seq in loci[loc].split(\"\\n\"):\n            if \"//\" not in seq:\n                lxs[snames.index(seq.split()[0][:]), loc] += 1\n\n    return lxs, snames", "response": "parse loci list and return presence and absence matrix"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ndistributing jobs to the parallel client", "response": "def batch(\n    baba,\n    ipyclient=None,\n    ):\n    \"\"\"\n    distributes jobs to the parallel client\n    \"\"\"\n\n    ## parse args\n    handle = baba.data\n    taxdicts = baba.tests\n    mindicts = baba.params.mincov\n    nboots = baba.params.nboots\n\n    ## if ms generator make into reusable list\n    sims = 0\n    if isinstance(handle, types.GeneratorType):\n        handle = list(handle)\n        sims = 1\n    else:\n        ## expand locifile path to full path\n        handle = os.path.realpath(handle)\n\n    ## parse taxdicts into names and lists if it a dictionary\n    #if isinstance(taxdicts, dict):\n    #    names, taxdicts = taxdicts.keys(), taxdicts.values()\n    #else:\n    #    names = []\n    names = []\n    if isinstance(taxdicts, dict):\n        taxdicts = [taxdicts]\n\n    ## an array to hold results (len(taxdicts), nboots)\n    tot = len(taxdicts)\n    resarr = np.zeros((tot, 7), dtype=np.float64)\n    bootsarr = np.zeros((tot, nboots), dtype=np.float64)\n    paneldict = {}\n\n    ## TODO: Setup a wrapper to find and cleanup ipyclient\n    ## define the function and parallelization to use, \n    ## if no ipyclient then drops back to using multiprocessing.\n    if not ipyclient:\n        # ipyclient = ip.core.parallel.get_client(**self._ipcluster)\n        raise IPyradError(\"you must enter an ipyparallel.Client() object\")\n    else:\n        lbview = ipyclient.load_balanced_view()\n\n    ## submit jobs to run on the cluster queue\n    start = time.time()\n    asyncs = {}\n    idx = 0\n\n    ## prepare data before sending to engines\n    ## if it's a str (locifile) then parse it here just once.\n    if isinstance(handle, str):\n        with open(handle, 'r') as infile:\n            loci = infile.read().strip().split(\"|\\n\")\n    if isinstance(handle, list):\n        pass #sims()\n\n    ## iterate over tests (repeats mindicts if fewer than taxdicts)\n    itests = iter(taxdicts)\n    imdict = itertools.cycle([mindicts])\n\n    #for test, mindict in zip(taxdicts, itertools.cycle([mindicts])):\n    for i in xrange(len(ipyclient)):\n\n        ## next entries unless fewer than len ipyclient, skip\n        try:\n            test = next(itests)\n            mindict = next(imdict)\n        except StopIteration:\n            continue\n\n        ## if it's sim data then convert to an array\n        if sims:\n            loci = _msp_to_arr(handle, test)\n            args = (loci, test, mindict, nboots)\n            print(\"not yet implemented\")\n            #asyncs[idx] = lbview.apply_async(dstat, *args)\n        else:\n            args = [loci, test, mindict, nboots]\n            asyncs[idx] = lbview.apply(dstat, *args)\n        idx += 1\n\n    ## block until finished, print progress if requested.\n    finished = 0\n    try:\n        while 1:\n            keys = [i for (i, j) in asyncs.items() if j.ready()]\n            ## check for failures\n            for job in keys:\n                if not asyncs[job].successful():\n                    raise IPyradWarningExit(\\\n                        \" error: {}: {}\".format(job, asyncs[job].exception()))\n                ## enter results for successful jobs\n                else:\n                    _res, _bot = asyncs[job].result()\n                    \n                    ## store D4 results\n                    if _res.shape[0] == 1:\n                        resarr[job] = _res.T.as_matrix()[:, 0]\n                        bootsarr[job] = _bot\n                    \n                    ## or store D5 results                        \n                    else:   \n                        paneldict[job] = _res.T\n\n                    ## remove old job\n                    del asyncs[job]\n                    finished += 1\n\n                    ## submit next job if there is one.\n                    try:\n                        test = next(itests)\n                        mindict = next(imdict)\n                        if sims:\n                            loci = _msp_to_arr(handle, test)\n                            args = (loci, test, mindict, nboots)\n                            print(\"not yet implemented\")\n                            #asyncs[idx] = lbview.apply_async(dstat, *args)\n                        else:\n                            args = [loci, test, mindict, nboots]\n                            asyncs[idx] = lbview.apply(dstat, *args)\n                        idx += 1\n                    except StopIteration:\n                        pass\n\n            ## count finished and break if all are done.\n            #fin = idx - len(asyncs)\n            elap = datetime.timedelta(seconds=int(time.time()-start))\n            printstr = \" calculating D-stats  | {} | \"\n            progressbar(tot, finished, printstr.format(elap), spacer=\"\")\n            time.sleep(0.1)\n            if not asyncs:\n                print(\"\")\n                break\n\n    except KeyboardInterrupt as inst:\n        ## cancel all jobs (ipy & multiproc modes) and then raise error\n        try:\n            ipyclient.abort()\n        except Exception:\n            pass\n        raise inst\n\n    ## dress up resarr as a Pandas DataFrame if 4-part test\n    if len(test) == 4:\n        if not names:\n            names = range(len(taxdicts))\n        #print(\"resarr\")\n        #print(resarr)\n        resarr = pd.DataFrame(resarr, \n            index=names,\n            columns=[\"dstat\", \"bootmean\", \"bootstd\", \"Z\", \"ABBA\", \"BABA\", \"nloci\"])\n\n        ## sort results and bootsarr to match if test names were supplied\n        resarr = resarr.sort_index()\n        order = [list(resarr.index).index(i) for i in names]\n        bootsarr = bootsarr[order]\n        return resarr, bootsarr\n    else:\n        ## order results dfs\n        listres = []\n        for key in range(len(paneldict)):\n            listres.append(paneldict[key])\n            \n        ## make into a multi-index dataframe\n        ntests = len(paneldict)\n        multi_index = [\n            np.array([[i] * 3 for i in range(ntests)]).flatten(),\n            np.array(['p3', 'p4', 'shared'] * ntests),\n        ]\n        resarr = pd.DataFrame(\n            data=pd.concat(listres).as_matrix(), \n            index=multi_index,\n            columns=listres[0].columns,\n            )\n        return resarr, None"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nperforming a single D - stat test", "response": "def dstat(inarr, taxdict, mindict=1, nboots=1000, name=0):\n    \"\"\" private function to perform a single D-stat test\"\"\"\n\n    #if isinstance(inarr, str):\n    #    with open(inarr, 'r') as infile:\n    #        inarr = infile.read().strip().split(\"|\\n\")\n\n    # ## get data as an array from loci file\n    # ## if loci-list then parse arr from loci\n    if isinstance(inarr, list):\n        arr, _ = _loci_to_arr(inarr, taxdict, mindict)\n    \n    # ## if it's an array already then go ahead\n    # elif isinstance(inarr, np.ndarray):\n    #     arr = inarr\n    # ## if it's a simulation object get freqs from array\n    # elif isinstance(inarr, Sim):\n    #     arr = _msp_to_arr(inarr, taxdict)\n\n    #elif isinstance(inarr, types.GeneratorType):\n    #    arr = _msp_to_arr(inarr, taxdict)\n    #elif isinstance(inarr, list):\n    #    arr = _msp_to_arr(inarr, taxdict)\n    ## get data from Sim object, do not digest the ms generator\n    #else:\n    #    raise Exception(\"Must enter either a 'locifile' or 'arr'\")\n\n    ## run tests\n    #if len(taxdict) == 4:\n    if arr.shape[1] == 4:\n\n        ## get results\n        res, boots = _get_signif_4(arr, nboots)\n    \n        ## make res into a nice DataFrame\n        res = pd.DataFrame(res, \n            columns=[name],\n            index=[\"Dstat\", \"bootmean\", \"bootstd\", \"Z\", \"ABBA\", \"BABA\", \"nloci\"])\n\n    else:\n        ## get results\n        res, boots = _get_signif_5(arr, nboots)\n         ## make int a DataFrame\n        res = pd.DataFrame(res,\n            index=[\"p3\", \"p4\", \"shared\"], \n            columns=[\"Dstat\", \"bootmean\", \"bootstd\", \"Z\", \"ABxxA\", \"BAxxA\", \"nloci\"]\n            )\n\n    return res.T, boots"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreturn a frequency array from a loci file for all loci with taxa from taxdict and min coverage from mindict.", "response": "def _loci_to_arr(loci, taxdict, mindict):\n    \"\"\"\n    return a frequency array from a loci file for all loci with taxa from \n    taxdict and min coverage from mindict. \n    \"\"\"\n\n    ## make the array (4 or 5) and a mask array to remove loci without cov\n    nloci = len(loci)\n    maxlen = np.max(np.array([len(locus.split(\"\\n\")[0]) for locus in loci]))\n    keep = np.zeros(nloci, dtype=np.bool_)\n    arr = np.zeros((nloci, 4, maxlen), dtype=np.float64)\n\n    ## six rows b/c one for each p3, and for the fused p3 ancestor\n    if len(taxdict) == 5:\n        arr = np.zeros((nloci, 6, maxlen), dtype=np.float64)\n\n    ## if not mindict, make one that requires 1 in each taxon\n    if isinstance(mindict, int):\n        mindict = {i: mindict for i in taxdict}\n    elif isinstance(mindict, dict):\n        mindict = {i: mindict[i] for i in taxdict}\n    else:\n        mindict = {i: 1 for i in taxdict}\n\n    ## raise error if names are not 'p[int]' \n    allowed_names = ['p1', 'p2', 'p3', 'p4', 'p5']\n    if any([i not in allowed_names for i in taxdict]):\n        raise IPyradError(\\\n            \"keys in taxdict must be named 'p1' through 'p4' or 'p5'\")\n\n    ## parse key names\n    keys = sorted([i for i in taxdict.keys() if i[0] == 'p'])\n    outg = keys[-1]\n\n    ## grab seqs just for the good guys\n    for loc in xrange(nloci):\n\n        ## parse the locus\n        lines = loci[loc].split(\"\\n\")[:-1]\n        names = [i.split()[0] for i in lines]\n        seqs = np.array([list(i.split()[1]) for i in lines])\n\n        ## check that names cover the taxdict (still need to check by site)\n        covs = [sum([j in names for j in taxdict[tax]]) >= mindict[tax] \\\n                for tax in taxdict]\n\n        ## keep locus\n        if all(covs):\n            keep[loc] = True\n\n            ## get the refseq\n            refidx = np.where([i in taxdict[outg] for i in names])[0]\n            refseq = seqs[refidx].view(np.uint8)\n            ancestral = np.array([reftrick(refseq, GETCONS2)[:, 0]])\n\n            ## freq of ref in outgroup\n            iseq = _reffreq2(ancestral, refseq, GETCONS2)\n            arr[loc, -1, :iseq.shape[1]] = iseq \n\n            ## enter 4-taxon freqs\n            if len(taxdict) == 4:\n                for tidx, key in enumerate(keys[:-1]):\n\n                    ## get idx of names in test tax\n                    nidx = np.where([i in taxdict[key] for i in names])[0]\n                    sidx = seqs[nidx].view(np.uint8)\n                   \n                    ## get freq of sidx\n                    iseq = _reffreq2(ancestral, sidx, GETCONS2)\n                   \n                    ## fill it in \n                    arr[loc, tidx, :iseq.shape[1]] = iseq\n\n            else:\n\n                ## entere p5; and fill it in\n                iseq = _reffreq2(ancestral, refseq, GETCONS2) \n                arr[loc, -1, :iseq.shape[1]] = iseq \n                \n                ## enter p1\n                nidx = np.where([i in taxdict['p1'] for i in names])[0]\n                sidx = seqs[nidx].view(np.uint8)\n                iseq = _reffreq2(ancestral, sidx, GETCONS2)\n                arr[loc, 0, :iseq.shape[1]] = iseq\n                \n                ## enter p2\n                nidx = np.where([i in taxdict['p2'] for i in names])[0]\n                sidx = seqs[nidx].view(np.uint8)\n                iseq = _reffreq2(ancestral, sidx, GETCONS2)\n                arr[loc, 1, :iseq.shape[1]] = iseq\n                \n                ## enter p3 with p4 masked, and p4 with p3 masked\n                nidx = np.where([i in taxdict['p3'] for i in names])[0]\n                nidy = np.where([i in taxdict['p4'] for i in names])[0]\n                sidx = seqs[nidx].view(np.uint8)\n                sidy = seqs[nidy].view(np.uint8)\n                xseq = _reffreq2(ancestral, sidx, GETCONS2)\n                yseq = _reffreq2(ancestral, sidy, GETCONS2)\n                mask3 = xseq != 0\n                mask4 = yseq != 0\n                xseq[mask4] = 0\n                yseq[mask3] = 0\n                arr[loc, 2, :xseq.shape[1]] = xseq\n                arr[loc, 3, :yseq.shape[1]] = yseq\n                \n                ## enter p34 \n                nidx = nidx.tolist() + nidy.tolist()\n                sidx = seqs[nidx].view(np.uint8)\n                iseq = _reffreq2(ancestral, sidx, GETCONS2)\n                arr[loc, 4, :iseq.shape[1]] = iseq\n\n\n    ## size-down array to the number of loci that have taxa for the test\n    arr = arr[keep, :, :]\n\n    ## size-down sites to \n    arr = masknulls(arr)\n\n    return arr, keep"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _get_boots(arr, nboots):\n    ## hold results (nboots, [dstat, ])\n    boots = np.zeros((nboots,))\n    \n    ## iterate to fill boots\n    for bidx in xrange(nboots):\n        ## sample with replacement\n        lidx = np.random.randint(0, arr.shape[0], arr.shape[0])\n        tarr = arr[lidx]\n        _, _, dst = _prop_dstat(tarr)\n        boots[bidx] = dst\n    \n    ## return bootarr\n    return boots", "response": "get bootstrap D - stats"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreturns a list of stats and boots.", "response": "def _get_signif_4(arr, nboots):\n    \"\"\"\n    returns a list of stats and an array of dstat boots. Stats includes\n    z-score and two-sided P-value. \n    \"\"\"\n    abba, baba, dst = _prop_dstat(arr)\n    boots = _get_boots(arr, nboots)\n    estimate, stddev = (boots.mean(), boots.std())\n    zscore = 0.\n    if stddev:\n        zscore = np.abs(dst) / stddev\n    stats = [dst, estimate, stddev, zscore, abba, baba, arr.shape[0]]\n    return np.array(stats), boots"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning a list of stats and boots.", "response": "def _get_signif_5(arr, nboots):\n    \"\"\"\n    returns a list of stats and an array of dstat boots. Stats includes\n    z-score and two-sided P-value. \n    \"\"\"\n\n    statsarr = np.zeros((3, 7), dtype=np.float64)\n    bootsarr = np.zeros((3, nboots))\n\n    idx = 0\n    for acol in [2, 3, 4]:\n        rows = np.array([0, 1, acol, 5])\n        tarr = arr[:, rows, :]\n\n        abxa, baxa, dst = _prop_dstat(tarr)\n        boots = _get_boots(tarr, nboots)\n        estimate, stddev = (boots.mean(), boots.std())\n        if stddev:\n            zscore = np.abs(dst) / stddev\n        else:\n            zscore = np.NaN\n        stats = [dst, estimate, stddev, zscore, abxa, baxa, arr.shape[0]]\n\n        statsarr[idx] = stats\n        bootsarr[idx] = boots\n        idx += 1\n\n    return statsarr, bootsarr"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nsimulates a newick .", "response": "def _simulate(self, nreps, admix=None, Ns=500000, gen=20):\n    \"\"\"\n    Enter a baba.Tree object in which the 'tree' attribute (newick \n    derived tree) has edge lengths in units of generations. You can \n    use the 'gen' parameter to multiply branch lengths by a constant. \n\n    Parameters:\n    -----------\n\n    nreps: (int)\n        Number of reps (loci) to simulate under the demographic scenario\n    tree: (baba.Tree object)\n        A baba.Tree object initialized by calling baba.Tree(*args). \n    admix: (list)\n        A list of admixture events to occur on the tree. Nodes must be \n        reference by their index number, and events must occur in time\n        intervals when edges exist. Use the .draw() function of the \n        baba.Tree object to see node index numbers and coalescent times.\n    Ns: (float)\n        Fixed effective population size for all lineages (may allow to vary\n        in the future). \n    gen: (int)\n        A multiplier applied to branch lengths to scale into units of \n        generations. Example, if all edges on a tree were 1 then you might\n        enter 50000 to multiply so that edges are 50K generations long.\n\n    \"\"\"\n\n    ## node ages\n    Taus = np.array(list(set(self.verts[:, 1]))) * 1e4 * gen\n\n    ## The tips samples, ordered alphanumerically\n    ## Population IDs correspond to their indexes in pop config\n    ntips = len(self.tree)\n    #names = {name: idx for idx, name in enumerate(sorted(self.tree.get_leaf_names()))}\n    ## rev ladderized leaf name order (left to right on downward facing tree)\n    names = {name: idx for idx, name in enumerate(self.tree.get_leaf_names()[::-1])}\n    pop_config = [\n        ms.PopulationConfiguration(sample_size=2, initial_size=Ns)\n        for i in range(ntips)\n    ]\n\n    ## migration matrix all zeros init\n    migmat = np.zeros((ntips, ntips)).tolist()\n\n    ## a list for storing demographic events\n    demog = []\n\n    ## coalescent times\n    coals = sorted(list(set(self.verts[:, 1])))[1:]\n    for ct in xrange(len(coals)):\n        ## check for admix event before next coalescence\n        ## ...\n        \n        ## print coals[ct], nidxs, time\n        nidxs = np.where(self.verts[:, 1] == coals[ct])[0]\n        time = Taus[ct+1]\n\n        ## add coalescence at each node\n        for nidx in nidxs:\n            node = self.tree.search_nodes(name=str(nidx))[0]\n\n            ## get destionation (lowest child idx number), and other\n            dest = sorted(node.get_leaves(), key=lambda x: x.idx)[0]\n            otherchild = [i for i in node.children if not \n                          i.get_leaves_by_name(dest.name)][0]\n\n            ## get source\n            if otherchild.is_leaf():\n                source = otherchild\n            else:\n                source = sorted(otherchild.get_leaves(), key=lambda x: x.idx)[0]\n            \n            ## add coal events\n            event = ms.MassMigration(\n                        time=int(time),\n                        source=names[source.name], \n                        destination=names[dest.name], \n                        proportion=1.0)\n            #print(int(time), names[source.name], names[dest.name])\n        \n            ## ...\n            demog.append(event)\n            \n            \n    ## sim the data\n    replicates = ms.simulate(\n        population_configurations=pop_config,\n        migration_matrix=migmat,\n        demographic_events=demog,\n        num_replicates=nreps,\n        length=100, \n        mutation_rate=1e-8)\n    return replicates"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreturning the. tests list of taxa as a pandas dataframe.", "response": "def taxon_table(self):\n        \"\"\"\n        Returns the .tests list of taxa as a pandas dataframe. \n        By auto-generating this table from tests it means that \n        the table itself cannot be modified unless it is returned \n        and saved. \n        \"\"\"\n        if self.tests:\n            keys = sorted(self.tests[0].keys())\n            if isinstance(self.tests, list):\n                ld = [[(key, i[key]) for key in keys] for i in self.tests]\n                dd = [dict(i) for i in ld]\n                df = pd.DataFrame(dd)\n                return df\n            else:\n                return pd.DataFrame(pd.Series(self.tests)).T\n        else:\n            return None"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nrun a batch of dstat tests on a list of tests and store the results in self. results_table.", "response": "def run(self, \n        ipyclient=None,\n        ):\n        \"\"\"\n        Run a batch of dstat tests on a list of tests, where each test is \n        a dictionary mapping sample names to {p1 - p4} (and sometimes p5). \n        Parameters modifying the behavior of the run, such as the number\n        of bootstrap replicates (nboots) or the minimum coverage for \n        loci (mincov) can be set in {object}.params.\n\n        Parameters:\n        -----------\n        ipyclient (ipyparallel.Client object):\n            An ipyparallel client object to distribute jobs to a cluster. \n        \"\"\"\n        self.results_table, self.results_boots = batch(self, ipyclient)\n\n        ## skip this for 5-part test results\n        if not isinstance(self.results_table, list):\n            self.results_table.nloci = np.nan_to_num(self.results_table.nloci)\\\n                                                 .astype(int)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nplot a single - panel plot of the current treefile and the test sets.", "response": "def plot(self, \n        show_test_labels=True, \n        use_edge_lengths=True,         \n        collapse_outgroup=False, \n        pct_tree_x=0.5, \n        pct_tree_y=0.2,\n        subset_tests=None,\n        #toytree_kwargs=None,\n        *args, \n        **kwargs):\n\n        \"\"\" \n        Draw a multi-panel figure with tree, tests, and results \n        \n        Parameters:\n        -----------\n        height: int\n        ...\n\n        width: int\n        ...\n\n        show_test_labels: bool\n        ...\n\n        use_edge_lengths: bool\n        ...\n\n        collapse_outgroups: bool\n        ...\n\n        pct_tree_x: float\n        ...\n\n        pct_tree_y: float\n        ...\n\n        subset_tests: list\n        ...\n\n        ...\n\n        \"\"\"\n\n        ## check for attributes\n        if not self.newick:\n            raise IPyradError(\"baba plot requires a newick treefile\")\n        if not self.tests:\n            raise IPyradError(\"baba plot must have a .tests attribute\")\n\n        ## ensure tests is a list\n        if isinstance(self.tests, dict):\n            self.tests = [self.tests]\n\n        ## re-decompose the tree\n        ttree = toytree.tree(\n            self.newick, \n            orient='down', \n            use_edge_lengths=use_edge_lengths,\n            )\n\n        ## subset test to show fewer\n        if subset_tests != None:\n            #tests = self.tests[subset_tests]\n            tests = [self.tests[i] for i in subset_tests]\n            boots = self.results_boots[subset_tests]\n        else:\n            tests = self.tests\n            boots = self.results_boots\n\n        ## make the plot\n        canvas, axes, panel = baba_panel_plot(\n            ttree=ttree,\n            tests=tests, \n            boots=boots, \n            show_test_labels=show_test_labels, \n            use_edge_lengths=use_edge_lengths, \n            collapse_outgroup=collapse_outgroup, \n            pct_tree_x=pct_tree_x,\n            pct_tree_y=pct_tree_y,\n            *args, \n            **kwargs)\n        return canvas, axes, panel"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef loci2multinex(name, \n                  locifile, \n                  subsamples=None,\n                  outdir=None,\n                  maxloci=None,\n                  minSNPs=1,\n                  seed=12345,\n                  mcmc_burnin=int(1e6),\n                  mcmc_ngen=int(2e6),\n                  mcmc_sample_freq=1000,\n                  ):\n\n    \"\"\"\n    Converts loci file format to multiple nexus formatted files, one for \n    each locus, and writes a mrbayes block in the nexus information. The \n    mrbayes block will be set to run 2 replicate chains, for [mcmc_ngen]\n    generations, skipping [burnin] steps, and sampling every \n    [mcmc_sample_freq] steps. \n\n\n    Parameters:\n    -----------\n    name: (str)\n        A prefix name for output files that will be produced\n    locifile: (str)\n        A .loci file produced by ipyrad.\n    maxloci: (int)\n        Limit the number of loci to the first N loci with sufficient sampling\n        to be included in the analysis. \n    minSNPs: (int)\n        Only include loci that have at least N parsimony informative SNPs.\n    seed: (int)\n        Random seed used for resolving ambiguities.\n    burnin: (int)\n        mrbayes nexus block burnin parameter used for 'sump burnin' and 'sumt burnin'.\n        The number of generations to skip before starting parameter and tree sampling. \n    mcmc_ngen: (int)\n        mrbayes nexus block 'mcmc ngen' and 'mcmc printfreq' parameters. We don't really\n        to have any info printed to screen, so these values are set equal. This is the \n        length of the chains that will be run. \n    mcmc_sample_freq: (int)\n        mrbayes nexus block 'mcmc samplefreq' parameter. The frequency of sampling from\n        the mcmc chain. \n\n    \"\"\"\n\n    ## workdir is the top level directory (e.g., analysis-bucky)\n    if outdir:\n        if not os.path.exists(outdir):\n            os.makedirs(outdir)\n    else:\n        outdir = os.path.curdir\n\n    ## enforce full path names\n    outdir = os.path.realpath(outdir)\n\n    ## outdir is a named directory within this (e.g., analysis-bucky/subs/)\n    outdir = os.path.join(outdir, \"bucky-{}\".format(name))\n    if not os.path.exists(outdir):\n        os.makedirs(outdir)\n    else:\n        ## remove {number}.nex files in this folder\n        ofiles = glob.glob(os.path.join(outdir, \"[0-9].nex*\"))\n        for ofile in ofiles:\n            os.remove(ofile)\n\n    ## parse the locifile to a generator\n    with open(locifile) as infile:\n        loci = (i for i in infile.read().strip().split(\"|\\n\"))\n\n    ## convert subsamples to a set\n    if not subsamples:\n        ## get all sample names from loci\n        with open(locifile) as infile:\n            subs = set((i.split()[0] for i in infile.readlines() if \"//\" not in i))\n    else:   \n        subs = set(subsamples)\n\n    ## keep track of how many loci pass\n    lens = len(subs)\n    nlocus = 0\n    \n    ## create subsampled data set\n    for loc in loci:\n        dat = loc.split(\"\\n\")[:-1]\n\n        ## get names and seq from locus\n        names = [i.split()[0] for i in dat]\n        seqs = np.array([list(i.split()[1]) for i in dat])\n\n        ## check that locus has required samples for each subtree\n        if len(set(names).intersection(set(subs))) == lens:\n            ## order the same way every time\n            seqsamp = seqs[[names.index(tax) for tax in subs]]\n            seqsamp = _resolveambig(seqsamp)\n            pis = _count_PIS(seqsamp, minSNPs)\n\n            if pis:\n                nlocus += 1\n                ## remove invariable columns given this subsampling\n                copied = seqsamp.copy()\n                copied[copied == \"-\"] = \"N\"\n                rmcol = np.all(copied == \"N\", axis=0)\n                seqsamp = seqsamp[:, ~rmcol]\n\n                ## write to a nexus file\n                mdict = dict(zip(subs, [i.tostring() for i in seqsamp]))\n                nexmake(mdict, nlocus, outdir, mcmc_burnin, mcmc_ngen, mcmc_sample_freq)\n\n    print \"wrote {} nexus files to {}\".format(nlocus, outdir)", "response": "This function takes a loci file and returns a nexus formatted file that can be used to generate multiple nexus files."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef nexmake(mdict, nlocus, dirs, mcmc_burnin, mcmc_ngen, mcmc_sample_freq):\n    ## create matrix as a string\n    max_name_len = max([len(i) for i in mdict])\n    namestring = \"{:<\" + str(max_name_len+1) + \"} {}\\n\"\n    matrix = \"\"\n    for i in mdict.items():\n        matrix += namestring.format(i[0], i[1])\n    \n    ## write nexus block\n    handle = os.path.join(dirs, \"{}.nex\".format(nlocus))\n    with open(handle, 'w') as outnex:\n        outnex.write(NEXBLOCK.format(**{\n            \"ntax\": len(mdict), \n            \"nchar\": len(mdict.values()[0]), \n            \"matrix\": matrix,\n            \"ngen\": mcmc_ngen, \n            \"sfreq\": mcmc_sample_freq, \n            \"burnin\": mcmc_burnin, \n            }))", "response": "function that creates a NEXUS\n    file with a mrbayes analysis block."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ncalls fastq - dump on the given SRRs and writes them to the workdir.", "response": "def call_fastq_dump_on_SRRs(self, srr, outname, paired):\n    \"\"\"\n    calls fastq-dump on SRRs, relabels fastqs by their accession\n    names, and writes them to the workdir. Saves temp sra files\n    in the designated tmp folder and immediately removes them.\n    \"\"\"\n\n    ## build command for fastq-dumping\n    fd_cmd = [\n        \"fastq-dump\", srr,\n        \"--accession\", outname,\n        \"--outdir\", self.workdir, \n        \"--gzip\",\n        ]\n    if paired:\n        fd_cmd += [\"--split-files\"]\n\n    ## call fq dump command\n    proc = sps.Popen(fd_cmd, stderr=sps.STDOUT, stdout=sps.PIPE)\n    o, e = proc.communicate()\n\n    ## delete the stupid temp sra file from the place \n    ## that it is very hard-coded to be written to, and \n    ## LEFT IN, for some crazy reason.\n    srafile = os.path.join(self.workdir, \"sra\", srr+\".sra\")\n    if os.path.exists(srafile):\n        os.remove(srafile)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef fields_checker(fields):\n    ## make sure fields will work\n    if isinstance(fields, int):\n        fields = str(fields)\n    if isinstance(fields, str):\n        if \",\" in fields:\n            fields = [str(i) for i in fields.split(\",\")]\n        else:\n            fields = [str(fields)]\n    elif isinstance(fields, (tuple, list)):\n        fields = [str(i) for i in fields]\n    else:\n        raise IPyradWarningExit(\"fields not properly formatted\")\n\n    ## do not allow zero in fields\n    fields = [i for i in fields if i != '0']\n\n    return fields", "response": "checks if a list of fields is properly formatted"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ndownloading the accessions into a temporary directory and run the work.", "response": "def run(self, \n        force=False, \n        ipyclient=None, \n        name_fields=30, \n        name_separator=\"_\", \n        dry_run=False):\n        \"\"\"\n        Download the accessions into a the designated workdir. \n\n        Parameters\n        ----------\n        force: (bool)\n            If force=True then existing files with the same name\n            will be overwritten. \n\n        ipyclient: (ipyparallel.Client)\n            If provided, work will be distributed across a parallel\n            client, otherwise download will be run on a single core.\n\n        name_fields: (int, str):\n            Provide the index of the name fields to be used as a prefix\n            for fastq output files. The default is 30, which is the \n            SampleName field. Use sra.fetch_fields to see all available\n            fields and their indices. A likely alternative is 1 (Run). \n            If multiple are listed then they will be joined by a \"_\" \n            character. For example (29,30) would yield something like:\n            latin-name_sample-name (e.g., mus_musculus-NR10123).\n\n        dry_run: (bool)\n            If True then a table of file names that _would_ be downloaded\n            will be shown, but the actual files will note be downloaded.\n        \"\"\"\n\n        ## temporarily set directory for tmpfiles used by fastq-dump\n        ## if this fails then just skip it.\n        try:\n            ## ensure output directory, also used as tmpdir\n            if not os.path.exists(self.workdir):\n                os.makedirs(self.workdir)\n\n            ## get original directory for sra files \n            ## probably /home/ncbi/public/sra by default.\n            self._set_vdbconfig_path()\n\n            ## register ipyclient for cleanup\n            if ipyclient:\n                self._ipcluster[\"pids\"] = {}\n                for eid in ipyclient.ids:\n                    engine = ipyclient[eid]\n                    if not engine.outstanding:\n                        pid = engine.apply(os.getpid).get()\n                        self._ipcluster[\"pids\"][eid] = pid               \n\n            ## submit jobs to engines or local \n            self._submit_jobs(\n                force=force, \n                ipyclient=ipyclient, \n                name_fields=name_fields, \n                name_separator=name_separator,\n                dry_run=dry_run,\n                )\n\n        except IPyradWarningExit as inst:\n            print(inst)\n        ## exceptions to catch, cleanup and handle ipyclient interrupts\n        except KeyboardInterrupt:\n            print(\"keyboard interrupt...\")\n        except Exception as inst:\n            print(\"Exception in run() - {}\".format(inst))\n        finally:\n            ## reset working sra path\n            self._restore_vdbconfig_path()\n\n            ## if it made a new sra directory then it should be empty when \n            ## we are finished if all .sra files were removed. If so, then\n            ## let's also remove the dir. if not empty, leave it.\n            sradir = os.path.join(self.workdir, \"sra\")\n            if os.path.exists(sradir) and (not os.listdir(sradir)):\n                shutil.rmtree(sradir)\n            else:\n                ## print warning\n                try:\n                    print(FAILED_DOWNLOAD.format(os.listdir(sradir)))\n                except OSError as inst:\n                    ## If sra dir doesn't even exist something very bad is broken.\n                    raise IPyradWarningExit(\"Download failed. Exiting.\")\n                ## remove fastq file matching to cached sra file\n                for srr in os.listdir(sradir):\n                    isrr = srr.split(\".\")[0]\n                    ipath = os.path.join(self.workdir, \"*_{}*.gz\".format(isrr))\n                    ifile = glob.glob(ipath)[0]\n                    if os.path.exists(ifile):\n                        os.remove(ifile)\n                ## remove cache of sra files\n                shutil.rmtree(sradir)\n\n            ## cleanup ipcluster shutdown\n            if ipyclient:\n                ## send SIGINT (2) to all engines still running tasks\n                try:\n                    ipyclient.abort()\n                    time.sleep(0.5)\n                    for engine_id, pid in self._ipcluster[\"pids\"].items():\n                        if ipyclient.queue_status()[engine_id][\"tasks\"]:\n                            os.kill(pid, 2)\n                        time.sleep(0.1)\n                except ipp.NoEnginesRegistered:\n                    pass\n                ## clean memory space\n                if not ipyclient.outstanding:\n                    ipyclient.purge_everything()\n                ## uh oh, kill everything, something bad happened\n                else:\n                    ipyclient.shutdown(hub=True, block=False)\n                    ipyclient.close()\n                    print(\"\\nwarning: ipcluster shutdown and must be restarted\")"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _submit_jobs(self, \n        force, \n        ipyclient, \n        name_fields, \n        name_separator, \n        dry_run):\n        \"\"\"\n        Download the accessions into a the designated workdir. \n        If file already exists it will only be overwritten if \n        force=True. Temporary files are removed. \n        \"\"\"\n\n        ## get Run data with default fields (1,4,6,30)\n        df = self.fetch_runinfo(range(31), quiet=True)\n        sys.stdout.flush()\n\n        ## if not ipyclient then use multiprocessing\n        if ipyclient:\n            lb = ipyclient.load_balanced_view()\n\n        ## if Run has samples with same name (replicates) then \n        ## we need to include the accessions in the file names\n        if name_fields:\n            ## indexing requires -1 ints\n            fields = [int(i)-1 for i in fields_checker(name_fields)]\n            ## make accession names, no spaces allowed\n            df['Accession'] = pd.Series(df[df.columns[fields[0]]], index=df.index)\n            for field in fields[1:]:\n                df.Accession += name_separator + df[df.columns[field]]\n            df.Accession = [i.replace(\" \", \"_\") for i in df.Accession]    \n            ## check that names are unique\n            if not df.Accession.shape[0] == df.Accession.unique().shape[0]:\n                raise IPyradWarningExit(\"names are not unique:\\n{}\"\\\n                    .format(df.Accession))\n\n        ## backup default naming scheme\n        else:\n            if len(set(df.SampleName)) != len(df.SampleName):\n                accs = (i+\"-\"+j for i, j in zip(df.SampleName, df.Run))\n                df.Accession = accs\n            else:\n                df.Accession = df.SampleName\n\n        if dry_run:\n            print(\"\\rThe following files will be written to: {}\".format(self.workdir))\n            print(\"{}\\n\".format(df.Accession))\n        else:\n            ## iterate over and download\n            asyncs = []\n            for idx in df.index:\n\n                ## get args for this run\n                srr = df.Run[idx]\n                outname = df.Accession[idx]\n                paired = df.spots_with_mates.values.astype(int).nonzero()[0].any()\n                fpath = os.path.join(self.workdir, outname+\".fastq.gz\")\n\n                ## skip if exists and not force\n                skip = False\n                if force:\n                    if os.path.exists(fpath):\n                        os.remove(fpath)\n                else:\n                    if os.path.exists(fpath):                \n                        skip = True\n                        sys.stdout.flush()\n                        print(\"[skip] file already exists: {}\".format(fpath))\n\n                ## single job progress bar\n                tidx = df.Accession.shape[0]\n                #if not ipyclient:\n                    \n\n                ## submit job to run\n                if not skip:\n                    args = (self, srr, outname, paired)\n                    if ipyclient:\n                        async = lb.apply_async(call_fastq_dump_on_SRRs, *args)\n                        asyncs.append(async)\n                    else:\n                        print(\"Downloading file {}/{}: {}\".format(idx+1, tidx, fpath))\n                        call_fastq_dump_on_SRRs(*args)\n                        sys.stdout.flush()\n\n            ## progress bar while blocking parallel\n            if ipyclient:\n                tots = df.Accession.shape[0]\n                printstr = \" Downloading fastq files | {} | \"\n                start = time.time()\n                while 1:\n                    elapsed = datetime.timedelta(seconds=int(time.time()-start))\n                    ready = sum([i.ready() for i in asyncs])\n                    progressbar(tots, ready, printstr.format(elapsed), spacer=\"\")\n                    time.sleep(0.1)\n                    if tots == ready:\n                        print(\"\")\n                        break\n                self._report(tots)\n\n                ## check for fails\n                for async in asyncs:\n                    if not async.successful():\n                        raise IPyradWarningExit(async.result())", "response": "Submit jobs to the user."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef fetch_runinfo(self, fields=None, quiet=False):\n        if not quiet:\n            print(\"\\rFetching project data...\", end=\"\")\n\n        ## if no entry then fetch (nearly) all fields.\n        if fields == None:  \n            fields = range(30)\n        fields = fields_checker(fields)\n\n        ## command strings\n        es_cmd = [\n            \"esearch\", \n            \"-db\", \"sra\", \n            \"-query\", self.accession,\n        ]\n\n        ef_cmd = [\n            \"efetch\", \n            \"--format\", \"runinfo\",\n        ]\n\n        cut_cmd = [\n            \"cut\", \n            \"-d\", \",\", \n            \"-f\", \",\".join(fields),\n        ]\n\n        ## pipe commands together\n        proc1 = sps.Popen(es_cmd, stderr=sps.STDOUT, stdout=sps.PIPE)\n        proc2 = sps.Popen(ef_cmd, stdin=proc1.stdout, stderr=sps.STDOUT, stdout=sps.PIPE)\n        proc3 = sps.Popen(cut_cmd, stdin=proc2.stdout, stderr=sps.STDOUT, stdout=sps.PIPE)\n        o, e = proc3.communicate()\n        proc2.stdout.close()\n        proc1.stdout.close()\n        \n        if o:\n            vals = o.strip().split(\"\\n\")\n            names = vals[0].split(\",\")\n            items = [i.split(\",\") for i in vals[1:] if i not in [\"\", vals[0]]]\n            return pd.DataFrame(items, columns=names)\n        else:\n            raise IPyradWarningExit(\"no samples found in {}\".format(self.accession))", "response": "Fetch the runinfo for the specified fields from the specified project."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef Async(cls, token, session=None, **options):\n        return cls(token, session=session, is_async=True, **options)", "response": "Returns the client in async mode."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ngets the CR Constants for a specific resource", "response": "def get_constants(self, **params: keys):\n        \"\"\"Get the CR Constants\n\n        Parameters\n        ----------\n        \\*\\*keys: Optional[list] = None\n            Filter which keys should be included in the\n            response\n        \\*\\*exclude: Optional[list] = None\n            Filter which keys should be excluded from the\n            response\n        \\*\\*timeout: Optional[int] = None\n            Custom timeout that overwrites Client.timeout\n        \"\"\"\n        url = self.api.CONSTANTS\n        return self._get_model(url, **params)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_player(self, *tags: crtag, **params: keys):\n        url = self.api.PLAYER + '/' + ','.join(tags)\n        return self._get_model(url, FullPlayer, **params)", "response": "Get a player information by tag."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef get_player_verify(self, tag: crtag, apikey: str, **params: keys):\n        url = self.api.PLAYER + '/' + tag + '/verify'\n        params.update({'token': apikey})\n        return self._get_model(url, FullPlayer, **params)", "response": "This endpoint is restricted tournament to\n            certain members of the community"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ngets a player s battle log.", "response": "def get_player_battles(self, *tags: crtag, **params: keys):\n        \"\"\"Get a player's battle log\n\n        Parameters\n        ----------\n        \\*tags: str\n            Valid player tags. Minimum length: 3\n            Valid characters: 0289PYLQGRJCUV\n        \\*\\*keys: Optional[list] = None\n            Filter which keys should be included in the\n            response\n        \\*\\*exclude: Optional[list] = None\n            Filter which keys should be excluded from the\n            response\n        \\*\\*max: Optional[int] = None\n            Limit the number of items returned in the response\n        \\*\\*page: Optional[int] = None\n            Works with max, the zero-based page of the\n            items\n        \\*\\*timeout: Optional[int] = None\n            Custom timeout that overwrites Client.timeout\n        \"\"\"\n        url = self.api.PLAYER + '/' + ','.join(tags) + '/battles'\n        return self._get_model(url, **params)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nget a clan information for a set of tags", "response": "def get_clan(self, *tags: crtag, **params: keys):\n        \"\"\"Get a clan information\n\n        Parameters\n        ----------\n        \\*tags: str\n            Valid clan tags. Minimum length: 3\n            Valid characters: 0289PYLQGRJCUV\n        \\*\\*keys: Optional[list] = None\n            Filter which keys should be included in the\n            response\n        \\*\\*exclude: Optional[list] = None\n            Filter which keys should be excluded from the\n            response\n        \\*\\*timeout: Optional[int] = None\n            Custom timeout that overwrites Client.timeout\n        \"\"\"\n        url = self.api.CLAN + '/' + ','.join(tags)\n        return self._get_model(url, FullClan, **params)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ngetting a list of clans that are being tracked by having cr - api. com or royaleapi. com or .", "response": "def get_tracking_clans(self, **params: keys):\n        \"\"\"Get a list of clans that are being\n        tracked by having either cr-api.com or\n        royaleapi.com in the description\n\n        Parameters\n        ----------\n        \\*\\*keys: Optional[list] = None\n            Filter which keys should be included in the\n            response\n        \\*\\*exclude: Optional[list] = None\n            Filter which keys should be excluded from the\n            response\n        \\*\\*max: Optional[int] = None\n            Limit the number of items returned in the response\n        \\*\\*page: Optional[int] = None\n            Works with max, the zero-based page of the\n            items\n        \\*\\*timeout: Optional[int] = None\n            Custom timeout that overwrites Client.timeout\n        \"\"\"\n        url = self.api.CLAN + '/tracking'\n        return self._get_model(url, **params)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_clan_tracking(self, *tags: crtag, **params: keys):\n        url = self.api.CLAN + '/' + ','.join(tags) + '/tracking'\n        return self._get_model(url, **params)", "response": "Returns True if the clan is currently being tracked by the API by having either cr - api. com or royaleapi. com\n            in the clan description."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nget inforamtion about a clan s current clan war", "response": "def get_clan_war(self, tag: crtag, **params: keys):\n        \"\"\"Get inforamtion about a clan's current clan war\n\n        Parameters\n        ----------\n        *tag: str\n            A valid clan tag. Minimum length: 3\n            Valid characters: 0289PYLQGRJCUV\n        \\*\\*keys: Optional[list] = None\n            Filter which keys should be included in the\n            response\n        \\*\\*exclude: Optional[list] = None\n            Filter which keys should be excluded from the\n            response\n        \\*\\*timeout: Optional[int] = None\n            Custom timeout that overwrites Client.timeout\n        \"\"\"\n        url = self.api.CLAN + '/' + tag + '/war'\n        return self._get_model(url, **params)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef get_tournament(self, tag: crtag, **params: keys):\n        url = self.api.TOURNAMENT + '/' + tag\n        return self._get_model(url, **params)", "response": "Get a tournament information from a specific tournament."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef search_tournaments(self, **params: keys):\n        url = self.api.TOURNAMENT + '/search'\n        return self._get_model(url, PartialClan, **params)", "response": "Search for a tournament s tournaments"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\ngetting a list of top clans by war", "response": "def get_top_war_clans(self, country_key='', **params: keys):\n        \"\"\"Get a list of top clans by war\n\n        location_id: Optional[str] = ''\n            A location ID or '' (global)\n            See https://github.com/RoyaleAPI/cr-api-data/blob/master/json/regions.json\n            for a list of acceptable location IDs\n        \\*\\*keys: Optional[list] = None\n            Filter which keys should be included in the\n            response\n        \\*\\*exclude: Optional[list] = None\n            Filter which keys should be excluded from the\n            response\n        \\*\\*max: Optional[int] = None\n            Limit the number of items returned in the response\n        \\*\\*page: Optional[int] = None\n            Works with max, the zero-based page of the\n            items\n        \\*\\*timeout: Optional[int] = None\n            Custom timeout that overwrites Client.timeout\n        \"\"\"\n        url = self.api.TOP + '/war/' + str(country_key)\n        return self._get_model(url, PartialClan, **params)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef get_top_players(self, country_key='', **params: keys):\n        url = self.api.TOP + '/players/' + str(country_key)\n        return self._get_model(url, PartialPlayerClan, **params)", "response": "Get a list of top players in a given country"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nget a list of popular clans for a specific key.", "response": "def get_popular_clans(self, **params: keys):\n        \"\"\"Get a list of most queried clans\n\n        \\*\\*keys: Optional[list] = None\n            Filter which keys should be included in the\n            response\n        \\*\\*exclude: Optional[list] = None\n            Filter which keys should be excluded from the\n            response\n        \\*\\*max: Optional[int] = None\n            Limit the number of items returned in the response\n        \\*\\*page: Optional[int] = None\n            Works with max, the zero-based page of the\n            items\n        \\*\\*timeout: Optional[int] = None\n            Custom timeout that overwrites Client.timeout\n        \"\"\"\n        url = self.api.POPULAR + '/clans'\n        return self._get_model(url, PartialClan, **params)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_popular_players(self, **params: keys):\n        url = self.api.POPULAR + '/players'\n        return self._get_model(url, PartialPlayerClan, **params)", "response": "Get a list of popular players in the cluster."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ngetting a list of popular tournaments for a specific tournament.", "response": "def get_popular_tournaments(self, **params: keys):\n        \"\"\"Get a list of most queried tournaments\n\n        \\*\\*keys: Optional[list] = None\n            Filter which keys should be included in the\n            response\n        \\*\\*exclude: Optional[list] = None\n            Filter which keys should be excluded from the\n            response\n        \\*\\*max: Optional[int] = None\n            Limit the number of items returned in the response\n        \\*\\*page: Optional[int] = None\n            Works with max, the zero-based page of the\n            items\n        \\*\\*timeout: Optional[int] = None\n            Custom timeout that overwrites Client.timeout\n        \"\"\"\n        url = self.api.POPULAR + '/tournament'\n        return self._get_model(url, PartialTournament, **params)"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ngets a list of most queried decks for a specific locale", "response": "def get_popular_decks(self, **params: keys):\n        \"\"\"Get a list of most queried decks\n\n        \\*\\*keys: Optional[list] = None\n            Filter which keys should be included in the\n            response\n        \\*\\*exclude: Optional[list] = None\n            Filter which keys should be excluded from the\n            response\n        \\*\\*max: Optional[int] = None\n            Limit the number of items returned in the response\n        \\*\\*page: Optional[int] = None\n            Works with max, the zero-based page of the\n            items\n        \\*\\*timeout: Optional[int] = None\n            Custom timeout that overwrites Client.timeout\n        \"\"\"\n        url = self.api.POPULAR + '/decks'\n        return self._get_model(url, **params)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ngetting a list of known tournaments for a given tournament.", "response": "def get_known_tournaments(self, **params: tournamentfilter):\n        \"\"\"Get a list of queried tournaments\n\n        \\*\\*keys: Optional[list] = None\n            Filter which keys should be included in the\n            response\n        \\*\\*exclude: Optional[list] = None\n            Filter which keys should be excluded from the\n            response\n        \\*\\*max: Optional[int] = None\n            Limit the number of items returned in the response\n        \\*\\*page: Optional[int] = None\n            Works with max, the zero-based page of the\n            items\n        \\*\\*timeout: Optional[int] = None\n            Custom timeout that overwrites Client.timeout\n        \"\"\"\n        url = self.api.TOURNAMENT + '/known'\n        return self._get_model(url, PartialTournament, **params)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ngetting information about a tournament with the specified tournament tag.", "response": "def get_player(self, tag: crtag, timeout=None):\n        \"\"\"Get information about a player\n\n        Parameters\n        ----------\n        tag: str\n            A valid tournament tag. Minimum length: 3\n            Valid characters: 0289PYLQGRJCUV\n        timeout: Optional[int] = None\n            Custom timeout that overwrites Client.timeout\n        \"\"\"\n        url = self.api.PLAYER + '/' + tag\n        return self._get_model(url, FullPlayer, timeout=timeout)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ncheck the API Key of a player. This endpoint has been **restricted** to certain members of the community Raises BadRequest if the apikey is invalid Parameters ---------- tag: str A valid tournament tag. Minimum length: 3 Valid characters: 0289PYLQGRJCUV apikey: str The API Key in the player's settings timeout: Optional[int] = None Custom timeout that overwrites Client.timeout", "response": "def get_player_verify(self, tag: crtag, apikey: str, timeout=None):\n        \"\"\"Check the API Key of a player.\n        This endpoint has been **restricted** to\n        certain members of the community\n\n        Raises BadRequest if the apikey is invalid\n\n        Parameters\n        ----------\n        tag: str\n            A valid tournament tag. Minimum length: 3\n            Valid characters: 0289PYLQGRJCUV\n        apikey: str\n            The API Key in the player's settings\n        timeout: Optional[int] = None\n            Custom timeout that overwrites Client.timeout\n        \"\"\"\n        url = self.api.PLAYER + '/' + tag + '/verifytoken'\n        return self._get_model(url, FullPlayer, timeout=timeout, method='POST', json={'token': apikey})"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ngetting a player s battle log.", "response": "def get_player_battles(self, tag: crtag, **params: keys):\n        \"\"\"Get a player's battle log\n\n        Parameters\n        ----------\n        tag: str\n            A valid tournament tag. Minimum length: 3\n            Valid characters: 0289PYLQGRJCUV\n        \\*\\*limit: Optional[int] = None\n            Limit the number of items returned in the response\n        \\*\\*timeout: Optional[int] = None\n            Custom timeout that overwrites Client.timeout\n        \"\"\"\n        url = self.api.PLAYER + '/' + tag + '/battlelog'\n        return self._get_model(url, **params)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef get_player_chests(self, tag: crtag, timeout: int=None):\n        url = self.api.PLAYER + '/' + tag + '/upcomingchests'\n        return self._get_model(url, timeout=timeout)", "response": "Get information about a player s chests."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef get_clan(self, tag: crtag, timeout: int=None):\n        url = self.api.CLAN + '/' + tag\n        return self._get_model(url, FullClan, timeout=timeout)", "response": "Get inforamtion about a clan."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef search_clans(self, **params: clansearch):\n        url = self.api.CLAN\n        return self._get_model(url, PartialClan, **params)", "response": "Search for a clan."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ngetting inforamtion about a clan s current clan war", "response": "def get_clan_war(self, tag: crtag, timeout: int=None):\n        \"\"\"Get inforamtion about a clan's current clan war\n\n        Parameters\n        ----------\n        tag: str\n            A valid tournament tag. Minimum length: 3\n            Valid characters: 0289PYLQGRJCUV\n        timeout: Optional[int] = None\n            Custom timeout that overwrites Client.timeout\n        \"\"\"\n        url = self.api.CLAN + '/' + tag + '/currentwar'\n        return self._get_model(url, timeout=timeout)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef get_tournament(self, tag: crtag, timeout=0):\n        url = self.api.TOURNAMENT + '/' + tag\n        return self._get_model(url, PartialTournament, timeout=timeout)", "response": "Get a tournament information by its tag."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef search_tournaments(self, name: str, **params: keys):\n        url = self.api.TOURNAMENT\n        params['name'] = name\n        return self._get_model(url, PartialTournament, **params)", "response": "Search for a tournament by its name."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_all_cards(self, timeout: int=None):\n        url = self.api.CARDS\n        return self._get_model(url, timeout=timeout)", "response": "Get a list of all the cards in the game\nTaxonomy API."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ngets a list of all locations in the cluster.", "response": "def get_all_locations(self, timeout: int=None):\n        \"\"\"Get a list of all locations\n\n        Parameters\n        ----------\n        timeout: Optional[int] = None\n            Custom timeout that overwrites Client.timeout\n        \"\"\"\n        url = self.api.LOCATIONS\n        return self._get_model(url, timeout=timeout)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_location(self, location_id: int, timeout: int=None):\n        url = self.api.LOCATIONS + '/' + str(location_id)\n        return self._get_model(url, timeout=timeout)", "response": "Get a location information by location ID"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef get_top_clans(self, location_id='global', **params: keys):\n        url = self.api.LOCATIONS + '/' + str(location_id) + '/rankings/clans'\n        return self._get_model(url, PartialClan, **params)", "response": "Get a list of top clans by trophy\n           "}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ngets a list of top players in a location.", "response": "def get_top_players(self, location_id='global', **params: keys):\n        \"\"\"Get a list of top players\n\n        Parameters\n        ----------\n        location_id: Optional[str] = 'global'\n            A location ID or global\n            See https://github.com/RoyaleAPI/cr-api-data/blob/master/json/regions.json\n            for a list of acceptable location IDs\n        \\*\\*limit: Optional[int] = None\n            Limit the number of items returned in the response\n        \\*\\*timeout: Optional[int] = None\n            Custom timeout that overwrites Client.timeout\n        \"\"\"\n        url = self.api.LOCATIONS + '/' + str(location_id) + '/rankings/players'\n        return self._get_model(url, PartialPlayerClan, **params)"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef get_clan_image(self, obj: BaseAttrDict):\n\n        try:\n            badge_id = obj.clan.badge_id\n        except AttributeError:\n            try:\n                badge_id = obj.badge_id\n            except AttributeError:\n                return 'https://i.imgur.com/Y3uXsgj.png'\n\n        if badge_id is None:\n            return 'https://i.imgur.com/Y3uXsgj.png'\n\n        for i in self.constants.alliance_badges:\n            if i.id == badge_id:\n                return 'https://royaleapi.github.io/cr-api-assets/badges/' + i.name + '.png'", "response": "Get the clan badge image URL"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_arena_image(self, obj: BaseAttrDict):\n        badge_id = obj.arena.id\n        for i in self.constants.arenas:\n            if i.id == badge_id:\n                return 'https://royaleapi.github.io/cr-api-assets/arenas/arena{}.png'.format(i.arena_id)", "response": "Get the arena image URL for the given object."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns the card info from constants. cards or None if no card info is found.", "response": "def get_card_info(self, card_name: str):\n        \"\"\"Returns card info from constants\n\n        Parameters\n        ---------\n        card_name: str\n            A card name\n\n        Returns None or Constants\n        \"\"\"\n        for c in self.constants.cards:\n            if c.name == card_name:\n                return c"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef get_rarity_info(self, rarity: str):\n        for c in self.constants.rarities:\n            if c.name == rarity:\n                return c", "response": "Returns card info from constants. rarities."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef get_deck_link(self, deck: BaseAttrDict):\n        deck_link = 'https://link.clashroyale.com/deck/en?deck='\n\n        for i in deck:\n            card = self.get_card_info(i.name)\n            deck_link += '{0.id};'.format(card)\n\n        return deck_link", "response": "Form a deck link"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_datetime(self, timestamp: str, unix=True):\n        time = datetime.strptime(timestamp, '%Y%m%dT%H%M%S.%fZ')\n        if unix:\n            return int(time.timestamp())\n        else:\n            return time", "response": "Converts a %Y%m%dT%H%M%S.%fZ format to a UNIX timestamp or a datetime. datetime object"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef get_clan(self):\n        try:\n            return self.client.get_clan(self.clan.tag)\n        except AttributeError:\n            try:\n                return self.client.get_clan(self.tag)\n            except AttributeError:\n                raise ValueError('This player does not have a clan.')", "response": "a sync function to return clan."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef coerce_annotation(ann, namespace):\n    '''Validate that the annotation has the correct namespace,\n    and is well-formed.\n\n    If the annotation is not of the correct namespace, automatic conversion\n    is attempted.\n\n    Parameters\n    ----------\n    ann : jams.Annotation\n        The annotation object in question\n\n    namespace : str\n        The namespace pattern to match `ann` against\n\n    Returns\n    -------\n    ann_coerced: jams.Annotation\n        The annotation coerced to the target namespace\n\n    Raises\n    ------\n    NamespaceError\n        If `ann` does not match the proper namespace\n\n    SchemaError\n        If `ann` fails schema validation\n\n    See Also\n    --------\n    jams.nsconvert.convert\n    '''\n\n    ann = convert(ann, namespace)\n    ann.validate(strict=True)\n\n    return ann", "response": "Validate that the annotation is of the correct namespace and is well - formed."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef beat(ref, est, **kwargs):\n    r'''Beat tracking evaluation\n\n    Parameters\n    ----------\n    ref : jams.Annotation\n        Reference annotation object\n    est : jams.Annotation\n        Estimated annotation object\n    kwargs\n        Additional keyword arguments\n\n    Returns\n    -------\n    scores : dict\n        Dictionary of scores, where the key is the metric name (str) and\n        the value is the (float) score achieved.\n\n    See Also\n    --------\n    mir_eval.beat.evaluate\n\n    Examples\n    --------\n    >>> # Load in the JAMS objects\n    >>> ref_jam = jams.load('reference.jams')\n    >>> est_jam = jams.load('estimated.jams')\n    >>> # Select the first relevant annotations\n    >>> ref_ann = ref_jam.search(namespace='beat')[0]\n    >>> est_ann = est_jam.search(namespace='beat')[0]\n    >>> scores = jams.eval.beat(ref_ann, est_ann)\n    '''\n\n    namespace = 'beat'\n    ref = coerce_annotation(ref, namespace)\n    est = coerce_annotation(est, namespace)\n\n    ref_times, _ = ref.to_event_values()\n    est_times, _ = est.to_event_values()\n\n    return mir_eval.beat.evaluate(ref_times, est_times, **kwargs)", "response": "r Beat tracking evaluation of a given reference and estimated object."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef chord(ref, est, **kwargs):\n    r'''Chord evaluation\n\n    Parameters\n    ----------\n    ref : jams.Annotation\n        Reference annotation object\n    est : jams.Annotation\n        Estimated annotation object\n    kwargs\n        Additional keyword arguments\n\n    Returns\n    -------\n    scores : dict\n        Dictionary of scores, where the key is the metric name (str) and\n        the value is the (float) score achieved.\n\n    See Also\n    --------\n    mir_eval.chord.evaluate\n\n    Examples\n    --------\n    >>> # Load in the JAMS objects\n    >>> ref_jam = jams.load('reference.jams')\n    >>> est_jam = jams.load('estimated.jams')\n    >>> # Select the first relevant annotations\n    >>> ref_ann = ref_jam.search(namespace='chord')[0]\n    >>> est_ann = est_jam.search(namespace='chord')[0]\n    >>> scores = jams.eval.chord(ref_ann, est_ann)\n    '''\n\n    namespace = 'chord'\n    ref = coerce_annotation(ref, namespace)\n    est = coerce_annotation(est, namespace)\n    ref_interval, ref_value = ref.to_interval_values()\n    est_interval, est_value = est.to_interval_values()\n\n    return mir_eval.chord.evaluate(ref_interval, ref_value,\n                                   est_interval, est_value, **kwargs)", "response": "Chord evaluation for a single resource in the JAMS file."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef hierarchy_flatten(annotation):\n    '''Flatten a multi_segment annotation into mir_eval style.\n\n    Parameters\n    ----------\n    annotation : jams.Annotation\n        An annotation in the `multi_segment` namespace\n\n    Returns\n    -------\n    hier_intervalss : list\n        A list of lists of intervals, ordered by increasing specificity.\n\n    hier_labels : list\n        A list of lists of labels, ordered by increasing specificity.\n    '''\n\n    intervals, values = annotation.to_interval_values()\n\n    ordering = dict()\n\n    for interval, value in zip(intervals, values):\n        level = value['level']\n        if level not in ordering:\n            ordering[level] = dict(intervals=list(), labels=list())\n\n        ordering[level]['intervals'].append(interval)\n        ordering[level]['labels'].append(value['label'])\n\n    levels = sorted(list(ordering.keys()))\n    hier_intervals = [ordering[level]['intervals'] for level in levels]\n    hier_labels = [ordering[level]['labels'] for level in levels]\n\n    return hier_intervals, hier_labels", "response": "Flatten a multi_segment annotation into mir_eval style."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef hierarchy(ref, est, **kwargs):\n    r'''Multi-level segmentation evaluation\n\n    Parameters\n    ----------\n    ref : jams.Annotation\n        Reference annotation object\n    est : jams.Annotation\n        Estimated annotation object\n    kwargs\n        Additional keyword arguments\n\n    Returns\n    -------\n    scores : dict\n        Dictionary of scores, where the key is the metric name (str) and\n        the value is the (float) score achieved.\n\n    See Also\n    --------\n    mir_eval.hierarchy.evaluate\n\n    Examples\n    --------\n    >>> # Load in the JAMS objects\n    >>> ref_jam = jams.load('reference.jams')\n    >>> est_jam = jams.load('estimated.jams')\n    >>> # Select the first relevant annotations\n    >>> ref_ann = ref_jam.search(namespace='multi_segment')[0]\n    >>> est_ann = est_jam.search(namespace='multi_segment')[0]\n    >>> scores = jams.eval.hierarchy(ref_ann, est_ann)\n    '''\n    namespace = 'multi_segment'\n    ref = coerce_annotation(ref, namespace)\n    est = coerce_annotation(est, namespace)\n    ref_hier, ref_hier_lab = hierarchy_flatten(ref)\n    est_hier, est_hier_lab = hierarchy_flatten(est)\n\n    return mir_eval.hierarchy.evaluate(ref_hier, ref_hier_lab,\n                                       est_hier, est_hier_lab,\n                                       **kwargs)", "response": "Multi - level segmentation evaluation"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef melody(ref, est, **kwargs):\n    r'''Melody extraction evaluation\n\n    Parameters\n    ----------\n    ref : jams.Annotation\n        Reference annotation object\n    est : jams.Annotation\n        Estimated annotation object\n    kwargs\n        Additional keyword arguments\n\n    Returns\n    -------\n    scores : dict\n        Dictionary of scores, where the key is the metric name (str) and\n        the value is the (float) score achieved.\n\n    See Also\n    --------\n    mir_eval.melody.evaluate\n\n    Examples\n    --------\n    >>> # Load in the JAMS objects\n    >>> ref_jam = jams.load('reference.jams')\n    >>> est_jam = jams.load('estimated.jams')\n    >>> # Select the first relevant annotations\n    >>> ref_ann = ref_jam.search(namespace='pitch_contour')[0]\n    >>> est_ann = est_jam.search(namespace='pitch_contour')[0]\n    >>> scores = jams.eval.melody(ref_ann, est_ann)\n    '''\n\n    namespace = 'pitch_contour'\n    ref = coerce_annotation(ref, namespace)\n    est = coerce_annotation(est, namespace)\n\n    ref_times, ref_p = ref.to_event_values()\n    est_times, est_p = est.to_event_values()\n\n    ref_freq = np.asarray([p['frequency'] * (-1)**(~p['voiced']) for p in ref_p])\n    est_freq = np.asarray([p['frequency'] * (-1)**(~p['voiced']) for p in est_p])\n\n    return mir_eval.melody.evaluate(ref_times, ref_freq,\n                                    est_times, est_freq,\n                                    **kwargs)", "response": "r Melody extraction of a single object."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef pattern_to_mireval(ann):\n    '''Convert a pattern_jku annotation object to mir_eval format.\n\n    Parameters\n    ----------\n    ann : jams.Annotation\n        Must have `namespace='pattern_jku'`\n\n    Returns\n    -------\n    patterns : list of list of tuples\n        - `patterns[x]` is a list containing all occurrences of pattern x\n\n        - `patterns[x][y]` is a list containing all notes for\n           occurrence y of pattern x\n\n        - `patterns[x][y][z]` contains a time-note tuple\n          `(time, midi note)`\n    '''\n\n    # It's easier to work with dictionaries, since we can't assume\n    # sequential pattern or occurrence identifiers\n\n    patterns = defaultdict(lambda: defaultdict(list))\n\n    # Iterate over the data in interval-value format\n\n    for time, observation in zip(*ann.to_event_values()):\n\n        pattern_id = observation['pattern_id']\n        occurrence_id = observation['occurrence_id']\n        obs = (time, observation['midi_pitch'])\n\n        # Push this note observation into the correct pattern/occurrence\n        patterns[pattern_id][occurrence_id].append(obs)\n\n    # Convert to list-list-tuple format for mir_eval\n    return [list(_.values()) for _ in six.itervalues(patterns)]", "response": "Convert a pattern_jku annotation object to mir_eval format."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef pattern(ref, est, **kwargs):\n    r'''Pattern detection evaluation\n\n    Parameters\n    ----------\n    ref : jams.Annotation\n        Reference annotation object\n    est : jams.Annotation\n        Estimated annotation object\n    kwargs\n        Additional keyword arguments\n\n    Returns\n    -------\n    scores : dict\n        Dictionary of scores, where the key is the metric name (str) and\n        the value is the (float) score achieved.\n\n    See Also\n    --------\n    mir_eval.pattern.evaluate\n\n    Examples\n    --------\n    >>> # Load in the JAMS objects\n    >>> ref_jam = jams.load('reference.jams')\n    >>> est_jam = jams.load('estimated.jams')\n    >>> # Select the first relevant annotations\n    >>> ref_ann = ref_jam.search(namespace='pattern_jku')[0]\n    >>> est_ann = est_jam.search(namespace='pattern_jku')[0]\n    >>> scores = jams.eval.pattern(ref_ann, est_ann)\n    '''\n\n    namespace = 'pattern_jku'\n    ref = coerce_annotation(ref, namespace)\n    est = coerce_annotation(est, namespace)\n\n    ref_patterns = pattern_to_mireval(ref)\n    est_patterns = pattern_to_mireval(est)\n\n    return mir_eval.pattern.evaluate(ref_patterns, est_patterns, **kwargs)", "response": "r'''Pattern detection evaluation\n\n    Parameters\n    ----------\n    ref : jams.Annotation\n        Reference annotation object\n    est : jams.Annotation\n        Estimated annotation object\n    kwargs\n        Additional keyword arguments\n\n    Returns\n    -------\n    scores : dict\n        Dictionary of scores, where the key is the metric name (str) and\n        the value is the (float) score achieved.\n\n    See Also\n    --------\n    mir_eval.pattern.evaluate\n\n    Examples\n    --------\n    >>> # Load in the JAMS objects\n    >>> ref_jam = jams.load('reference.jams')\n    >>> est_jam = jams.load('estimated.jams')\n    >>> # Select the first relevant annotations\n    >>> ref_ann = ref_jam.search(namespace='pattern_jku')[0]\n    >>> est_ann = est_jam.search(namespace='pattern_jku')[0]\n    >>> scores = jams.eval.pattern(ref_ann, est_ann)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef transcription(ref, est, **kwargs):\n    r'''Note transcription evaluation\n\n    Parameters\n    ----------\n    ref : jams.Annotation\n        Reference annotation object\n    est : jams.Annotation\n        Estimated annotation object\n    kwargs\n        Additional keyword arguments\n\n    Returns\n    -------\n    scores : dict\n        Dictionary of scores, where the key is the metric name (str) and\n        the value is the (float) score achieved.\n\n    See Also\n    --------\n    mir_eval.transcription.evaluate\n\n    Examples\n    --------\n    >>> # Load in the JAMS objects\n    >>> ref_jam = jams.load('reference.jams')\n    >>> est_jam = jams.load('estimated.jams')\n    >>> # Select the first relevant annotations. You can use any annotation\n    >>> # type that can be converted to pitch_contour (such as pitch_midi)\n    >>> ref_ann = ref_jam.search(namespace='pitch_contour')[0]\n    >>> est_ann = est_jam.search(namespace='note_hz')[0]\n    >>> scores = jams.eval.transcription(ref_ann, est_ann)\n    '''\n\n    namespace = 'pitch_contour'\n    ref = coerce_annotation(ref, namespace)\n    est = coerce_annotation(est, namespace)\n    ref_intervals, ref_p = ref.to_interval_values()\n    est_intervals, est_p = est.to_interval_values()\n\n    ref_pitches = np.asarray([p['frequency'] * (-1)**(~p['voiced']) for p in ref_p])\n    est_pitches = np.asarray([p['frequency'] * (-1)**(~p['voiced']) for p in est_p])\n\n    return mir_eval.transcription.evaluate(\n        ref_intervals, ref_pitches, est_intervals, est_pitches, **kwargs)", "response": "r Note transcription evaluation for a single object."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nadding a namespace definition to our working set.", "response": "def add_namespace(filename):\n    '''Add a namespace definition to our working set.\n\n    Namespace files consist of partial JSON schemas defining the behavior\n    of the `value` and `confidence` fields of an Annotation.\n\n    Parameters\n    ----------\n    filename : str\n        Path to json file defining the namespace object\n    '''\n    with open(filename, mode='r') as fileobj:\n        __NAMESPACE__.update(json.load(fileobj))"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef namespace(ns_key):\n    '''Construct a validation schema for a given namespace.\n\n    Parameters\n    ----------\n    ns_key : str\n        Namespace key identifier (eg, 'beat' or 'segment_tut')\n\n    Returns\n    -------\n    schema : dict\n        JSON schema of `namespace`\n    '''\n\n    if ns_key not in __NAMESPACE__:\n        raise NamespaceError('Unknown namespace: {:s}'.format(ns_key))\n\n    sch = copy.deepcopy(JAMS_SCHEMA['definitions']['SparseObservation'])\n\n    for key in ['value', 'confidence']:\n        try:\n            sch['properties'][key] = __NAMESPACE__[ns_key][key]\n        except KeyError:\n            pass\n\n    return sch", "response": "Construct a validation schema for a given namespace."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef namespace_array(ns_key):\n    '''Construct a validation schema for arrays of a given namespace.\n\n    Parameters\n    ----------\n    ns_key : str\n        Namespace key identifier\n\n    Returns\n    -------\n    schema : dict\n        JSON schema of `namespace` observation arrays\n    '''\n\n    obs_sch = namespace(ns_key)\n    obs_sch['title'] = 'Observation'\n\n    sch = copy.deepcopy(JAMS_SCHEMA['definitions']['SparseObservationList'])\n    sch['items'] = obs_sch\n    return sch", "response": "Construct a validation schema for arrays of a given namespace."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns the allowed values for an enumerated namespace.", "response": "def values(ns_key):\n    '''Return the allowed values for an enumerated namespace.\n\n    Parameters\n    ----------\n    ns_key : str\n        Namespace key identifier\n\n    Returns\n    -------\n    values : list\n\n    Raises\n    ------\n    NamespaceError\n        If `ns_key` is not found, or does not have enumerated values\n\n    Examples\n    --------\n    >>> jams.schema.values('tag_gtzan')\n    ['blues', 'classical', 'country', 'disco', 'hip-hop', 'jazz',\n     'metal', 'pop', 'reggae', 'rock']\n    '''\n\n    if ns_key not in __NAMESPACE__:\n        raise NamespaceError('Unknown namespace: {:s}'.format(ns_key))\n\n    if 'enum' not in __NAMESPACE__[ns_key]['value']:\n        raise NamespaceError('Namespace {:s} is not enumerated'.format(ns_key))\n\n    return copy.copy(__NAMESPACE__[ns_key]['value']['enum'])"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef get_dtypes(ns_key):\n    '''Get the dtypes associated with the value and confidence fields\n    for a given namespace.\n\n    Parameters\n    ----------\n    ns_key : str\n        The namespace key in question\n\n    Returns\n    -------\n    value_dtype, confidence_dtype : numpy.dtype\n        Type identifiers for value and confidence fields.\n    '''\n\n    # First, get the schema\n    if ns_key not in __NAMESPACE__:\n        raise NamespaceError('Unknown namespace: {:s}'.format(ns_key))\n\n    value_dtype = __get_dtype(__NAMESPACE__[ns_key].get('value', {}))\n    confidence_dtype = __get_dtype(__NAMESPACE__[ns_key].get('confidence', {}))\n\n    return value_dtype, confidence_dtype", "response": "Get the dtypes associated with the value and confidence fields\n    for a given namespace."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef list_namespaces():\n    '''Print out a listing of available namespaces'''\n    print('{:30s}\\t{:40s}'.format('NAME', 'DESCRIPTION'))\n    print('-' * 78)\n    for sch in sorted(__NAMESPACE__):\n        desc = __NAMESPACE__[sch]['description']\n        desc = (desc[:44] + '..') if len(desc) > 46 else desc\n        print('{:30s}\\t{:40s}'.format(sch, desc))", "response": "Print out a listing of available namespaces"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef __get_dtype(typespec):\n    '''Get the dtype associated with a jsonschema type definition\n\n    Parameters\n    ----------\n    typespec : dict\n        The schema definition\n\n    Returns\n    -------\n    dtype : numpy.dtype\n        The associated dtype\n    '''\n\n    if 'type' in typespec:\n        return __TYPE_MAP__.get(typespec['type'], np.object_)\n\n    elif 'enum' in typespec:\n        # Enums map to objects\n        return np.object_\n\n    elif 'oneOf' in typespec:\n        # Recurse\n        types = [__get_dtype(v) for v in typespec['oneOf']]\n\n        # If they're not all equal, return object\n        if all([t == types[0] for t in types]):\n            return types[0]\n\n    return np.object_", "response": "Get the dtype associated with a jsonschema type definition\n   "}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nloading the schema file from the package.", "response": "def __load_jams_schema():\n    '''Load the schema file from the package.'''\n\n    schema_file = os.path.join(SCHEMA_DIR, 'jams_schema.json')\n\n    jams_schema = None\n    with open(resource_filename(__name__, schema_file), mode='r') as fdesc:\n        jams_schema = json.load(fdesc)\n\n    if jams_schema is None:\n        raise JamsError('Unable to load JAMS schema')\n\n    return jams_schema"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef import_lab(namespace, filename, infer_duration=True, **parse_options):\n    r'''Load a .lab file as an Annotation object.\n\n    .lab files are assumed to have the following format:\n\n        ``TIME_START\\tTIME_END\\tANNOTATION``\n\n    By default, .lab files are assumed to have columns separated by one\n    or more white-space characters, and have no header or index column\n    information.\n\n    If the .lab file contains only two columns, then an empty duration\n    field is inferred.\n\n    If the .lab file contains more than three columns, each row's\n    annotation value is assigned the contents of last non-empty column.\n\n\n    Parameters\n    ----------\n    namespace : str\n        The namespace for the new annotation\n\n    filename : str\n        Path to the .lab file\n\n    infer_duration : bool\n        If `True`, interval durations are inferred from `(start, end)` columns,\n        or difference between successive times.\n\n        If `False`, interval durations are assumed to be explicitly coded as\n        `(start, duration)` columns.  If only one time column is given, then\n        durations are set to 0.\n\n        For instantaneous event annotations (e.g., beats or onsets), this\n        should be set to `False`.\n\n    parse_options : additional keyword arguments\n        Passed to ``pandas.DataFrame.read_csv``\n\n    Returns\n    -------\n    annotation : Annotation\n        The newly constructed annotation object\n\n    See Also\n    --------\n    pandas.DataFrame.read_csv\n    '''\n\n    # Create a new annotation object\n    annotation = core.Annotation(namespace)\n\n    parse_options.setdefault('sep', r'\\s+')\n    parse_options.setdefault('engine', 'python')\n    parse_options.setdefault('header', None)\n    parse_options.setdefault('index_col', False)\n\n    # This is a hack to handle potentially ragged .lab data\n    parse_options.setdefault('names', range(20))\n\n    data = pd.read_csv(filename, **parse_options)\n\n    # Drop all-nan columns\n    data = data.dropna(how='all', axis=1)\n\n    # Do we need to add a duration column?\n    # This only applies to event annotations\n    if len(data.columns) == 2:\n        # Insert a column of zeros after the timing\n        data.insert(1, 'duration', 0)\n        if infer_duration:\n            data['duration'][:-1] = data.loc[:, 0].diff()[1:].values\n\n    else:\n        # Convert from time to duration\n        if infer_duration:\n            data.loc[:, 1] -= data[0]\n\n    for row in data.itertuples():\n        time, duration = row[1:3]\n\n        value = [x for x in row[3:] if x is not None][-1]\n\n        annotation.append(time=time,\n                          duration=duration,\n                          confidence=1.0,\n                          value=value)\n\n    return annotation", "response": "Imports a. lab file into a new Annotation object."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef expand_filepaths(base_dir, rel_paths):\n    return [os.path.join(base_dir, os.path.normpath(rp)) for rp in rel_paths]", "response": "Expand a list of relative paths to a give base directory."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef find_with_extension(in_dir, ext, depth=3, sort=True):\n    assert depth >= 1\n    ext = ext.strip(os.extsep)\n    match = list()\n    for n in range(1, depth+1):\n        wildcard = os.path.sep.join([\"*\"]*n)\n        search_path = os.path.join(in_dir, os.extsep.join([wildcard, ext]))\n        match += glob.glob(search_path)\n\n    if sort:\n        match.sort()\n    return match", "response": "Naive depth - search into a directory for files with a given extension."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef get_comments(jam, ann):\n    '''Get the metadata from a jam and an annotation, combined as a string.\n\n    Parameters\n    ----------\n    jam : JAMS\n        The jams object\n\n    ann : Annotation\n        An annotation object\n\n    Returns\n    -------\n    comments : str\n        The jam.file_metadata and ann.annotation_metadata, combined and serialized\n    '''\n    jam_comments = jam.file_metadata.__json__\n    ann_comments = ann.annotation_metadata.__json__\n    return json.dumps({'metadata': jam_comments,\n                       'annotation metadata': ann_comments},\n                      indent=2)", "response": "Get the metadata from a jam and an annotation as a string."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef lab_dump(ann, comment, filename, sep, comment_char):\n    '''Save an annotation as a lab/csv.\n\n    Parameters\n    ----------\n    ann : Annotation\n        The annotation object\n\n    comment : str\n        The comment string header\n\n    filename : str\n        The output filename\n\n    sep : str\n        The separator string for output\n\n    comment_char : str\n        The character used to denote comments\n    '''\n\n    intervals, values = ann.to_interval_values()\n\n    frame = pd.DataFrame(columns=['Time', 'End Time', 'Label'],\n                         data={'Time': intervals[:, 0],\n                               'End Time': intervals[:, 1],\n                               'Label': values})\n\n    with open(filename, 'w') as fdesc:\n        for line in comment.split('\\n'):\n            fdesc.write('{:s}  {:s}\\n'.format(comment_char, line))\n\n        frame.to_csv(path_or_buf=fdesc, index=False, sep=sep)", "response": "Save an annotation as a lab / csv."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef convert_jams(jams_file, output_prefix, csv=False, comment_char='#', namespaces=None):\n    '''Convert jams to labs.\n\n    Parameters\n    ----------\n    jams_file : str\n        The path on disk to the jams file in question\n\n    output_prefix : str\n        The file path prefix of the outputs\n\n    csv : bool\n        Whether to output in csv (True) or lab (False) format\n\n    comment_char : str\n        The character used to denote comments\n\n    namespaces : list-like\n        The set of namespace patterns to match for output\n    '''\n\n    if namespaces is None:\n        raise ValueError('No namespaces provided. Try \".*\" for all namespaces.')\n\n    jam = jams.load(jams_file)\n\n    # Get all the annotations\n    # Filter down to the unique ones\n    # For each annotation\n    #     generate the comment string\n    #     generate the output filename\n    #     dump to csv\n\n    # Make a counter object for each namespace type\n    counter = collections.Counter()\n\n    annotations = []\n\n    for query in namespaces:\n        annotations.extend(jam.search(namespace=query))\n\n    if csv:\n        suffix = 'csv'\n        sep = ','\n    else:\n        suffix = 'lab'\n        sep = '\\t'\n\n    for ann in annotations:\n        index = counter[ann.namespace]\n        counter[ann.namespace] += 1\n        filename = os.path.extsep.join([get_output_name(output_prefix,\n                                                        ann.namespace,\n                                                        index),\n                                        suffix])\n\n        comment = get_comments(jam, ann)\n\n        # Dump to disk\n        lab_dump(ann, comment, filename, sep, comment_char)", "response": "Convert jams file to labs."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nparses command line arguments and return a dictionary of arguments.", "response": "def parse_arguments(args):\n    '''Parse arguments from the command line'''\n    parser = argparse.ArgumentParser(description='Convert JAMS to .lab files')\n\n    parser.add_argument('-c',\n                        '--comma-separated',\n                        dest='csv',\n                        action='store_true',\n                        default=False,\n                        help='Output in .csv instead of .lab')\n\n    parser.add_argument('--comment', dest='comment_char', type=str, default='#',\n                        help='Comment character')\n\n    parser.add_argument('-n',\n                        '--namespace',\n                        dest='namespaces',\n                        nargs='+',\n                        default=['.*'],\n                        help='One or more namespaces to output.  Default is all.')\n\n    parser.add_argument('jams_file',\n                        help='Path to the input jams file')\n\n    parser.add_argument('output_prefix', help='Prefix for output files')\n\n    return vars(parser.parse_args(args))"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nconvert a given annotation to the target namespace.", "response": "def convert(annotation, target_namespace):\n    '''Convert a given annotation to the target namespace.\n\n    Parameters\n    ----------\n    annotation : jams.Annotation\n        An annotation object\n\n    target_namespace : str\n        The target namespace\n\n    Returns\n    -------\n    mapped_annotation : jams.Annotation\n        if `annotation` already belongs to `target_namespace`, then\n        it is returned directly.\n\n        otherwise, `annotation` is copied and automatically converted\n        to the target namespace.\n\n    Raises\n    ------\n    SchemaError\n        if the input annotation fails to validate\n\n    NamespaceError\n        if no conversion is possible\n\n    Examples\n    --------\n    Convert frequency measurements in Hz to MIDI\n\n    >>> ann_midi = jams.convert(ann_hz, 'note_midi')\n\n    And back to Hz\n\n    >>> ann_hz2 = jams.convert(ann_midi, 'note_hz')\n    '''\n\n    # First, validate the input. If this fails, we can't auto-convert.\n    annotation.validate(strict=True)\n\n    # If we're already in the target namespace, do nothing\n    if annotation.namespace == target_namespace:\n        return annotation\n\n    if target_namespace in __CONVERSION__:\n        # Otherwise, make a copy to mangle\n        annotation = deepcopy(annotation)\n\n        # Look for a way to map this namespace to the target\n        for source in __CONVERSION__[target_namespace]:\n            if annotation.search(namespace=source):\n                return __CONVERSION__[target_namespace][source](annotation)\n\n    # No conversion possible\n    raise NamespaceError('Unable to convert annotation from namespace='\n                         '\"{0}\" to \"{1}\"'.format(annotation.namespace,\n                                                 target_namespace))"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef can_convert(annotation, target_namespace):\n    '''Test if an annotation can be mapped to a target namespace\n\n    Parameters\n    ----------\n    annotation : jams.Annotation\n        An annotation object\n\n    target_namespace : str\n        The target namespace\n\n    Returns\n    -------\n    True\n        if `annotation` can be automatically converted to\n        `target_namespace`\n\n    False\n        otherwise\n    '''\n\n    # If we're already in the target namespace, do nothing\n    if annotation.namespace == target_namespace:\n        return True\n\n    if target_namespace in __CONVERSION__:\n        # Look for a way to map this namespace to the target\n        for source in __CONVERSION__[target_namespace]:\n            if annotation.search(namespace=source):\n                return True\n    return False", "response": "Test if an annotation can be mapped to a target namespace\n   "}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef pitch_hz_to_contour(annotation):\n    '''Convert a pitch_hz annotation to a contour'''\n    annotation.namespace = 'pitch_contour'\n    data = annotation.pop_data()\n\n    for obs in data:\n        annotation.append(time=obs.time, duration=obs.duration,\n                          confidence=obs.confidence,\n                          value=dict(index=0,\n                                     frequency=np.abs(obs.value),\n                                     voiced=obs.value > 0))\n    return annotation", "response": "Convert a pitch_hz annotation to a contour"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef note_hz_to_midi(annotation):\n    '''Convert a pitch_hz annotation to pitch_midi'''\n\n    annotation.namespace = 'note_midi'\n\n    data = annotation.pop_data()\n\n    for obs in data:\n        annotation.append(time=obs.time, duration=obs.duration,\n                          confidence=obs.confidence,\n                          value=12 * (np.log2(obs.value) - np.log2(440.0)) + 69)\n\n    return annotation", "response": "Convert a pitch_hz annotation to pitch_midi"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nconvert scaper annotations to tag_open", "response": "def scaper_to_tag(annotation):\n    '''Convert scaper annotations to tag_open'''\n\n    annotation.namespace = 'tag_open'\n\n    data = annotation.pop_data()\n    for obs in data:\n        annotation.append(time=obs.time, duration=obs.duration,\n                          confidence=obs.confidence, value=obs.value['label'])\n\n    return annotation"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef load(path_or_file, validate=True, strict=True, fmt='auto'):\n\n    with _open(path_or_file, mode='r', fmt=fmt) as fdesc:\n        jam = JAMS(**json.load(fdesc))\n\n    if validate:\n        jam.validate(strict=strict)\n\n    return jam", "response": "r Loads a JAMS object from a file."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\npopping a prefix from a query string.", "response": "def query_pop(query, prefix, sep='.'):\n    '''Pop a prefix from a query string.\n\n\n    Parameters\n    ----------\n    query : str\n        The query string\n\n    prefix : str\n        The prefix string to pop, if it exists\n\n    sep : str\n        The string to separate fields\n\n    Returns\n    -------\n    popped : str\n        `query` with a `prefix` removed from the front (if found)\n        or `query` if the prefix was not found\n\n    Examples\n    --------\n    >>> query_pop('Annotation.namespace', 'Annotation')\n    'namespace'\n    >>> query_pop('namespace', 'Annotation')\n    'namespace'\n\n    '''\n\n    terms = query.split(sep)\n\n    if terms[0] == prefix:\n        terms = terms[1:]\n\n    return sep.join(terms)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef match_query(string, query):\n    '''Test if a string matches a query.\n\n    Parameters\n    ----------\n    string : str\n        The string to test\n\n    query : string, callable, or object\n        Either a regular expression, callable function, or object.\n\n    Returns\n    -------\n    match : bool\n        `True` if:\n        - `query` is a callable and `query(string) == True`\n        - `query` is a regular expression and `re.match(query, string)`\n        - or `string == query` for any other query\n\n        `False` otherwise\n\n    '''\n\n    if six.callable(query):\n        return query(string)\n\n    elif (isinstance(query, six.string_types) and\n          isinstance(string, six.string_types)):\n        return re.match(query, string) is not None\n\n    else:\n        return query == string", "response": "Test if a string matches a query."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef summary(obj, indent=0):\n    '''Helper function to format repr strings for JObjects and friends.\n\n    Parameters\n    ----------\n    obj\n        The object to repr\n\n    indent : int >= 0\n        indent each new line by `indent` spaces\n\n    Returns\n    -------\n    r : str\n        If `obj` has a `__summary__` method, it is used.\n\n        If `obj` is a `SortedKeyList`, then it returns a description\n        of the length of the list.\n\n        Otherwise, `repr(obj)`.\n    '''\n    if hasattr(obj, '__summary__'):\n        rep = obj.__summary__()\n    elif isinstance(obj, SortedKeyList):\n        rep = '<{:d} observations>'.format(len(obj))\n    else:\n        rep = repr(obj)\n\n    return rep.replace('\\n', '\\n' + ' ' * indent)", "response": "Helper function to format repr strings for JObjects and friends."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nupdating the attributes of a JObject.", "response": "def update(self, **kwargs):\n        '''Update the attributes of a JObject.\n\n        Parameters\n        ----------\n        kwargs\n            Keyword arguments of the form `attribute=new_value`\n\n        Examples\n        --------\n        >>> J = jams.JObject(foo=5)\n        >>> J.dumps()\n        '{\"foo\": 5}'\n        >>> J.update(bar='baz')\n        >>> J.dumps()\n        '{\"foo\": 5, \"bar\": \"baz\"}'\n        '''\n        for name, value in six.iteritems(kwargs):\n            setattr(self, name, value)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nsearch this object and its descendants.", "response": "def search(self, **kwargs):\n        '''Query this object (and its descendants).\n\n        Parameters\n        ----------\n        kwargs\n            Each `(key, value)` pair encodes a search field in `key`\n            and a target value in `value`.\n\n            `key` must be a string, and should correspond to a property in\n            the JAMS object hierarchy, e.g., 'Annotation.namespace` or `email`\n\n            `value` must be either an object (tested for equality), a\n            string describing a search pattern (regular expression), or a\n            lambda function which evaluates to `True` if the candidate\n            object matches the search criteria and `False` otherwise.\n\n        Returns\n        -------\n        match : bool\n            `True` if any of the search keys match the specified value,\n            `False` otherwise, or if the search keys do not exist\n            within the object.\n\n        Examples\n        --------\n        >>> J = jams.JObject(foo=5, needle='quick brown fox')\n        >>> J.search(needle='.*brown.*')\n        True\n        >>> J.search(needle='.*orange.*')\n        False\n        >>> J.search(badger='.*brown.*')\n        False\n        >>> J.search(foo=5)\n        True\n        >>> J.search(foo=10)\n        False\n        >>> J.search(foo=lambda x: x < 10)\n        True\n        >>> J.search(foo=lambda x: x > 10)\n        False\n        '''\n\n        match = False\n\n        r_query = {}\n        myself = self.__class__.__name__\n\n        # Pop this object name off the query\n        for k, value in six.iteritems(kwargs):\n            k_pop = query_pop(k, myself)\n\n            if k_pop:\n                r_query[k_pop] = value\n\n        if not r_query:\n            return False\n\n        for key in r_query:\n            if hasattr(self, key):\n                match |= match_query(getattr(self, key), r_query[key])\n\n        if not match:\n            for attr in dir(self):\n                obj = getattr(self, attr)\n\n                if isinstance(obj, JObject):\n                    match |= obj.search(**r_query)\n\n        return match"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef validate(self, strict=True):\n        '''Validate a JObject against its schema\n\n        Parameters\n        ----------\n        strict : bool\n            Enforce strict schema validation\n\n        Returns\n        -------\n        valid : bool\n            True if the jam validates\n            False if not, and `strict==False`\n\n        Raises\n        ------\n        SchemaError\n            If `strict==True` and `jam` fails validation\n        '''\n\n        valid = True\n\n        try:\n            jsonschema.validate(self.__json__, self.__schema__)\n\n        except jsonschema.ValidationError as invalid:\n            if strict:\n                raise SchemaError(str(invalid))\n            else:\n                warnings.warn(str(invalid))\n\n            valid = False\n\n        return valid", "response": "Validate a JObject against its schema."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef append(self, time=None, duration=None, value=None, confidence=None):\n        '''Append an observation to the data field\n\n        Parameters\n        ----------\n        time : float >= 0\n        duration : float >= 0\n            The time and duration of the new observation, in seconds\n        value\n        confidence\n            The value and confidence of the new observations.\n\n            Types and values should conform to the namespace of the\n            Annotation object.\n\n        Examples\n        --------\n        >>> ann = jams.Annotation(namespace='chord')\n        >>> ann.append(time=3, duration=2, value='E#')\n        '''\n\n        self.data.add(Observation(time=float(time),\n                                  duration=float(duration),\n                                  value=value,\n                                  confidence=confidence))", "response": "Append an observation to the data field\n           "}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nadds observations from row - major storage.", "response": "def append_records(self, records):\n        '''Add observations from row-major storage.\n\n        This is primarily useful for deserializing sparsely packed data.\n\n        Parameters\n        ----------\n        records : iterable of dicts or Observations\n            Each element of `records` corresponds to one observation.\n        '''\n        for obs in records:\n            if isinstance(obs, Observation):\n                self.append(**obs._asdict())\n            else:\n                self.append(**obs)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nadding observations from column - major storage.", "response": "def append_columns(self, columns):\n        '''Add observations from column-major storage.\n\n        This is primarily used for deserializing densely packed data.\n\n        Parameters\n        ----------\n        columns : dict of lists\n            Keys must be `time, duration, value, confidence`,\n            and each much be a list of equal length.\n\n        '''\n        self.append_records([dict(time=t, duration=d, value=v, confidence=c)\n                             for (t, d, v, c)\n                             in six.moves.zip(columns['time'],\n                                              columns['duration'],\n                                              columns['value'],\n                                              columns['confidence'])])"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef validate(self, strict=True):\n        '''Validate this annotation object against the JAMS schema,\n        and its data against the namespace schema.\n\n        Parameters\n        ----------\n        strict : bool\n            If `True`, then schema violations will cause an Exception.\n            If `False`, then schema violations will issue a warning.\n\n        Returns\n        -------\n        valid : bool\n            `True` if the object conforms to schema.\n            `False` if the object fails to conform to schema,\n            but `strict == False`.\n\n        Raises\n        ------\n        SchemaError\n            If `strict == True` and the object fails validation\n\n        See Also\n        --------\n        JObject.validate\n        '''\n\n        # Get the schema for this annotation\n        ann_schema = schema.namespace_array(self.namespace)\n\n        valid = True\n\n        try:\n            jsonschema.validate(self.__json_light__(data=False),\n                                schema.JAMS_SCHEMA)\n\n            # validate each record in the frame\n            data_ser = [serialize_obj(obs) for obs in self.data]\n            jsonschema.validate(data_ser, ann_schema)\n\n        except jsonschema.ValidationError as invalid:\n            if strict:\n                raise SchemaError(str(invalid))\n            else:\n                warnings.warn(str(invalid))\n            valid = False\n\n        return valid", "response": "Validate this object against the JAMS schema and its data against the namespace schema."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef trim(self, start_time, end_time, strict=False):\n        '''\n        Trim the annotation and return as a new `Annotation` object.\n\n        Trimming will result in the new annotation only containing observations\n        that occur in the intersection of the time range spanned by the\n        annotation and the time range specified by the user. The new annotation\n        will span the time range ``[trim_start, trim_end]`` where\n        ``trim_start = max(self.time, start_time)`` and ``trim_end =\n        min(self.time + self.duration, end_time)``.\n\n        If ``strict=False`` (default) observations that start before\n        ``trim_start`` and end after it will be trimmed such that they start at\n        ``trim_start``, and similarly observations that start before\n        ``trim_end`` and end after it will be trimmed to end at ``trim_end``.\n        If ``strict=True`` such borderline observations will be discarded.\n\n        The new duration of the annotation will be ``trim_end - trim_start``.\n\n        Note that if the range defined by ``[start_time, end_time]``\n        doesn't intersect with the original time range spanned by the\n        annotation the resulting annotation will contain no observations, will\n        have the same start time as the original annotation and have duration\n        0.\n\n        This function also copies over all the annotation metadata from the\n        original annotation and documents the trim operation by adding a list\n        of tuples to the annotation's sandbox keyed by\n        ``Annotation.sandbox.trim`` which documents each trim operation with a\n        tuple ``(start_time, end_time, trim_start, trim_end)``.\n\n        Parameters\n        ----------\n        start_time : float\n            The desired start time for the trimmed annotation in seconds.\n        end_time\n            The desired end time for the trimmed annotation in seconds. Must be\n            greater than ``start_time``.\n        strict : bool\n            When ``False`` (default) observations that lie at the boundaries of\n            the trimming range (given by ``[trim_start, trim_end]`` as\n            described above), i.e. observations that start before and end after\n            either the trim start or end time, will have their time and/or\n            duration adjusted such that only the part of the observation that\n            lies within the trim range is kept. When ``True`` such observations\n            are discarded and not included in the trimmed annotation.\n\n        Returns\n        -------\n        ann_trimmed : Annotation\n            The trimmed annotation, returned as a new jams.Annotation object.\n            If the trim range specified by ``[start_time, end_time]`` does not\n            intersect at all with the original time range of the annotation a\n            warning will be issued and the returned annotation will be empty.\n\n        Raises\n        ------\n        ParameterError\n            If ``end_time`` is not greater than ``start_time``.\n\n        Examples\n        --------\n        >>> ann = jams.Annotation(namespace='tag_open', time=2, duration=8)\n        >>> ann.append(time=2, duration=2, value='one')\n        >>> ann.append(time=4, duration=2, value='two')\n        >>> ann.append(time=6, duration=2, value='three')\n        >>> ann.append(time=7, duration=2, value='four')\n        >>> ann.append(time=8, duration=2, value='five')\n        >>> ann_trim = ann.trim(5, 8, strict=False)\n        >>> print(ann_trim.time, ann_trim.duration)\n        (5, 3)\n        >>> ann_trim.to_dataframe()\n           time  duration  value confidence\n        0     5         1    two       None\n        1     6         2  three       None\n        2     7         1   four       None\n        >>> ann_trim_strict = ann.trim(5, 8, strict=True)\n        >>> print(ann_trim_strict.time, ann_trim_strict.duration)\n        (5, 3)\n        >>> ann_trim_strict.to_dataframe()\n           time  duration  value confidence\n        0     6         2  three       None\n        '''\n        # Check for basic start_time and end_time validity\n        if end_time <= start_time:\n            raise ParameterError(\n                'end_time must be greater than start_time.')\n\n        # If the annotation does not have a set duration value, we'll assume\n        # trimming is possible (up to the user to ensure this is valid).\n        if self.duration is None:\n            orig_time = start_time\n            orig_duration = end_time - start_time\n            warnings.warn(\n                \"Annotation.duration is not defined, cannot check \"\n                \"for temporal intersection, assuming the annotation \"\n                \"is valid between start_time and end_time.\")\n        else:\n            orig_time = self.time\n            orig_duration = self.duration\n\n        # Check whether there is intersection between the trim range and\n        # annotation: if not raise a warning and set trim_start and trim_end\n        # appropriately.\n        if start_time > (orig_time + orig_duration) or (end_time < orig_time):\n            warnings.warn(\n                'Time range defined by [start_time,end_time] does not '\n                'intersect with the time range spanned by this annotation, '\n                'the trimmed annotation will be empty.')\n            trim_start = self.time\n            trim_end = trim_start\n        else:\n            # Determine new range\n            trim_start = max(orig_time, start_time)\n            trim_end = min(orig_time + orig_duration, end_time)\n\n        # Create new annotation with same namespace/metadata\n        ann_trimmed = Annotation(\n            self.namespace,\n            data=None,\n            annotation_metadata=self.annotation_metadata,\n            sandbox=self.sandbox,\n            time=trim_start,\n            duration=trim_end - trim_start)\n\n        # Selectively add observations based on their start time / duration\n        # We do this rather than copying and directly manipulating the\n        # annotation' data frame (which might be faster) since this way trim is\n        # independent of the internal data representation.\n        for obs in self.data:\n\n            obs_start = obs.time\n            obs_end = obs_start + obs.duration\n\n            if obs_start < trim_end and obs_end > trim_start:\n\n                new_start = max(obs_start, trim_start)\n                new_end = min(obs_end, trim_end)\n                new_duration = new_end - new_start\n\n                if ((not strict) or\n                        (new_start == obs_start and new_end == obs_end)):\n                    ann_trimmed.append(time=new_start,\n                                       duration=new_duration,\n                                       value=obs.value,\n                                       confidence=obs.confidence)\n\n        if 'trim' not in ann_trimmed.sandbox.keys():\n            ann_trimmed.sandbox.update(\n                trim=[{'start_time': start_time, 'end_time': end_time,\n                       'trim_start': trim_start, 'trim_end': trim_end}])\n        else:\n            ann_trimmed.sandbox.trim.append(\n                {'start_time': start_time, 'end_time': end_time,\n                 'trim_start': trim_start, 'trim_end': trim_end})\n\n        return ann_trimmed", "response": "Trim the annotation and return as a new Annotation object."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nslicing the annotation and return as a new Annotation object.", "response": "def slice(self, start_time, end_time, strict=False):\n        '''\n        Slice the annotation and return as a new `Annotation` object.\n\n        Slicing has the same effect as trimming (see `Annotation.trim`) except\n        that while trimming does not modify the start time of the annotation or\n        the observations it contains, slicing will set the new annotation's\n        start time to ``max(0, trimmed_annotation.time - start_time)`` and the\n        start time of its observations will be set with respect to this new\n        reference start time.\n\n        This function documents the slice operation by adding a list of tuples\n        to the annotation's sandbox keyed by ``Annotation.sandbox.slice`` which\n        documents each slice operation with a tuple\n        ``(start_time, end_time, slice_start, slice_end)``, where\n        ``slice_start`` and ``slice_end`` are given by ``trim_start`` and\n        ``trim_end`` (see `Annotation.trim`).\n\n        Since slicing is implemented  using trimming, the trimming operation\n        will also be documented in ``Annotation.sandbox.trim`` as described in\n        `Annotation.trim`.\n\n        This function is useful for example when trimming an audio file,\n        allowing the user to trim the annotation while ensuring all time\n        information matches the new trimmed audio file.\n\n        Parameters\n        ----------\n        start_time : float\n            The desired start time for slicing in seconds.\n        end_time\n            The desired end time for slicing in seconds. Must be greater than\n            ``start_time``.\n        strict : bool\n            When ``False`` (default) observations that lie at the boundaries of\n            the slice (see `Annotation.trim` for details) will have their time\n            and/or duration adjusted such that only the part of the observation\n            that lies within the slice range is kept. When ``True`` such\n            observations are discarded and not included in the sliced\n            annotation.\n\n        Returns\n        -------\n        sliced_ann : Annotation\n            The sliced annotation.\n\n        See Also\n        --------\n        Annotation.trim\n\n        Examples\n        --------\n        >>> ann = jams.Annotation(namespace='tag_open', time=2, duration=8)\n        >>> ann.append(time=2, duration=2, value='one')\n        >>> ann.append(time=4, duration=2, value='two')\n        >>> ann.append(time=6, duration=2, value='three')\n        >>> ann.append(time=7, duration=2, value='four')\n        >>> ann.append(time=8, duration=2, value='five')\n        >>> ann_slice = ann.slice(5, 8, strict=False)\n        >>> print(ann_slice.time, ann_slice.duration)\n        (0, 3)\n        >>> ann_slice.to_dataframe()\n           time  duration  value confidence\n        0   0.0       1.0    two       None\n        1   1.0       2.0  three       None\n        2   2.0       1.0   four       None\n        >>> ann_slice_strict = ann.slice(5, 8, strict=True)\n        >>> print(ann_slice_strict.time, ann_slice_strict.duration)\n        (0, 3)\n        >>> ann_slice_strict.to_dataframe()\n           time  duration  value confidence\n        0   1.0       2.0  three       None\n        '''\n        # start by trimming the annotation\n        sliced_ann = self.trim(start_time, end_time, strict=strict)\n        raw_data = sliced_ann.pop_data()\n\n        # now adjust the start time of the annotation and the observations it\n        # contains.\n\n        for obs in raw_data:\n            new_time = max(0, obs.time - start_time)\n            # if obs.time > start_time,\n            #   duration doesn't change\n            # if obs.time < start_time,\n            #   duration shrinks by start_time - obs.time\n            sliced_ann.append(time=new_time,\n                              duration=obs.duration,\n                              value=obs.value,\n                              confidence=obs.confidence)\n\n        ref_time = sliced_ann.time\n        slice_start = ref_time\n        slice_end = ref_time + sliced_ann.duration\n\n        if 'slice' not in sliced_ann.sandbox.keys():\n            sliced_ann.sandbox.update(\n                slice=[{'start_time': start_time, 'end_time': end_time,\n                        'slice_start': slice_start, 'slice_end': slice_end}])\n        else:\n            sliced_ann.sandbox.slice.append(\n                {'start_time': start_time, 'end_time': end_time,\n                 'slice_start': slice_start, 'slice_end': slice_end})\n\n        # Update the timing for the sliced annotation\n        sliced_ann.time = max(0, ref_time - start_time)\n\n        return sliced_ann"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef pop_data(self):\n        '''Replace this observation's data with a fresh container.\n\n        Returns\n        -------\n        annotation_data : SortedKeyList\n            The original annotation data container\n        '''\n\n        data = self.data\n        self.data = SortedKeyList(key=self._key)\n        return data", "response": "Replace this observation s data with a fresh container."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nextracting observation data in a mir_eval - friendly format.", "response": "def to_interval_values(self):\n        '''Extract observation data in a `mir_eval`-friendly format.\n\n        Returns\n        -------\n        intervals : np.ndarray [shape=(n, 2), dtype=float]\n            Start- and end-times of all valued intervals\n\n            `intervals[i, :] = [time[i], time[i] + duration[i]]`\n\n        labels : list\n            List view of value field.\n        '''\n\n        ints, vals = [], []\n        for obs in self.data:\n            ints.append([obs.time, obs.time + obs.duration])\n            vals.append(obs.value)\n\n        if not ints:\n            return np.empty(shape=(0, 2), dtype=float), []\n\n        return np.array(ints), vals"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nextract observation data in a mir_eval - friendly format.", "response": "def to_event_values(self):\n        '''Extract observation data in a `mir_eval`-friendly format.\n\n        Returns\n        -------\n        times : np.ndarray [shape=(n,), dtype=float]\n            Start-time of all observations\n\n        labels : list\n            List view of value field.\n        '''\n        ints, vals = [], []\n        for obs in self.data:\n            ints.append(obs.time)\n            vals.append(obs.value)\n\n        return np.array(ints), vals"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef to_samples(self, times, confidence=False):\n        '''Sample the annotation at specified times.\n\n        Parameters\n        ----------\n        times : np.ndarray, non-negative, ndim=1\n            The times (in seconds) to sample the annotation\n\n        confidence : bool\n            If `True`, return both values and confidences.\n            If `False` (default) only return values.\n\n        Returns\n        -------\n        values : list\n            `values[i]` is a list of observation values for intervals\n            that cover `times[i]`.\n\n        confidence : list (optional)\n            `confidence` values corresponding to `values`\n        '''\n        times = np.asarray(times)\n        if times.ndim != 1 or np.any(times < 0):\n            raise ParameterError('times must be 1-dimensional and non-negative')\n\n        idx = np.argsort(times)\n        samples = times[idx]\n\n        values = [list() for _ in samples]\n        confidences = [list() for _ in samples]\n\n        for obs in self.data:\n            start = np.searchsorted(samples, obs.time)\n            end = np.searchsorted(samples, obs.time + obs.duration, side='right')\n            for i in range(start, end):\n                values[idx[i]].append(obs.value)\n                confidences[idx[i]].append(obs.confidence)\n\n        if confidence:\n            return values, confidences\n        else:\n            return values", "response": "Sample the annotation at specified times."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef to_html(self, max_rows=None):\n        '''Render this annotation list in HTML\n\n        Returns\n        -------\n        rendered : str\n            An HTML table containing this annotation's data.\n        '''\n        n = len(self.data)\n\n        div_id = _get_divid(self)\n\n        out = r'''  <div class=\"panel panel-default\">\n                        <div class=\"panel-heading\" role=\"tab\" id=\"heading-{0}\">\n                            <button\n                                type=\"button\"\n                                data-toggle=\"collapse\"\n                                data-parent=\"#accordion\"\n                                href=\"#{0}\"\n                                aria-expanded=\"false\"\n                                class=\"collapsed btn btn-info btn-block\"\n                                aria-controls=\"{0}\">\n                                {1:s}\n                                <span class=\"badge pull-right\">{2:d}</span>\n                            </button>\n                        </div>'''.format(div_id, self.namespace, n)\n\n        out += r'''     <div id=\"{0}\" class=\"panel-collapse collapse\"\n                             role=\"tabpanel\" aria-labelledby=\"heading-{0}\">\n                            <div class=\"panel-body\">'''.format(div_id)\n\n        out += r'''<div class=\"pull-right\">\n                        {}\n                    </div>'''.format(self.annotation_metadata._repr_html_())\n        out += r'''<div class=\"pull-right clearfix\">\n                        {}\n                    </div>'''.format(self.sandbox._repr_html_())\n\n        # -- Annotation content starts here\n        out += r'''<div><table border=\"1\" class=\"dataframe\">\n                    <thead>\n                        <tr style=\"text-align: right;\">\n                            <th></th>\n                            <th>time</th>\n                            <th>duration</th>\n                            <th>value</th>\n                            <th>confidence</th>\n                        </tr>\n                    </thead>'''.format(self.namespace, n)\n\n        out += r'''<tbody>'''\n\n        if max_rows is None or n <= max_rows:\n            out += self._fmt_rows(0, n)\n        else:\n            out += self._fmt_rows(0, max_rows//2)\n            out += r'''<tr>\n                            <th>...</th>\n                            <td>...</td>\n                            <td>...</td>\n                            <td>...</td>\n                            <td>...</td>\n                        </tr>'''\n            out += self._fmt_rows(n-max_rows//2, n)\n\n        out += r'''</tbody>'''\n\n        out += r'''</table></div>'''\n\n        out += r'''</div></div></div>'''\n        return out", "response": "Render this annotation list in HTML."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _key(cls, obs):\n        '''Provides sorting index for Observation objects'''\n        if not isinstance(obs, Observation):\n            raise JamsError('{} must be of type jams.Observation'.format(obs))\n\n        return obs.time", "response": "Provides sorting index for Observation objects"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nfilters the annotation array down to only those Annotation objects matching the query.", "response": "def search(self, **kwargs):\n        '''Filter the annotation array down to only those Annotation\n        objects matching the query.\n\n\n        Parameters\n        ----------\n        kwargs : search parameters\n            See JObject.search\n\n        Returns\n        -------\n        results : AnnotationArray\n            An annotation array of the objects matching the query\n\n        See Also\n        --------\n        JObject.search\n        '''\n\n        results = AnnotationArray()\n\n        for annotation in self:\n            if annotation.search(**kwargs):\n                results.append(annotation)\n\n        return results"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef trim(self, start_time, end_time, strict=False):\n        '''\n        Trim every annotation contained in the annotation array using\n        `Annotation.trim` and return as a new `AnnotationArray`.\n\n        See `Annotation.trim` for details about trimming. This function does\n        not modify the annotations in the original annotation array.\n\n\n        Parameters\n        ----------\n        start_time : float\n            The desired start time for the trimmed annotations in seconds.\n        end_time\n            The desired end time for trimmed annotations in seconds. Must be\n            greater than ``start_time``.\n        strict : bool\n            When ``False`` (default) observations that lie at the boundaries of\n            the trimming range (see `Annotation.trim` for details) will have\n            their time and/or duration adjusted such that only the part of the\n            observation that lies within the trim range is kept. When ``True``\n            such observations are discarded and not included in the trimmed\n            annotation.\n\n        Returns\n        -------\n        trimmed_array : AnnotationArray\n            An annotation array where every annotation has been trimmed.\n        '''\n        trimmed_array = AnnotationArray()\n        for ann in self:\n            trimmed_array.append(ann.trim(start_time, end_time, strict=strict))\n\n        return trimmed_array", "response": "Returns an array of all the annotations in this array that are contained within the specified time range."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nslices every annotation in the array using the slice method and return as a new array.", "response": "def slice(self, start_time, end_time, strict=False):\n        '''\n        Slice every annotation contained in the annotation array using\n        `Annotation.slice`\n        and return as a new AnnotationArray\n\n        See `Annotation.slice` for details about slicing. This function does\n        not modify the annotations in the original annotation array.\n\n        Parameters\n        ----------\n        start_time : float\n            The desired start time for slicing in seconds.\n        end_time\n            The desired end time for slicing in seconds. Must be greater than\n            ``start_time``.\n        strict : bool\n            When ``False`` (default) observations that lie at the boundaries of\n            the slicing range (see `Annotation.slice` for details) will have\n            their time and/or duration adjusted such that only the part of the\n            observation that lies within the trim range is kept. When ``True``\n            such observations are discarded and not included in the sliced\n            annotation.\n\n        Returns\n        -------\n        sliced_array : AnnotationArray\n            An annotation array where every annotation has been sliced.\n        '''\n        sliced_array = AnnotationArray()\n        for ann in self:\n            sliced_array.append(ann.slice(start_time, end_time, strict=strict))\n\n        return sliced_array"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef add(self, jam, on_conflict='fail'):\n\n        if on_conflict not in ['overwrite', 'fail', 'ignore']:\n            raise ParameterError(\"on_conflict='{}' is not in ['fail', \"\n                                 \"'overwrite', 'ignore'].\".format(on_conflict))\n\n        if not self.file_metadata == jam.file_metadata:\n            if on_conflict == 'overwrite':\n                self.file_metadata = jam.file_metadata\n            elif on_conflict == 'fail':\n                raise JamsError(\"Metadata conflict! \"\n                                \"Resolve manually or force-overwrite it.\")\n\n        self.annotations.extend(jam.annotations)\n        self.sandbox.update(**jam.sandbox)", "response": "Add the contents of another jam to this object."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef save(self, path_or_file, strict=True, fmt='auto'):\n\n        self.validate(strict=strict)\n\n        with _open(path_or_file, mode='w', fmt=fmt) as fdesc:\n            json.dump(self.__json__, fdesc, indent=2)", "response": "Serialize the JAMS object to a JSON formatted stream to file."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef validate(self, strict=True):\n        '''Validate a JAMS object against the schema.\n\n        Parameters\n        ----------\n        strict : bool\n            If `True`, an exception will be raised on validation failure.\n            If `False`, a warning will be raised on validation failure.\n\n        Returns\n        -------\n        valid : bool\n            `True` if the object passes schema validation.\n            `False` otherwise.\n\n        Raises\n        ------\n        SchemaError\n            If `strict==True` and the JAMS object does not match the schema\n\n        See Also\n        --------\n        jsonschema.validate\n\n        '''\n        valid = True\n        try:\n            jsonschema.validate(self.__json_light__, schema.JAMS_SCHEMA)\n\n            for ann in self.annotations:\n                if isinstance(ann, Annotation):\n                    valid &= ann.validate(strict=strict)\n                else:\n                    msg = '{} is not a well-formed JAMS Annotation'.format(ann)\n                    valid = False\n                    if strict:\n                        raise SchemaError(msg)\n                    else:\n                        warnings.warn(str(msg))\n\n        except jsonschema.ValidationError as invalid:\n            if strict:\n                raise SchemaError(str(invalid))\n            else:\n                warnings.warn(str(invalid))\n\n            valid = False\n\n        return valid", "response": "Validate a JAMS object against the schema."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreturning a new JAMS object with all the annotations in the jam that are in the specified time range.", "response": "def trim(self, start_time, end_time, strict=False):\n        '''\n        Trim all the annotations inside the jam and return as a new `JAMS`\n        object.\n\n        See `Annotation.trim` for details about how the annotations\n        are trimmed.\n\n        This operation is also documented in the jam-level sandbox\n        with a list keyed by ``JAMS.sandbox.trim`` containing a tuple for each\n        jam-level trim of the form ``(start_time, end_time)``.\n\n        This function also copies over all of the file metadata from the\n        original jam.\n\n        Note: trimming does not affect the duration of the jam, i.e. the value\n        of ``JAMS.file_metadata.duration`` will be the same for the original\n        and trimmed jams.\n\n        Parameters\n        ----------\n        start_time : float\n            The desired start time for the trimmed annotations in seconds.\n        end_time\n            The desired end time for trimmed annotations in seconds. Must be\n            greater than ``start_time``.\n        strict : bool\n            When ``False`` (default) observations that lie at the boundaries of\n            the trimming range (see `Annotation.trim` for details), will have\n            their time and/or duration adjusted such that only the part of the\n            observation that lies within the trim range is kept. When ``True``\n            such observations are discarded and not included in the trimmed\n            annotation.\n\n        Returns\n        -------\n        jam_trimmed : JAMS\n            The trimmed jam with trimmed annotations, returned as a new JAMS\n            object.\n\n        '''\n        # Make sure duration is set in file metadata\n        if self.file_metadata.duration is None:\n            raise JamsError(\n                'Duration must be set (jam.file_metadata.duration) before '\n                'trimming can be performed.')\n\n        # Make sure start and end times are within the file start/end times\n        if not (0 <= start_time <= end_time <= float(\n                self.file_metadata.duration)):\n            raise ParameterError(\n                'start_time and end_time must be within the original file '\n                'duration ({:f}) and end_time cannot be smaller than '\n                'start_time.'.format(float(self.file_metadata.duration)))\n\n        # Create a new jams\n        jam_trimmed = JAMS(annotations=None,\n                           file_metadata=self.file_metadata,\n                           sandbox=self.sandbox)\n\n        # trim annotations\n        jam_trimmed.annotations = self.annotations.trim(\n            start_time, end_time, strict=strict)\n\n        # Document jam-level trim in top level sandbox\n        if 'trim' not in jam_trimmed.sandbox.keys():\n            jam_trimmed.sandbox.update(\n                trim=[{'start_time': start_time, 'end_time': end_time}])\n        else:\n            jam_trimmed.sandbox.trim.append(\n                {'start_time': start_time, 'end_time': end_time})\n\n        return jam_trimmed"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef slice(self, start_time, end_time, strict=False):\n        '''\n        Slice all the annotations inside the jam and return as a new `JAMS`\n        object.\n\n        See `Annotation.slice` for details about how the annotations\n        are sliced.\n\n        This operation is also documented in the jam-level sandbox\n        with a list keyed by ``JAMS.sandbox.slice`` containing a tuple for each\n        jam-level slice of the form ``(start_time, end_time)``.\n\n        Since slicing is implemented using trimming, the operation will also be\n        documented in ``JAMS.sandbox.trim`` as described in `JAMS.trim`.\n\n        This function also copies over all of the file metadata from the\n        original jam.\n\n        Note: slicing will affect the duration of the jam, i.e. the new value\n        of ``JAMS.file_metadata.duration`` will be ``end_time - start_time``.\n\n        Parameters\n        ----------\n        start_time : float\n            The desired start time for slicing in seconds.\n        end_time\n            The desired end time for slicing in seconds. Must be greater than\n            ``start_time``.\n        strict : bool\n            When ``False`` (default) observations that lie at the boundaries of\n            the slicing range (see `Annotation.slice` for details), will have\n            their time and/or duration adjusted such that only the part of the\n            observation that lies within the slice range is kept. When ``True``\n            such observations are discarded and not included in the sliced\n            annotation.\n\n        Returns\n        -------\n        jam_sliced: JAMS\n            The sliced jam with sliced annotations, returned as a new\n            JAMS object.\n\n        '''\n        # Make sure duration is set in file metadata\n        if self.file_metadata.duration is None:\n            raise JamsError(\n                'Duration must be set (jam.file_metadata.duration) before '\n                'slicing can be performed.')\n\n        # Make sure start and end times are within the file start/end times\n        if (start_time < 0 or\n                start_time > float(self.file_metadata.duration) or\n                end_time < start_time or\n                end_time > float(self.file_metadata.duration)):\n            raise ParameterError(\n                'start_time and end_time must be within the original file '\n                'duration ({:f}) and end_time cannot be smaller than '\n                'start_time.'.format(float(self.file_metadata.duration)))\n\n        # Create a new jams\n        jam_sliced = JAMS(annotations=None,\n                          file_metadata=self.file_metadata,\n                          sandbox=self.sandbox)\n\n        # trim annotations\n        jam_sliced.annotations = self.annotations.slice(\n            start_time, end_time, strict=strict)\n\n        # adjust dutation\n        jam_sliced.file_metadata.duration = end_time - start_time\n\n        # Document jam-level trim in top level sandbox\n        if 'slice' not in jam_sliced.sandbox.keys():\n            jam_sliced.sandbox.update(\n                slice=[{'start_time': start_time, 'end_time': end_time}])\n        else:\n            jam_sliced.sandbox.slice.append(\n                {'start_time': start_time, 'end_time': end_time})\n\n        return jam_sliced", "response": "Slice the file metadata of the specified jam."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef pprint_jobject(obj, **kwargs):\n    '''Pretty-print a jobject.\n\n    Parameters\n    ----------\n    obj : jams.JObject\n\n    kwargs\n        additional parameters to `json.dumps`\n\n    Returns\n    -------\n    string\n        A simplified display of `obj` contents.\n    '''\n\n    obj_simple = {k: v for k, v in six.iteritems(obj.__json__) if v}\n\n    string = json.dumps(obj_simple, **kwargs)\n\n    # Suppress braces and quotes\n    string = re.sub(r'[{}\"]', '', string)\n\n    # Kill trailing commas\n    string = re.sub(r',\\n', '\\n', string)\n\n    # Kill blank lines\n    string = re.sub(r'^\\s*$', '', string)\n\n    return string", "response": "Pretty - print a jobject."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef intervals(annotation, **kwargs):\n    '''Plotting wrapper for labeled intervals'''\n    times, labels = annotation.to_interval_values()\n\n    return mir_eval.display.labeled_intervals(times, labels, **kwargs)", "response": "Wrapper for labeled intervals"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nplot wrapper for hierarchical segmentations", "response": "def hierarchy(annotation, **kwargs):\n    '''Plotting wrapper for hierarchical segmentations'''\n    htimes, hlabels = hierarchy_flatten(annotation)\n\n    htimes = [np.asarray(_) for _ in htimes]\n    return mir_eval.display.hierarchy(htimes, hlabels, **kwargs)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nplot wrapper for pitch contours", "response": "def pitch_contour(annotation, **kwargs):\n    '''Plotting wrapper for pitch contours'''\n    ax = kwargs.pop('ax', None)\n\n    # If the annotation is empty, we need to construct a new axes\n    ax = mir_eval.display.__get_axes(ax=ax)[0]\n\n    times, values = annotation.to_interval_values()\n\n    indices = np.unique([v['index'] for v in values])\n\n    for idx in indices:\n        rows = [i for (i, v) in enumerate(values) if v['index'] == idx]\n        freqs = np.asarray([values[r]['frequency'] for r in rows])\n        unvoiced = ~np.asarray([values[r]['voiced'] for r in rows])\n        freqs[unvoiced] *= -1\n\n        ax = mir_eval.display.pitch(times[rows, 0], freqs, unvoiced=True,\n                                    ax=ax,\n                                    **kwargs)\n    return ax"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef event(annotation, **kwargs):\n    '''Plotting wrapper for events'''\n\n    times, values = annotation.to_interval_values()\n\n    if any(values):\n        labels = values\n    else:\n        labels = None\n\n    return mir_eval.display.events(times, labels=labels, **kwargs)", "response": "Plotting wrapper for events"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef beat_position(annotation, **kwargs):\n    '''Plotting wrapper for beat-position data'''\n\n    times, values = annotation.to_interval_values()\n\n    labels = [_['position'] for _ in values]\n\n    # TODO: plot time signature, measure number\n    return mir_eval.display.events(times, labels=labels, **kwargs)", "response": "Plotting wrapper for beat - position data"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef piano_roll(annotation, **kwargs):\n    '''Plotting wrapper for piano rolls'''\n    times, midi = annotation.to_interval_values()\n\n    return mir_eval.display.piano_roll(times, midi=midi, **kwargs)", "response": "Plotting wrapper for piano rolls"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nvisualizes a jams. Annotation object through mir_eval. display", "response": "def display(annotation, meta=True, **kwargs):\n    '''Visualize a jams annotation through mir_eval\n\n    Parameters\n    ----------\n    annotation : jams.Annotation\n        The annotation to display\n\n    meta : bool\n        If `True`, include annotation metadata in the figure\n\n    kwargs\n        Additional keyword arguments to mir_eval.display functions\n\n    Returns\n    -------\n    ax\n        Axis handles for the new display\n\n    Raises\n    ------\n    NamespaceError\n        If the annotation cannot be visualized\n    '''\n\n    for namespace, func in six.iteritems(VIZ_MAPPING):\n        try:\n            ann = coerce_annotation(annotation, namespace)\n\n            axes = func(ann, **kwargs)\n\n            # Title should correspond to original namespace, not the coerced version\n            axes.set_title(annotation.namespace)\n            if meta:\n                description = pprint_jobject(annotation.annotation_metadata, indent=2)\n\n                anchored_box = AnchoredText(description.strip('\\n'),\n                                            loc=2,\n                                            frameon=True,\n                                            bbox_to_anchor=(1.02, 1.0),\n                                            bbox_transform=axes.transAxes,\n                                            borderpad=0.0)\n                axes.add_artist(anchored_box)\n\n                axes.figure.subplots_adjust(right=0.8)\n\n            return axes\n        except NamespaceError:\n            pass\n\n    raise NamespaceError('Unable to visualize annotation of namespace=\"{:s}\"'\n                         .format(annotation.namespace))"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef display_multi(annotations, fig_kw=None, meta=True, **kwargs):\n    '''Display multiple annotations with shared axes\n\n    Parameters\n    ----------\n    annotations : jams.AnnotationArray\n        A collection of annotations to display\n\n    fig_kw : dict\n        Keyword arguments to `plt.figure`\n\n    meta : bool\n        If `True`, display annotation metadata for each annotation\n\n    kwargs\n        Additional keyword arguments to the `mir_eval.display` routines\n\n    Returns\n    -------\n    fig\n        The created figure\n    axs\n        List of subplot axes corresponding to each displayed annotation\n    '''\n    if fig_kw is None:\n        fig_kw = dict()\n\n    fig_kw.setdefault('sharex', True)\n    fig_kw.setdefault('squeeze', True)\n\n    # Filter down to coercable annotations first\n    display_annotations = []\n    for ann in annotations:\n        for namespace in VIZ_MAPPING:\n            if can_convert(ann, namespace):\n                display_annotations.append(ann)\n                break\n\n    # If there are no displayable annotations, fail here\n    if not len(display_annotations):\n        raise ParameterError('No displayable annotations found')\n\n    fig, axs = plt.subplots(nrows=len(display_annotations), ncols=1, **fig_kw)\n\n    # MPL is stupid when making singleton subplots.\n    # We catch this and make it always iterable.\n    if len(display_annotations) == 1:\n        axs = [axs]\n\n    for ann, ax in zip(display_annotations, axs):\n        kwargs['ax'] = ax\n        display(ann, meta=meta, **kwargs)\n\n    return fig, axs", "response": "Display multiple annotations with shared axes."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ngenerates a click sample.", "response": "def mkclick(freq, sr=22050, duration=0.1):\n    '''Generate a click sample.\n\n    This replicates functionality from mir_eval.sonify.clicks,\n    but exposes the target frequency and duration.\n    '''\n\n    times = np.arange(int(sr * duration))\n    click = np.sin(2 * np.pi * times * freq / float(sr))\n    click *= np.exp(- times / (1e-2 * sr))\n\n    return click"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef clicks(annotation, sr=22050, length=None, **kwargs):\n    '''Sonify events with clicks.\n\n    This uses mir_eval.sonify.clicks, and is appropriate for instantaneous\n    events such as beats or segment boundaries.\n    '''\n\n    interval, _ = annotation.to_interval_values()\n\n    return filter_kwargs(mir_eval.sonify.clicks, interval[:, 0],\n                         fs=sr, length=length, **kwargs)", "response": "Sonify events with clicks."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef downbeat(annotation, sr=22050, length=None, **kwargs):\n    '''Sonify beats and downbeats together.\n    '''\n\n    beat_click = mkclick(440 * 2, sr=sr)\n    downbeat_click = mkclick(440 * 3, sr=sr)\n\n    intervals, values = annotation.to_interval_values()\n\n    beats, downbeats = [], []\n\n    for time, value in zip(intervals[:, 0], values):\n        if value['position'] == 1:\n            downbeats.append(time)\n        else:\n            beats.append(time)\n\n    if length is None:\n        length = int(sr * np.max(intervals)) + len(beat_click) + 1\n\n    y = filter_kwargs(mir_eval.sonify.clicks,\n                      np.asarray(beats),\n                      fs=sr, length=length, click=beat_click)\n\n    y += filter_kwargs(mir_eval.sonify.clicks,\n                       np.asarray(downbeats),\n                       fs=sr, length=length, click=downbeat_click)\n\n    return y", "response": "Sonify beats and downbeats together.\n   "}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef chord(annotation, sr=22050, length=None, **kwargs):\n    '''Sonify chords\n\n    This uses mir_eval.sonify.chords.\n    '''\n\n    intervals, chords = annotation.to_interval_values()\n\n    return filter_kwargs(mir_eval.sonify.chords,\n                         chords, intervals,\n                         fs=sr, length=length,\n                         **kwargs)", "response": "Sonify chords\n\n    This uses mir_eval.sonify.chords."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef piano_roll(annotation, sr=22050, length=None, **kwargs):\n    '''Sonify a piano-roll\n\n    This uses mir_eval.sonify.time_frequency, and is appropriate\n    for sparse transcription data, e.g., annotations in the `note_midi`\n    namespace.\n    '''\n\n    intervals, pitches = annotation.to_interval_values()\n\n    # Construct the pitchogram\n    pitch_map = {f: idx for idx, f in enumerate(np.unique(pitches))}\n\n    gram = np.zeros((len(pitch_map), len(intervals)))\n\n    for col, f in enumerate(pitches):\n        gram[pitch_map[f], col] = 1\n\n    return filter_kwargs(mir_eval.sonify.time_frequency,\n                         gram, pitches, intervals,\n                         sr, length=length, **kwargs)", "response": "Sonify a piano - roll\n    This uses mir_eval. sonify. time_frequency and is appropriate\n    for sparse transcription data e. g. annotations in the note_midi namespace."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef sonify(annotation, sr=22050, duration=None, **kwargs):\n    '''Sonify a jams annotation through mir_eval\n\n    Parameters\n    ----------\n    annotation : jams.Annotation\n        The annotation to sonify\n\n    sr = : positive number\n        The sampling rate of the output waveform\n\n    duration : float (optional)\n        Optional length (in seconds) of the output waveform\n\n    kwargs\n        Additional keyword arguments to mir_eval.sonify functions\n\n    Returns\n    -------\n    y_sonified : np.ndarray\n        The waveform of the sonified annotation\n\n    Raises\n    ------\n    NamespaceError\n        If the annotation has an un-sonifiable namespace\n    '''\n\n    length = None\n\n    if duration is None:\n        duration = annotation.duration\n\n    if duration is not None:\n        length = int(duration * sr)\n\n    # If the annotation can be directly sonified, try that first\n    if annotation.namespace in SONIFY_MAPPING:\n        ann = coerce_annotation(annotation, annotation.namespace)\n        return SONIFY_MAPPING[annotation.namespace](ann,\n                                                    sr=sr,\n                                                    length=length,\n                                                    **kwargs)\n\n    for namespace, func in six.iteritems(SONIFY_MAPPING):\n        try:\n            ann = coerce_annotation(annotation, namespace)\n            return func(ann, sr=sr, length=length, **kwargs)\n        except NamespaceError:\n            pass\n\n    raise NamespaceError('Unable to sonify annotation of namespace=\"{:s}\"'\n                         .format(annotation.namespace))", "response": "Sonify a jams annotation through mir_eval. sonify"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nvalidate a jams file against a schema", "response": "def validate(schema_file=None, jams_files=None):\n    '''Validate a jams file against a schema'''\n\n    schema = load_json(schema_file)\n\n    for jams_file in jams_files:\n        try:\n            jams = load_json(jams_file)\n            jsonschema.validate(jams, schema)\n            print '{:s} was successfully validated'.format(jams_file)\n        except jsonschema.ValidationError as exc:\n            print '{:s} was NOT successfully validated'.format(jams_file)\n\n            print exc"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef make_stream_features(self, stream, features):\n        mechs = self.settings['sasl_mechanisms']\n        if mechs and not stream.authenticated:\n            sub = ElementTree.SubElement(features, MECHANISMS_TAG)\n            for mech in mechs:\n                if mech in sasl.SERVER_MECHANISMS:\n                    ElementTree.SubElement(sub, MECHANISM_TAG).text = mech\n        return features", "response": "Add SASL features to the <features/> element of the stream."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef handle_stream_features(self, stream, features):\n        element = features.find(MECHANISMS_TAG)\n        self.peer_sasl_mechanisms = []\n        if element is None:\n            return None\n        for sub in element:\n            if sub.tag != MECHANISM_TAG:\n                continue\n            self.peer_sasl_mechanisms.append(sub.text)\n\n        if stream.authenticated or not self.peer_sasl_mechanisms:\n            return StreamFeatureNotHandled(\"SASL\", mandatory = True)\n\n        username = self.settings.get(\"username\")\n        if not username:\n            # TODO: other rules for s2s\n            if stream.me.local:\n                username = stream.me.local\n            else:\n                username = None\n        self._sasl_authenticate(stream, username, self.settings.get(\"authzid\"))\n        return StreamFeatureHandled(\"SASL\", mandatory = True)", "response": "Process incoming <stream : features > element.\n        [ initiating entity only"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef process_sasl_auth(self, stream, element):\n        if self.authenticator:\n            logger.debug(\"Authentication already started\")\n            return False\n\n        password_db = self.settings[\"password_database\"]\n        mechanism = element.get(\"mechanism\")\n        if not mechanism:\n            stream.send_stream_error(\"bad-format\")\n            raise FatalStreamError(\"<sasl:auth/> with no mechanism\")\n\n        stream.auth_method_used = mechanism\n        self.authenticator = sasl.server_authenticator_factory(mechanism,\n                                                                password_db)\n\n        content = element.text.encode(\"us-ascii\")\n        ret = self.authenticator.start(stream.auth_properties,\n                                                a2b_base64(content))\n\n        if isinstance(ret, sasl.Success):\n            element = ElementTree.Element(SUCCESS_TAG)\n            element.text = ret.encode()\n        elif isinstance(ret, sasl.Challenge):\n            element = ElementTree.Element(CHALLENGE_TAG)\n            element.text = ret.encode()\n        else:\n            element = ElementTree.Element(FAILURE_TAG)\n            ElementTree.SubElement(element, SASL_QNP + ret.reason)\n\n        stream.write_element(element)\n\n        if isinstance(ret, sasl.Success):\n            self._handle_auth_success(stream, ret)\n        elif isinstance(ret, sasl.Failure):\n            raise SASLAuthenticationFailed(\"SASL authentication failed: {0}\"\n                                                            .format(ret.reason))\n        return True", "response": "Process incoming SASL authentication element."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _handle_auth_success(self, stream, success):\n        if not self._check_authorization(success.properties, stream):\n            element = ElementTree.Element(FAILURE_TAG)\n            ElementTree.SubElement(element, SASL_QNP + \"invalid-authzid\")\n            return True\n        authzid = success.properties.get(\"authzid\")\n        if authzid:\n            peer = JID(success.authzid)\n        elif \"username\" in success.properties:\n            peer = JID(success.properties[\"username\"], stream.me.domain)\n        else:\n            # anonymous\n            peer = None\n        stream.set_peer_authenticated(peer, True)", "response": "Handle successful authentication.\n\n        Send <success/> and mark the stream peer authenticated.\n\n        [receiver only]"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nprocesses incoming SASL challenge element.", "response": "def _process_sasl_challenge(self, stream, element):\n        \"\"\"Process incoming <sasl:challenge/> element.\n\n        [initiating entity only]\n        \"\"\"\n        if not self.authenticator:\n            logger.debug(\"Unexpected SASL challenge\")\n            return False\n\n        content = element.text.encode(\"us-ascii\")\n        ret = self.authenticator.challenge(a2b_base64(content))\n        if isinstance(ret, sasl.Response):\n            element = ElementTree.Element(RESPONSE_TAG)\n            element.text = ret.encode()\n        else:\n            element = ElementTree.Element(ABORT_TAG)\n\n        stream.write_element(element)\n\n        if isinstance(ret, sasl.Failure):\n            stream.disconnect()\n            raise SASLAuthenticationFailed(\"SASL authentication failed\")\n\n        return True"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _process_sasl_response(self, stream, element):\n        if not self.authenticator:\n            logger.debug(\"Unexpected SASL response\")\n            return False\n\n        content = element.text.encode(\"us-ascii\")\n        ret = self.authenticator.response(a2b_base64(content))\n        if isinstance(ret, sasl.Success):\n            element = ElementTree.Element(SUCCESS_TAG)\n            element.text = ret.encode()\n        elif isinstance(ret, sasl.Challenge):\n            element = ElementTree.Element(CHALLENGE_TAG)\n            element.text = ret.encode()\n        else:\n            element = ElementTree.Element(FAILURE_TAG)\n            ElementTree.SubElement(element, SASL_QNP + ret.reason)\n\n        stream.write_element(element)\n\n        if isinstance(ret, sasl.Success):\n            self._handle_auth_success(stream, ret)\n        elif isinstance(ret, sasl.Failure):\n            raise SASLAuthenticationFailed(\"SASL authentication failed: {0!r}\"\n                                                            .format(ret.reson))\n        return True", "response": "Process incoming SASL response element."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _check_authorization(self, properties, stream):\n        authzid = properties.get(\"authzid\")\n        if not authzid:\n            return True\n        try:\n            jid = JID(authzid)\n        except ValueError:\n            return False\n\n        if \"username\" not in properties:\n            result = False\n        elif jid.local != properties[\"username\"]:\n            result = False\n        elif jid.domain != stream.me.domain:\n            result = False\n        elif jid.resource:\n            result = False\n        else:\n            result = True\n        return result", "response": "Check if the user is authorized to use a provided authzid."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _process_sasl_success(self, stream, element):\n        if not self.authenticator:\n            logger.debug(\"Unexpected SASL response\")\n            return False\n\n        content = element.text\n\n        if content:\n            data = a2b_base64(content.encode(\"us-ascii\"))\n        else:\n            data = None\n        ret = self.authenticator.finish(data)\n        if isinstance(ret, sasl.Success):\n            logger.debug(\"SASL authentication succeeded\")\n            authzid = ret.properties.get(\"authzid\")\n            if authzid:\n                me = JID(authzid)\n            elif \"username\" in ret.properties:\n                # FIXME: other rules for server\n                me = JID(ret.properties[\"username\"], stream.peer.domain)\n            else:\n                me = None\n            stream.set_authenticated(me, True)\n        else:\n            logger.debug(\"SASL authentication failed\")\n            raise SASLAuthenticationFailed(\"Additional success data\"\n                                                        \" procesing failed\")\n        return True", "response": "Process incoming SASL success element."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nprocesses incoming SASL failure element.", "response": "def _process_sasl_failure(self, stream, element):\n        \"\"\"Process incoming <sasl:failure/> element.\n\n        [initiating entity only]\n        \"\"\"\n        _unused = stream\n        if not self.authenticator:\n            logger.debug(\"Unexpected SASL response\")\n            return False\n\n        logger.debug(\"SASL authentication failed: {0!r}\".format(\n                                                element_to_unicode(element)))\n        raise SASLAuthenticationFailed(\"SASL authentication failed\")"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _process_sasl_abort(self, stream, element):\n        _unused, _unused = stream, element\n        if not self.authenticator:\n            logger.debug(\"Unexpected SASL response\")\n            return False\n\n        self.authenticator = None\n        logger.debug(\"SASL authentication aborted\")\n        return True", "response": "Process incoming SASL abort element."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _sasl_authenticate(self, stream, username, authzid):\n        if not stream.initiator:\n            raise SASLAuthenticationFailed(\"Only initiating entity start\"\n                                                        \" SASL authentication\")\n        if stream.features is None or not self.peer_sasl_mechanisms:\n            raise SASLNotAvailable(\"Peer doesn't support SASL\")\n\n        props = dict(stream.auth_properties)\n        if not props.get(\"service-domain\") and (\n                                        stream.peer and stream.peer.domain):\n            props[\"service-domain\"] = stream.peer.domain\n        if username is not None:\n            props[\"username\"] = username\n        if authzid is not None:\n            props[\"authzid\"] = authzid\n        if \"password\" in self.settings:\n            props[\"password\"] = self.settings[\"password\"]\n        props[\"available_mechanisms\"] = self.peer_sasl_mechanisms\n        enabled = sasl.filter_mechanism_list(\n                            self.settings['sasl_mechanisms'], props,\n                                            self.settings['insecure_auth'])\n        if not enabled:\n            raise SASLNotAvailable(\n                                \"None of SASL mechanism selected can be used\")\n        props[\"enabled_mechanisms\"] = enabled\n\n        mechanism = None\n        for mech in enabled:\n            if mech in self.peer_sasl_mechanisms:\n                mechanism = mech\n                break\n        if not mechanism:\n            raise SASLMechanismNotAvailable(\"Peer doesn't support any of\"\n                                                    \" our SASL mechanisms\")\n        logger.debug(\"Our mechanism: {0!r}\".format(mechanism))\n\n        stream.auth_method_used = mechanism\n        self.authenticator = sasl.client_authenticator_factory(mechanism)\n        initial_response = self.authenticator.start(props)\n        if not isinstance(initial_response, sasl.Response):\n            raise SASLAuthenticationFailed(\"SASL initiation failed\")\n\n        element = ElementTree.Element(AUTH_TAG)\n        element.set(\"mechanism\", mechanism)\n        if initial_response.data:\n            if initial_response.encode:\n                element.text = initial_response.encode()\n            else:\n                element.text = initial_response.data\n        stream.write_element(element)", "response": "Start SASL authentication process."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef delayed_call(self, delay, function):\n        main_loop = self\n        handler = []\n        class DelayedCallHandler(TimeoutHandler):\n            \"\"\"Wrapper timeout handler class for the delayed call.\"\"\"\n            # pylint: disable=R0903\n            @timeout_handler(delay, False)\n            def callback(self):\n                \"\"\"Wrapper timeout handler method for the delayed call.\"\"\"\n                try:\n                    function()\n                finally:\n                    main_loop.remove_handler(handler[0])\n        handler.append(DelayedCallHandler())\n        self.add_handler(handler[0])", "response": "Schedule function to be called from the main loop after delay seconds."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nmakes a freshly created roster item from an XML element.", "response": "def from_xml(cls, element):\n        \"\"\"Make a RosterItem from an XML element.\n\n        :Parameters:\n            - `element`: the XML element\n        :Types:\n            - `element`: :etree:`ElementTree.Element`\n\n        :return: a freshly created roster item\n        :returntype: `cls`\n        \"\"\"\n        if element.tag != ITEM_TAG:\n            raise ValueError(\"{0!r} is not a roster item\".format(element))\n        try:\n            jid = JID(element.get(\"jid\"))\n        except ValueError:\n            raise BadRequestProtocolError(u\"Bad item JID\")\n        subscription = element.get(\"subscription\")\n        ask = element.get(\"ask\")\n        name = element.get(\"name\")\n        duplicate_group = False\n        groups = set()\n        for child in element:\n            if child.tag != GROUP_TAG:\n                continue\n            group = child.text\n            if group is None:\n                group = u\"\"\n            if group in groups:\n                duplicate_group = True\n            else:\n                groups.add(group)\n        approved = element.get(\"approved\")\n        if approved == \"true\":\n            approved = True\n        elif approved in (\"false\", None):\n            approved = False\n        else:\n            logger.debug(\"RosterItem.from_xml: got unknown 'approved':\"\n                            \" {0!r}, changing to False\".format(approved))\n            approved = False\n        result = cls(jid, name, groups, subscription, ask, approved)\n        result._duplicate_group = duplicate_group\n        return result"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nmake an XML element from self.", "response": "def as_xml(self, parent = None):\n        \"\"\"Make an XML element from self.\n\n        :Parameters:\n            - `parent`: Parent element\n        :Types:\n            - `parent`: :etree:`ElementTree.Element`\n        \"\"\"\n        if parent is not None:\n            element = ElementTree.SubElement(parent, ITEM_TAG)\n        else:\n            element = ElementTree.Element(ITEM_TAG)\n        element.set(\"jid\", unicode(self.jid))\n        if self.name is not None:\n            element.set(\"name\", self.name)\n        if self.subscription is not None:\n            element.set(\"subscription\", self.subscription)\n        if self.ask:\n            element.set(\"ask\", self.ask)\n        if self.approved:\n            element.set(\"approved\", \"true\")\n        for group in self.groups:\n            ElementTree.SubElement(element, GROUP_TAG).text = group\n        return element"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _verify(self, valid_subscriptions, fix):\n        if self.subscription not in valid_subscriptions:\n            if fix:\n                logger.debug(\"RosterItem.from_xml: got unknown 'subscription':\"\n                        \" {0!r}, changing to None\".format(self.subscription))\n                self.subscription = None\n            else:\n                raise ValueError(\"Bad 'subscription'\")\n        if self.ask not in (None, u\"subscribe\"):\n            if fix:\n                logger.debug(\"RosterItem.from_xml: got unknown 'ask':\"\n                                \" {0!r}, changing to None\".format(self.ask))\n                self.ask = None\n            else:\n                raise ValueError(\"Bad 'ask'\")", "response": "Verify that the roster item is valid."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef verify_roster_result(self, fix = False):\n        self._verify((None, u\"from\", u\"to\", u\"both\"), fix)", "response": "Verify that the roster item is valid."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nverifies that the roster push item is valid.", "response": "def verify_roster_push(self, fix = False):\n        \"\"\"Check if `self` is valid roster push item.\n\n        Valid item must have proper `subscription` value other and valid value\n        for 'ask'.\n\n        :Parameters:\n            - `fix`: if `True` than replace invalid 'subscription' and 'ask'\n              values with the defaults\n        :Types:\n            - `fix`: `bool`\n\n        :Raise: `ValueError` if the item is invalid.\n        \"\"\"\n        self._verify((None, u\"from\", u\"to\", u\"both\", u\"remove\"), fix)"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef verify_roster_set(self, fix = False, settings = None):\n        # pylint: disable=R0912\n        try:\n            self._verify((None, u\"remove\"), fix)\n        except ValueError, err:\n            raise BadRequestProtocolError(unicode(err))\n        if self.ask:\n            if fix:\n                self.ask = None\n            else:\n                raise BadRequestProtocolError(\"'ask' in roster set\")\n        if self.approved:\n            if fix:\n                self.approved = False\n            else:\n                raise BadRequestProtocolError(\"'approved' in roster set\")\n        if settings is None:\n            settings = XMPPSettings()\n        name_length_limit = settings[\"roster_name_length_limit\"]\n        if self.name and len(self.name) > name_length_limit:\n            raise NotAcceptableProtocolError(u\"Roster item name too long\")\n        group_length_limit = settings[\"roster_group_name_length_limit\"]\n        for group in self.groups:\n            if not group:\n                raise NotAcceptableProtocolError(u\"Roster group name empty\")\n            if len(group) > group_length_limit:\n                raise NotAcceptableProtocolError(u\"Roster group name too long\")\n        if self._duplicate_group:\n            raise BadRequestProtocolError(u\"Item group duplicated\")", "response": "Verify that this item is valid for a roster set."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns the set of groups defined in the roster.", "response": "def groups(self):\n        \"\"\"Set of groups defined in the roster.\n\n        :Return: the groups\n        :ReturnType: `set` of `unicode`\n        \"\"\"\n        groups = set()\n        for item in self._items:\n            groups |= item.groups\n        return groups"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn a list of items with given name.", "response": "def get_items_by_name(self, name, case_sensitive = True):\n        \"\"\"\n        Return a list of items with given name.\n\n        :Parameters:\n            - `name`: name to look-up\n            - `case_sensitive`: if `False` the matching will be case\n              insensitive.\n        :Types:\n            - `name`: `unicode`\n            - `case_sensitive`: `bool`\n\n        :Returntype: `list` of `RosterItem`\n        \"\"\"\n        if not case_sensitive and name:\n            name = name.lower()\n        result = []\n        for item in self._items:\n            if item.name == name:\n                result.append(item)\n            elif item.name is None:\n                continue\n            elif not case_sensitive and item.name.lower() == name:\n                result.append(item)\n        return result"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef get_items_by_group(self, group, case_sensitive = True):\n        result = []\n        if not group:\n            for item in self._items:\n                if not item.groups:\n                    result.append(item)\n            return result\n        if not case_sensitive:\n            group = group.lower()\n        for item in self._items:\n            if group in item.groups:\n                result.append(item)\n            elif not case_sensitive and group in [g.lower() for g\n                                                            in item.groups]:\n                result.append(item)\n        return result", "response": "Return a list of items within a given group."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nadd an item to the roster.", "response": "def add_item(self, item, replace = False):\n        \"\"\"\n        Add an item to the roster.\n\n        This will not automatically update the roster on the server.\n\n        :Parameters:\n            - `item`: the item to add\n            - `replace`: if `True` then existing item will be replaced,\n              otherwise a `ValueError` will be raised on conflict\n        :Types:\n            - `item`: `RosterItem`\n            - `replace`: `bool`\n        \"\"\"\n        if item.jid in self._jids:\n            if replace:\n                self.remove_item(item.jid)\n            else:\n                raise ValueError(\"JID already in the roster\")\n        index = len(self._items)\n        self._items.append(item)\n        self._jids[item.jid] = index"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nremoving an item from the roster.", "response": "def remove_item(self, jid):\n        \"\"\"Remove item from the roster.\n\n        :Parameters:\n            - `jid`: JID of the item to remove\n        :Types:\n            - `jid`: `JID`\n        \"\"\"\n        if jid not in self._jids:\n            raise KeyError(jid)\n        index = self._jids[jid]\n        for i in range(index, len(self._jids)):\n            self._jids[self._items[i].jid] -= 1\n        del self._jids[jid]\n        del self._items[index]"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nloads a roster from an XML file.", "response": "def load_roster(self, source):\n        \"\"\"Load roster from an XML file.\n\n        Can be used before the connection is started to load saved\n        roster copy, for efficient retrieval of versioned roster.\n\n        :Parameters:\n            - `source`: file name or a file object\n        :Types:\n            - `source`: `str` or file-like object\n        \"\"\"\n        try:\n            tree = ElementTree.parse(source)\n        except ElementTree.ParseError, err:\n            raise ValueError(\"Invalid roster format: {0}\".format(err))\n        roster = Roster.from_xml(tree.getroot())\n        for item in roster:\n            item.verify_roster_result(True)\n        self.roster = roster"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nsaves the roster to a file.", "response": "def save_roster(self, dest, pretty = True):\n        \"\"\"Save the roster to an XML file.\n\n        Can be used to save the last know roster copy for faster loading\n        of a verisoned roster (if server supports that).\n\n        :Parameters:\n            - `dest`: file name or a file object\n            - `pretty`: pretty-format the roster XML\n        :Types:\n            - `dest`: `str` or file-like object\n            - `pretty`: `bool`\n        \"\"\"\n        if self.roster is None:\n            raise ValueError(\"No roster\")\n        element = self.roster.as_xml()\n        if pretty:\n            if len(element):\n                element.text = u'\\n  '\n            p_child = None\n            for child in element:\n                if p_child is not None:\n                    p_child.tail = u'\\n  '\n                if len(child):\n                    child.text = u'\\n    '\n                p_grand = None\n                for grand in child:\n                    if p_grand is not None:\n                        p_grand.tail = u'\\n    '\n                    p_grand = grand\n                if p_grand is not None:\n                    p_grand.tail = u'\\n  '\n                p_child = child\n            if p_child is not None:\n                p_child.tail = u\"\\n\"\n        tree = ElementTree.ElementTree(element)\n        tree.write(dest, \"utf-8\")"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ncheck for roster related features in the stream features received and set self. server_features accordingly.", "response": "def handle_got_features_event(self, event):\n        \"\"\"Check for roster related features in the stream features received\n        and set `server_features` accordingly.\n        \"\"\"\n        server_features = set()\n        logger.debug(\"Checking roster-related features\")\n        if event.features.find(FEATURE_ROSTERVER) is not None:\n            logger.debug(\"  Roster versioning available\")\n            server_features.add(\"versioning\")\n        if event.features.find(FEATURE_APPROVALS) is not None:\n            logger.debug(\"  Subscription pre-approvals available\")\n            server_features.add(\"pre-approvals\")\n        self.server_features = server_features"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nrequest roster upon login.", "response": "def handle_authorized_event(self, event):\n        \"\"\"Request roster upon login.\"\"\"\n        self.server = event.authorized_jid.bare()\n        if \"versioning\" in self.server_features:\n            if self.roster is not None and self.roster.version is not None:\n                version = self.roster.version\n            else:\n                version = u\"\"\n        else:\n            version = None\n        self.request_roster(version)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nrequest roster from server.", "response": "def request_roster(self, version = None):\n        \"\"\"Request roster from server.\n\n        :Parameters:\n            - `version`: if not `None` versioned roster will be requested\n              for given local version. Use \"\" to request full roster.\n        :Types:\n            - `version`: `unicode`\n        \"\"\"\n        processor = self.stanza_processor\n        request = Iq(stanza_type = \"get\")\n        request.set_payload(RosterPayload(version = version))\n        processor.set_response_handlers(request,\n                                    self._get_success, self._get_error)\n        processor.send(request)"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nhandles successful response to the roster request.", "response": "def _get_success(self, stanza):\n        \"\"\"Handle successful response to the roster request.\n        \"\"\"\n        payload = stanza.get_payload(RosterPayload)\n        if payload is None:\n            if \"versioning\" in self.server_features and self.roster:\n                logger.debug(\"Server will send roster delta in pushes\")\n            else:\n                logger.warning(\"Bad roster response (no payload)\")\n                self._event_queue.put(RosterNotReceivedEvent(self, stanza))\n                return\n        else:\n            items = list(payload)\n            for item in items:\n                item.verify_roster_result(True)\n            self.roster = Roster(items, payload.version)\n        self._event_queue.put(RosterReceivedEvent(self, self.roster))"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _get_error(self, stanza):\n        if stanza:\n            logger.debug(u\"Roster request failed: {0}\".format(\n                                                stanza.error.condition_name))\n        else:\n            logger.debug(u\"Roster request failed: timeout\")\n        self._event_queue.put(RosterNotReceivedEvent(self, stanza))", "response": "Handle failure of the roster request."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef handle_roster_push(self, stanza):\n        if self.server is None and stanza.from_jid:\n            logger.debug(u\"Server address not known, cannot verify roster push\"\n                                \" from {0}\".format(stanza.from_jid))\n            return stanza.make_error_response(u\"service-unavailable\")\n        if self.server and stanza.from_jid and stanza.from_jid != self.server:\n            logger.debug(u\"Roster push from invalid source: {0}\".format(\n                                                            stanza.from_jid))\n            return stanza.make_error_response(u\"service-unavailable\")\n        payload = stanza.get_payload(RosterPayload)\n        if len(payload) != 1:\n            logger.warning(\"Bad roster push received ({0} items)\"\n                                                    .format(len(payload)))\n            return stanza.make_error_response(u\"bad-request\")\n        if self.roster is None:\n            logger.debug(\"Dropping roster push - no roster here\")\n            return True\n        item = payload[0]\n        item.verify_roster_push(True)\n        old_item = self.roster.get(item.jid)\n        if item.subscription == \"remove\":\n            if old_item:\n                self.roster.remove_item(item.jid)\n        else:\n            self.roster.add_item(item, replace = True)\n        self._event_queue.put(RosterUpdatedEvent(self, old_item, item))\n        return stanza.make_result_response()", "response": "Handle a roster push stanza."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nadds a contact to the roster.", "response": "def add_item(self, jid, name = None, groups = None,\n                                callback = None, error_callback = None):\n        \"\"\"Add a contact to the roster.\n\n        :Parameters:\n            - `jid`: contact's jid\n            - `name`: name for the contact\n            - `groups`: sequence of group names the contact should belong to\n            - `callback`: function to call when the request succeeds. It should\n              accept a single argument - a `RosterItem` describing the\n              requested change\n            - `error_callback`: function to call when the request fails. It\n              should accept a single argument - an error stanza received\n              (`None` in case of timeout)\n        :Types:\n            - `jid`: `JID`\n            - `name`: `unicode`\n            - `groups`: sequence of `unicode`\n        \"\"\"\n        # pylint: disable=R0913\n        if jid in self.roster:\n            raise ValueError(\"{0!r} already in the roster\".format(jid))\n        item = RosterItem(jid, name, groups)\n        self._roster_set(item, callback, error_callback)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nmodify a roster item.", "response": "def update_item(self, jid, name = NO_CHANGE, groups = NO_CHANGE,\n                                callback = None, error_callback = None):\n        \"\"\"Modify a contact in the roster.\n\n        :Parameters:\n            - `jid`: contact's jid\n            - `name`: a new name for the contact\n            - `groups`: a sequence of group names the contact should belong to\n            - `callback`: function to call when the request succeeds. It should\n              accept a single argument - a `RosterItem` describing the\n              requested change\n            - `error_callback`: function to call when the request fails. It\n              should accept a single argument - an error stanza received\n              (`None` in case of timeout)\n        :Types:\n            - `jid`: `JID`\n            - `name`: `unicode`\n            - `groups`: sequence of `unicode`\n        \"\"\"\n        # pylint: disable=R0913\n        item = self.roster[jid]\n        if name is NO_CHANGE and groups is NO_CHANGE:\n            return\n        if name is NO_CHANGE:\n            name = item.name\n        if groups is NO_CHANGE:\n            groups = item.groups\n        item = RosterItem(jid, name, groups)\n        self._roster_set(item, callback, error_callback)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nremove a contact from the roster.", "response": "def remove_item(self, jid, callback = None, error_callback = None):\n        \"\"\"Remove a contact from the roster.\n\n        :Parameters:\n            - `jid`: contact's jid\n            - `callback`: function to call when the request succeeds. It should\n              accept a single argument - a `RosterItem` describing the\n              requested change\n            - `error_callback`: function to call when the request fails. It\n              should accept a single argument - an error stanza received\n              (`None` in case of timeout)\n        :Types:\n            - `jid`: `JID`\n        \"\"\"\n        item = self.roster[jid]\n        if jid not in self.roster:\n            raise KeyError(jid)\n        item = RosterItem(jid, subscription = \"remove\")\n        self._roster_set(item, callback, error_callback)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _roster_set(self, item, callback, error_callback):\n        stanza = Iq(to_jid = self.server, stanza_type = \"set\")\n        payload = RosterPayload([item])\n        stanza.set_payload(payload)\n        def success_cb(result_stanza):\n            \"\"\"Success callback for roster set.\"\"\"\n            if callback:\n                callback(item)\n        def error_cb(error_stanza):\n            \"\"\"Error callback for roster set.\"\"\"\n            if error_callback:\n                error_callback(error_stanza)\n            else:\n                logger.error(\"Roster change of '{0}' failed\".format(item.jid))\n        processor = self.stanza_processor\n        processor.set_response_handlers(stanza,\n                                    success_cb, error_cb)\n        processor.send(stanza)", "response": "Send a roster set stanza to the server."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef free(self):\n        if not self.borrowed:\n            self.xmlnode.unlinkNode()\n            self.xmlnode.freeNode()\n        self.xmlnode=None", "response": "Free the XML node owned by this instance."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nevaluating the XPath expression in context of self. xmlnode.", "response": "def xpath_eval(self,expr):\n        \"\"\"\n        Evaluate XPath expression in context of `self.xmlnode`.\n\n        :Parameters:\n            - `expr`: the XPath expression\n        :Types:\n            - `expr`: `unicode`\n\n        :return: the result of the expression evaluation.\n        :returntype: list of `libxml2.xmlNode`\n        \"\"\"\n        ctxt = common_doc.xpathNewContext()\n        ctxt.setContextNode(self.xmlnode)\n        ctxt.xpathRegisterNs(\"muc\",self.ns.getContent())\n        ret=ctxt.xpathEval(to_utf8(expr))\n        ctxt.xpathFreeContext()\n        return ret"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef set_history(self, parameters):\n        for child in xml_element_iter(self.xmlnode.children):\n            if get_node_ns_uri(child) == MUC_NS and child.name == \"history\":\n                child.unlinkNode()\n                child.freeNode()\n                break\n\n        if parameters.maxchars and parameters.maxchars < 0:\n            raise ValueError(\"History parameter maxchars must be positive\")\n        if parameters.maxstanzas and parameters.maxstanzas < 0:\n            raise ValueError(\"History parameter maxstanzas must be positive\")\n        if parameters.maxseconds and parameters.maxseconds < 0:\n            raise ValueError(\"History parameter maxseconds must be positive\")\n\n        hnode=self.xmlnode.newChild(self.xmlnode.ns(), \"history\", None)\n\n        if parameters.maxchars is not None:\n            hnode.setProp(\"maxchars\", str(parameters.maxchars))\n        if parameters.maxstanzas is not None:\n            hnode.setProp(\"maxstanzas\", str(parameters.maxstanzas))\n        if parameters.maxseconds is not None:\n            hnode.setProp(\"maxseconds\", str(parameters.maxseconds))\n        if parameters.since is not None:\n            hnode.setProp(\"since\", parameters.since.strftime(\"%Y-%m-%dT%H:%M:%SZ\"))", "response": "Sets the history parameters."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nreturning history parameters carried by the stanza.", "response": "def get_history(self):\n        \"\"\"Return history parameters carried by the stanza.\n\n        :returntype: `HistoryParameters`\"\"\"\n        for child in xml_element_iter(self.xmlnode.children):\n            if get_node_ns_uri(child) == MUC_NS and child.name == \"history\":\n                maxchars = from_utf8(child.prop(\"maxchars\"))\n                if maxchars is not None:\n                    maxchars = int(maxchars)\n                maxstanzas = from_utf8(child.prop(\"maxstanzas\"))\n                if maxstanzas is not None:\n                    maxstanzas = int(maxstanzas)\n                maxseconds = from_utf8(child.prop(\"maxseconds\"))\n                if maxseconds is not None:\n                    maxseconds = int(maxseconds)\n                # TODO: since -- requires parsing of Jabber dateTime profile\n                since = None\n                return HistoryParameters(maxchars, maxstanzas, maxseconds, since)"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nsets the password for the MUC request.", "response": "def set_password(self, password):\n        \"\"\"Set password for the MUC request.\n\n        :Parameters:\n            - `password`: password\n        :Types:\n            - `password`: `unicode`\"\"\"\n        for child in xml_element_iter(self.xmlnode.children):\n            if get_node_ns_uri(child) == MUC_NS and child.name == \"password\":\n                child.unlinkNode()\n                child.freeNode()\n                break\n\n        if password is not None:\n            self.xmlnode.newTextChild(self.xmlnode.ns(), \"password\", to_utf8(password))"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef get_password(self):\n        for child in xml_element_iter(self.xmlnode.children):\n            if get_node_ns_uri(child) == MUC_NS and child.name == \"password\":\n                return from_utf8(child.getContent())\n        return None", "response": "Get password from the MUC request."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef __init(self,affiliation,role,jid=None,nick=None,actor=None,reason=None):\n        if not affiliation:\n            affiliation=None\n        elif affiliation not in affiliations:\n            raise ValueError(\"Bad affiliation\")\n        self.affiliation=affiliation\n        if not role:\n            role=None\n        elif role not in roles:\n            raise ValueError(\"Bad role\")\n        self.role=role\n        if jid:\n            self.jid=JID(jid)\n        else:\n            self.jid=None\n        if actor:\n            self.actor=JID(actor)\n        else:\n            self.actor=None\n        self.nick=nick\n        self.reason=reason", "response": "Initialize a MucItem object from a set of attributes."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef __from_xmlnode(self, xmlnode):\n        actor=None\n        reason=None\n        n=xmlnode.children\n        while n:\n            ns=n.ns()\n            if ns and ns.getContent()!=MUC_USER_NS:\n                continue\n            if n.name==\"actor\":\n                actor=n.getContent()\n            if n.name==\"reason\":\n                reason=n.getContent()\n            n=n.next\n        self.__init(\n            from_utf8(xmlnode.prop(\"affiliation\")),\n            from_utf8(xmlnode.prop(\"role\")),\n            from_utf8(xmlnode.prop(\"jid\")),\n            from_utf8(xmlnode.prop(\"nick\")),\n            from_utf8(actor),\n            from_utf8(reason),\n            );", "response": "Initialize a MucItem object from an xml. dom Node object."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nreturn an XML representation of the object.", "response": "def as_xml(self,parent):\n        \"\"\"\n        Create XML representation of `self`.\n\n        :Parameters:\n            - `parent`: the element to which the created node should be linked to.\n        :Types:\n            - `parent`: `libxml2.xmlNode`\n\n        :return: an XML node.\n        :returntype: `libxml2.xmlNode`\n        \"\"\"\n        n=parent.newChild(None,\"item\",None)\n        if self.actor:\n            n.newTextChild(None,\"actor\",to_utf8(self.actor))\n        if self.reason:\n            n.newTextChild(None,\"reason\",to_utf8(self.reason))\n        n.setProp(\"affiliation\",to_utf8(self.affiliation))\n        if self.role:\n            n.setProp(\"role\",to_utf8(self.role))\n        if self.jid:\n            n.setProp(\"jid\",to_utf8(self.jid.as_unicode()))\n        if self.nick:\n            n.setProp(\"nick\",to_utf8(self.nick))\n        return n"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ninitialize a MucStatus element from a status code.", "response": "def __init(self,code):\n        \"\"\"Initialize a `MucStatus` element from a status code.\n\n        :Parameters:\n            - `code`: the status code.\n        :Types:\n            - `code`: `int`\n        \"\"\"\n        code=int(code)\n        if code<0 or code>999:\n            raise ValueError(\"Bad status code\")\n        self.code=code"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef as_xml(self,parent):\n        n=parent.newChild(None,\"status\",None)\n        n.setProp(\"code\",\"%03i\" % (self.code,))\n        return n", "response": "Create XML representation of the object."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef get_items(self):\n        if not self.xmlnode.children:\n            return []\n        ret=[]\n        n=self.xmlnode.children\n        while n:\n            ns=n.ns()\n            if ns and ns.getContent()!=self.ns:\n                pass\n            elif n.name==\"item\":\n                ret.append(MucItem(n))\n            elif n.name==\"status\":\n                ret.append(MucStatus(n))\n            # FIXME: alt,decline,invite,password\n            n=n.next\n        return ret", "response": "Get a list of objects describing the content of self."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nclear the content of self. xmlnode removing all items and status etc.", "response": "def clear(self):\n        \"\"\"\n        Clear the content of `self.xmlnode` removing all <item/>, <status/>, etc.\n        \"\"\"\n        if not self.xmlnode.children:\n            return\n        n=self.xmlnode.children\n        while n:\n            ns=n.ns()\n            if ns and ns.getContent()!=MUC_USER_NS:\n                pass\n            else:\n                n.unlinkNode()\n                n.freeNode()\n            n=n.next"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nadds an item to the muc.", "response": "def add_item(self,item):\n        \"\"\"Add an item to `self`.\n\n        :Parameters:\n            - `item`: the item to add.\n        :Types:\n            - `item`: `MucItemBase`\n        \"\"\"\n        if not isinstance(item,MucItemBase):\n            raise TypeError(\"Bad item type for muc#user\")\n        item.as_xml(self.xmlnode)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nget the MUC specific payload element.", "response": "def get_muc_child(self):\n        \"\"\"\n        Get the MUC specific payload element.\n\n        :return: the object describing the stanza payload in MUC namespace.\n        :returntype: `MucX` or `MucUserX` or `MucAdminQuery` or `MucOwnerX`\n        \"\"\"\n        if self.muc_child:\n            return self.muc_child\n        if not self.xmlnode.children:\n            return None\n        n=self.xmlnode.children\n        while n:\n            if n.name not in (\"x\",\"query\"):\n                n=n.next\n                continue\n            ns=n.ns()\n            if not ns:\n                n=n.next\n                continue\n            ns_uri=ns.getContent()\n            if (n.name,ns_uri)==(\"x\",MUC_NS):\n                self.muc_child=MucX(n)\n                return self.muc_child\n            if (n.name,ns_uri)==(\"x\",MUC_USER_NS):\n                self.muc_child=MucUserX(n)\n                return self.muc_child\n            if (n.name,ns_uri)==(\"query\",MUC_ADMIN_NS):\n                self.muc_child=MucAdminQuery(n)\n                return self.muc_child\n            if (n.name,ns_uri)==(\"query\",MUC_OWNER_NS):\n                self.muc_child=MucOwnerX(n)\n                return self.muc_child\n            n=n.next"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nremoves the MUC specific stanza payload element.", "response": "def clear_muc_child(self):\n        \"\"\"\n        Remove the MUC specific stanza payload element.\n        \"\"\"\n        if self.muc_child:\n            self.muc_child.free_borrowed()\n            self.muc_child=None\n        if not self.xmlnode.children:\n            return\n        n=self.xmlnode.children\n        while n:\n            if n.name not in (\"x\",\"query\"):\n                n=n.next\n                continue\n            ns=n.ns()\n            if not ns:\n                n=n.next\n                continue\n            ns_uri=ns.getContent()\n            if ns_uri in (MUC_NS,MUC_USER_NS,MUC_ADMIN_NS,MUC_OWNER_NS):\n                n.unlinkNode()\n                n.freeNode()\n            n=n.next"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ncreates the MucUserX element in the stanza.", "response": "def make_muc_userinfo(self):\n        \"\"\"\n        Create <x xmlns=\"...muc#user\"/> element in the stanza.\n\n        :return: the element created.\n        :returntype: `MucUserX`\n        \"\"\"\n        self.clear_muc_child()\n        self.muc_child=MucUserX(parent=self.xmlnode)\n        return self.muc_child"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef make_muc_admin_quey(self):\n        self.clear_muc_child()\n        self.muc_child=MucAdminQuery(parent=self.xmlnode)\n        return self.muc_child", "response": "Create the MucAdminQuery element in the stanza."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nmake the presence stanza a MUC room join request.", "response": "def make_join_request(self, password = None, history_maxchars = None,\n            history_maxstanzas = None, history_seconds = None,\n            history_since = None):\n        \"\"\"\n        Make the presence stanza a MUC room join request.\n\n        :Parameters:\n            - `password`: password to the room.\n            - `history_maxchars`: limit of the total number of characters in\n              history.\n            - `history_maxstanzas`: limit of the total number of messages in\n              history.\n            - `history_seconds`: send only messages received in the last\n              `seconds` seconds.\n            - `history_since`: Send only the messages received since the\n              dateTime specified (UTC).\n        :Types:\n            - `password`: `unicode`\n            - `history_maxchars`: `int`\n            - `history_maxstanzas`: `int`\n            - `history_seconds`: `int`\n            - `history_since`: `datetime.datetime`\n        \"\"\"\n        self.clear_muc_child()\n        self.muc_child=MucX(parent=self.xmlnode)\n        if (history_maxchars is not None or history_maxstanzas is not None\n                or history_seconds is not None or history_since is not None):\n            history = HistoryParameters(history_maxchars, history_maxstanzas,\n                    history_seconds, history_since)\n            self.muc_child.set_history(history)\n        if password is not None:\n            self.muc_child.set_password(password)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning the information contained in the MUC room join request or None.", "response": "def get_join_info(self):\n        \"\"\"If `self` is a MUC room join request return the information contained.\n\n        :return: the join request details or `None`.\n        :returntype: `MucX`\n        \"\"\"\n        x=self.get_muc_child()\n        if not x:\n            return None\n        if not isinstance(x,MucX):\n            return None\n        return x"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nmake the iq stanza a MUC room participant kick request.", "response": "def make_kick_request(self,nick,reason):\n        \"\"\"\n        Make the iq stanza a MUC room participant kick request.\n\n        :Parameters:\n            - `nick`: nickname of user to kick.\n            - `reason`: reason of the kick.\n        :Types:\n            - `nick`: `unicode`\n            - `reason`: `unicode`\n\n        :return: object describing the kick request details.\n        :returntype: `MucItem`\n        \"\"\"\n        self.clear_muc_child()\n        self.muc_child=MucAdminQuery(parent=self.xmlnode)\n        item=MucItem(\"none\",\"none\",nick=nick,reason=reason)\n        self.muc_child.add_item(item)\n        return self.muc_child"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ndoing NFKC normalization of Unicode data.", "response": "def nfkc(data):\n    \"\"\"Do NFKC normalization of Unicode data.\n\n    :Parameters:\n        - `data`: list of Unicode characters or Unicode string.\n\n    :return: normalized Unicode string.\"\"\"\n    if isinstance(data, list):\n        data = u\"\".join(data)\n    return unicodedata.normalize(\"NFKC\", data)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nmodify stringprep cache size.", "response": "def set_stringprep_cache_size(size):\n    \"\"\"Modify stringprep cache size.\n\n    :Parameters:\n        - `size`: new cache size\n    \"\"\"\n    # pylint: disable-msg=W0603\n    global _stringprep_cache_size\n    _stringprep_cache_size = size\n    if len(Profile.cache_items) > size:\n        remove = Profile.cache_items[:-size]\n        for profile, key in remove:\n            try:\n                del profile.cache[key]\n            except KeyError:\n                pass\n        Profile.cache_items = Profile.cache_items[-size:]"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef prepare(self, data):\n        ret = self.cache.get(data)\n        if ret is not None:\n            return ret\n        result = self.map(data)\n        if self.normalization:\n            result = self.normalization(result)\n        result = self.prohibit(result)\n        result = self.check_unassigned(result)\n        if self.bidi:\n            result = self.check_bidi(result)\n        if isinstance(result, list):\n            result = u\"\".join()\n        if len(self.cache_items) >= _stringprep_cache_size:\n            remove = self.cache_items[: -_stringprep_cache_size // 2]\n            for profile, key in remove:\n                try:\n                    del profile.cache[key]\n                except KeyError:\n                    pass\n            self.cache_items[:] = self.cache_items[\n                                                -_stringprep_cache_size // 2 :]\n        self.cache_items.append((self, data))\n        self.cache[data] = result\n        return result", "response": "Prepare a string for stored strings."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef prepare_query(self, data):\n        data = self.map(data)\n        if self.normalization:\n            data = self.normalization(data)\n        data = self.prohibit(data)\n        if self.bidi:\n            data = self.check_bidi(data)\n        if isinstance(data, list):\n            data = u\"\".join(data)\n        return data", "response": "Complete string preparation procedure for query strings."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nmapping a string of characters into a list of strings.", "response": "def map(self, data):\n        \"\"\"Mapping part of string preparation.\"\"\"\n        result = []\n        for char in data:\n            ret = None\n            for lookup in self.mapping:\n                ret = lookup(char)\n                if ret is not None:\n                    break\n            if ret is not None:\n                result.append(ret)\n            else:\n                result.append(char)\n        return result"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef prohibit(self, data):\n        for char in data:\n            for lookup in self.prohibited:\n                if lookup(char):\n                    raise StringprepError(\"Prohibited character: {0!r}\"\n                                                                .format(char))\n        return data", "response": "Checks for prohibited characters."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nchecks for unassigned characters.", "response": "def check_unassigned(self, data):\n        \"\"\"Checks for unassigned character codes.\"\"\"\n        for char in data:\n            for lookup in self.unassigned:\n                if lookup(char):\n                    raise StringprepError(\"Unassigned character: {0!r}\"\n                                                                .format(char))\n        return data"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nchecks if sting is valid for bidirectional printing.", "response": "def check_bidi(data):\n        \"\"\"Checks if sting is valid for bidirectional printing.\"\"\"\n        has_l = False\n        has_ral = False\n        for char in data:\n            if stringprep.in_table_d1(char):\n                has_ral = True\n            elif stringprep.in_table_d2(char):\n                has_l = True\n        if has_l and has_ral:\n            raise StringprepError(\"Both RandALCat and LCat characters present\")\n        if has_ral and (not stringprep.in_table_d1(data[0])\n                                    or not stringprep.in_table_d1(data[-1])):\n            raise StringprepError(\"The first and the last character must\"\n                                                                \" be RandALCat\")\n        return data"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef hold_exception(method):\n    @functools.wraps(method)\n    def wrapper(self, *args, **kwargs):\n        \"\"\"Wrapper for methods decorated with `hold_exception`.\"\"\"\n        # pylint: disable=W0703,W0212\n        try:\n            return method(self, *args, **kwargs)\n        except Exception:\n            if self.exc_info:\n                raise\n            if not self._stack:\n                logger.debug('@hold_exception wrapped method {0!r} called'\n                            ' from outside of the main loop'.format(method))\n                raise\n            self.exc_info = sys.exc_info()\n            logger.debug(u\"exception in glib main loop callback:\",\n                                                exc_info = self.exc_info)\n            # pylint: disable=W0212\n            main_loop = self._stack[-1]\n            if main_loop is not None:\n                main_loop.quit()\n            return False\n    return wrapper", "response": "Decorator for methods that are used to store the\n            exception raised."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nregisters an io - handler with the glib main loop.", "response": "def _configure_io_handler(self, handler):\n        \"\"\"Register an io-handler with the glib main loop.\"\"\"\n        if self.check_events():\n            return\n        if handler in self._unprepared_handlers:\n            old_fileno = self._unprepared_handlers[handler]\n            prepared = self._prepare_io_handler(handler)\n        else:\n            old_fileno = None\n            prepared = True\n        fileno = handler.fileno()\n        if old_fileno is not None and fileno != old_fileno:\n            tag = self._io_sources.pop(handler, None)\n            if tag is not None:\n                glib.source_remove(tag)\n        if not prepared:\n            self._unprepared_handlers[handler] = fileno\n        if fileno is None:\n            logger.debug(\" {0!r}.fileno() is None, not polling\"\n                                                    .format(handler))\n            return\n        events = 0\n        if handler.is_readable():\n            logger.debug(\" {0!r} readable\".format(handler))\n            events |= glib.IO_IN | glib.IO_ERR\n        if handler.is_writable():\n            logger.debug(\" {0!r} writable\".format(handler))\n            events |= glib.IO_OUT | glib.IO_HUP | glib.IO_ERR\n        if events:\n            logger.debug(\" registering {0!r} handler fileno {1} for\"\n                            \" events {2}\".format(handler, fileno, events))\n            glib.io_add_watch(fileno, events, self._io_callback, handler)"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ncalls by glib when an I / O event occurs.", "response": "def _io_callback(self, fileno, condition, handler):\n        \"\"\"Called by glib on I/O event.\"\"\"\n        # pylint: disable=W0613\n        self._anything_done = True\n        logger.debug(\"_io_callback called for {0!r}, cond: {1}\".format(handler,\n                                                                    condition))\n        try:\n            if condition & glib.IO_HUP:\n                handler.handle_hup()\n            if condition & glib.IO_IN:\n                handler.handle_read()\n            elif condition & glib.IO_ERR:\n                handler.handle_err()\n            if condition & glib.IO_OUT:\n                handler.handle_write()\n            if self.check_events():\n                return False\n        finally:\n            self._io_sources.pop(handler, None)\n            self._configure_io_handler(handler)\n            self._prepare_pending()\n        return False"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _prepare_io_handler(self, handler):\n        logger.debug(\" preparing handler: {0!r}\".format(handler))\n        self._unprepared_pending.discard(handler)\n        ret = handler.prepare()\n        logger.debug(\"   prepare result: {0!r}\".format(ret))\n        if isinstance(ret, HandlerReady):\n            del self._unprepared_handlers[handler]\n            prepared = True\n        elif isinstance(ret, PrepareAgain):\n            if ret.timeout == 0:\n                tag = glib.idle_add(self._prepare_io_handler_cb, handler)\n                self._prepare_sources[handler] = tag\n            elif ret.timeout is not None:\n                timeout = ret.timeout\n                timeout = int(timeout * 1000)\n                if not timeout:\n                    timeout = 1\n                tag = glib.timeout_add(timeout, self._prepare_io_handler_cb,\n                                                                    handler)\n                self._prepare_sources[handler] = tag\n            else:\n                self._unprepared_pending.add(handler)\n            prepared = False\n        else:\n            raise TypeError(\"Unexpected result type from prepare()\")\n        return prepared", "response": "Call the interfaces. IOHandler. prepare method and add the handler to the unprepared list when done."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nremove an i - handler.", "response": "def _remove_io_handler(self, handler):\n        \"\"\"Remove an i/o-handler.\"\"\"\n        if handler in self._unprepared_handlers:\n            del self._unprepared_handlers[handler]\n        tag = self._prepare_sources.pop(handler, None)\n        if tag is not None:\n            glib.source_remove(tag)\n        tag = self._io_sources.pop(handler, None)\n        if tag is not None:\n            glib.source_remove(tag)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _add_timeout_handler(self, handler):\n        # pylint: disable=W0212\n        for dummy, method in inspect.getmembers(handler, callable):\n            if not hasattr(method, \"_pyxmpp_timeout\"):\n                continue\n            tag = glib.timeout_add(int(method._pyxmpp_timeout * 1000),\n                                                self._timeout_cb, method)\n            self._timer_sources[method] = tag", "response": "Add a TimeoutHandler to the main loop."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _remove_timeout_handler(self, handler):\n        for dummy, method in inspect.getmembers(handler, callable):\n            if not hasattr(method, \"_pyxmpp_timeout\"):\n                continue\n            tag = self._timer_sources.pop(method, None)\n            if tag is not None:\n                glib.source_remove(tag)", "response": "Remove a TimeoutHandler from the main loop."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ncall the timeout handler due.", "response": "def _timeout_cb(self, method):\n        \"\"\"Call the timeout handler due.\n        \"\"\"\n        self._anything_done = True\n        logger.debug(\"_timeout_cb() called for: {0!r}\".format(method))\n        result = method()\n        # pylint: disable=W0212\n        rec = method._pyxmpp_recurring\n        if rec:\n            self._prepare_pending()\n            return True\n\n        if rec is None and result is not None:\n            logger.debug(\" auto-recurring, restarting in {0} s\"\n                                                            .format(result))\n            tag = glib.timeout_add(int(result * 1000), self._timeout_cb, method)\n            self._timer_sources[method] = tag\n        else:\n            self._timer_sources.pop(method, None)\n        self._prepare_pending()\n        return False"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nstopping the loop after the time specified in the loop call.", "response": "def _loop_timeout_cb(self, main_loop):\n        \"\"\"Stops the loop after the time specified in the `loop` call.\n        \"\"\"\n        self._anything_done = True\n        logger.debug(\"_loop_timeout_cb() called\")\n        main_loop.quit()"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ncreates Iq Message or Presence object for XML stanza element", "response": "def stanza_factory(element, return_path = None, language = None):\n    \"\"\"Creates Iq, Message or Presence object for XML stanza `element`\n\n    :Parameters:\n        - `element`: the stanza XML element\n        - `return_path`: object through which responses to this stanza should\n          be sent (will be weakly referenced by the stanza object).\n        - `language`: default language for the stanza\n    :Types:\n        - `element`: :etree:`ElementTree.Element`\n        - `return_path`: `StanzaRoute`\n        - `language`: `unicode`\n    \"\"\"\n    tag = element.tag\n    if tag.endswith(\"}iq\") or tag == \"iq\":\n        return Iq(element, return_path = return_path, language = language)\n    if tag.endswith(\"}message\") or tag == \"message\":\n        return Message(element, return_path = return_path, language = language)\n    if tag.endswith(\"}presence\") or tag == \"presence\":\n        return Presence(element, return_path = return_path, language = language)\n    else:\n        return Stanza(element, return_path = return_path, language = language)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nexamining out the response returned by a stanza handler and sends the all the relevant data to the appropriate handler.", "response": "def _process_handler_result(self, response):\n        \"\"\"Examines out the response returned by a stanza handler and sends all\n        stanzas provided.\n\n        :Parameters:\n            - `response`: the response to process. `None` or `False` means 'not\n              handled'. `True` means 'handled'. Stanza or stanza list means\n              handled with the stanzas to send back\n        :Types:\n            - `response`: `bool` or `Stanza` or iterable of `Stanza`\n\n        :Returns:\n            - `True`: if `response` is `Stanza`, iterable or `True` (meaning\n              the stanza was processed).\n            - `False`: when `response` is `False` or `None`\n\n        :returntype: `bool`\n        \"\"\"\n\n        if response is None or response is False:\n            return False\n\n        if isinstance(response, Stanza):\n            self.send(response)\n            return True\n\n        try:\n            response = iter(response)\n        except TypeError:\n            return bool(response)\n\n        for stanza in response:\n            if isinstance(stanza, Stanza):\n                self.send(stanza)\n            else:\n                logger.warning(u\"Unexpected object in stanza handler result:\"\n                                                    u\" {0!r}\".format(stanza))\n        return True"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nprocessing an IQ stanza of type response or error.", "response": "def _process_iq_response(self, stanza):\n        \"\"\"Process IQ stanza of type 'response' or 'error'.\n\n        :Parameters:\n            - `stanza`: the stanza received\n        :Types:\n            - `stanza`: `Iq`\n\n        If a matching handler is available pass the stanza to it.  Otherwise\n        ignore it if it is \"error\" or \"result\" stanza or return\n        \"feature-not-implemented\" error if it is \"get\" or \"set\".\n        \"\"\"\n        stanza_id = stanza.stanza_id\n        from_jid = stanza.from_jid\n        if from_jid:\n            ufrom = from_jid.as_unicode()\n        else:\n            ufrom = None\n        res_handler = err_handler = None\n        try:\n            res_handler, err_handler = self._iq_response_handlers.pop(\n                                                    (stanza_id, ufrom))\n        except KeyError:\n            logger.debug(\"No response handler for id={0!r} from={1!r}\"\n                                                .format(stanza_id, ufrom))\n            logger.debug(\" from_jid: {0!r} peer: {1!r}  me: {2!r}\"\n                                        .format(from_jid, self.peer, self.me))\n            if ( (from_jid == self.peer or from_jid == self.me\n                            or self.me and from_jid == self.me.bare()) ):\n                try:\n                    logger.debug(\"  trying id={0!r} from=None\"\n                                                        .format(stanza_id))\n                    res_handler, err_handler = \\\n                            self._iq_response_handlers.pop(\n                                                    (stanza_id, None))\n                except KeyError:\n                    pass\n        if stanza.stanza_type == \"result\":\n            if res_handler:\n                response = res_handler(stanza)\n            else:\n                return False\n        else:\n            if err_handler:\n                response = err_handler(stanza)\n            else:\n                return False\n        self._process_handler_result(response)\n        return True"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nprocess IQ stanza received.", "response": "def process_iq(self, stanza):\n        \"\"\"Process IQ stanza received.\n\n        :Parameters:\n            - `stanza`: the stanza received\n        :Types:\n            - `stanza`: `Iq`\n\n        If a matching handler is available pass the stanza to it.  Otherwise\n        ignore it if it is \"error\" or \"result\" stanza or return\n        \"feature-not-implemented\" error if it is \"get\" or \"set\".\"\"\"\n\n        typ = stanza.stanza_type\n        if typ in (\"result\", \"error\"):\n            return self._process_iq_response(stanza)\n        if typ not in (\"get\", \"set\"):\n            raise BadRequestProtocolError(\"Bad <iq/> type\")\n        logger.debug(\"Handling <iq type='{0}'> stanza: {1!r}\".format(\n                                                            stanza, typ))\n        payload = stanza.get_payload(None)\n        logger.debug(\"  payload: {0!r}\".format(payload))\n        if not payload:\n            raise BadRequestProtocolError(\"<iq/> stanza with no child element\")\n        handler = self._get_iq_handler(typ, payload)\n        if not handler:\n            payload = stanza.get_payload(None, specialize = True)\n            logger.debug(\"  specialized payload: {0!r}\".format(payload))\n            if not isinstance(payload, XMLPayload):\n                handler = self._get_iq_handler(typ, payload)\n        if handler:\n            response = handler(stanza)\n            self._process_handler_result(response)\n            return True\n        else:\n            raise ServiceUnavailableProtocolError(\"Not implemented\")"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _get_iq_handler(self, iq_type, payload):\n        key = (payload.__class__, payload.handler_key)\n        logger.debug(\"looking up iq {0} handler for {1!r}, key: {2!r}\"\n                            .format(iq_type, payload, key))\n        logger.debug(\"handlers: {0!r}\".format(self._iq_handlers))\n        handler = self._iq_handlers[iq_type].get(key)\n        return handler", "response": "Get an IQ handler for given iq type and payload."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nsearches the handler list for handlers matching the given stanza type and payload namespace.", "response": "def __try_handlers(self, handler_list, stanza, stanza_type = None):\n        \"\"\" Search the handler list for handlers matching\n        given stanza type and payload namespace. Run the\n        handlers found ordering them by priority until\n        the first one which returns `True`.\n\n        :Parameters:\n            - `handler_list`: list of available handlers\n            - `stanza`: the stanza to handle\n            - `stanza_type`: stanza type override (value of its \"type\"\n              attribute)\n\n        :return: result of the last handler or `False` if no\n            handler was found.\n        \"\"\"\n        # pylint: disable=W0212\n        if stanza_type is None:\n            stanza_type = stanza.stanza_type\n        payload = stanza.get_all_payload()\n        classes = [p.__class__ for p in payload]\n        keys = [(p.__class__, p.handler_key) for p in payload]\n        for handler in handler_list:\n            type_filter = handler._pyxmpp_stanza_handled[1]\n            class_filter = handler._pyxmpp_payload_class_handled\n            extra_filter = handler._pyxmpp_payload_key\n            if type_filter != stanza_type:\n                continue\n            if class_filter:\n                if extra_filter is None and class_filter not in classes:\n                    continue\n                if extra_filter and (class_filter, extra_filter) not in keys:\n                    continue\n            response = handler(stanza)\n            if self._process_handler_result(response):\n                return True\n        return False"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef process_message(self, stanza):\n\n        stanza_type = stanza.stanza_type\n        if stanza_type is None:\n            stanza_type = \"normal\"\n\n        if self.__try_handlers(self._message_handlers, stanza,\n                                                stanza_type = stanza_type):\n            return True\n\n        if stanza_type not in (\"error\", \"normal\"):\n            # try 'normal' handler additionaly to the regular handler\n            return self.__try_handlers(self._message_handlers, stanza,\n                                                    stanza_type = \"normal\")\n        return False", "response": "Pass it to a handler of the type and payload namespace."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef process_presence(self, stanza):\n\n        stanza_type = stanza.stanza_type\n        return self.__try_handlers(self._presence_handlers, stanza, stanza_type)", "response": "Pass it to a handler of the stanza s type and payload namespace."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef route_stanza(self, stanza):\n        if stanza.stanza_type not in (\"error\", \"result\"):\n            response = stanza.make_error_response(u\"recipient-unavailable\")\n            self.send(response)\n        return True", "response": "Process a presence stanza not addressed to us."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef process_stanza(self, stanza):\n\n        self.fix_in_stanza(stanza)\n        to_jid = stanza.to_jid\n\n        if not self.process_all_stanzas and to_jid and (\n                to_jid != self.me and to_jid.bare() != self.me.bare()):\n            return self.route_stanza(stanza)\n\n        try:\n            if isinstance(stanza, Iq):\n                if self.process_iq(stanza):\n                    return True\n            elif isinstance(stanza, Message):\n                if self.process_message(stanza):\n                    return True\n            elif isinstance(stanza, Presence):\n                if self.process_presence(stanza):\n                    return True\n        except ProtocolError, err:\n            typ = stanza.stanza_type\n            if typ != 'error' and (typ != 'result'\n                                                or stanza.stanza_type != 'iq'):\n                response = stanza.make_error_response(err.xmpp_name)\n                self.send(response)\n                err.log_reported()\n            else:\n                err.log_ignored()\n            return\n        logger.debug(\"Unhandled %r stanza: %r\" % (stanza.stanza_type,\n                                                        stanza.serialize()))\n        return False", "response": "Process a stanza received from the stream."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nsets response handlers for an IQ get or set stanza.", "response": "def set_response_handlers(self, stanza, res_handler, err_handler,\n                                    timeout_handler = None, timeout = None):\n        \"\"\"Set response handler for an IQ \"get\" or \"set\" stanza.\n\n        This should be called before the stanza is sent.\n\n        :Parameters:\n            - `stanza`: an IQ stanza\n            - `res_handler`: result handler for the stanza. Will be called\n              when matching <iq type=\"result\"/> is received. Its only\n              argument will be the stanza received. The handler may return\n              a stanza or list of stanzas which should be sent in response.\n            - `err_handler`: error handler for the stanza. Will be called\n              when matching <iq type=\"error\"/> is received. Its only\n              argument will be the stanza received. The handler may return\n              a stanza or list of stanzas which should be sent in response\n              but this feature should rather not be used (it is better not to\n              respond to 'error' stanzas).\n            - `timeout_handler`: timeout handler for the stanza. Will be called\n              (with no arguments) when no matching <iq type=\"result\"/> or <iq\n              type=\"error\"/> is received in next `timeout` seconds.\n            - `timeout`: timeout value for the stanza. After that time if no\n              matching <iq type=\"result\"/> nor <iq type=\"error\"/> stanza is\n              received, then timeout_handler (if given) will be called.\n        \"\"\"\n        # pylint: disable-msg=R0913\n        self.lock.acquire()\n        try:\n            self._set_response_handlers(stanza, res_handler, err_handler,\n                                                    timeout_handler, timeout)\n        finally:\n            self.lock.release()"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _set_response_handlers(self, stanza, res_handler, err_handler,\n                                timeout_handler = None, timeout = None):\n        \"\"\"Same as `set_response_handlers` but assume `self.lock` is\n        acquired.\"\"\"\n        # pylint: disable-msg=R0913\n        self.fix_out_stanza(stanza)\n        to_jid = stanza.to_jid\n        if to_jid:\n            to_jid = unicode(to_jid)\n        if timeout_handler:\n            def callback(dummy1, dummy2):\n                \"\"\"Wrapper for the timeout handler to make it compatible\n                with the `ExpiringDictionary` \"\"\"\n                timeout_handler()\n            self._iq_response_handlers.set_item(\n                                    (stanza.stanza_id, to_jid),\n                                    (res_handler,err_handler),\n                                    timeout, callback)\n        else:\n            self._iq_response_handlers.set_item(\n                                    (stanza.stanza_id, to_jid),\n                                    (res_handler, err_handler),\n                                    timeout)", "response": "Set the response handlers for the given stanza."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ninstall stanza handlers provided by handler_objects.", "response": "def setup_stanza_handlers(self, handler_objects, usage_restriction):\n        \"\"\"Install stanza handlers provided by `handler_objects`\"\"\"\n        # pylint: disable=W0212\n        iq_handlers = {\"get\": {}, \"set\": {}}\n        message_handlers = []\n        presence_handlers = []\n        for obj in handler_objects:\n            if not isinstance(obj, XMPPFeatureHandler):\n                continue\n            obj.stanza_processor = self\n            for dummy, handler in inspect.getmembers(obj, callable):\n                if not hasattr(handler, \"_pyxmpp_stanza_handled\"):\n                    continue\n                element_name, stanza_type = handler._pyxmpp_stanza_handled\n                restr = handler._pyxmpp_usage_restriction\n                if restr and restr != usage_restriction:\n                    continue\n                if element_name == \"iq\":\n                    payload_class = handler._pyxmpp_payload_class_handled\n                    payload_key = handler._pyxmpp_payload_key\n                    if (payload_class, payload_key) in iq_handlers[stanza_type]:\n                        continue\n                    iq_handlers[stanza_type][(payload_class, payload_key)] = \\\n                            handler\n                    continue\n                elif element_name == \"message\":\n                    handler_list = message_handlers\n                elif element_name == \"presence\":\n                    handler_list = presence_handlers\n                else:\n                    raise ValueError, \"Bad handler decoration\"\n                handler_list.append(handler)\n        with self.lock:\n            self._iq_handlers = iq_handlers\n            self._presence_handlers = presence_handlers\n            self._message_handlers = message_handlers"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef send(self, stanza):\n        if self.uplink:\n            self.uplink.send(stanza)\n        else:\n            raise NoRouteError(\"No route for stanza\")", "response": "Send a stanza to the uplink if it is defined."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef check_events(self):\n        if self.event_dispatcher.flush() is QUIT:\n            self._quit = True\n            return True\n        return False", "response": "Call the event dispatcher.\n\n        Quit the main loop when the QUIT event is reached."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _add_timeout_handler(self, handler):\n        # pylint: disable-msg=W0212\n        now = time.time()\n        for dummy, method in inspect.getmembers(handler, callable):\n            if not hasattr(method, \"_pyxmpp_timeout\"):\n                continue\n            self._timeout_handlers.append((now + method._pyxmpp_timeout,\n                                                                    method))\n        self._timeout_handlers.sort(key = lambda x: x[0])", "response": "Add a TimeoutHandler to the main loop."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _remove_timeout_handler(self, handler):\n        self._timeout_handlers = [(t, h) for (t, h)\n                                            in self._timeout_handlers\n                                            if h.im_self != handler]", "response": "Remove the given TimeoutHandler from the main loop."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ncalls the timeout handlers due.", "response": "def _call_timeout_handlers(self):\n        \"\"\"Call the timeout handlers due.\n\n        :Return: (next_event_timeout, sources_handled) tuple.\n            next_event_timeout is number of seconds until the next timeout\n            event, sources_handled is number of handlers called.\n        \"\"\"\n        sources_handled = 0\n        now = time.time()\n        schedule = None\n        while self._timeout_handlers:\n            schedule, handler = self._timeout_handlers[0]\n            if schedule <= now:\n                # pylint: disable-msg=W0212\n                logger.debug(\"About to call a timeout handler: {0!r}\"\n                                                        .format(handler))\n                self._timeout_handlers = self._timeout_handlers[1:]\n                result = handler()\n                logger.debug(\" handler result: {0!r}\".format(result))\n                rec = handler._pyxmpp_recurring\n                if rec:\n                    logger.debug(\" recurring, restarting in {0} s\"\n                                        .format(handler._pyxmpp_timeout))\n                    self._timeout_handlers.append(\n                                    (now + handler._pyxmpp_timeout, handler))\n                    self._timeout_handlers.sort(key = lambda x: x[0])\n                elif rec is None and result is not None:\n                    logger.debug(\" auto-recurring, restarting in {0} s\"\n                                                            .format(result))\n                    self._timeout_handlers.append((now + result, handler))\n                    self._timeout_handlers.sort(key = lambda x: x[0])\n                sources_handled += 1\n            else:\n                break\n            if self.check_events():\n                return 0, sources_handled\n        if self._timeout_handlers and schedule:\n            timeout = schedule - now\n        else:\n            timeout = None\n        return timeout, sources_handled"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef xml_elements_equal(element1, element2, ignore_level1_cdata = False):\n    # pylint: disable-msg=R0911\n    if None in (element1, element2) or element1.tag != element2.tag:\n        return False\n    attrs1 = element1.items()\n    attrs1.sort()\n    attrs2 = element2.items()\n    attrs2.sort()\n\n    if not ignore_level1_cdata:\n        if element1.text != element2.text:\n            return False\n\n    if attrs1 != attrs2:\n        return False\n\n    if len(element1) != len(element2):\n        return False\n    for child1, child2 in zip(element1, element2):\n        if child1.tag != child2.tag:\n            return False\n        if not ignore_level1_cdata:\n            if element1.text != element2.text:\n                return False\n        if not xml_elements_equal(child1, child2):\n            return False\n    return True", "response": "Checks if two XML elements are equal."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nconvert a naive datetime object containing a naive datetime object containing a UTC time with local time.", "response": "def datetime_utc_to_local(utc):\n    \"\"\"\n    An ugly hack to convert naive :std:`datetime.datetime` object containing\n    UTC time to a naive :std:`datetime.datetime` object with local time.\n    It seems standard Python 2.3 library doesn't provide any better way to\n    do that.\n    \"\"\"\n    # pylint: disable-msg=C0103\n    ts = time.time()\n    cur = datetime.datetime.fromtimestamp(ts)\n    cur_utc = datetime.datetime.utcfromtimestamp(ts)\n\n    offset = cur - cur_utc\n    t = utc\n\n    d = datetime.timedelta(hours = 2)\n    while d > _MINUTE:\n        local = t + offset\n        tm = local.timetuple()\n        tm = tm[0:8] + (0, )\n        ts = time.mktime(tm)\n        u = datetime.datetime.utcfromtimestamp(ts)\n        diff = u - utc\n        if diff < _MINUTE and diff > -_MINUTE:\n            break\n        if diff > _NULLDELTA:\n            offset -= d\n        else:\n            offset += d\n        d //= 2\n    return local"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _decode_subelements(self):\n        for child in self._element:\n            if child.tag == self._subject_tag:\n                self._subject = child.text\n            elif child.tag == self._body_tag:\n                self._body = child.text\n            elif child.tag == self._thread_tag:\n                self._thread = child.text", "response": "Decode the stanza subelements."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning the XML stanza representation.", "response": "def as_xml(self):\n        \"\"\"Return the XML stanza representation.\n\n        Always return an independent copy of the stanza XML representation,\n        which can be freely modified without affecting the stanza.\n\n        :returntype: :etree:`ElementTree.Element`\"\"\"\n        result = Stanza.as_xml(self)\n        if self._subject:\n            child = ElementTree.SubElement(result, self._subject_tag)\n            child.text = self._subject\n        if self._body:\n            child = ElementTree.SubElement(result, self._body_tag)\n            child.text = self._body\n        if self._thread:\n            child = ElementTree.SubElement(result, self._thread_tag)\n            child.text = self._thread\n        return result"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ncreating a deep copy of the stanza.", "response": "def copy(self):\n        \"\"\"Create a deep copy of the stanza.\n\n        :returntype: `Message`\"\"\"\n        result = Message(None, self.from_jid, self.to_jid,\n                        self.stanza_type, self.stanza_id, self.error,\n                        self._return_path(), self._subject, self._body,\n                                                            self._thread)\n        for payload in self._payload:\n            result.add_payload(payload.copy())\n        return result"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef make_error_response(self, cond):\n\n        if self.stanza_type == \"error\":\n            raise ValueError(\"Errors may not be generated in response\"\n                                                                \" to errors\")\n\n        msg = Message(stanza_type = \"error\", from_jid = self.to_jid,\n                        to_jid = self.from_jid, stanza_id = self.stanza_id,\n                        error_cond = cond,\n                        subject = self._subject, body = self._body,\n                        thread = self._thread)\n\n        if self._payload is None:\n            self.decode_payload()\n        for payload in self._payload:\n            msg.add_payload(payload.copy())\n\n        return msg", "response": "Create an error response for any non - error message stanza."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nmoving a SessionHandler instance to the beginning of the list.", "response": "def _move_session_handler(handlers):\n    \"\"\"Find a SessionHandler instance in the list and move it to the beginning.\n    \"\"\"\n    index = 0\n    for i, handler in enumerate(handlers):\n        if isinstance(handler, SessionHandler):\n            index = i\n            break\n    if index:\n        handlers[:index + 1] = [handlers[index]] + handlers[:index]"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nschedule a new XMPP c2s connection.", "response": "def connect(self):\n        \"\"\"Schedule a new XMPP c2s connection.\n        \"\"\"\n        with self.lock:\n            if self.stream:\n                logger.debug(\"Closing the previously used stream.\")\n                self._close_stream()\n\n            transport = TCPTransport(self.settings)\n\n            addr = self.settings[\"server\"]\n            if addr:\n                service = None\n            else:\n                addr = self.jid.domain\n                service = self.settings[\"c2s_service\"]\n\n            transport.connect(addr, self.settings[\"c2s_port\"], service)\n            handlers = self._base_handlers[:]\n            handlers += self.handlers + [self]\n            self.clear_response_handlers()\n            self.setup_stanza_handlers(handlers, \"pre-auth\")\n            stream = ClientStream(self.jid, self, handlers, self.settings)\n            stream.initiate(transport)\n            self.main_loop.add_handler(transport)\n            self.main_loop.add_handler(stream)\n            self._ml_handlers += [transport, stream]\n            self.stream = stream\n            self.uplink = stream"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef disconnect(self):\n        with self.lock:\n            if self.stream:\n                if self.settings[u\"initial_presence\"]:\n                    self.send(Presence(stanza_type = \"unavailable\"))\n                self.stream.disconnect()", "response": "Gracefully disconnect from the server."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _close_stream(self):\n        self.stream.close()\n        if self.stream.transport in self._ml_handlers:\n            self._ml_handlers.remove(self.stream.transport)\n            self.main_loop.remove_handler(self.stream.transport)\n        self.stream = None\n        self.uplink = None", "response": "Same as `close_stream` but with the `lock` acquired."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _stream_disconnected(self, event):\n        with self.lock:\n            if event.stream != self.stream:\n                return\n            if self.stream is not None and event.stream == self.stream:\n                if self.stream.transport in self._ml_handlers:\n                    self._ml_handlers.remove(self.stream.transport)\n                    self.main_loop.remove_handler(self.stream.transport)\n                self.stream = None\n                self.uplink = None", "response": "Handle stream disconnection event."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef regular_tasks(self):\n        with self.lock:\n            ret = self._iq_response_handlers.expire()\n            if ret is None:\n                return 1\n            else:\n                return min(1, ret)", "response": "Do some housekeeping (cache expiration, timeout handling).\n\n        This method should be called periodically from the application's\n        main loop.\n\n        :Return: suggested delay (in seconds) before the next call to this\n                                                                    method.\n        :Returntype: `int`"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef base_handlers_factory(self):\n        tls_handler = StreamTLSHandler(self.settings)\n        sasl_handler = StreamSASLHandler(self.settings)\n        session_handler = SessionHandler()\n        binding_handler = ResourceBindingHandler(self.settings)\n        return [tls_handler, sasl_handler, binding_handler, session_handler]", "response": "Default base client handlers factory."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nreturn a payload class for given element name.", "response": "def payload_class_for_element_name(element_name):\n    \"\"\"Return a payload class for given element name.\"\"\"\n    logger.debug(\" looking up payload class for element: {0!r}\".format(\n                                                                element_name))\n    logger.debug(\"  known: {0!r}\".format(STANZA_PAYLOAD_CLASSES))\n    if element_name in STANZA_PAYLOAD_CLASSES:\n        return STANZA_PAYLOAD_CLASSES[element_name]\n    else:\n        return XMLPayload"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _unquote(data):\n    if not data.startswith(b'\"') or not data.endswith(b'\"'):\n        return data\n    return QUOTE_RE.sub(b\"\\\\1\", data[1:-1])", "response": "Unquote a quoted string from DIGEST - MD5 challenge or response."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _quote(data):\n    data = data.replace(b'\\\\', b'\\\\\\\\')\n    data = data.replace(b'\"', b'\\\\\"')\n    return data", "response": "Prepare a string for quoting for DIGEST - MD5 challenge or response."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ncomputes the MD5 sum of username realm passwd.", "response": "def _make_urp_hash(username, realm, passwd):\n    \"\"\"Compute MD5 sum of username:realm:password.\n\n    :Parameters:\n        - `username`: a username.\n        - `realm`: a realm.\n        - `passwd`: a password.\n    :Types:\n        - `username`: `bytes`\n        - `realm`: `bytes`\n        - `passwd`: `bytes`\n\n    :return: the MD5 sum of the parameters joined with ':'.\n    :returntype: `bytes`\"\"\"\n    if realm is None:\n        realm = b\"\"\n    return _h_value(b\":\".join((username, realm, passwd)))"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _compute_response(urp_hash, nonce, cnonce, nonce_count, authzid,\n                                                                    digest_uri):\n    \"\"\"Compute DIGEST-MD5 response value.\n\n    :Parameters:\n        - `urp_hash`: MD5 sum of username:realm:password.\n        - `nonce`: nonce value from a server challenge.\n        - `cnonce`: cnonce value from the client response.\n        - `nonce_count`: nonce count value.\n        - `authzid`: authorization id.\n        - `digest_uri`: digest-uri value.\n    :Types:\n        - `urp_hash`: `bytes`\n        - `nonce`: `bytes`\n        - `nonce_count`: `int`\n        - `authzid`: `bytes`\n        - `digest_uri`: `bytes`\n\n    :return: the computed response value.\n    :returntype: `bytes`\"\"\"\n    # pylint: disable-msg=C0103,R0913\n    logger.debug(\"_compute_response{0!r}\".format((urp_hash, nonce, cnonce,\n                                            nonce_count, authzid,digest_uri)))\n    if authzid:\n        a1 = b\":\".join((urp_hash, nonce, cnonce, authzid))\n    else:\n        a1 = b\":\".join((urp_hash, nonce, cnonce))\n    a2 = b\"AUTHENTICATE:\" + digest_uri\n    return b2a_hex(_kd_value(b2a_hex(_h_value(a1)), b\":\".join((\n            nonce, nonce_count, cnonce, b\"auth\", b2a_hex(_h_value(a2))))))", "response": "Compute DIGEST - MD5 response value."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef rfc2425encode(name,value,parameters=None,charset=\"utf-8\"):\n    if not parameters:\n        parameters={}\n    if type(value) is unicode:\n        value=value.replace(u\"\\r\\n\",u\"\\\\n\")\n        value=value.replace(u\"\\n\",u\"\\\\n\")\n        value=value.replace(u\"\\r\",u\"\\\\n\")\n        value=value.encode(charset,\"replace\")\n    elif type(value) is not str:\n        raise TypeError(\"Bad type for rfc2425 value\")\n    elif not valid_string_re.match(value):\n        parameters[\"encoding\"]=\"b\"\n        value=binascii.b2a_base64(value)\n\n    ret=str(name).lower()\n    for k,v in parameters.items():\n        ret+=\";%s=%s\" % (str(k),str(v))\n    ret+=\":\"\n    while(len(value)>70):\n        ret+=value[:70]+\"\\r\\n \"\n        value=value[70:]\n    ret+=value+\"\\r\\n\"\n    return ret", "response": "Encodes a vCard field into an RFC2425 line."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef as_xml(self,parent):\n        return parent.newTextChild(None, to_utf8(self.name.upper()), to_utf8(self.value))", "response": "Create vcard - tmp XML representation of the field."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef as_xml(self,parent):\n        name=to_utf8(self.name.upper())\n        content=self.value.as_utf8()\n        return parent.newTextChild(None, name, content)", "response": "Create vcard - tmp XML representation of the field."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef rfc2426(self):\n        return rfc2425encode(\"n\",u';'.join(quote_semicolon(val) for val in\n                (self.family,self.given,self.middle,self.prefix,self.suffix)))", "response": "Return the field content in the RFC 2426 format."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef as_xml(self,parent):\n        n=parent.newChild(None,\"N\",None)\n        n.newTextChild(None,\"FAMILY\",to_utf8(self.family))\n        n.newTextChild(None,\"GIVEN\",to_utf8(self.given))\n        n.newTextChild(None,\"MIDDLE\",to_utf8(self.middle))\n        n.newTextChild(None,\"PREFIX\",to_utf8(self.prefix))\n        n.newTextChild(None,\"SUFFIX\",to_utf8(self.suffix))\n        return n", "response": "Create vcard - tmp XML representation of the field."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef rfc2426(self):\n        if self.uri:\n            return rfc2425encode(self.name,self.uri,{\"value\":\"uri\"})\n        elif self.image:\n            if self.type:\n                p={\"type\":self.type}\n            else:\n                p={}\n            return rfc2425encode(self.name,self.image,p)", "response": "Return the field in the RFC 2426 format."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef as_xml(self,parent):\n        n=parent.newChild(None,self.name.upper(),None)\n        if self.uri:\n            n.newTextChild(None,\"EXTVAL\",to_utf8(self.uri))\n        else:\n            if self.type:\n                n.newTextChild(None,\"TYPE\",self.type)\n            n.newTextChild(None,\"BINVAL\",binascii.b2a_base64(self.image))\n        return n", "response": "Create vcard - tmp XML representation of the field."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef __from_xml(self,value):\n        n=value.children\n        vns=get_node_ns(value)\n        while n:\n            if n.type!='element':\n                n=n.next\n                continue\n            ns=get_node_ns(n)\n            if (ns and vns and ns.getContent()!=vns.getContent()):\n                n=n.next\n                continue\n            if n.name=='POBOX':\n                self.pobox=unicode(n.getContent(),\"utf-8\",\"replace\")\n            elif n.name in ('EXTADR', 'EXTADD'):\n                self.extadr=unicode(n.getContent(),\"utf-8\",\"replace\")\n            elif n.name=='STREET':\n                self.street=unicode(n.getContent(),\"utf-8\",\"replace\")\n            elif n.name=='LOCALITY':\n                self.locality=unicode(n.getContent(),\"utf-8\",\"replace\")\n            elif n.name=='REGION':\n                self.region=unicode(n.getContent(),\"utf-8\",\"replace\")\n            elif n.name=='PCODE':\n                self.pcode=unicode(n.getContent(),\"utf-8\",\"replace\")\n            elif n.name=='CTRY':\n                self.ctry=unicode(n.getContent(),\"utf-8\",\"replace\")\n            elif n.name in (\"HOME\",\"WORK\",\"POSTAL\",\"PARCEL\",\"DOM\",\"INTL\",\n                    \"PREF\"):\n                self.type.append(n.name.lower())\n            n=n.next\n        if self.type==[]:\n            self.type=[\"intl\",\"postal\",\"parcel\",\"work\"]\n        elif \"dom\" in self.type and \"intl\" in self.type:\n            raise ValueError(\"Both 'dom' and 'intl' specified in vcard ADR\")", "response": "Initialize a VCardAdr object from an XML element."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nreturns the field in the RFC 2426 format.", "response": "def rfc2426(self):\n        \"\"\"RFC2426-encode the field content.\n\n        :return: the field in the RFC 2426 format.\n        :returntype: `str`\"\"\"\n        return rfc2425encode(\"adr\",u';'.join(quote_semicolon(val) for val in\n                (self.pobox,self.extadr,self.street,self.locality,\n                        self.region,self.pcode,self.ctry)),\n                {\"type\":\",\".join(self.type)})"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ncreate vcard - tmp XML representation of the field.", "response": "def as_xml(self,parent):\n        \"\"\"Create vcard-tmp XML representation of the field.\n\n        :Parameters:\n            - `parent`: parent node for the element\n        :Types:\n            - `parent`: `libxml2.xmlNode`\n\n        :return: xml node with the field data.\n        :returntype: `libxml2.xmlNode`\"\"\"\n        n=parent.newChild(None,\"ADR\",None)\n        for t in (\"home\",\"work\",\"postal\",\"parcel\",\"dom\",\"intl\",\"pref\"):\n            if t in self.type:\n                n.newChild(None,t.upper(),None)\n        n.newTextChild(None,\"POBOX\",to_utf8(self.pobox))\n        n.newTextChild(None,\"EXTADD\",to_utf8(self.extadr))\n        n.newTextChild(None,\"STREET\",to_utf8(self.street))\n        n.newTextChild(None,\"LOCALITY\",to_utf8(self.locality))\n        n.newTextChild(None,\"REGION\",to_utf8(self.region))\n        n.newTextChild(None,\"PCODE\",to_utf8(self.pcode))\n        n.newTextChild(None,\"CTRY\",to_utf8(self.ctry))\n        return n"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef as_xml(self,parent):\n        n=parent.newChild(None,\"ADR\",None)\n        for t in (\"home\",\"work\",\"postal\",\"parcel\",\"dom\",\"intl\",\"pref\"):\n            if t in self.type:\n                n.newChild(None,t.upper(),None)\n        for l in self.lines:\n            n.newTextChild(None,\"LINE\",l)\n        return n", "response": "Create vcard - tmp XML representation of the field."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ncreating vcard - tmp XML representation of the field.", "response": "def as_xml(self,parent):\n        \"\"\"Create vcard-tmp XML representation of the field.\n\n        :Parameters:\n            - `parent`: parent node for the element\n        :Types:\n            - `parent`: `libxml2.xmlNode`\n\n        :return: xml node with the field data.\n        :returntype: `libxml2.xmlNode`\"\"\"\n        n=parent.newChild(None,\"TEL\",None)\n        for t in (\"home\",\"work\",\"voice\",\"fax\",\"pager\",\"msg\",\"cell\",\"video\",\n                \"bbs\",\"modem\",\"isdn\",\"pcs\",\"pref\"):\n            if t in self.type:\n                n.newChild(None,t.upper(),None)\n        n.newTextChild(None,\"NUMBER\",to_utf8(self.number))\n        return n"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns the field in the RFC 2426 format.", "response": "def rfc2426(self):\n        \"\"\"RFC2426-encode the field content.\n\n        :return: the field in the RFC 2426 format.\n        :returntype: `str`\"\"\"\n        return rfc2425encode(\"geo\",u';'.join(quote_semicolon(val) for val in\n                (self.lat,self.lon)))"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef as_xml(self,parent):\n        n=parent.newChild(None,\"GEO\",None)\n        n.newTextChild(None,\"LAT\",to_utf8(self.lat))\n        n.newTextChild(None,\"LON\",to_utf8(self.lon))\n        return n", "response": "Create vcard - tmp XML representation of the field."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef rfc2426(self):\n        if self.unit:\n            return rfc2425encode(\"org\",u';'.join(quote_semicolon(val) for val in\n                                      (self.name,self.unit)))\n        else:\n            return rfc2425encode(\"org\",unicode(quote_semicolon(self.name)))", "response": "RFC 2426 - encode the field content."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ncreate vcard - tmp XML representation of the field.", "response": "def as_xml(self,parent):\n        \"\"\"Create vcard-tmp XML representation of the field.\n\n        :Parameters:\n            - `parent`: parent node for the element\n        :Types:\n            - `parent`: `libxml2.xmlNode`\n\n        :return: xml node with the field data.\n        :returntype: `libxml2.xmlNode`\"\"\"\n        n=parent.newChild(None,\"ORG\",None)\n        n.newTextChild(None,\"ORGNAME\",to_utf8(self.name))\n        n.newTextChild(None,\"ORGUNIT\",to_utf8(self.unit))\n        return n"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\ncreates vcard - tmp XML representation of the field.", "response": "def as_xml(self,parent):\n        \"\"\"Create vcard-tmp XML representation of the field.\n\n        :Parameters:\n            - `parent`: parent node for the element\n        :Types:\n            - `parent`: `libxml2.xmlNode`\n\n        :return: xml node with the field data.\n        :returntype: `libxml2.xmlNode`\"\"\"\n        n=parent.newChild(None,\"CATEGORIES\",None)\n        for k in self.keywords:\n            n.newTextChild(None,\"KEYWORD\",to_utf8(k))\n        return n"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreturns the field in the RFC 2426 format.", "response": "def rfc2426(self):\n        \"\"\"RFC2426-encode the field content.\n\n        :return: the field in the RFC 2426 format.\n        :returntype: `str`\"\"\"\n        if self.uri:\n            return rfc2425encode(self.name,self.uri,{\"value\":\"uri\"})\n        elif self.sound:\n            return rfc2425encode(self.name,self.sound)"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef as_xml(self,parent):\n        n=parent.newChild(None,self.name.upper(),None)\n        if self.uri:\n            n.newTextChild(None,\"EXTVAL\",to_utf8(self.uri))\n        elif self.phonetic:\n            n.newTextChild(None,\"PHONETIC\",to_utf8(self.phonetic))\n        else:\n            n.newTextChild(None,\"BINVAL\",binascii.b2a_base64(self.sound))\n        return n", "response": "Create vcard - tmp XML representation of the field."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef as_xml(self,parent):\n        if self.value in (\"public\",\"private\",\"confidental\"):\n            n=parent.newChild(None,self.name.upper(),None)\n            n.newChild(None,self.value.upper(),None)\n            return n\n        return None", "response": "Create vcard - tmp XML representation of the field."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning the field in the RFC 2426 format.", "response": "def rfc2426(self):\n        \"\"\"RFC2426-encode the field content.\n\n        :return: the field in the RFC 2426 format.\n        :returntype: `str`\"\"\"\n        if self.type:\n            p={\"type\":self.type}\n        else:\n            p={}\n        return rfc2425encode(self.name,self.cred,p)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef as_xml(self,parent):\n        n=parent.newChild(None,self.name.upper(),None)\n        if self.type:\n            n.newTextChild(None,\"TYPE\",self.type)\n        n.newTextChild(None,\"CRED\",binascii.b2a_base64(self.cred))\n        return n", "response": "Create vcard - tmp XML representation of the field."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ninitialize the mandatory self. fn from self. n.", "response": "def __make_fn(self):\n        \"\"\"Initialize the mandatory `self.fn` from `self.n`.\n\n        This is a workaround for buggy clients which set only one of them.\"\"\"\n        s=[]\n        if self.n.prefix:\n            s.append(self.n.prefix)\n        if self.n.given:\n            s.append(self.n.given)\n        if self.n.middle:\n            s.append(self.n.middle)\n        if self.n.family:\n            s.append(self.n.family)\n        if self.n.suffix:\n            s.append(self.n.suffix)\n        s=u\" \".join(s)\n        self.content[\"FN\"]=VCardString(\"FN\", s, empty_ok = True)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef __from_xml(self,data):\n        ns=get_node_ns(data)\n        if ns and ns.getContent()!=VCARD_NS:\n            raise ValueError(\"Not in the %r namespace\" % (VCARD_NS,))\n        if data.name!=\"vCard\":\n            raise ValueError(\"Bad root element name: %r\" % (data.name,))\n        n=data.children\n        dns=get_node_ns(data)\n        while n:\n            if n.type!='element':\n                n=n.next\n                continue\n            ns=get_node_ns(n)\n            if (ns and dns and ns.getContent()!=dns.getContent()):\n                n=n.next\n                continue\n            if not self.components.has_key(n.name):\n                n=n.next\n                continue\n            cl,tp=self.components[n.name]\n            if tp in (\"required\",\"optional\"):\n                if self.content.has_key(n.name):\n                    raise ValueError(\"Duplicate %s\" % (n.name,))\n                try:\n                    self.content[n.name]=cl(n.name,n)\n                except Empty:\n                    pass\n            elif tp==\"multi\":\n                if not self.content.has_key(n.name):\n                    self.content[n.name]=[]\n                try:\n                    self.content[n.name].append(cl(n.name,n))\n                except Empty:\n                    pass\n            n=n.next", "response": "Initialize a VCard object from an XML node."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef __from_rfc2426(self,data):\n        data=from_utf8(data)\n        lines=data.split(\"\\n\")\n        started=0\n        current=None\n        for l in lines:\n            if not l:\n                continue\n            if l[-1]==\"\\r\":\n                l=l[:-1]\n            if not l:\n                continue\n            if l[0] in \" \\t\":\n                if current is None:\n                    continue\n                current+=l[1:]\n                continue\n            if not started and current and current.upper().strip()==\"BEGIN:VCARD\":\n                started=1\n            elif started and current.upper().strip()==\"END:VCARD\":\n                current=None\n                break\n            elif current and started:\n                self._process_rfc2425_record(current)\n            current=l\n        if started and current:\n            self._process_rfc2425_record(current)", "response": "Initialize a VCard object from an RFC2426 string."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _process_rfc2425_record(self,data):\n        label,value=data.split(\":\",1)\n        value=value.replace(\"\\\\n\",\"\\n\").replace(\"\\\\N\",\"\\n\")\n        psplit=label.lower().split(\";\")\n        name=psplit[0]\n        params=psplit[1:]\n        if u\".\" in name:\n            name=name.split(\".\",1)[1]\n        name=name.upper()\n        if name in (u\"X-DESC\",u\"X-JABBERID\"):\n            name=name[2:]\n        if not self.components.has_key(name):\n            return\n        if params:\n            params=dict([p.split(\"=\",1) for p in params])\n        cl,tp=self.components[name]\n        if tp in (\"required\",\"optional\"):\n            if self.content.has_key(name):\n                raise ValueError(\"Duplicate %s\" % (name,))\n            try:\n                self.content[name]=cl(name,value,params)\n            except Empty:\n                pass\n        elif tp==\"multi\":\n            if not self.content.has_key(name):\n                self.content[name]=[]\n            try:\n                self.content[name].append(cl(name,value,params))\n            except Empty:\n                pass\n        else:\n            return", "response": "Parse a single RFC2425 record and update the internal dictionary with the attributes."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef rfc2426(self):\n        ret=\"begin:VCARD\\r\\n\"\n        ret+=\"version:3.0\\r\\n\"\n        for _unused, value in self.content.items():\n            if value is None:\n                continue\n            if type(value) is list:\n                for v in value:\n                    ret+=v.rfc2426()\n            else:\n                v=value.rfc2426()\n                ret+=v\n        return ret+\"end:VCARD\\r\\n\"", "response": "Return the UTF - 8 encoded RFC2426 representation of the object."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ncompletes the XML node with self content.", "response": "def complete_xml_element(self, xmlnode, _unused):\n        \"\"\"Complete the XML node with `self` content.\n\n        Should be overriden in classes derived from `StanzaPayloadObject`.\n\n        :Parameters:\n            - `xmlnode`: XML node with the element being built. It has already\n              right name and namespace, but no attributes or content.\n            - `_unused`: document to which the element belongs.\n        :Types:\n            - `xmlnode`: `libxml2.xmlNode`\n            - `_unused`: `libxml2.xmlDoc`\"\"\"\n        for _unused1, value in self.content.items():\n            if value is None:\n                continue\n            if type(value) is list:\n                for v in value:\n                    v.as_xml(xmlnode)\n            else:\n                value.as_xml(xmlnode)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef update_state(self):\n        self._lock.acquire()\n        try:\n            now = datetime.utcnow()\n            if self.state == 'new':\n                self.state = 'fresh'\n            if self.state == 'fresh':\n                if now > self.freshness_time:\n                    self.state = 'old'\n            if self.state == 'old':\n                if now > self.expire_time:\n                    self.state = 'stale'\n            if self.state == 'stale':\n                if now > self.purge_time:\n                    self.state = 'purged'\n            self.state_value = _state_values[self.state]\n            return self.state\n        finally:\n            self._lock.release()", "response": "Update the state of the item and compute time of the next next"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _deactivate(self):\n        self.cache.remove_fetcher(self)\n        if self.active:\n            self._deactivated()", "response": "Remove the fetcher from cache and mark it not active."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef got_it(self, value, state = \"new\"):\n        if not self.active:\n            return\n        item = CacheItem(self.address, value, self._item_freshness_period,\n                self._item_expiration_period, self._item_purge_period, state)\n        self._object_handler(item.address, item.value, item.state)\n        self.cache.add_item(item)\n        self._deactivate()", "response": "Handle a successfull retrieval and call apriopriate handler."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef error(self, error_data):\n        if not self.active:\n            return\n        if not self._try_backup_item():\n            self._error_handler(self.address, error_data)\n        self.cache.invalidate_object(self.address)\n        self._deactivate()", "response": "Handle an error and call apriopriate handler."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nhandle fetcher timeout and call apriopriate handler.", "response": "def timeout(self):\n        \"\"\"Handle fetcher timeout and call apriopriate handler.\n\n        Is called by the cache object and should _not_ be called by fetcher or\n        application.\n\n        Do nothing when the fetcher is not active any more (after\n        one of handlers was already called).\"\"\"\n        if not self.active:\n            return\n        if not self._try_backup_item():\n            if self._timeout_handler:\n                self._timeout_handler(self.address)\n            else:\n                self._error_handler(self.address, None)\n        self.cache.invalidate_object(self.address)\n        self._deactivate()"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nchecking if a backup item is available in cache and call the item handler if it is.", "response": "def _try_backup_item(self):\n        \"\"\"Check if a backup item is available in cache and call\n        the item handler if it is.\n\n        :return: `True` if backup item was found.\n        :returntype: `bool`\"\"\"\n        if not self._backup_state:\n            return False\n        item = self.cache.get_item(self.address, self._backup_state)\n        if item:\n            self._object_handler(item.address, item.value, item.state)\n            return True\n        else:\n            False"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef request_object(self, address, state, object_handler,\n            error_handler = None, timeout_handler = None,\n            backup_state = None, timeout = timedelta(minutes=60),\n            freshness_period = None, expiration_period = None,\n            purge_period = None):\n        \"\"\"Request an object with given address and state not worse than\n        `state`. The object will be taken from cache if available, and\n        created/fetched otherwise. The request is asynchronous -- this metod\n        doesn't return the object directly, but the `object_handler` is called\n        as soon as the object is available (this may be before `request_object`\n        returns and may happen in other thread). On error the `error_handler`\n        will be called, and on timeout -- the `timeout_handler`.\n\n        :Parameters:\n            - `address`: address of the object requested.\n            - `state`: the worst acceptable object state. When 'new' then always\n              a new object will be created/fetched. 'stale' will select any\n              item available in cache.\n            - `object_handler`: function to be called when object is available.\n              It will be called with the following arguments: address, object\n              and its state.\n            - `error_handler`: function to be called on object retrieval error.\n              It will be called with two arguments: requested address and\n              additional error information (fetcher-specific, may be\n              StanzaError for XMPP objects).  If not given, then the object\n              handler will be called with object set to `None` and state\n              \"error\".\n            - `timeout_handler`: function to be called on object retrieval\n              timeout.  It will be called with only one argument: the requested\n              address. If not given, then the `error_handler` will be called\n              instead, with error details set to `None`.\n            - `backup_state`: when set and object in state `state` is not\n              available in the cache and object retrieval failed then object\n              with this state will also be looked-up in the cache and provided\n              if available.\n            - `timeout`: time interval after which retrieval of the object\n              should be given up.\n            - `freshness_period`: time interval after which the item created\n              should become 'old'.\n            - `expiration_period`: time interval after which the item created\n              should become 'stale'.\n            - `purge_period`: time interval after which the item created\n              shuld be removed from the cache.\n        :Types:\n            - `address`: any hashable\n            - `state`: \"new\", \"fresh\", \"old\" or \"stale\"\n            - `object_handler`: callable(address, value, state)\n            - `error_handler`: callable(address, error_data)\n            - `timeout_handler`: callable(address)\n            - `backup_state`: \"new\", \"fresh\", \"old\" or \"stale\"\n            - `timeout`: `timedelta`\n            - `freshness_period`: `timedelta`\n            - `expiration_period`: `timedelta`\n            - `purge_period`: `timedelta`\n        \"\"\"\n        self._lock.acquire()\n        try:\n            if state == 'stale':\n                state = 'purged'\n            item = self.get_item(address, state)\n            if item:\n                object_handler(item.address, item.value, item.state)\n                return\n            if not self._fetcher:\n                raise TypeError(\"No cache fetcher defined\")\n            if not error_handler:\n                def default_error_handler(address, _unused):\n                    \"Default error handler.\"\n                    return object_handler(address, None, 'error')\n                error_handler = default_error_handler\n            if not timeout_handler:\n                def default_timeout_handler(address):\n                    \"Default timeout handler.\"\n                    return error_handler(address, None)\n                timeout_handler = default_timeout_handler\n            if freshness_period is None:\n                freshness_period = self.default_freshness_period\n            if expiration_period is None:\n                expiration_period = self.default_expiration_period\n            if purge_period is None:\n                purge_period = self.default_purge_period\n\n            fetcher = self._fetcher(self, address, freshness_period,\n                    expiration_period, purge_period, object_handler, error_handler,\n                    timeout_handler, timeout, backup_state)\n            fetcher.fetch()\n            self._active_fetchers.append((fetcher.timeout_time,fetcher))\n            self._active_fetchers.sort()\n        finally:\n            self._lock.release()", "response": "Request an object from the cache."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef invalidate_object(self, address, state = 'stale'):\n        self._lock.acquire()\n        try:\n            item = self.get_item(address)\n            if item and item.state_value<_state_values[state]:\n                item.state=state\n                item.update_state()\n                self._items_list.sort()\n        finally:\n            self._lock.release()", "response": "Invalidates the object with the given address."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nadd an item to the cache.", "response": "def add_item(self, item):\n        \"\"\"Add an item to the cache.\n\n        Item state is updated before adding it (it will not be 'new' any more).\n\n        :Parameters:\n            - `item`: the item to add.\n        :Types:\n            - `item`: `CacheItem`\n\n        :return: state of the item after addition.\n        :returntype: `str`\n        \"\"\"\n        self._lock.acquire()\n        try:\n            state = item.update_state()\n            if state != 'purged':\n                if len(self._items_list) >= self.max_items:\n                    self.purge_items()\n                self._items[item.address] = item\n                self._items_list.append(item)\n                self._items_list.sort()\n            return item.state\n        finally:\n            self._lock.release()"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef get_item(self, address, state = 'fresh'):\n        self._lock.acquire()\n        try:\n            item = self._items.get(address)\n            if not item:\n                return None\n            self.update_item(item)\n            if _state_values[state] >= item.state_value:\n                return item\n            return None\n        finally:\n            self._lock.release()", "response": "Get an item from the cache."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef update_item(self, item):\n\n        self._lock.acquire()\n        try:\n            state = item.update_state()\n            self._items_list.sort()\n            if item.state == 'purged':\n                self._purged += 1\n                if self._purged > 0.25*self.max_items:\n                    self.purge_items()\n            return state\n        finally:\n            self._lock.release()", "response": "Update the state of an item in the cache."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nremoving purged and overlimit items from the cache.", "response": "def purge_items(self):\n        \"\"\"Remove purged and overlimit items from the cache.\n\n        TODO: optimize somehow.\n\n        Leave no more than 75% of `self.max_items` items in the cache.\"\"\"\n        self._lock.acquire()\n        try:\n            il=self._items_list\n            num_items = len(il)\n            need_remove = num_items - int(0.75 * self.max_items)\n\n            for _unused in range(need_remove):\n                item=il.pop(0)\n                try:\n                    del self._items[item.address]\n                except KeyError:\n                    pass\n\n            while il and il[0].update_state()==\"purged\":\n                item=il.pop(0)\n                try:\n                    del self._items[item.address]\n                except KeyError:\n                    pass\n        finally:\n            self._lock.release()"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef tick(self):\n        self._lock.acquire()\n        try:\n            now = datetime.utcnow()\n            for t,f in list(self._active_fetchers):\n                if t > now:\n                    break\n                f.timeout()\n            self.purge_items()\n        finally:\n            self._lock.release()", "response": "Do the regular cache maintenance."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef remove_fetcher(self, fetcher):\n        self._lock.acquire()\n        try:\n            for t, f in list(self._active_fetchers):\n                if f is fetcher:\n                    self._active_fetchers.remove((t, f))\n                    f._deactivated()\n                    return\n        finally:\n            self._lock.release()", "response": "Removes a running fetcher from the list of active fetchers."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nsetting the fetcher class.", "response": "def set_fetcher(self, fetcher_class):\n        \"\"\"Set the fetcher class.\n\n        :Parameters:\n            - `fetcher_class`: the fetcher class.\n        :Types:\n            - `fetcher_class`: `CacheFetcher` based class\n        \"\"\"\n        self._lock.acquire()\n        try:\n            self._fetcher = fetcher_class\n        finally:\n            self._lock.release()"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nrequesting an object from the cache.", "response": "def request_object(self, object_class, address, state, object_handler,\n            error_handler = None, timeout_handler = None,\n            backup_state = None, timeout = None,\n            freshness_period = None, expiration_period = None, purge_period = None):\n        \"\"\"Request an object of given class, with given address and state not\n        worse than `state`. The object will be taken from cache if available,\n        and created/fetched otherwise. The request is asynchronous -- this\n        metod doesn't return the object directly, but the `object_handler` is\n        called as soon as the object is available (this may be before\n        `request_object` returns and may happen in other thread). On error the\n        `error_handler` will be called, and on timeout -- the\n        `timeout_handler`.\n\n        :Parameters:\n            - `object_class`: class (type) of the object requested.\n            - `address`: address of the object requested.\n            - `state`: the worst acceptable object state. When 'new' then always\n              a new object will be created/fetched. 'stale' will select any\n              item available in cache.\n            - `object_handler`: function to be called when object is available.\n              It will be called with the following arguments: address, object\n              and its state.\n            - `error_handler`: function to be called on object retrieval error.\n              It will be called with two arguments: requested address and\n              additional error information (fetcher-specific, may be\n              StanzaError for XMPP objects).  If not given, then the object\n              handler will be called with object set to `None` and state\n              \"error\".\n            - `timeout_handler`: function to be called on object retrieval\n              timeout.  It will be called with only one argument: the requested\n              address. If not given, then the `error_handler` will be called\n              instead, with error details set to `None`.\n            - `backup_state`: when set and object in state `state` is not\n              available in the cache and object retrieval failed then object\n              with this state will also be looked-up in the cache and provided\n              if available.\n            - `timeout`: time interval after which retrieval of the object\n              should be given up.\n            - `freshness_period`: time interval after which the item created\n              should become 'old'.\n            - `expiration_period`: time interval after which the item created\n              should become 'stale'.\n            - `purge_period`: time interval after which the item created\n              shuld be removed from the cache.\n        :Types:\n            - `object_class`: `classobj`\n            - `address`: any hashable\n            - `state`: \"new\", \"fresh\", \"old\" or \"stale\"\n            - `object_handler`: callable(address, value, state)\n            - `error_handler`: callable(address, error_data)\n            - `timeout_handler`: callable(address)\n            - `backup_state`: \"new\", \"fresh\", \"old\" or \"stale\"\n            - `timeout`: `timedelta`\n            - `freshness_period`: `timedelta`\n            - `expiration_period`: `timedelta`\n            - `purge_period`: `timedelta`\n        \"\"\"\n\n        self._lock.acquire()\n        try:\n            if object_class not in self._caches:\n                raise TypeError(\"No cache for %r\" % (object_class,))\n\n            self._caches[object_class].request_object(address, state, object_handler,\n                    error_handler, timeout_handler, backup_state, timeout,\n                    freshness_period, expiration_period, purge_period)\n        finally:\n            self._lock.release()"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ndoes the regular cache maintenance.", "response": "def tick(self):\n        \"\"\"Do the regular cache maintenance.\n\n        Must be called from time to time for timeouts and cache old items\n        purging to work.\"\"\"\n        self._lock.acquire()\n        try:\n            for cache in self._caches.values():\n                cache.tick()\n        finally:\n            self._lock.release()"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nregister a fetcher class for an object class.", "response": "def register_fetcher(self, object_class, fetcher_class):\n        \"\"\"Register a fetcher class for an object class.\n\n        :Parameters:\n            - `object_class`: class to be retrieved by the fetcher.\n            - `fetcher_class`: the fetcher class.\n        :Types:\n            - `object_class`: `classobj`\n            - `fetcher_class`: `CacheFetcher` based class\n        \"\"\"\n        self._lock.acquire()\n        try:\n            cache = self._caches.get(object_class)\n            if not cache:\n                cache = Cache(self.max_items, self.default_freshness_period,\n                        self.default_expiration_period, self.default_purge_period)\n                self._caches[object_class] = cache\n            cache.set_fetcher(fetcher_class)\n        finally:\n            self._lock.release()"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef unregister_fetcher(self, object_class):\n        self._lock.acquire()\n        try:\n            cache = self._caches.get(object_class)\n            if not cache:\n                return\n            cache.set_fetcher(None)\n        finally:\n            self._lock.release()", "response": "Unregister a fetcher class for an object class."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _register_client_authenticator(klass, name):\n    # pylint: disable-msg=W0212\n    CLIENT_MECHANISMS_D[name] = klass\n    items = sorted(CLIENT_MECHANISMS_D.items(), key = _key_func, reverse = True)\n    CLIENT_MECHANISMS[:] = [k for (k, v) in items ]\n    SECURE_CLIENT_MECHANISMS[:] = [k for (k, v) in items\n                                                    if v._pyxmpp_sasl_secure]", "response": "Register a client authenticator class to the given name."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nregisters a client authenticator class to the server authenticator list.", "response": "def _register_server_authenticator(klass, name):\n    \"\"\"Add a client authenticator class to `SERVER_MECHANISMS_D`,\n    `SERVER_MECHANISMS` and, optionally, to `SECURE_SERVER_MECHANISMS`\n    \"\"\"\n    # pylint: disable-msg=W0212\n    SERVER_MECHANISMS_D[name] = klass\n    items = sorted(SERVER_MECHANISMS_D.items(), key = _key_func, reverse = True)\n    SERVER_MECHANISMS[:] = [k for (k, v) in items ]\n    SECURE_SERVER_MECHANISMS[:] = [k for (k, v) in items\n                                                    if v._pyxmpp_sasl_secure]"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nclass decorator for SASL mechanims.", "response": "def sasl_mechanism(name, secure, preference = 50):\n    \"\"\"Class decorator generator for `ClientAuthenticator` or\n    `ServerAuthenticator` subclasses. Adds the class to the pyxmpp.sasl\n    mechanism registry.\n\n    :Parameters:\n        - `name`: SASL mechanism name\n        - `secure`: if the mechanims can be considered secure - `True`\n          if it can be used over plain-text channel\n        - `preference`: mechanism preference level (the higher the better)\n    :Types:\n        - `name`: `unicode`\n        - `secure`: `bool`\n        - `preference`: `int`\n    \"\"\"\n    # pylint: disable-msg=W0212\n    def decorator(klass):\n        \"\"\"The decorator.\"\"\"\n        klass._pyxmpp_sasl_secure = secure\n        klass._pyxmpp_sasl_preference = preference\n        if issubclass(klass, ClientAuthenticator):\n            _register_client_authenticator(klass, name)\n        elif issubclass(klass, ServerAuthenticator):\n            _register_server_authenticator(klass, name)\n        else:\n            raise TypeError(\"Not a ClientAuthenticator\"\n                                            \" or ServerAuthenticator class\")\n        return klass\n    return decorator"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef check_password(self, username, password, properties):\n        logger.debug(\"check_password{0!r}\".format(\n                                            (username, password, properties)))\n        pwd, pwd_format = self.get_password(username,\n                    (u\"plain\", u\"md5:user:realm:password\"), properties)\n        if pwd_format == u\"plain\":\n            logger.debug(\"got plain password: {0!r}\".format(pwd))\n            return pwd is not None and password == pwd\n        elif pwd_format in (u\"md5:user:realm:password\"):\n            logger.debug(\"got md5:user:realm:password password: {0!r}\"\n                                                            .format(pwd))\n            realm = properties.get(\"realm\")\n            if realm is None:\n                realm = \"\"\n            else:\n                realm = realm.encode(\"utf-8\")\n            username = username.encode(\"utf-8\")\n            password = password.encode(\"utf-8\")\n\n            # pylint: disable-msg=E1101\n            urp_hash = hashlib.md5(b\"%s:%s:%s\").hexdigest()\n            return urp_hash == pwd\n        logger.debug(\"got password in unknown format: {0!r}\".format(pwd_format))\n        return False", "response": "Checks the validity of a user s password against the external user s realm."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef encode(self):\n        if self.data is None:\n            return \"\"\n        elif not self.data:\n            return \"=\"\n        else:\n            ret = standard_b64encode(self.data)\n            return ret.decode(\"us-ascii\")", "response": "Base64 - encode the data contained in the reply when appropriate."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nsend session esteblishment request if the feature was advertised by the server.", "response": "def handle_authorized(self, event):\n        \"\"\"Send session esteblishment request if the feature was advertised by\n        the server.\n        \"\"\"\n        stream = event.stream\n        if not stream:\n            return\n        if not stream.initiator:\n            return\n        if stream.features is None:\n            return\n        element = stream.features.find(SESSION_TAG)\n        if element is None:\n            return\n        logger.debug(\"Establishing IM session\")\n        stanza = Iq(stanza_type = \"set\")\n        payload = XMLPayload(ElementTree.Element(SESSION_TAG))\n        stanza.set_payload(payload)\n        self.stanza_processor.set_response_handlers(stanza,\n                                        self._session_success,\n                                        self._session_error)\n        stream.send(stanza)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nconverting ASN. 1 string to Unicode string.", "response": "def _decode_asn1_string(data):\n    \"\"\"Convert ASN.1 string to a Unicode string.\n    \"\"\"\n    if isinstance(data, BMPString):\n        return bytes(data).decode(\"utf-16-be\")\n    else:\n        return bytes(data).decode(\"utf-8\")"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning the human - readable subject name derived from the SubjectName or SubjectAltName field.", "response": "def display_name(self):\n        \"\"\"Get human-readable subject name derived from the SubjectName\n        or SubjectAltName field.\n        \"\"\"\n        if self.subject_name:\n            return u\", \".join( [ u\", \".join(\n                        [ u\"{0}={1}\".format(k,v) for k, v in dn_tuple ] )\n                            for dn_tuple in self.subject_name ])\n        for name_type in (\"XmppAddr\", \"DNS\", \"SRV\"):\n            names = self.alt_names.get(name_type)\n            if names:\n                return names[0]\n        return u\"<unknown>\""}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef get_jids(self):\n        result = []\n        if (\"XmppAddr\" in self.alt_names or \"DNS\" in self.alt_names\n                                            or \"SRVName\" in self.alt_names):\n            addrs =  self.alt_names.get(\"XmppAddr\", [])\n            addrs += [ addr for addr in self.alt_names.get(\"DNS\", [])\n                                            if not addr.startswith(\"*.\") ]\n            addrs += [ addr.split(\".\", 1)[1] for addr\n                                        in self.alt_names.get(\"SRVName\", [])\n                            if (addr.startswith(\"_xmpp-server.\")\n                                    or addr.startswith(\"_xmpp-client.\"))]\n            warn_bad = True\n        elif self.common_names:\n            addrs = [addr for addr in self.common_names\n                                if \"@\" not in addr and \"/\" not in addr]\n            warn_bad = False\n        else:\n            return []\n        for addr in addrs:\n            try:\n                jid = JID(addr)\n                if jid not in result:\n                    result.append(jid)\n            except JIDError, err:\n                if warn_bad:\n                    logger.warning(u\"Bad JID in the certificate: {0!r}: {1}\"\n                                                            .format(addr, err))\n        return result", "response": "Return a list of JIDs for which this certificate is valid except the domain\n        wildcards."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef verify_server(self, server_name, srv_type = 'xmpp-client'):\n        server_jid = JID(server_name)\n        if \"XmppAddr\" not in self.alt_names and \"DNS\" not in self.alt_names \\\n                                and \"SRV\" not in self.alt_names:\n            return self.verify_jid_against_common_name(server_jid)\n        names = [name for name in self.alt_names.get(\"DNS\", [])\n                                        if not name.startswith(u\"*.\")]\n        names += self.alt_names.get(\"XmppAddr\", [])\n        for name in names:\n            logger.debug(\"checking {0!r} against {1!r}\".format(server_jid,\n                                                                name))\n            try:\n                jid = JID(name)\n            except ValueError:\n                logger.debug(\"Not a valid JID: {0!r}\".format(name))\n                continue\n            if jid == server_jid:\n                logger.debug(\"Match!\")\n                return True\n        if srv_type and self.verify_jid_against_srv_name(server_jid, srv_type):\n            return True\n        wildcards = [name[2:] for name in self.alt_names.get(\"DNS\", [])\n                                                if name.startswith(\"*.\")]\n        if not wildcards or not \".\" in server_jid.domain:\n            return False\n        logger.debug(\"checking {0!r} against wildcard domains: {1!r}\"\n                                                .format(server_jid, wildcards))\n        server_domain = JID(domain = server_jid.domain.split(\".\", 1)[1])\n        for domain in wildcards:\n            logger.debug(\"checking {0!r} against {1!r}\".format(server_domain,\n                                                                domain))\n            try:\n                jid = JID(domain)\n            except ValueError:\n                logger.debug(\"Not a valid JID: {0!r}\".format(name))\n                continue\n            if jid == server_domain:\n                logger.debug(\"Match!\")\n                return True\n        return False", "response": "Verify certificate for a given server."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef verify_jid_against_common_name(self, jid):\n        if not self.common_names:\n            return False\n        for name in self.common_names:\n            try:\n                cn_jid = JID(name)\n            except ValueError:\n                continue\n            if jid == cn_jid:\n                return True\n        return False", "response": "Verify that the given jid is listed in the certificate commonName."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ncheck if the cerificate is valid for given JID and a service type.", "response": "def verify_jid_against_srv_name(self, jid, srv_type):\n        \"\"\"Check if the cerificate is valid for given domain-only JID\n        and a service type.\n\n        :Parameters:\n            - `jid`: JID requested (domain part only)\n            - `srv_type`: service type, e.g. 'xmpp-client'\n        :Types:\n            - `jid`: `JID`\n            - `srv_type`: `unicode`\n        :Returntype: `bool`\n        \"\"\"\n        srv_prefix = u\"_\" + srv_type + u\".\"\n        srv_prefix_l = len(srv_prefix)\n        for srv in self.alt_names.get(\"SRVName\", []):\n            logger.debug(\"checking {0!r} against {1!r}\".format(jid,\n                                                                srv))\n            if not srv.startswith(srv_prefix):\n                logger.debug(\"{0!r} does not start with {1!r}\"\n                                                .format(srv, srv_prefix))\n                continue\n            try:\n                srv_jid = JID(srv[srv_prefix_l:])\n            except ValueError:\n                continue\n            if srv_jid == jid:\n                logger.debug(\"Match!\")\n                return True\n        return False"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nverify certificate for a client.", "response": "def verify_client(self, client_jid = None, domains = None):\n        \"\"\"Verify certificate for a client.\n\n        Please note that `client_jid` is only a hint to choose from the names,\n        other JID may be returned if `client_jid` is not included in the\n        certificate.\n\n        :Parameters:\n            - `client_jid`: client name requested. May be `None` to allow\n              any name in one of the `domains`.\n            - `domains`: list of domains we can handle.\n        :Types:\n            - `client_jid`: `JID`\n            - `domains`: `list` of `unicode`\n\n        :Return: one of the jids in the certificate or `None` is no authorized\n        name is found.\n        \"\"\"\n        jids = [jid for jid in self.get_jids() if jid.local]\n        if not jids:\n            return None\n        if client_jid is not None and client_jid in jids:\n            return client_jid\n        if domains is None:\n            return jids[0]\n        for jid in jids:\n            for domain in domains:\n                if are_domains_equal(jid.domain, domain):\n                    return jid\n        return None"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef from_ssl_socket(cls, ssl_socket):\n        cert = cls()\n        try:\n            data = ssl_socket.getpeercert()\n        except AttributeError:\n            # PyPy doesn't have .getppercert\n            return cert\n        logger.debug(\"Certificate data from ssl module: {0!r}\".format(data))\n        if not data:\n            return cert\n        cert.validated = True\n        cert.subject_name = data.get('subject')\n        cert.alt_names = defaultdict(list)\n        if 'subjectAltName' in data:\n            for name, value in data['subjectAltName']:\n                cert.alt_names[name].append(value)\n        if 'notAfter' in data:\n            tstamp = ssl.cert_time_to_seconds(data['notAfter'])\n            cert.not_after = datetime.utcfromtimestamp(tstamp)\n        if sys.version_info.major < 3:\n            cert._decode_names() # pylint: disable=W0212\n        cert.common_names = []\n        if cert.subject_name:\n            for part in cert.subject_name:\n                for name, value in part:\n                    if name == 'commonName':\n                        cert.common_names.append(value)\n        return cert", "response": "Load certificate data from an SSL socket."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _decode_names(self):\n        if self.subject_name is not None:\n            subject_name = []\n            for part in self.subject_name:\n                new_part = []\n                for name, value in part:\n                    try:\n                        name = name.decode(\"utf-8\")\n                        value = value.decode(\"utf-8\")\n                    except UnicodeError:\n                        continue\n                    new_part.append((name, value))\n                subject_name.append(tuple(new_part))\n            self.subject_name = tuple(subject_name)\n        for key, old in self.alt_names.items():\n            new = []\n            for name in old:\n                try:\n                    name = name.decode(\"utf-8\")\n                except UnicodeError:\n                    continue\n                new.append(name)\n            self.alt_names[key] = new", "response": "Decode names into Unicode."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nget certificate data from an SSL socket.", "response": "def from_ssl_socket(cls, ssl_socket):\n        \"\"\"Get certificate data from an SSL socket.\n        \"\"\"\n        try:\n            data = ssl_socket.getpeercert(True)\n        except AttributeError:\n            # PyPy doesn't have .getpeercert\n            data = None\n        if not data:\n            logger.debug(\"No certificate infromation\")\n            return cls()\n        result = cls.from_der_data(data)\n        result.validated = bool(ssl_socket.getpeercert())\n        return result"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ndecoding DER - encoded certificate.", "response": "def from_der_data(cls, data):\n        \"\"\"Decode DER-encoded certificate.\n\n        :Parameters:\n            - `data`: the encoded certificate\n        :Types:\n            - `data`: `bytes`\n\n        :Return: decoded certificate data\n        :Returntype: ASN1CertificateData\n        \"\"\"\n        # pylint: disable=W0212\n        logger.debug(\"Decoding DER certificate: {0!r}\".format(data))\n        if cls._cert_asn1_type is None:\n            cls._cert_asn1_type = Certificate()\n        cert = der_decoder.decode(data, asn1Spec = cls._cert_asn1_type)[0]\n        result = cls()\n        tbs_cert = cert.getComponentByName('tbsCertificate')\n        subject = tbs_cert.getComponentByName('subject')\n        logger.debug(\"Subject: {0!r}\".format(subject))\n        result._decode_subject(subject)\n        validity = tbs_cert.getComponentByName('validity')\n        result._decode_validity(validity)\n        extensions = tbs_cert.getComponentByName('extensions')\n        if extensions:\n            for extension in extensions:\n                logger.debug(\"Extension: {0!r}\".format(extension))\n                oid = extension.getComponentByName('extnID')\n                logger.debug(\"OID: {0!r}\".format(oid))\n                if oid != SUBJECT_ALT_NAME_OID:\n                    continue\n                value = extension.getComponentByName('extnValue')\n                logger.debug(\"Value: {0!r}\".format(value))\n                if isinstance(value, Any):\n                    # should be OctetString, but is Any\n                    # in pyasn1_modules-0.0.1a\n                    value = der_decoder.decode(value,\n                                                asn1Spec = OctetString())[0]\n                alt_names = der_decoder.decode(value,\n                                            asn1Spec = GeneralNames())[0]\n                logger.debug(\"SubjectAltName: {0!r}\".format(alt_names))\n                result._decode_alt_names(alt_names)\n        return result"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nloading data from a ASN. 1 subject.", "response": "def _decode_subject(self, subject):\n        \"\"\"Load data from a ASN.1 subject.\n        \"\"\"\n        self.common_names = []\n        subject_name = []\n        for rdnss in subject:\n            for rdns in rdnss:\n                rdnss_list = []\n                for nameval in rdns:\n                    val_type = nameval.getComponentByName('type')\n                    value = nameval.getComponentByName('value')\n                    if val_type not in DN_OIDS:\n                        logger.debug(\"OID {0} not supported\".format(val_type))\n                        continue\n                    val_type = DN_OIDS[val_type]\n                    value = der_decoder.decode(value,\n                                            asn1Spec = DirectoryString())[0]\n                    value = value.getComponent()\n                    try:\n                        value = _decode_asn1_string(value)\n                    except UnicodeError:\n                        logger.debug(\"Cannot decode value: {0!r}\".format(value))\n                        continue\n                    if val_type == u\"commonName\":\n                        self.common_names.append(value)\n                    rdnss_list.append((val_type, value))\n                subject_name.append(tuple(rdnss_list))\n        self.subject_name = tuple(subject_name)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _decode_validity(self, validity):\n        not_after = validity.getComponentByName('notAfter')\n        not_after = str(not_after.getComponent())\n        if isinstance(not_after, GeneralizedTime):\n            self.not_after = datetime.strptime(not_after, \"%Y%m%d%H%M%SZ\")\n        else:\n            self.not_after = datetime.strptime(not_after, \"%y%m%d%H%M%SZ\")\n        self.alt_names = defaultdict(list)", "response": "Load data from a ASN. 1 validity value."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nloading SubjectAltName from a ASN. 1 GeneralNames value.", "response": "def _decode_alt_names(self, alt_names):\n        \"\"\"Load SubjectAltName from a ASN.1 GeneralNames value.\n\n        :Values:\n            - `alt_names`: the SubjectAltNama extension value\n        :Types:\n            - `alt_name`: `GeneralNames`\n        \"\"\"\n        for alt_name in alt_names:\n            tname = alt_name.getName()\n            comp = alt_name.getComponent()\n            if tname == \"dNSName\":\n                key = \"DNS\"\n                value = _decode_asn1_string(comp)\n            elif tname == \"uniformResourceIdentifier\":\n                key = \"URI\"\n                value = _decode_asn1_string(comp)\n            elif tname == \"otherName\":\n                oid = comp.getComponentByName(\"type-id\")\n                value = comp.getComponentByName(\"value\")\n                if oid == XMPPADDR_OID:\n                    key = \"XmppAddr\"\n                    value = der_decoder.decode(value,\n                                            asn1Spec = UTF8String())[0]\n                    value = _decode_asn1_string(value)\n                elif oid == SRVNAME_OID:\n                    key = \"SRVName\"\n                    value = der_decoder.decode(value,\n                                            asn1Spec = IA5String())[0]\n                    value = _decode_asn1_string(value)\n                else:\n                    logger.debug(\"Unknown other name: {0}\".format(oid))\n                    continue\n            else:\n                logger.debug(\"Unsupported general name: {0}\"\n                                                        .format(tname))\n                continue\n            self.alt_names[key].append(value)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nload a certificate from a file.", "response": "def from_file(cls, filename):\n        \"\"\"Load certificate from a file.\n        \"\"\"\n        with open(filename, \"r\") as pem_file:\n            data = pem.readPemFromFile(pem_file)\n        return cls.from_der_data(data)"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef main():\n    parser = argparse.ArgumentParser(description = 'XMPP echo bot',\n                                    parents = [XMPPSettings.get_arg_parser()])\n    parser.add_argument('--debug',\n                        action = 'store_const', dest = 'log_level',\n                        const = logging.DEBUG, default = logging.INFO,\n                        help = 'Print debug messages')\n    parser.add_argument('--quiet', const = logging.ERROR,\n                        action = 'store_const', dest = 'log_level',\n                        help = 'Print only error messages')\n    parser.add_argument('--trace', action = 'store_true',\n                        help = 'Print XML data sent and received')\n    parser.add_argument('--roster-cache', \n                        help = 'Store roster in this file')\n    parser.add_argument('jid', metavar = 'JID', \n                                        help = 'The bot JID')\n    subparsers = parser.add_subparsers(help = 'Action', dest = \"action\")\n    show_p = subparsers.add_parser('show', help = 'Show roster and exit')\n    show_p.add_argument('--presence', action = 'store_true',\n                        help = 'Wait 5 s for contact presence information'\n                                ' and display it with the roster')\n    mon_p = subparsers.add_parser('monitor', help = \n                                        'Show roster and subsequent changes')\n    mon_p.add_argument('--presence', action = 'store_true',\n                        help = 'Show contact presence changes too')\n    add_p = subparsers.add_parser('add', help = 'Add an item to the roster')\n    add_p.add_argument('--subscribe', action = 'store_true', dest = 'subscribe',\n                        help = 'Request a presence subscription too')\n    add_p.add_argument('--approve', action = 'store_true', dest = 'approve',\n                        help = 'Pre-approve subscription from the contact'\n                                                ' (requires server support)')\n    add_p.add_argument('contact', metavar = 'CONTACT', help = 'The JID to add')\n    add_p.add_argument('name', metavar = 'NAME', nargs = '?',\n                                            help = 'Contact name')\n    add_p.add_argument('groups', metavar = 'GROUP', nargs = '*',\n                                            help = 'Group names')\n    rm_p = subparsers.add_parser('remove',\n                                    help = 'Remove an item from the roster')\n    rm_p.add_argument('contact', metavar = 'CONTACT',\n                                    help = 'The JID to remove')\n    upd_p = subparsers.add_parser('update', \n                                    help = 'Update an item in the roster')\n    upd_p.add_argument('contact', metavar = 'CONTACT',\n                                    help = 'The JID to update')\n    upd_p.add_argument('name', metavar = 'NAME', nargs = '?',\n                                            help = 'Contact name')\n    upd_p.add_argument('groups', metavar = 'GROUP', nargs = '*',\n                                            help = 'Group names')\n\n    args = parser.parse_args()\n    settings = XMPPSettings()\n    settings.load_arguments(args)\n\n    if settings.get(\"password\") is None:\n        password = getpass(\"{0!r} password: \".format(args.jid))\n        if sys.version_info.major < 3:\n            password = password.decode(\"utf-8\")\n        settings[\"password\"] = password\n\n    if sys.version_info.major < 3:\n        args.jid = args.jid.decode(\"utf-8\")\n        if getattr(args, \"contact\", None):\n            args.contact = args.contact.decode(\"utf-8\")\n        if getattr(args, \"name\", None):\n            args.name = args.name.decode(\"utf-8\")\n        if getattr(args, \"groups\", None):\n            args.groups = [g.decode(\"utf-8\") for g in args.groups]\n\n    logging.basicConfig(level = args.log_level)\n    if args.trace:\n        print \"enabling trace\"\n        handler = logging.StreamHandler()\n        handler.setLevel(logging.DEBUG)\n        for logger in (\"pyxmpp2.IN\", \"pyxmpp2.OUT\"):\n            logger = logging.getLogger(logger)\n            logger.setLevel(logging.DEBUG)\n            logger.addHandler(handler)\n            logger.propagate = False\n           \n    if args.action == \"monitor\" or args.action == \"show\" and args.presence:\n        # According to RFC6121 it could be None for 'monitor' (no need to send\n        # initial presence to request roster), but Google seems to require that\n        # to send roster pushes\n        settings[\"initial_presence\"] = Presence(priority = -1)\n    else:\n        settings[\"initial_presence\"] = None\n\n    tool = RosterTool(JID(args.jid), args, settings)\n    try:\n        tool.run()\n    except KeyboardInterrupt:\n        tool.disconnect()", "response": "Parse the command - line arguments and run the bot."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nrequest client connection and start the main loop.", "response": "def run(self):\n        \"\"\"Request client connection and start the main loop.\"\"\"\n        if self.args.roster_cache and os.path.exists(self.args.roster_cache):\n            logging.info(u\"Loading roster from {0!r}\"\n                                            .format(self.args.roster_cache))\n            try:\n                self.client.roster_client.load_roster(self.args.roster_cache)\n            except (IOError, ValueError), err:\n                logging.error(u\"Could not load the roster: {0!r}\".format(err))\n        self.client.connect()\n        self.client.run()"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef add_handler(self, handler):\n        if not isinstance(handler, EventHandler):\n            raise TypeError, \"Not an EventHandler\"\n        with self.lock:\n            if handler in self.handlers:\n                return\n            self.handlers.append(handler)\n            self._update_handlers()", "response": "Adds a handler object to the list of handlers."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nremoving a handler object from the list of handlers.", "response": "def remove_handler(self, handler):\n        \"\"\"Remove a handler object.\n\n        :Parameters:\n            - `handler`: the object to remove\n        \"\"\"\n        with self.lock:\n            if handler in self.handlers:\n                self.handlers.remove(handler)\n                self._update_handlers()"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nupdating _handler_map after handlers have been modified.", "response": "def _update_handlers(self):\n        \"\"\"Update `_handler_map` after `handlers` have been\n        modified.\"\"\"\n        handler_map = defaultdict(list)\n        for i, obj in enumerate(self.handlers):\n            for dummy, handler in inspect.getmembers(obj, callable):\n                if not hasattr(handler, \"_pyxmpp_event_handled\"):\n                    continue\n                # pylint: disable-msg=W0212\n                event_class = handler._pyxmpp_event_handled\n                handler_map[event_class].append( (i, handler) )\n        self._handler_map = handler_map"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nget the next event from the queue and pass it to the appropriate handlers.", "response": "def dispatch(self, block = False, timeout = None):\n        \"\"\"Get the next event from the queue and pass it to\n        the appropriate handlers.\n\n        :Parameters:\n            - `block`: wait for event if the queue is empty\n            - `timeout`: maximum time, in seconds, to wait if `block` is `True`\n        :Types:\n            - `block`: `bool`\n            - `timeout`: `float`\n\n        :Return: the event handled (may be `QUIT`) or `None`\n        \"\"\"\n        logger.debug(\" dispatching...\")\n        try:\n            event = self.queue.get(block, timeout)\n        except Queue.Empty:\n            logger.debug(\"    queue empty\")\n            return None\n        try:\n            logger.debug(\"    event: {0!r}\".format(event))\n            if event is QUIT:\n                return QUIT\n            handlers = list(self._handler_map[None])\n            klass = event.__class__\n            if klass in self._handler_map:\n                handlers += self._handler_map[klass]\n            logger.debug(\"    handlers: {0!r}\".format(handlers))\n            # to restore the original order of handler objects\n            handlers.sort(key = lambda x: x[0])\n            for dummy, handler in handlers:\n                logger.debug(u\"  passing the event to: {0!r}\".format(handler))\n                result = handler(event)\n                if isinstance(result, Event):\n                    self.queue.put(result)\n                elif result and event is not QUIT:\n                    return event\n            return event\n        finally:\n            self.queue.task_done()"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef flush(self, dispatch = True):\n        if dispatch:\n            while True:\n                event = self.dispatch(False)\n                if event in (None, QUIT):\n                    return event\n        else:\n            while True:\n                try:\n                    self.queue.get(False)\n                except Queue.Empty:\n                    return None", "response": "Flushes all events currently in the queue and dispatches them to the event handlers unless dispatch is False."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _configure_io_handler(self, handler):\n        if self.check_events():\n            return\n        if handler in self._unprepared_handlers:\n            old_fileno = self._unprepared_handlers[handler]\n            prepared = self._prepare_io_handler(handler)\n        else:\n            old_fileno = None\n            prepared = True\n        fileno = handler.fileno()\n        if old_fileno is not None and fileno != old_fileno:\n            del self._handlers[old_fileno]\n            try:\n                self.poll.unregister(old_fileno)\n            except KeyError:\n                # The socket has changed, but the old one isn't registered,\n                # e.g. ``prepare`` wants to connect again\n                pass\n        if not prepared:\n            self._unprepared_handlers[handler] = fileno\n        if not fileno:\n            return\n        self._handlers[fileno] = handler\n        events = 0\n        if handler.is_readable():\n            logger.debug(\" {0!r} readable\".format(handler))\n            events |= select.POLLIN\n        if handler.is_writable():\n            logger.debug(\" {0!r} writable\".format(handler))\n            events |= select.POLLOUT\n        if events:\n            logger.debug(\" registering {0!r} handler fileno {1} for\"\n                            \" events {2}\".format(handler, fileno, events))\n            self.poll.register(fileno, events)", "response": "Register an io - handler at the polling object."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ncalls the interfaces. IOHandler. prepare method and remove the handler from unprepared handler list when done.", "response": "def _prepare_io_handler(self, handler):\n        \"\"\"Call the `interfaces.IOHandler.prepare` method and\n        remove the handler from unprepared handler list when done.\n        \"\"\"\n        logger.debug(\" preparing handler: {0!r}\".format(handler))\n        ret = handler.prepare()\n        logger.debug(\"   prepare result: {0!r}\".format(ret))\n        if isinstance(ret, HandlerReady):\n            del self._unprepared_handlers[handler]\n            prepared = True\n        elif isinstance(ret, PrepareAgain):\n            if ret.timeout is not None:\n                if self._timeout is not None:\n                    self._timeout = min(self._timeout, ret.timeout)\n                else:\n                    self._timeout = ret.timeout\n            prepared = False\n        else:\n            raise TypeError(\"Unexpected result type from prepare()\")\n        return prepared"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nremove an i - o - handler.", "response": "def _remove_io_handler(self, handler):\n        \"\"\"Remove an i/o-handler.\"\"\"\n        if handler in self._unprepared_handlers:\n            old_fileno = self._unprepared_handlers[handler]\n            del self._unprepared_handlers[handler]\n        else:\n            old_fileno = handler.fileno()\n        if old_fileno is not None:\n            try:\n                del self._handlers[old_fileno]\n                self.poll.unregister(old_fileno)\n            except KeyError:\n                pass"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef loop_iteration(self, timeout = 60):\n        next_timeout, sources_handled = self._call_timeout_handlers()\n        if self._quit:\n            return sources_handled\n        if self._timeout is not None:\n            timeout = min(timeout, self._timeout)\n        if next_timeout is not None:\n            timeout = min(next_timeout, timeout)\n        for handler in list(self._unprepared_handlers):\n            self._configure_io_handler(handler)\n        events = self.poll.poll(timeout * 1000)\n        self._timeout = None\n        for (fileno, event) in events:\n            if event & select.POLLHUP:\n                self._handlers[fileno].handle_hup()\n            if event & select.POLLNVAL:\n                self._handlers[fileno].handle_nval()\n            if event & select.POLLIN:\n                self._handlers[fileno].handle_read()\n            elif event & select.POLLERR:\n                # if POLLIN was set this condition should be already handled\n                self._handlers[fileno].handle_err()\n            if event & select.POLLOUT:\n                self._handlers[fileno].handle_write()\n            sources_handled += 1\n            self._configure_io_handler(self._handlers[fileno])\n        return sources_handled", "response": "A loop iteration - check any scheduled events\n            and I/O available and run the handlers."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef HMAC(self, key, str_):\n        # pylint: disable=C0103\n        return hmac.new(key, str_, self.hash_factory).digest()", "response": "The HMAC(key, str) function."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef Hi(self, str_, salt, i):\n        # pylint: disable=C0103\n        Uj = self.HMAC(str_, salt + b\"\\000\\000\\000\\001\") # U1\n        result = Uj\n        for _ in range(2, i + 1):\n            Uj = self.HMAC(str_, Uj)               # Uj = HMAC(str, Uj-1)\n            result = self.XOR(result,  Uj)         # ... XOR Uj-1 XOR Uj\n        return result", "response": "The Hi(str, salt, i) function."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nprocess a challenge and return the response or a failure indicator.", "response": "def challenge(self, challenge):\n        \"\"\"Process a challenge and return the response.\n\n        :Parameters:\n            - `challenge`: the challenge from server.\n        :Types:\n            - `challenge`: `bytes`\n\n        :return: the response or a failure indicator.\n        :returntype: `sasl.Response` or `sasl.Failure`\n        \"\"\"\n        # pylint: disable=R0911\n        if not challenge:\n            logger.debug(\"Empty challenge\")\n            return Failure(\"bad-challenge\")\n\n        if self._server_first_message:\n            return self._final_challenge(challenge)\n\n        match = SERVER_FIRST_MESSAGE_RE.match(challenge)\n        if not match:\n            logger.debug(\"Bad challenge syntax: {0!r}\".format(challenge))\n            return Failure(\"bad-challenge\")\n\n        self._server_first_message = challenge\n\n        mext = match.group(\"mext\")\n        if mext:\n            logger.debug(\"Unsupported extension received: {0!r}\".format(mext))\n            return Failure(\"bad-challenge\")\n\n        nonce = match.group(\"nonce\")\n        if not nonce.startswith(self._c_nonce):\n            logger.debug(\"Nonce does not start with our nonce\")\n            return Failure(\"bad-challenge\")\n\n        salt = match.group(\"salt\")\n        try:\n            salt = a2b_base64(salt)\n        except ValueError:\n            logger.debug(\"Bad base64 encoding for salt: {0!r}\".format(salt))\n            return Failure(\"bad-challenge\")\n\n        iteration_count = match.group(\"iteration_count\")\n        try:\n            iteration_count = int(iteration_count)\n        except ValueError:\n            logger.debug(\"Bad iteration_count: {0!r}\".format(iteration_count))\n            return Failure(\"bad-challenge\")\n\n        return self._make_response(nonce, salt, iteration_count)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _make_response(self, nonce, salt, iteration_count):\n        self._salted_password = self.Hi(self.Normalize(self.password), salt,\n                                                            iteration_count)\n        self.password = None # not needed any more\n        if self.channel_binding:\n            channel_binding = b\"c=\" + standard_b64encode(self._gs2_header +\n                                                                self._cb_data)\n        else:\n            channel_binding = b\"c=\" + standard_b64encode(self._gs2_header)\n\n        # pylint: disable=C0103\n        client_final_message_without_proof = (channel_binding + b\",r=\" + nonce)\n\n        client_key = self.HMAC(self._salted_password, b\"Client Key\")\n        stored_key = self.H(client_key)\n        auth_message = ( self._client_first_message_bare + b\",\" +\n                                    self._server_first_message + b\",\" +\n                                        client_final_message_without_proof )\n        self._auth_message = auth_message\n        client_signature = self.HMAC(stored_key, auth_message)\n        client_proof = self.XOR(client_key, client_signature)\n        proof = b\"p=\" + standard_b64encode(client_proof)\n        client_final_message = (client_final_message_without_proof + b\",\" +\n                                                                    proof)\n        return Response(client_final_message)", "response": "Make a response for the first challenge from the server."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nprocessing the second challenge from the server and return the response or a failure indicator.", "response": "def _final_challenge(self, challenge):\n        \"\"\"Process the second challenge from the server and return the\n        response.\n\n        :Parameters:\n            - `challenge`: the challenge from server.\n        :Types:\n            - `challenge`: `bytes`\n\n        :return: the response or a failure indicator.\n        :returntype: `sasl.Response` or `sasl.Failure`\n        \"\"\"\n        if self._finished:\n            return Failure(\"extra-challenge\")\n\n        match = SERVER_FINAL_MESSAGE_RE.match(challenge)\n        if not match:\n            logger.debug(\"Bad final message syntax: {0!r}\".format(challenge))\n            return Failure(\"bad-challenge\")\n\n        error = match.group(\"error\")\n        if error:\n            logger.debug(\"Server returned SCRAM error: {0!r}\".format(error))\n            return Failure(u\"scram-\" + error.decode(\"utf-8\"))\n\n        verifier = match.group(\"verifier\")\n        if not verifier:\n            logger.debug(\"No verifier value in the final message\")\n            return Failure(\"bad-succes\")\n\n        server_key = self.HMAC(self._salted_password, b\"Server Key\")\n        server_signature = self.HMAC(server_key, self._auth_message)\n        if server_signature != a2b_base64(verifier):\n            logger.debug(\"Server verifier does not match\")\n            return Failure(\"bad-succes\")\n\n        self._finished = True\n        return Response(None)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nprocess the final challenge from the server.", "response": "def finish(self, data):\n        \"\"\"Process success indicator from the server.\n\n        Process any addiitional data passed with the success.\n        Fail if the server was not authenticated.\n\n        :Parameters:\n            - `data`: an optional additional data with success.\n        :Types:\n            - `data`: `bytes`\n\n        :return: success or failure indicator.\n        :returntype: `sasl.Success` or `sasl.Failure`\"\"\"\n        if not self._server_first_message:\n            logger.debug(\"Got success too early\")\n            return Failure(\"bad-success\")\n        if self._finished:\n            return Success({\"username\": self.username, \"authzid\": self.authzid})\n        else:\n            ret = self._final_challenge(data)\n            if isinstance(ret, Failure):\n                return ret\n            if self._finished:\n                return Success({\"username\": self.username,\n                                                    \"authzid\": self.authzid})\n            else:\n                logger.debug(\"Something went wrong when processing additional\"\n                                                        \" data with success?\")\n                return Failure(\"bad-success\")"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\ndecorate attaching a feature URI to a XMPPFeatureHandler class.", "response": "def feature_uri(uri):\n    \"\"\"Decorating attaching a feature URI (for Service Discovery or Capability\n    to a XMPPFeatureHandler class.\"\"\"\n    def decorator(class_):\n        \"\"\"Returns a decorated class\"\"\"\n        if \"_pyxmpp_feature_uris\" not in class_.__dict__:\n            class_._pyxmpp_feature_uris = set()\n        class_._pyxmpp_feature_uris.add(uri)\n        return class_\n    return decorator"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _iq_handler(iq_type, payload_class, payload_key, usage_restriction):\n    def decorator(func):\n        \"\"\"The decorator\"\"\"\n        func._pyxmpp_stanza_handled = (\"iq\", iq_type)\n        func._pyxmpp_payload_class_handled = payload_class\n        func._pyxmpp_payload_key = payload_key\n        func._pyxmpp_usage_restriction = usage_restriction\n        return func\n    return decorator", "response": "Method decorator for handling IQ types in XMPPFeatureHandler subclasses."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _stanza_handler(element_name, stanza_type, payload_class, payload_key,\n                                                            usage_restriction):\n    \"\"\"Method decorator generator for decorating <message/> or <presence/>\n    stanza handler methods in `XMPPFeatureHandler` subclasses.\n\n    :Parameters:\n        - `element_name`: \"message\" or \"presence\"\n        - `stanza_type`: expected value of the 'type' attribute of the stanza\n        - `payload_class`: payload class expected\n        - `payload_key`: payload class specific filtering key\n        - `usage_restriction`: optional usage restriction: \"pre-auth\" or\n          \"post-auth\"\n    :Types:\n        - `element_name`: `unicode`\n        - `stanza_type`: `unicode`\n        - `payload_class`: subclass of `StanzaPayload`\n        - `usage_restriction`: `unicode`\n    \"\"\"\n    def decorator(func):\n        \"\"\"The decorator\"\"\"\n        func._pyxmpp_stanza_handled = (element_name, stanza_type)\n        func._pyxmpp_payload_class_handled = payload_class\n        func._pyxmpp_payload_key = payload_key\n        func._pyxmpp_usage_restriction = usage_restriction\n        return func\n    return decorator", "response": "Method decorator for handling stanza types in XMPPFeatureHandler subclasses."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef presence_stanza_handler(stanza_type = None, payload_class = None,\n                        payload_key = None, usage_restriction = \"post-auth\"):\n    \"\"\"Method decorator generator for decorating <presence/>\n    stanza handler methods in `XMPPFeatureHandler` subclasses.\n\n    :Parameters:\n        - `payload_class`: payload class expected\n        - `stanza_type`: expected value of the 'type' attribute of the stanza.\n        - `payload_key`: payload class specific filtering key\n        - `usage_restriction`: optional usage restriction: \"pre-auth\" or\n          \"post-auth\"\n    :Types:\n        - `payload_class`: subclass of `StanzaPayload`\n        - `stanza_type`: `unicode`\n        - `usage_restriction`: `unicode`\n    \"\"\"\n    return _stanza_handler(\"presence\", stanza_type, payload_class, payload_key,\n                                                            usage_restriction)", "response": "Method decorator for presence stanza handler methods in XMPPFeatureHandler subclasses."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nclass decorator for decorationg StanzaPayload classes.", "response": "def payload_element_name(element_name):\n    \"\"\"Class decorator generator for decorationg\n    `StanzaPayload` subclasses.\n\n    :Parameters:\n        - `element_name`: XML element qname handled by the class\n    :Types:\n        - `element_name`: `unicode`\n    \"\"\"\n    def decorator(klass):\n        \"\"\"The payload_element_name decorator.\"\"\"\n        # pylint: disable-msg=W0212,W0404\n        from .stanzapayload import STANZA_PAYLOAD_CLASSES\n        from .stanzapayload import STANZA_PAYLOAD_ELEMENTS\n        if hasattr(klass, \"_pyxmpp_payload_element_name\"):\n            klass._pyxmpp_payload_element_name.append(element_name)\n        else:\n            klass._pyxmpp_payload_element_name = [element_name]\n        if element_name in STANZA_PAYLOAD_CLASSES:\n            logger = logging.getLogger('pyxmpp.payload_element_name')\n            logger.warning(\"Overriding payload class for {0!r}\".format(\n                                                                element_name))\n        STANZA_PAYLOAD_CLASSES[element_name] = klass\n        STANZA_PAYLOAD_ELEMENTS[klass].append(element_name)\n        return klass\n    return decorator"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef stream_element_handler(element_name, usage_restriction = None):\n    def decorator(func):\n        \"\"\"The decorator\"\"\"\n        func._pyxmpp_stream_element_handled = element_name\n        func._pyxmpp_usage_restriction = usage_restriction\n        return func\n    return decorator", "response": "Method decorator for decorating stream element handler methods in StreamFeatureHandler subclasses."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ncomplete the XML node with the self content.", "response": "def complete_xml_element(self, xmlnode, doc):\n        \"\"\"Complete the XML node with `self` content.\n\n        :Parameters:\n            - `xmlnode`: XML node with the element being built. It has already\n              right name and namespace, but no attributes or content.\n            - `doc`: document to which the element belongs.\n        :Types:\n            - `xmlnode`: `libxml2.xmlNode`\n            - `doc`: `libxml2.xmlDoc`\"\"\"\n        _unused = doc\n        if self.label is not None:\n            xmlnode.setProp(\"label\", self.label.encode(\"utf-8\"))\n        xmlnode.newTextChild(xmlnode.ns(), \"value\", self.value.encode(\"utf-8\"))\n        return xmlnode"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _new_from_xml(cls, xmlnode):\n        label = from_utf8(xmlnode.prop(\"label\"))\n        child = xmlnode.children\n        value = None\n        for child in xml_element_ns_iter(xmlnode.children, DATAFORM_NS):\n            if child.name == \"value\":\n                value = from_utf8(child.getContent())\n                break\n        if value is None:\n            raise BadRequestProtocolError(\"No value in <option/> element\")\n        return cls(value, label)", "response": "Create a new Option object from an XML element."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nadding an option for the field.", "response": "def add_option(self, value, label):\n        \"\"\"Add an option for the field.\n\n        :Parameters:\n            - `value`: option values.\n            - `label`: option label (human-readable description).\n        :Types:\n            - `value`: `list` of `unicode`\n            - `label`: `unicode`\n        \"\"\"\n        if type(value) is list:\n            warnings.warn(\".add_option() accepts single value now.\", DeprecationWarning, stacklevel=1)\n            value = value[0]\n        if self.type not in (\"list-multi\", \"list-single\"):\n            raise ValueError(\"Options are allowed only for list types.\")\n        option = Option(value, label)\n        self.options.append(option)\n        return option"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ncompletes the XML node with the content of this field.", "response": "def complete_xml_element(self, xmlnode, doc):\n        \"\"\"Complete the XML node with `self` content.\n\n        :Parameters:\n            - `xmlnode`: XML node with the element being built. It has already\n              right name and namespace, but no attributes or content.\n            - `doc`: document to which the element belongs.\n        :Types:\n            - `xmlnode`: `libxml2.xmlNode`\n            - `doc`: `libxml2.xmlDoc`\"\"\"\n        if self.type is not None and self.type not in self.allowed_types:\n            raise ValueError(\"Invalid form field type: %r\" % (self.type,))\n        if self.type is not None:\n            xmlnode.setProp(\"type\", self.type)\n        if not self.label is None:\n            xmlnode.setProp(\"label\", to_utf8(self.label))\n        if not self.name is None:\n            xmlnode.setProp(\"var\", to_utf8(self.name))\n        if self.values:\n            if self.type and len(self.values) > 1 and not self.type.endswith(u\"-multi\"):\n                raise ValueError(\"Multiple values not allowed for %r field\" % (self.type,))\n            for value in self.values:\n                xmlnode.newTextChild(xmlnode.ns(), \"value\", to_utf8(value))\n        for option in self.options:\n            option.as_xml(xmlnode, doc)\n        if self.required:\n            xmlnode.newChild(xmlnode.ns(), \"required\", None)\n        if self.desc:\n            xmlnode.newTextChild(xmlnode.ns(), \"desc\", to_utf8(self.desc))\n        return xmlnode"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _new_from_xml(cls, xmlnode):\n        field_type = xmlnode.prop(\"type\")\n        label = from_utf8(xmlnode.prop(\"label\"))\n        name = from_utf8(xmlnode.prop(\"var\"))\n        child = xmlnode.children\n        values = []\n        options = []\n        required = False\n        desc = None\n        while child:\n            if child.type != \"element\" or child.ns().content != DATAFORM_NS:\n                pass\n            elif child.name == \"required\":\n                required = True\n            elif child.name == \"desc\":\n                desc = from_utf8(child.getContent())\n            elif child.name == \"value\":\n                values.append(from_utf8(child.getContent()))\n            elif child.name == \"option\":\n                options.append(Option._new_from_xml(child))\n            child = child.next\n        if field_type and not field_type.endswith(\"-multi\") and len(values) > 1:\n            raise BadRequestProtocolError(\"Multiple values for a single-value field\")\n        return cls(name, values, field_type, label, options, required, desc)", "response": "Create a new Field object from an XML element."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nadding a field to the item.", "response": "def add_field(self, name = None, values = None, field_type = None,\n            label = None, options = None, required = False, desc = None, value = None):\n        \"\"\"Add a field to the item.\n\n        :Parameters:\n            - `name`: field name.\n            - `values`: raw field values. Not to be used together with `value`.\n            - `field_type`: field type.\n            - `label`: field label.\n            - `options`: optional values for the field.\n            - `required`: `True` if the field is required.\n            - `desc`: natural-language description of the field.\n            - `value`: field value or values in a field_type-specific type. May be used only\n              if `values` parameter is not provided.\n        :Types:\n            - `name`: `unicode`\n            - `values`: `list` of `unicode`\n            - `field_type`: `str`\n            - `label`: `unicode`\n            - `options`: `list` of `Option`\n            - `required`: `bool`\n            - `desc`: `unicode`\n            - `value`: `bool` for \"boolean\" field, `JID` for \"jid-single\", `list` of `JID`\n              for \"jid-multi\", `list` of `unicode` for \"list-multi\" and \"text-multi\"\n              and `unicode` for other field types.\n\n        :return: the field added.\n        :returntype: `Field`\n        \"\"\"\n        field = Field(name, values, field_type, label, options, required, desc, value)\n        self.fields.append(field)\n        return field"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ncomplete the XML node with self content.", "response": "def complete_xml_element(self, xmlnode, doc):\n        \"\"\"Complete the XML node with `self` content.\n\n        :Parameters:\n            - `xmlnode`: XML node with the element being built. It has already\n              right name and namespace, but no attributes or content.\n            - `doc`: document to which the element belongs.\n        :Types:\n            - `xmlnode`: `libxml2.xmlNode`\n            - `doc`: `libxml2.xmlDoc`\"\"\"\n        for field in self.fields:\n            field.as_xml(xmlnode, doc)"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ncreates a new Item object from an XML element.", "response": "def _new_from_xml(cls, xmlnode):\n        \"\"\"Create a new `Item` object from an XML element.\n\n        :Parameters:\n            - `xmlnode`: the XML element.\n        :Types:\n            - `xmlnode`: `libxml2.xmlNode`\n\n        :return: the object created.\n        :returntype: `Item`\n        \"\"\"\n        child = xmlnode.children\n        fields = []\n        while child:\n            if child.type != \"element\" or child.ns().content != DATAFORM_NS:\n                pass\n            elif child.name == \"field\":\n                fields.append(Field._new_from_xml(child))\n            child = child.next\n        return cls(fields)"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef add_item(self, fields = None):\n        item = Item(fields)\n        self.items.append(item)\n        return item", "response": "Add and item to the form."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef make_submit(self, keep_types = False):\n        result = Form(\"submit\")\n        for field in self.fields:\n            if field.type == \"fixed\":\n                continue\n            if not field.values:\n                if field.required:\n                    raise ValueError(\"Required field with no value!\")\n                continue\n            if keep_types:\n                result.add_field(field.name, field.values, field.type)\n            else:\n                result.add_field(field.name, field.values)\n        return result", "response": "Make a submit form using data in self."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ncompleting the XML node with the content of the form.", "response": "def complete_xml_element(self, xmlnode, doc):\n        \"\"\"Complete the XML node with `self` content.\n\n        :Parameters:\n            - `xmlnode`: XML node with the element being built. It has already\n              right name and namespace, but no attributes or content.\n            - `doc`: document to which the element belongs.\n        :Types:\n            - `xmlnode`: `libxml2.xmlNode`\n            - `doc`: `libxml2.xmlDoc`\"\"\"\n        if self.type not in self.allowed_types:\n            raise ValueError(\"Form type %r not allowed.\" % (self.type,))\n        xmlnode.setProp(\"type\", self.type)\n        if self.type == \"cancel\":\n            return\n        ns = xmlnode.ns()\n        if self.title is not None:\n            xmlnode.newTextChild(ns, \"title\", to_utf8(self.title))\n        if self.instructions is not None:\n            xmlnode.newTextChild(ns, \"instructions\", to_utf8(self.instructions))\n        for field in self.fields:\n            field.as_xml(xmlnode, doc)\n        if self.type != \"result\":\n            return\n        if self.reported_fields:\n            reported = xmlnode.newChild(ns, \"reported\", None)\n            for field in self.reported_fields:\n                field.as_xml(reported, doc)\n        for item in self.items:\n            item.as_xml(xmlnode, doc)"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ninitializes a Form object from an XML element.", "response": "def __from_xml(self, xmlnode):\n        \"\"\"Initialize a `Form` object from an XML element.\n\n        :Parameters:\n            - `xmlnode`: the XML element.\n        :Types:\n            - `xmlnode`: `libxml2.xmlNode`\n        \"\"\"\n        self.fields = []\n        self.reported_fields = []\n        self.items = []\n        self.title = None\n        self.instructions = None\n        if (xmlnode.type != \"element\" or xmlnode.name != \"x\"\n                or xmlnode.ns().content != DATAFORM_NS):\n            raise ValueError(\"Not a form: \" + xmlnode.serialize())\n        self.type = xmlnode.prop(\"type\")\n        if not self.type in self.allowed_types:\n            raise BadRequestProtocolError(\"Bad form type: %r\" % (self.type,))\n        child = xmlnode.children\n        while child:\n            if child.type != \"element\" or child.ns().content != DATAFORM_NS:\n                pass\n            elif child.name == \"title\":\n                self.title = from_utf8(child.getContent())\n            elif child.name == \"instructions\":\n                self.instructions = from_utf8(child.getContent())\n            elif child.name == \"field\":\n                self.fields.append(Field._new_from_xml(child))\n            elif child.name == \"item\":\n                self.items.append(Item._new_from_xml(child))\n            elif child.name == \"reported\":\n                self.__get_reported(child)\n            child = child.next"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef __get_reported(self, xmlnode):\n        child = xmlnode.children\n        while child:\n            if child.type != \"element\" or child.ns().content != DATAFORM_NS:\n                pass\n            elif child.name == \"field\":\n                self.reported_fields.append(Field._new_from_xml(child))\n            child = child.next", "response": "Parse the <reported/> element of the form."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nregisters the Service Discovery cache fetchers into the given cache suite.", "response": "def register_disco_cache_fetchers(cache_suite,stream):\n    \"\"\"Register Service Discovery cache fetchers into given\n    cache suite and using the stream provided.\n\n    :Parameters:\n        - `cache_suite`: the cache suite where the fetchers are to be\n          registered.\n        - `stream`: the stream to be used by the fetchers.\n    :Types:\n        - `cache_suite`: `cache.CacheSuite`\n        - `stream`: `pyxmpp.stream.Stream`\n    \"\"\"\n    tmp=stream\n    class DiscoInfoCacheFetcher(DiscoCacheFetcherBase):\n        \"\"\"Cache fetcher for DiscoInfo.\"\"\"\n        stream=tmp\n        disco_class=DiscoInfo\n    class DiscoItemsCacheFetcher(DiscoCacheFetcherBase):\n        \"\"\"Cache fetcher for DiscoItems.\"\"\"\n        stream=tmp\n        disco_class=DiscoItems\n    cache_suite.register_fetcher(DiscoInfo,DiscoInfoCacheFetcher)\n    cache_suite.register_fetcher(DiscoItems,DiscoItemsCacheFetcher)"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef remove(self):\n        if self.disco is None:\n            return\n        self.xmlnode.unlinkNode()\n        oldns=self.xmlnode.ns()\n        ns=self.xmlnode.newNs(oldns.getContent(),None)\n        self.xmlnode.replaceNs(oldns,ns)\n        common_root.addChild(self.xmlnode())\n        self.disco=None", "response": "Remove self from the containing DiscoItems object."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef set_name(self, name):\n        if name is None:\n            if self.xmlnode.hasProp(\"name\"):\n                self.xmlnode.unsetProp(\"name\")\n            return\n        name = unicode(name)\n        self.xmlnode.setProp(\"name\", name.encode(\"utf-8\"))", "response": "Set the name of the item."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef set_node(self,node):\n        if node is None:\n            if self.xmlnode.hasProp(\"node\"):\n                self.xmlnode.unsetProp(\"node\")\n            return\n        node = unicode(node)\n        self.xmlnode.setProp(\"node\", node.encode(\"utf-8\"))", "response": "Set the node of the item."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nsets the action of the item.", "response": "def set_action(self,action):\n        \"\"\"Set the action of the item.\n\n        :Parameters:\n            - `action`: the new action or `None`.\n        :Types:\n            - `action`: `unicode`\n        \"\"\"\n        if action is None:\n            if self.xmlnode.hasProp(\"action\"):\n                self.xmlnode.unsetProp(\"action\")\n            return\n        if action not in (\"remove\",\"update\"):\n            raise ValueError(\"Action must be 'update' or 'remove'\")\n        action = unicode(action)\n        self.xmlnode.setProp(\"action\", action.encode(\"utf-8\"))"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef get_name(self):\n        var = self.xmlnode.prop(\"name\")\n        if not var:\n            var = \"\"\n        return var.decode(\"utf-8\")", "response": "Get the name of the item or None if the item is not present."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nset the name of the item.", "response": "def set_name(self,name):\n        \"\"\"Set the name of the item.\n\n        :Parameters:\n            - `name`: the new name or `None`.\n        :Types:\n            - `name`: `unicode` \"\"\"\n        if not name:\n            raise ValueError(\"name is required in DiscoIdentity\")\n        name = unicode(name)\n        self.xmlnode.setProp(\"name\", name.encode(\"utf-8\"))"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nget the category of the item.", "response": "def get_category(self):\n        \"\"\"Get the category of the item.\n\n        :return: the category of the item.\n        :returntype: `unicode`\"\"\"\n        var = self.xmlnode.prop(\"category\")\n        if not var:\n            var = \"?\"\n        return var.decode(\"utf-8\")"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef set_category(self, category):\n        if not category:\n            raise ValueError(\"Category is required in DiscoIdentity\")\n        category = unicode(category)\n        self.xmlnode.setProp(\"category\", category.encode(\"utf-8\"))", "response": "Set the category of the item."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_type(self):\n        item_type = self.xmlnode.prop(\"type\")\n        if not item_type:\n            item_type = \"?\"\n        return item_type.decode(\"utf-8\")", "response": "Get the type of the item."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef set_type(self, item_type):\n        if not item_type:\n            raise ValueError(\"Type is required in DiscoIdentity\")\n        item_type = unicode(item_type)\n        self.xmlnode.setProp(\"type\", item_type.encode(\"utf-8\"))", "response": "Set the type of the item."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ngetting the items contained in self.", "response": "def get_items(self):\n        \"\"\"Get the items contained in `self`.\n\n        :return: the items contained.\n        :returntype: `list` of `DiscoItem`\"\"\"\n        ret=[]\n        l=self.xpath_ctxt.xpathEval(\"d:item\")\n        if l is not None:\n            for i in l:\n                ret.append(DiscoItem(self, i))\n        return ret"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nset items in the disco#items object.", "response": "def set_items(self, item_list):\n        \"\"\"Set items in the disco#items object.\n\n        All previous items are removed.\n\n        :Parameters:\n            - `item_list`: list of items or item properties\n              (jid,node,name,action).\n        :Types:\n            - `item_list`: sequence of `DiscoItem` or sequence of sequences\n        \"\"\"\n        for item in self.items:\n            item.remove()\n        for item in item_list:\n            try:\n                self.add_item(item.jid,item.node,item.name,item.action)\n            except AttributeError:\n                self.add_item(*item)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef add_item(self,jid,node=None,name=None,action=None):\n        return DiscoItem(self,jid,node,name,action)", "response": "Adds a new item to the DiscoItems object."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ncheck if the item with the given JID and node name exists in the disco.", "response": "def has_item(self,jid,node=None):\n        \"\"\"Check if `self` contains an item.\n\n        :Parameters:\n            - `jid`: JID of the item.\n            - `node`: node name of the item.\n        :Types:\n            - `jid`: `JID`\n            - `node`: `libxml2.xmlNode`\n\n        :return: `True` if the item is found in `self`.\n        :returntype: `bool`\"\"\"\n        l=self.xpath_ctxt.xpathEval(\"d:item\")\n        if l is None:\n            return False\n        for it in l:\n            di=DiscoItem(self,it)\n            if di.jid==jid and di.node==node:\n                return True\n        return False"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nget the features contained in self.", "response": "def get_features(self):\n        \"\"\"Get the features contained in `self`.\n\n        :return: the list of features.\n        :returntype: `list` of `unicode`\"\"\"\n        l = self.xpath_ctxt.xpathEval(\"d:feature\")\n        ret = []\n        for f in l:\n            if f.hasProp(\"var\"):\n                ret.append( f.prop(\"var\").decode(\"utf-8\") )\n        return ret"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef set_features(self, features):\n        for var in self.features:\n            self.remove_feature(var)\n\n        for var in features:\n            self.add_feature(var)", "response": "Sets the features in the disco#info object."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef has_feature(self,var):\n        if not var:\n            raise ValueError(\"var is None\")\n        if '\"' not in var:\n            expr=u'd:feature[@var=\"%s\"]' % (var,)\n        elif \"'\" not in var:\n            expr=u\"d:feature[@var='%s']\" % (var,)\n        else:\n            raise ValueError(\"Invalid feature name\")\n\n        l=self.xpath_ctxt.xpathEval(to_utf8(expr))\n        if l:\n            return True\n        else:\n            return False", "response": "Checks if the self contains the named feature."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef add_feature(self,var):\n        if self.has_feature(var):\n            return\n        n=self.xmlnode.newChild(None, \"feature\", None)\n        n.setProp(\"var\", to_utf8(var))", "response": "Adds a feature to the current object."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef remove_feature(self,var):\n        if not var:\n            raise ValueError(\"var is None\")\n        if '\"' not in var:\n            expr='d:feature[@var=\"%s\"]' % (var,)\n        elif \"'\" not in var:\n            expr=\"d:feature[@var='%s']\" % (var,)\n        else:\n            raise ValueError(\"Invalid feature name\")\n\n        l=self.xpath_ctxt.xpathEval(expr)\n        if not l:\n            return\n\n        for f in l:\n            f.unlinkNode()\n            f.freeNode()", "response": "Removes a feature from the current node."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nlisting the identities contained in self.", "response": "def get_identities(self):\n        \"\"\"List the identity objects contained in `self`.\n\n        :return: the list of identities.\n        :returntype: `list` of `DiscoIdentity`\"\"\"\n        ret=[]\n        l=self.xpath_ctxt.xpathEval(\"d:identity\")\n        if l is not None:\n            for i in l:\n                ret.append(DiscoIdentity(self,i))\n        return ret"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nset identities in the disco#info object.", "response": "def set_identities(self,identities):\n        \"\"\"Set identities in the disco#info object.\n\n        Remove all existing identities from `self`.\n\n        :Parameters:\n            - `identities`: list of identities or identity properties\n              (jid,node,category,type,name).\n        :Types:\n            - `identities`: sequence of `DiscoIdentity` or sequence of sequences\n        \"\"\"\n        for identity in self.identities:\n            identity.remove()\n        for identity in identities:\n            try:\n                self.add_identity(identity.name,identity.category,identity.type)\n            except AttributeError:\n                self.add_identity(*identity)"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nchecks if the item described by self belongs to the given category and type.", "response": "def identity_is(self,item_category,item_type=None):\n        \"\"\"Check if the item described by `self` belongs to the given category\n        and type.\n\n        :Parameters:\n            - `item_category`: the category name.\n            - `item_type`: the type name. If `None` then only the category is\n              checked.\n        :Types:\n            - `item_category`: `unicode`\n            - `item_type`: `unicode`\n\n        :return: `True` if `self` contains at least one <identity/> object with\n            given type and category.\n        :returntype: `bool`\"\"\"\n        if not item_category:\n            raise ValueError(\"bad category\")\n        if not item_type:\n            type_expr=u\"\"\n        elif '\"' not in item_type:\n            type_expr=u' and @type=\"%s\"' % (item_type,)\n        elif \"'\" not in type:\n            type_expr=u\" and @type='%s'\" % (item_type,)\n        else:\n            raise ValueError(\"Invalid type name\")\n        if '\"' not in item_category:\n            expr=u'd:identity[@category=\"%s\"%s]' % (item_category,type_expr)\n        elif \"'\" not in item_category:\n            expr=u\"d:identity[@category='%s'%s]\" % (item_category,type_expr)\n        else:\n            raise ValueError(\"Invalid category name\")\n\n        l=self.xpath_ctxt.xpathEval(to_utf8(expr))\n        if l:\n            return True\n        else:\n            return False"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nadding an identity to the DiscoInfo object.", "response": "def add_identity(self,item_name,item_category=None,item_type=None):\n        \"\"\"Add an identity to the `DiscoInfo` object.\n\n        :Parameters:\n            - `item_name`: name of the item.\n            - `item_category`: category of the item.\n            - `item_type`: type of the item.\n        :Types:\n            - `item_name`: `unicode`\n            - `item_category`: `unicode`\n            - `item_type`: `unicode`\n\n        :returns: the identity created.\n        :returntype: `DiscoIdentity`\"\"\"\n        return DiscoIdentity(self,item_name,item_category,item_type)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ninitializing the Service Discovery process.", "response": "def fetch(self):\n        \"\"\"Initialize the Service Discovery process.\"\"\"\n        from ..iq import Iq\n        jid,node = self.address\n        iq = Iq(to_jid = jid, stanza_type = \"get\")\n        disco = self.disco_class(node)\n        iq.add_content(disco.xmlnode)\n        self.stream.set_response_handlers(iq,self.__response, self.__error,\n                self.__timeout)\n        self.stream.send(iq)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nhandle successful disco response.", "response": "def __response(self,stanza):\n        \"\"\"Handle successful disco response.\n\n        :Parameters:\n            - `stanza`: the stanza received.\n        :Types:\n            - `stanza`: `pyxmpp.stanza.Stanza`\"\"\"\n        try:\n            d=self.disco_class(stanza.get_query())\n            self.got_it(d)\n        except ValueError,e:\n            self.error(e)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nhandle disco error response.", "response": "def __error(self,stanza):\n        \"\"\"Handle disco error response.\n\n        :Parameters:\n            - `stanza`: the stanza received.\n        :Types:\n            - `stanza`: `pyxmpp.stanza.Stanza`\"\"\"\n        try:\n            self.error(stanza.get_error())\n        except ProtocolError:\n            from ..error import StanzaErrorNode\n            self.error(StanzaErrorNode(\"undefined-condition\"))"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\ncreates an error response for the a get or set iq stanza.", "response": "def make_error_response(self, cond):\n        \"\"\"Create error response for the a \"get\" or \"set\" iq stanza.\n\n        :Parameters:\n            - `cond`: error condition name, as defined in XMPP specification.\n\n        :return: new `Iq` object with the same \"id\" as self, \"from\" and \"to\"\n            attributes swapped, type=\"error\" and containing <error /> element\n            plus payload of `self`.\n        :returntype: `Iq`\"\"\"\n\n        if self.stanza_type in (\"result\", \"error\"):\n            raise ValueError(\"Errors may not be generated for\"\n                                                \" 'result' and 'error' iq\")\n\n        stanza = Iq(stanza_type=\"error\", from_jid = self.to_jid,\n                        to_jid = self.from_jid, stanza_id = self.stanza_id,\n                        error_cond = cond)\n        if self._payload is None:\n            self.decode_payload()\n        for payload in self._payload:\n            # use Stanza.add_payload to skip the payload length check\n            Stanza.add_payload(stanza, payload)\n        return stanza"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef make_result_response(self):\n\n        if self.stanza_type not in (\"set\", \"get\"):\n            raise ValueError(\"Results may only be generated for\"\n                                                        \" 'set' or 'get' iq\")\n        stanza = Iq(stanza_type = \"result\", from_jid = self.to_jid,\n                        to_jid = self.from_jid, stanza_id = self.stanza_id)\n        return stanza", "response": "Create a result response for the a get or set iq stanza."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef add_payload(self, payload):\n        if self._payload is None:\n            self.decode_payload()\n        if len(self._payload) >= 1:\n            raise ValueError(\"Cannot add more payload to Iq stanza\")\n        return Stanza.add_payload(self, payload)", "response": "Adds a payload element to the stanza. Fails if there is already some\n        payload element attached."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nupdate the features element with StartTLS feature.", "response": "def make_stream_tls_features(self, stream, features):\n        \"\"\"Update the <features/> element with StartTLS feature.\n\n        [receving entity only]\n\n        :Parameters:\n            - `features`: the <features/> element of the stream.\n        :Types:\n            - `features`: :etree:`ElementTree.Element`\n\n        :returns: update <features/> element.\n        :returntype: :etree:`ElementTree.Element`\n        \"\"\"\n        if self.stream and stream is not self.stream:\n            raise ValueError(\"Single StreamTLSHandler instance can handle\"\n                                                            \" only one stream\")\n        self.stream = stream\n        if self.settings[\"starttls\"] and not stream.tls_established:\n            tls = ElementTree.SubElement(features, STARTTLS_TAG)\n            if self.settings[\"tls_require\"]:\n                ElementTree.SubElement(tls, REQUIRED_TAG)\n        return features"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nprocessing incoming StartTLS related element of stream features.", "response": "def handle_stream_features(self, stream, features):\n        \"\"\"Process incoming StartTLS related element of <stream:features/>.\n\n        [initiating entity only]\n\n        \"\"\"\n        if self.stream and stream is not self.stream:\n            raise ValueError(\"Single StreamTLSHandler instance can handle\"\n                                                            \" only one stream\")\n        self.stream = stream\n        logger.debug(\" tls: handling features\")\n        element = features.find(STARTTLS_TAG)\n        if element is None:\n            logger.debug(\" tls: no starttls feature found\")\n            if self.settings[\"tls_require\"]:\n                raise TLSNegotiationFailed(\"StartTLS required,\"\n                                                \" but not supported by peer\")\n            return None\n        if len(features) == 1:\n            required = True\n        else:\n            required = element.find(REQUIRED_TAG) is not None\n        if stream.tls_established:\n            logger.warning(\"StartTLS offerred when already established\")\n            return StreamFeatureNotHandled(\"StartTLS\", mandatory = required)\n\n        if self.settings[\"starttls\"]:\n            logger.debug(\"StartTLS negotiated\")\n            self._request_tls()\n            return StreamFeatureHandled(\"StartTLS\", mandatory = required)\n        else:\n            logger.debug(\" tls: not enabled\")\n            return StreamFeatureNotHandled(\"StartTLS\", mandatory = required)"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _request_tls(self):\n        self.requested = True\n        element = ElementTree.Element(STARTTLS_TAG)\n        self.stream.write_element(element)", "response": "Request a TLS - encrypted connection.\n        [ initiating entity only ]"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nhandle the TLS proceed element.", "response": "def _process_tls_proceed(self, stream, element):\n        \"\"\"Handle the <proceed /> element.\n        \"\"\"\n        # pylint: disable-msg=W0613\n        if not self.requested:\n            logger.debug(\"Unexpected TLS element: {0!r}\".format(element))\n            return False\n        logger.debug(\" tls: <proceed/> received\")\n        self.requested = False\n        self._make_tls_connection()\n        return True"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\ninitiates TLS connection. [initiating entity only]", "response": "def _make_tls_connection(self):\n        \"\"\"Initiate TLS connection.\n\n        [initiating entity only]\n        \"\"\"\n        logger.debug(\"Preparing TLS connection\")\n        if self.settings[\"tls_verify_peer\"]:\n            cert_reqs = ssl.CERT_REQUIRED\n        else:\n            cert_reqs = ssl.CERT_NONE\n        self.stream.transport.starttls(\n                    keyfile = self.settings[\"tls_key_file\"],\n                    certfile = self.settings[\"tls_cert_file\"],\n                    server_side = not self.stream.initiator,\n                    cert_reqs = cert_reqs,\n                    ssl_version = ssl.PROTOCOL_TLSv1,\n                    ca_certs = self.settings[\"tls_cacert_file\"],\n                    do_handshake_on_connect = False,\n                    )"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef handle_tls_connected_event(self, event):\n        if self.settings[\"tls_verify_peer\"]:\n            valid = self.settings[\"tls_verify_callback\"](event.stream,\n                                                        event.peer_certificate)\n            if not valid:\n                raise SSLError(\"Certificate verification failed\")\n        event.stream.tls_established = True\n        with event.stream.lock:\n            event.stream._restart_stream()", "response": "Verify the peer certificate on the TLSConnectedEvent."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ndefault certificate verification callback for TLS connections.", "response": "def is_certificate_valid(stream, cert):\n        \"\"\"Default certificate verification callback for TLS connections.\n\n        :Parameters:\n            - `cert`: certificate information\n        :Types:\n            - `cert`: `CertificateData`\n\n        :return: computed verification result.\n        \"\"\"\n        try:\n            logger.debug(\"tls_is_certificate_valid(cert = {0!r})\".format(cert))\n            if not cert:\n                logger.warning(\"No TLS certificate information received.\")\n                return False\n            if not cert.validated:\n                logger.warning(\"TLS certificate not validated.\")\n                return False\n            srv_type = stream.transport._dst_service # pylint: disable=W0212\n            if cert.verify_server(stream.peer, srv_type):\n                logger.debug(\" tls: certificate valid for {0!r}\"\n                                                        .format(stream.peer))\n                return True\n            else:\n                logger.debug(\" tls: certificate not valid for {0!r}\"\n                                                        .format(stream.peer))\n                return False\n        except:\n            logger.exception(\"Exception caught while checking a certificate\")\n            raise"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nparses the command - line arguments and run the tool.", "response": "def main():\n    \"\"\"Parse the command-line arguments and run the tool.\"\"\"\n    parser = argparse.ArgumentParser(description = 'XMPP version checker',\n                                    parents = [XMPPSettings.get_arg_parser()])\n    parser.add_argument('source', metavar = 'SOURCE', \n                                        help = 'Source JID')\n    parser.add_argument('target', metavar = 'TARGET', nargs = '?',\n                            help = 'Target JID (default: domain of SOURCE)')\n    parser.add_argument('--debug',\n                        action = 'store_const', dest = 'log_level',\n                        const = logging.DEBUG, default = logging.INFO,\n                        help = 'Print debug messages')\n    parser.add_argument('--quiet', const = logging.ERROR,\n                        action = 'store_const', dest = 'log_level',\n                        help = 'Print only error messages')\n    args = parser.parse_args()\n    settings = XMPPSettings()\n    settings.load_arguments(args)\n    \n    if settings.get(\"password\") is None:\n        password = getpass(\"{0!r} password: \".format(args.source))\n        if sys.version_info.major < 3:\n            password = password.decode(\"utf-8\")\n        settings[\"password\"] = password\n\n    if sys.version_info.major < 3:\n        args.source = args.source.decode(\"utf-8\")\n\n    source = JID(args.source)\n\n    if args.target:\n        if sys.version_info.major < 3:\n            args.target = args.target.decode(\"utf-8\")\n        target = JID(args.target)\n    else:\n        target = JID(source.domain)\n\n    logging.basicConfig(level = args.log_level)\n\n    checker = VersionChecker(source, target, settings)\n    try:\n        checker.run()\n    except KeyboardInterrupt:\n        checker.disconnect()"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nsend the initial presence after log - in.", "response": "def handle_authorized(self, event):\n        \"\"\"Send the initial presence after log-in.\"\"\"\n        request_software_version(self.client, self.target_jid,\n                                        self.success, self.failure)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nset the value of the object in the dictionary.", "response": "def set_item(self, key, value, timeout = None, timeout_callback = None):\n        \"\"\"Set item of the dictionary.\n\n        :Parameters:\n            - `key`: the key.\n            - `value`: the object to store.\n            - `timeout`: timeout value for the object (in seconds from now).\n            - `timeout_callback`: function to be called when the item expires.\n              The callback should accept none, one (the key) or two (the key\n              and the value) arguments.\n        :Types:\n            - `key`: any hashable value\n            - `value`: any python object\n            - `timeout`: `int`\n            - `timeout_callback`: callable\n        \"\"\"\n        with self._lock:\n            logger.debug(\"expdict.__setitem__({0!r}, {1!r}, {2!r}, {3!r})\"\n                            .format(key, value, timeout, timeout_callback))\n            if not timeout:\n                timeout = self._default_timeout\n            self._timeouts[key] = (time.time() + timeout, timeout_callback)\n            return dict.__setitem__(self, key, value)"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ndoes the expiration of items that have expired by now.", "response": "def expire(self):\n        \"\"\"Do the expiration of dictionary items.\n\n        Remove items that expired by now from the dictionary.\n\n        :Return: time, in seconds, when the next item expires or `None`\n        :returntype: `float`\n        \"\"\"\n        with self._lock:\n            logger.debug(\"expdict.expire. timeouts: {0!r}\"\n                                                    .format(self._timeouts))\n            next_timeout = None\n            for k in self._timeouts.keys():\n                ret = self._expire_item(k)\n                if ret is not None:\n                    if next_timeout is None:\n                        next_timeout = ret\n                    else:\n                        next_timeout = min(next_timeout, ret)\n            return next_timeout"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _expire_item(self, key):\n        (timeout, callback) = self._timeouts[key]\n        now = time.time()\n        if timeout <= now:\n            item = dict.pop(self, key)\n            del self._timeouts[key]\n            if callback:\n                try:\n                    callback(key, item)\n                except TypeError:\n                    try:\n                        callback(key)\n                    except TypeError:\n                        callback()\n            return None\n        else:\n            return timeout - now", "response": "Do the expiration of a dictionary item."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ndecoding attributes of the stanza XML element and put them into the stanza properties.", "response": "def _decode_attributes(self):\n        \"\"\"Decode attributes of the stanza XML element\n        and put them into the stanza properties.\"\"\"\n        try:\n            from_jid = self._element.get('from')\n            if from_jid:\n                self._from_jid = JID(from_jid)\n            to_jid = self._element.get('to')\n            if to_jid:\n                self._to_jid = JID(to_jid)\n        except ValueError:\n            raise JIDMalformedProtocolError\n        self._stanza_type = self._element.get('type')\n        self._stanza_id = self._element.get('id')\n        lang = self._element.get(XML_LANG_QNAME)\n        if lang:\n            self._language = lang"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ndecodes the error element of the stanza.", "response": "def _decode_error(self):\n        \"\"\"Decode error element of the stanza.\"\"\"\n        error_qname = self._ns_prefix + \"error\"\n        for child in self._element:\n            if child.tag == error_qname:\n                self._error = StanzaErrorElement(child)\n                return\n        raise BadRequestProtocolError(\"Error element missing in\"\n                                                            \" an error stanza\")"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ncreates a deep copy of the stanza.", "response": "def copy(self):\n        \"\"\"Create a deep copy of the stanza.\n\n        :returntype: `Stanza`\"\"\"\n        result = Stanza(self.element_name, self.from_jid, self.to_jid,\n                        self.stanza_type, self.stanza_id, self.error,\n                        self._return_path())\n        if self._payload is None:\n            self.decode_payload()\n        for payload in self._payload:\n            result.add_payload(payload.copy())\n        return result"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef as_xml(self):\n        attrs = {}\n        if self._from_jid:\n            attrs['from'] = unicode(self._from_jid)\n        if self._to_jid:\n            attrs['to'] = unicode(self._to_jid)\n        if self._stanza_type:\n            attrs['type'] = self._stanza_type\n        if self._stanza_id:\n            attrs['id'] = self._stanza_id\n        if self._language:\n            attrs[XML_LANG_QNAME] = self._language\n        element = ElementTree.Element(self._element_qname, attrs)\n        if self._payload is None:\n            self.decode_payload()\n        for payload in self._payload:\n            element.append(payload.as_xml())\n        if self._error:\n            element.append(self._error.as_xml(\n                                        stanza_namespace = self._namespace))\n        return element", "response": "Return the XML stanza representation."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_xml(self):\n        if not self._dirty:\n            return self._element\n        element = self.as_xml()\n        self._element = element\n        self._dirty = False\n        return element", "response": "Return the XML stanza representation of the current object."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef decode_payload(self, specialize = False):\n        if self._payload is not None:\n            # already decoded\n            return\n        if self._element is None:\n            raise ValueError(\"This stanza has no element to decode\"\"\")\n        payload = []\n        if specialize:\n            factory = payload_factory\n        else:\n            factory = XMLPayload\n        for child in self._element:\n            if self.__class__ is not Stanza:\n                if child.tag.startswith(self._ns_prefix):\n                    continue\n            payload.append(factory(child))\n        self._payload = payload", "response": "Decode the payload from the element passed to the constructor."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nsets the stanza payload to a single item.", "response": "def set_payload(self, payload):\n        \"\"\"Set stanza payload to a single item.\n\n        All current stanza content of will be dropped.\n        Marks the stanza dirty.\n\n        :Parameters:\n            - `payload`: XML element or stanza payload object to use\n        :Types:\n            - `payload`: :etree:`ElementTree.Element` or `StanzaPayload`\n        \"\"\"\n        if isinstance(payload, ElementClass):\n            self._payload = [ XMLPayload(payload) ]\n        elif isinstance(payload, StanzaPayload):\n            self._payload = [ payload ]\n        else:\n            raise TypeError(\"Bad payload type\")\n        self._dirty = True"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef add_payload(self, payload):\n        if self._payload is None:\n            self.decode_payload()\n        if isinstance(payload, ElementClass):\n            self._payload.append(XMLPayload(payload))\n        elif isinstance(payload, StanzaPayload):\n            self._payload.append(payload)\n        else:\n            raise TypeError(\"Bad payload type\")\n        self._dirty = True", "response": "Adds a new payload to the stanza."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreturns a list of stanza payload objects.", "response": "def get_all_payload(self, specialize = False):\n        \"\"\"Return list of stanza payload objects.\n\n        :Parameters:\n            - `specialize`: If `True`, then return objects of specialized\n              `StanzaPayload` classes whenever possible, otherwise the\n              representation already available will be used (often\n              `XMLPayload`)\n\n        :Returntype: `list` of `StanzaPayload`\n        \"\"\"\n        if self._payload is None:\n            self.decode_payload(specialize)\n        elif specialize:\n            for i, payload in enumerate(self._payload):\n                if isinstance(payload, XMLPayload):\n                    klass = payload_class_for_element_name(\n                                                        payload.element.tag)\n                    if klass is not XMLPayload:\n                        payload = klass.from_xml(payload.element)\n                        self._payload[i] = payload\n        return list(self._payload)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nget the first payload element matching the given class and optional key.", "response": "def get_payload(self, payload_class, payload_key = None,\n                                                        specialize = False):\n        \"\"\"Get the first payload item matching the given class\n        and optional key.\n\n        Payloads may be addressed using a specific payload class or\n        via the generic `XMLPayload` element, though the `XMLPayload`\n        representation is available only as long as the element is not\n        requested by a more specific type.\n\n        :Parameters:\n            - `payload_class`: requested payload class, a subclass of\n              `StanzaPayload`. If `None` get the first payload in whatever\n              class is available.\n            - `payload_key`: optional key for additional match. When used\n              with `payload_class` = `XMLPayload` this selects the element to\n              match\n            - `specialize`: If `True`, and `payload_class` is `None` then\n              return object of a specialized `StanzaPayload` subclass whenever\n              possible\n        :Types:\n            - `payload_class`: `StanzaPayload`\n            - `specialize`: `bool`\n\n        :Return: payload element found or `None`\n        :Returntype: `StanzaPayload`\n        \"\"\"\n        if self._payload is None:\n            self.decode_payload()\n        if payload_class is None:\n            if self._payload:\n                payload = self._payload[0]\n                if specialize and isinstance(payload, XMLPayload):\n                    klass = payload_class_for_element_name(\n                                                        payload.element.tag)\n                    if klass is not XMLPayload:\n                        payload = klass.from_xml(payload.element)\n                        self._payload[0] = payload\n                return payload\n            else:\n                return None\n        # pylint: disable=W0212\n        elements = payload_class._pyxmpp_payload_element_name\n        for i, payload in enumerate(self._payload):\n            if isinstance(payload, XMLPayload):\n                if payload_class is not XMLPayload:\n                    if payload.xml_element_name not in elements:\n                        continue\n                    payload = payload_class.from_xml(payload.element)\n            elif not isinstance(payload, payload_class):\n                continue\n            if payload_key is not None and payload_key != payload.handler_key():\n                continue\n            self._payload[i] = payload\n            return payload\n        return None"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef element_to_unicode(element):\n    if hasattr(ElementTree, 'tounicode'):\n        # pylint: disable=E1103\n        return ElementTree.tounicode(\"element\")\n    elif sys.version_info.major < 3:\n        return unicode(ElementTree.tostring(element))\n    else:\n        return ElementTree.tostring(element, encoding = \"unicode\")", "response": "Serialize an XML element into a unicode string."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nadding resource binding feature to the features element of the stream.", "response": "def make_stream_features(self, stream, features):\n        \"\"\"Add resource binding feature to the <features/> element of the\n        stream.\n\n        [receving entity only]\n\n        :returns: update <features/> element.\n        \"\"\"\n        self.stream = stream\n        if stream.peer_authenticated and not stream.peer.resource:\n            ElementTree.SubElement(features, FEATURE_BIND)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef handle_stream_features(self, stream, features):\n        logger.debug(u\"Handling stream features: {0}\".format(\n                                        element_to_unicode(features)))\n        element = features.find(FEATURE_BIND)\n        if element is None:\n            logger.debug(\"No <bind/> in features\")\n            return None\n        resource = stream.settings[\"resource\"]\n        self.bind(stream, resource)\n        return StreamFeatureHandled(\"Resource binding\", mandatory = True)", "response": "Process incoming <stream : features/> element.\n\n        [ initiating entity only"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef bind(self, stream, resource):\n        self.stream = stream\n        stanza = Iq(stanza_type = \"set\")\n        payload = ResourceBindingPayload(resource = resource)\n        stanza.set_payload(payload)\n        self.stanza_processor.set_response_handlers(stanza,\n                                        self._bind_success, self._bind_error)\n        stream.send(stanza)\n        stream.event(BindingResourceEvent(resource))", "response": "Bind to a resource."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _bind_success(self, stanza):\n        # pylint: disable-msg=R0201\n        payload = stanza.get_payload(ResourceBindingPayload)\n        jid = payload.jid\n        if not jid:\n            raise BadRequestProtocolError(u\"<jid/> element mising in\"\n                                                    \" the bind response\")\n        self.stream.me = jid\n        self.stream.event(AuthorizedEvent(self.stream.me))", "response": "Handle resource binding success."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nserialize an XMPP element. Utility function for debugging or logging.", "response": "def serialize(element):\n    \"\"\"Serialize an XMPP element.\n\n    Utility function for debugging or logging.\n\n        :Parameters:\n            - `element`: the element to serialize\n        :Types:\n            - `element`: :etree:`ElementTree.Element`\n\n        :Return: serialized element\n        :Returntype: `unicode`\n    \"\"\"\n    if getattr(_THREAD, \"serializer\", None) is None:\n        _THREAD.serializer = XMPPSerializer(\"jabber:client\")\n        _THREAD.serializer.emit_head(None, None)\n    return _THREAD.serializer.emit_stanza(element)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nadd a new namespace prefix.", "response": "def add_prefix(self, namespace, prefix):\n        \"\"\"Add a new namespace prefix.\n\n        If the root element has not yet been emitted the prefix will\n        be declared there, otherwise the prefix will be declared on the\n        top-most element using this namespace in every stanza.\n\n        :Parameters:\n            - `namespace`: the namespace URI\n            - `prefix`: the prefix string\n        :Types:\n            - `namespace`: `unicode`\n            - `prefix`: `unicode`\n        \"\"\"\n        if prefix == \"xml\" and namespace != XML_NS:\n            raise ValueError, \"Cannot change 'xml' prefix meaning\"\n        self._prefixes[namespace] = prefix"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef emit_head(self, stream_from, stream_to, stream_id = None,\n                                            version = u'1.0', language = None):\n        \"\"\"Return the opening tag of the stream root element.\n\n        :Parameters:\n            - `stream_from`: the 'from' attribute of the stream. May be `None`.\n            - `stream_to`: the 'to' attribute of the stream. May be `None`.\n            - `version`: the 'version' of the stream.\n            - `language`: the 'xml:lang' of the stream\n        :Types:\n            - `stream_from`: `unicode`\n            - `stream_to`: `unicode`\n            - `version`: `unicode`\n            - `language`: `unicode`\n        \"\"\"\n        # pylint: disable-msg=R0913\n        self._root_prefixes = dict(STANDARD_PREFIXES)\n        self._root_prefixes[self.stanza_namespace] = None\n        for namespace, prefix in self._root_prefixes.items():\n            if not prefix or prefix == \"stream\":\n                continue\n            if namespace in STANDARD_PREFIXES or namespace in STANZA_NAMESPACES:\n                continue\n            self._root_prefixes[namespace] = prefix\n        tag = u\"<{0}:stream version={1}\".format(STANDARD_PREFIXES[STREAM_NS],\n                                                        quoteattr(version))\n        if stream_from:\n            tag += u\" from={0}\".format(quoteattr(stream_from))\n        if stream_to:\n            tag += u\" to={0}\".format(quoteattr(stream_to))\n        if stream_id is not None:\n            tag += u\" id={0}\".format(quoteattr(stream_id))\n        if language is not None:\n            tag += u\" xml:lang={0}\".format(quoteattr(language))\n        for namespace, prefix in self._root_prefixes.items():\n            if prefix == \"xml\":\n                continue\n            if prefix:\n                tag += u' xmlns:{0}={1}'.format(prefix, quoteattr(namespace))\n            else:\n                tag += u' xmlns={1}'.format(prefix, quoteattr(namespace))\n        tag += u\">\"\n        self._head_emitted = True\n        return tag", "response": "Return the opening tag of the stream root element."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nsplit an element of attribute qname into namespace and local name.", "response": "def _split_qname(self, name, is_element):\n        \"\"\"Split an element of attribute qname into namespace and local\n        name.\n\n        :Parameters:\n            - `name`: element or attribute QName\n            - `is_element`: `True` for an element, `False` for an attribute\n        :Types:\n            - `name`: `unicode`\n            - `is_element`: `bool`\n\n        :Return: namespace URI, local name\n        :returntype: `unicode`, `unicode`\"\"\"\n        if name.startswith(u\"{\"):\n            namespace, name = name[1:].split(u\"}\", 1)\n            if namespace in STANZA_NAMESPACES:\n                namespace = self.stanza_namespace\n        elif is_element:\n            raise ValueError(u\"Element with no namespace: {0!r}\".format(name))\n        else:\n            namespace = None\n        return namespace, name"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nmakes a new namespace prefix which won t conflict with the current namespace.", "response": "def _make_prefix(self, declared_prefixes):\n        \"\"\"Make up a new namespace prefix, which won't conflict\n        with `_prefixes` and prefixes declared in the current scope.\n\n        :Parameters:\n            - `declared_prefixes`: namespace to prefix mapping for the current\n              scope\n        :Types:\n            - `declared_prefixes`: `unicode` to `unicode` dictionary\n\n        :Returns: a new prefix\n        :Returntype: `unicode`\n        \"\"\"\n        used_prefixes = set(self._prefixes.values())\n        used_prefixes |= set(declared_prefixes.values())\n        while True:\n            prefix = u\"ns{0}\".format(self._next_id)\n            self._next_id += 1\n            if prefix not in used_prefixes:\n                break\n        return prefix"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nconverting a namespace - prefixed tag or attribute name into a namespace - prefixed tag or attribute name.", "response": "def _make_prefixed(self, name, is_element, declared_prefixes, declarations):\n        \"\"\"Return namespace-prefixed tag or attribute name.\n\n        Add appropriate declaration to `declarations` when neccessary.\n\n        If no prefix for an element namespace is defined, make the elements\n        namespace default (no prefix). For attributes, make up a prefix in such\n        case.\n\n        :Parameters:\n            - `name`: QName ('{namespace-uri}local-name')\n              to convert\n            - `is_element`: `True` for element, `False` for an attribute\n            - `declared_prefixes`: mapping of prefixes already declared\n              at this scope\n            - `declarations`: XMLNS declarations on the current element.\n        :Types:\n            - `name`: `unicode`\n            - `is_element`: `bool`\n            - `declared_prefixes`: `unicode` to `unicode` dictionary\n            - `declarations`: `unicode` to `unicode` dictionary\n\n        :Returntype: `unicode`\"\"\"\n        namespace, name = self._split_qname(name, is_element)\n        if namespace is None:\n            prefix = None\n        elif namespace in declared_prefixes:\n            prefix = declared_prefixes[namespace]\n        elif namespace in self._prefixes:\n            prefix = self._prefixes[namespace]\n            declarations[namespace] = prefix\n            declared_prefixes[namespace] = prefix\n        else:\n            if is_element:\n                prefix = None\n            else:\n                prefix = self._make_prefix(declared_prefixes)\n            declarations[namespace] = prefix\n            declared_prefixes[namespace] = prefix\n        if prefix:\n            return prefix + u\":\" + name\n        else:\n            return name"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _make_ns_declarations(declarations, declared_prefixes):\n        result = []\n        for namespace, prefix in declarations.items():\n            if prefix:\n                result.append(u' xmlns:{0}={1}'.format(prefix, quoteattr(\n                                                                namespace)))\n            else:\n                result.append(u' xmlns={1}'.format(prefix, quoteattr(\n                                                                namespace)))\n            for d_namespace, d_prefix in declared_prefixes.items():\n                if (not prefix and not d_prefix) or d_prefix == prefix:\n                    if namespace != d_namespace:\n                        del declared_prefixes[d_namespace]\n        return u\" \".join(result)", "response": "Build namespace declarations and remove obsoleted mappings\n        from declared_prefixes."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _emit_element(self, element, level, declared_prefixes):\n        declarations = {}\n        declared_prefixes = dict(declared_prefixes)\n        name = element.tag\n        prefixed = self._make_prefixed(name, True, declared_prefixes,\n                                                                declarations)\n        start_tag = u\"<{0}\".format(prefixed)\n        end_tag = u\"</{0}>\".format(prefixed)\n        for name, value in element.items():\n            prefixed = self._make_prefixed(name, False, declared_prefixes,\n                                                                declarations)\n            start_tag += u' {0}={1}'.format(prefixed, quoteattr(value))\n\n        declarations = self._make_ns_declarations(declarations,\n                                                        declared_prefixes)\n        if declarations:\n            start_tag += u\" \" + declarations\n        children = []\n        for child in element:\n            children.append(self._emit_element(child, level +1,\n                                                        declared_prefixes))\n        if not children and not element.text:\n            start_tag += u\"/>\"\n            end_tag = u\"\"\n            text = u\"\"\n        else:\n            start_tag += u\">\"\n            if level > 0 and element.text:\n                text = escape(element.text)\n            else:\n                text = u\"\"\n        if level > 1 and element.tail:\n            tail = escape(element.tail)\n        else:\n            tail = u\"\"\n        return start_tag + text + u''.join(children) + end_tag + tail", "response": "Recursive XML element serializer."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef emit_stanza(self, element):\n        if not self._head_emitted:\n            raise RuntimeError(\".emit_head() must be called first.\")\n        string = self._emit_element(element, level = 1,\n                                    declared_prefixes = self._root_prefixes)\n        return remove_evil_characters(string)", "response": "Serialize a stanza.\n            Must be called before emit_head."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nfilters a list of mechanisms to include those that succeed with the provided properties and are secure enough.", "response": "def filter_mechanism_list(mechanisms, properties, allow_insecure = False,\n                            server_side = False):\n    \"\"\"Filter a mechanisms list only to include those mechanisms that cans\n    succeed with the provided properties and are secure enough.\n\n    :Parameters:\n        - `mechanisms`: list of the mechanisms names\n        - `properties`: available authentication properties\n        - `allow_insecure`: allow insecure mechanisms\n    :Types:\n        - `mechanisms`: sequence of `unicode`\n        - `properties`: mapping\n        - `allow_insecure`: `bool`\n\n    :returntype: `list` of `unicode`\n    \"\"\"\n    # pylint: disable=W0212\n    result = []\n    for mechanism in mechanisms:\n        try:\n            if server_side:\n                klass = SERVER_MECHANISMS_D[mechanism]\n            else:\n                klass = CLIENT_MECHANISMS_D[mechanism]\n        except KeyError:\n            logger.debug(\" skipping {0} - not supported\".format(mechanism))\n            continue\n        secure = properties.get(\"security-layer\")\n        if not allow_insecure and not klass._pyxmpp_sasl_secure and not secure:\n            logger.debug(\" skipping {0}, as it is not secure\".format(mechanism))\n            continue\n        if not klass.are_properties_sufficient(properties):\n            logger.debug(\" skipping {0}, as the properties are not sufficient\"\n                                                            .format(mechanism))\n            continue\n        result.append(mechanism)\n    return result"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ncall when an error stanza is received.", "response": "def error(self,stanza):\n        \"\"\"\n        Called when an error stanza is received.\n\n        :Parameters:\n            - `stanza`: the stanza received.\n        :Types:\n            - `stanza`: `pyxmpp.stanza.Stanza`\n        \"\"\"\n        err=stanza.get_error()\n        self.__logger.debug(\"Error from: %r Condition: %r\"\n                % (stanza.get_from(),err.get_condition))"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nupdates presence stanza with user information update.", "response": "def update_presence(self,presence):\n        \"\"\"\n        Update user information.\n\n        :Parameters:\n            - `presence`: a presence stanza with user information update.\n        :Types:\n            - `presence`: `MucPresence`\n        \"\"\"\n        self.presence=MucPresence(presence)\n        t=presence.get_type()\n        if t==\"unavailable\":\n            self.role=\"none\"\n            self.affiliation=\"none\"\n        self.room_jid=self.presence.get_from()\n        self.nick=self.room_jid.resource\n        mc=self.presence.get_muc_child()\n        if isinstance(mc,MucUserX):\n            items=mc.get_items()\n            for item in items:\n                if not isinstance(item,MucItem):\n                    continue\n                if item.role:\n                    self.role=item.role\n                if item.affiliation:\n                    self.affiliation=item.affiliation\n                if item.jid:\n                    self.real_jid=item.jid\n                if item.nick:\n                    self.new_nick=item.nick\n                break"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ngetting a user from the room.", "response": "def get_user(self,nick_or_jid,create=False):\n        \"\"\"\n        Get a room user with given nick or JID.\n\n        :Parameters:\n            - `nick_or_jid`: the nickname or room JID of the user requested.\n            - `create`: if `True` and `nick_or_jid` is a JID, then a new\n              user object will be created if there is no such user in the room.\n        :Types:\n            - `nick_or_jid`: `unicode` or `JID`\n            - `create`: `bool`\n\n        :return: the named user or `None`\n        :returntype: `MucRoomUser`\n        \"\"\"\n        if isinstance(nick_or_jid,JID):\n            if not nick_or_jid.resource:\n                return None\n            for u in self.users.values():\n                if nick_or_jid in (u.room_jid,u.real_jid):\n                    return u\n            if create:\n                return MucRoomUser(nick_or_jid)\n            else:\n                return None\n        return self.users.get(nick_or_jid)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef set_stream(self,stream):\n        _unused = stream\n        if self.joined and self.handler:\n            self.handler.user_left(self.me,None)\n        self.joined=False", "response": "Called when current stream changes."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef join(self, password=None, history_maxchars = None,\n            history_maxstanzas = None, history_seconds = None, history_since = None):\n        \"\"\"\n        Send a join request for the room.\n\n        :Parameters:\n            - `password`: password to the room.\n            - `history_maxchars`: limit of the total number of characters in\n              history.\n            - `history_maxstanzas`: limit of the total number of messages in\n              history.\n            - `history_seconds`: send only messages received in the last\n              `history_seconds` seconds.\n            - `history_since`: Send only the messages received since the\n              dateTime specified (UTC).\n        :Types:\n            - `password`: `unicode`\n            - `history_maxchars`: `int`\n            - `history_maxstanzas`: `int`\n            - `history_seconds`: `int`\n            - `history_since`: `datetime.datetime`\n        \"\"\"\n        if self.joined:\n            raise RuntimeError(\"Room is already joined\")\n        p=MucPresence(to_jid=self.room_jid)\n        p.make_join_request(password, history_maxchars, history_maxstanzas,\n                history_seconds, history_since)\n        self.manager.stream.send(p)", "response": "Send a join request to the room."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nsends a leave request for the room.", "response": "def leave(self):\n        \"\"\"\n        Send a leave request for the room.\n        \"\"\"\n        if self.joined:\n            p=MucPresence(to_jid=self.room_jid,stanza_type=\"unavailable\")\n            self.manager.stream.send(p)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef send_message(self,body):\n        m=Message(to_jid=self.room_jid.bare(),stanza_type=\"groupchat\",body=body)\n        self.manager.stream.send(m)", "response": "Send a message to the room."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef set_subject(self,subject):\n        m=Message(to_jid=self.room_jid.bare(),stanza_type=\"groupchat\",subject=subject)\n        self.manager.stream.send(m)", "response": "Send a subject change request to the room."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef change_nick(self,new_nick):\n        new_room_jid=JID(self.room_jid.node,self.room_jid.domain,new_nick)\n        p=Presence(to_jid=new_room_jid)\n        self.manager.stream.send(p)", "response": "Send a nick change request to the room."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef get_room_jid(self,nick=None):\n        if nick is None:\n            return self.room_jid\n        return JID(self.room_jid.node,self.room_jid.domain,nick)", "response": "Get the room JID for given nick."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef process_available_presence(self,stanza):\n        fr=stanza.get_from()\n        if not fr.resource:\n            return\n        nick=fr.resource\n        user=self.users.get(nick)\n        if user:\n            old_user=MucRoomUser(user)\n            user.update_presence(stanza)\n            user.nick=nick\n        else:\n            old_user=None\n            user=MucRoomUser(stanza)\n            self.users[user.nick]=user\n        self.handler.presence_changed(user,stanza)\n        if fr==self.room_jid and not self.joined:\n            self.joined=True\n            self.me=user\n            mc=stanza.get_muc_child()\n            if isinstance(mc,MucUserX):\n                status = [i for i in mc.get_items() if isinstance(i,MucStatus) and i.code==201]\n                if status:\n                    self.configured = False\n                    self.handler.room_created(stanza)\n            if self.configured is None:\n                self.configured = True\n        if not old_user or old_user.role==\"none\":\n            self.handler.user_joined(user,stanza)\n        else:\n            if old_user.nick!=user.nick:\n                self.handler.nick_changed(user,old_user.nick,stanza)\n                if old_user.room_jid==self.room_jid:\n                    self.room_jid=fr\n            if old_user.role!=user.role:\n                self.handler.role_changed(user,old_user.role,user.role,stanza)\n            if old_user.affiliation!=user.affiliation:\n                self.handler.affiliation_changed(user,old_user.affiliation,user.affiliation,stanza)", "response": "Process available presence from the room."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nprocesses an unavailable presence stanza.", "response": "def process_unavailable_presence(self,stanza):\n        \"\"\"\n        Process <presence type=\"unavailable\"/> received from the room.\n\n        :Parameters:\n            - `stanza`: the stanza received.\n        :Types:\n            - `stanza`: `MucPresence`\n        \"\"\"\n        fr=stanza.get_from()\n        if not fr.resource:\n            return\n        nick=fr.resource\n        user=self.users.get(nick)\n        if user:\n            old_user=MucRoomUser(user)\n            user.update_presence(stanza)\n            self.handler.presence_changed(user,stanza)\n            if user.new_nick:\n                mc=stanza.get_muc_child()\n                if isinstance(mc,MucUserX):\n                    renames=[i for i in mc.get_items() if isinstance(i,MucStatus) and i.code==303]\n                    if renames:\n                        self.users[user.new_nick]=user\n                        del self.users[nick]\n                        return\n        else:\n            old_user=None\n            user=MucRoomUser(stanza)\n            self.users[user.nick]=user\n            self.handler.presence_changed(user,stanza)\n        if fr==self.room_jid and self.joined:\n            self.joined=False\n            self.handler.user_left(user,stanza)\n            self.manager.forget(self)\n            self.me=user\n        elif old_user:\n            self.handler.user_left(user,stanza)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef process_groupchat_message(self,stanza):\n        fr=stanza.get_from()\n        user=self.get_user(fr,True)\n        s=stanza.get_subject()\n        if s:\n            self.subject=s\n            self.handler.subject_changed(user,stanza)\n        else:\n            self.handler.message_received(user,stanza)", "response": "Process the groupchat message received from the room."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nprocess a room configuration form request.", "response": "def process_configuration_form_success(self, stanza):\n        \"\"\"\n        Process successful result of a room configuration form request.\n\n        :Parameters:\n            - `stanza`: the stanza received.\n        :Types:\n            - `stanza`: `Presence`\n        \"\"\"\n        if stanza.get_query_ns() != MUC_OWNER_NS:\n            raise ValueError(\"Bad result namespace\") # TODO: ProtocolError\n        query = stanza.get_query()\n        form = None\n        for el in xml_element_ns_iter(query.children, DATAFORM_NS):\n            form = Form(el)\n            break\n        if not form:\n            raise ValueError(\"No form received\") # TODO: ProtocolError\n        self.configuration_form = form\n        self.handler.configuration_form_received(form)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef request_configuration_form(self):\n        iq = Iq(to_jid = self.room_jid.bare(), stanza_type = \"get\")\n        iq.new_query(MUC_OWNER_NS, \"query\")\n        self.manager.stream.set_response_handlers(\n                iq, self.process_configuration_form_success, self.process_configuration_form_error)\n        self.manager.stream.send(iq)\n        return iq.get_id()", "response": "Request a configuration form for the room."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef process_configuration_success(self, stanza):\n        _unused = stanza\n        self.configured = True\n        self.handler.room_configured()", "response": "Process a configuration request from the broker."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nconfigures the room using the provided data.", "response": "def configure_room(self, form):\n        \"\"\"\n        Configure the room using the provided data.\n        Do nothing if the provided form is of type 'cancel'.\n\n        :Parameters:\n            - `form`: the configuration parameters. Should be a 'submit' form made by filling-in\n              the configuration form retireved using `self.request_configuration_form` or\n              a 'cancel' form.\n        :Types:\n            - `form`: `Form`\n\n        :return: id of the request stanza or `None` if a 'cancel' form was provieded.\n        :returntype: `unicode`\n        \"\"\"\n\n        if form.type == \"cancel\":\n            return None\n        elif form.type != \"submit\":\n            raise ValueError(\"A 'submit' form required to configure a room\")\n        iq = Iq(to_jid = self.room_jid.bare(), stanza_type = \"set\")\n        query = iq.new_query(MUC_OWNER_NS, \"query\")\n        form.as_xml(query)\n        self.manager.stream.set_response_handlers(\n                iq, self.process_configuration_success, self.process_configuration_error)\n        self.manager.stream.send(iq)\n        return iq.get_id()"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef request_instant_room(self):\n        if self.configured:\n            raise RuntimeError(\"Instant room may be requested for unconfigured room only\")\n        form = Form(\"submit\")\n        return self.configure_room(form)", "response": "Request an instant room."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nchange the stream assigned to this object.", "response": "def set_stream(self,stream):\n        \"\"\"\n        Change the stream assigned to `self`.\n\n        :Parameters:\n            - `stream`: the new stream to be assigned to `self`.\n        :Types:\n            - `stream`: `pyxmpp.stream.Stream`\n        \"\"\"\n        self.jid=stream.me\n        self.stream=stream\n        for r in self.rooms.values():\n            r.set_stream(stream)"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nassigns handlers to the MUC stanza.", "response": "def set_handlers(self,priority=10):\n        \"\"\"\n        Assign MUC stanza handlers to the `self.stream`.\n\n        :Parameters:\n            - `priority`: priority for the handlers.\n        :Types:\n            - `priority`: `int`\n        \"\"\"\n        self.stream.set_message_handler(\"groupchat\",self.__groupchat_message,None,priority)\n        self.stream.set_message_handler(\"error\",self.__error_message,None,priority)\n        self.stream.set_presence_handler(\"available\",self.__presence_available,None,priority)\n        self.stream.set_presence_handler(\"unavailable\",self.__presence_unavailable,None,priority)\n        self.stream.set_presence_handler(\"error\",self.__presence_error,None,priority)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef join(self, room, nick, handler, password = None, history_maxchars = None,\n            history_maxstanzas = None, history_seconds = None, history_since = None):\n        \"\"\"\n        Create and return a new room state object and request joining\n        to a MUC room.\n\n        :Parameters:\n            - `room`: the name of a room to be joined\n            - `nick`: the nickname to be used in the room\n            - `handler`: is an object to handle room events.\n            - `password`: password for the room, if any\n            - `history_maxchars`: limit of the total number of characters in\n              history.\n            - `history_maxstanzas`: limit of the total number of messages in\n              history.\n            - `history_seconds`: send only messages received in the last\n              `history_seconds` seconds.\n            - `history_since`: Send only the messages received since the\n              dateTime specified (UTC).\n\n        :Types:\n            - `room`: `JID`\n            - `nick`: `unicode`\n            - `handler`: `MucRoomHandler`\n            - `password`: `unicode`\n            - `history_maxchars`: `int`\n            - `history_maxstanzas`: `int`\n            - `history_seconds`: `int`\n            - `history_since`: `datetime.datetime`\n\n        :return: the room state object created.\n        :returntype: `MucRoomState`\n        \"\"\"\n\n        if not room.node or room.resource:\n            raise ValueError(\"Invalid room JID\")\n\n        room_jid = JID(room.node, room.domain, nick)\n\n        cur_rs = self.rooms.get(room_jid.bare().as_unicode())\n        if cur_rs and cur_rs.joined:\n            raise RuntimeError(\"Room already joined\")\n\n        rs=MucRoomState(self, self.stream.me, room_jid, handler)\n        self.rooms[room_jid.bare().as_unicode()]=rs\n        rs.join(password, history_maxchars, history_maxstanzas,\n            history_seconds, history_since)\n        return rs", "response": "Creates and returns a new MucRoomState object and requests joining the specified room."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef forget(self,rs):\n        try:\n            del self.rooms[rs.room_jid.bare().as_unicode()]\n        except KeyError:\n            pass", "response": "Forget a managed room from the list of managed rooms."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef __groupchat_message(self,stanza):\n        fr=stanza.get_from()\n        key=fr.bare().as_unicode()\n        rs=self.rooms.get(key)\n        if not rs:\n            self.__logger.debug(\"groupchat message from unknown source\")\n            return False\n        rs.process_groupchat_message(stanza)\n        return True", "response": "Process a groupchat message from a MUC room."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nprocesses an error message from a MUC room.", "response": "def __error_message(self,stanza):\n        \"\"\"Process an error message from a MUC room.\n\n        :Parameters:\n            - `stanza`: the stanza received.\n        :Types:\n            - `stanza`: `Message`\n\n        :return: `True` if the message was properly recognized as directed to\n            one of the managed rooms, `False` otherwise.\n        :returntype: `bool`\"\"\"\n        fr=stanza.get_from()\n        key=fr.bare().as_unicode()\n        rs=self.rooms.get(key)\n        if not rs:\n            return False\n        rs.process_error_message(stanza)\n        return True"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nprocessing an presence error from a MUC room.", "response": "def __presence_error(self,stanza):\n        \"\"\"Process an presence error from a MUC room.\n\n        :Parameters:\n            - `stanza`: the stanza received.\n        :Types:\n            - `stanza`: `Presence`\n\n        :return: `True` if the stanza was properly recognized as generated by\n            one of the managed rooms, `False` otherwise.\n        :returntype: `bool`\"\"\"\n        fr=stanza.get_from()\n        key=fr.bare().as_unicode()\n        rs=self.rooms.get(key)\n        if not rs:\n            return False\n        rs.process_error_presence(stanza)\n        return True"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nprocessing an available presence from a MUC room.", "response": "def __presence_available(self,stanza):\n        \"\"\"Process an available presence from a MUC room.\n\n        :Parameters:\n            - `stanza`: the stanza received.\n        :Types:\n            - `stanza`: `Presence`\n\n        :return: `True` if the stanza was properly recognized as generated by\n            one of the managed rooms, `False` otherwise.\n        :returntype: `bool`\"\"\"\n        fr=stanza.get_from()\n        key=fr.bare().as_unicode()\n        rs=self.rooms.get(key)\n        if not rs:\n            return False\n        rs.process_available_presence(MucPresence(stanza))\n        return True"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef __presence_unavailable(self,stanza):\n        fr=stanza.get_from()\n        key=fr.bare().as_unicode()\n        rs=self.rooms.get(key)\n        if not rs:\n            return False\n        rs.process_unavailable_presence(MucPresence(stanza))\n        return True", "response": "Process an unavailable presence from a MUC room."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef get(self, key, local_default = None, required = False):\n        # pylint: disable-msg=W0221\n        if key in self._settings:\n            return self._settings[key]\n        if local_default is not None:\n            return local_default\n        if key in self._defs:\n            setting_def = self._defs[key]\n            if setting_def.default is not None:\n                return setting_def.default\n            factory = setting_def.factory\n            if factory is None:\n                return None\n            value = factory(self)\n            if setting_def.cache is True:\n                setting_def.default = value\n            return value\n        if required:\n            raise KeyError(key)\n        return local_default", "response": "Get a parameter value."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef load_arguments(self, args):\n        for name, setting in self._defs.items():\n            if sys.version_info.major < 3:\n                # pylint: disable-msg=W0404\n                from locale import getpreferredencoding\n                encoding = getpreferredencoding()\n                name = name.encode(encoding, \"replace\")\n            attr = \"pyxmpp2_\" + name\n            try:\n                self[setting.name] = getattr(args, attr)\n            except AttributeError:\n                pass", "response": "Load settings from the argument parser output."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nadding a new setting definition.", "response": "def add_setting(cls, name, type = unicode, default = None, factory = None,\n                        cache = False, default_d = None, doc = None,\n                        cmdline_help = None, validator = None, basic = False):\n        \"\"\"Add a new setting definition.\n\n        :Parameters:\n            - `name`: setting name\n            - `type`: setting type object or type description\n            - `default`: default value\n            - `factory`: default value factory\n            - `cache`: if `True` the `factory` will be called only once\n              and its value stored as a constant default.\n            - `default_d`: description of the default value\n            - `doc`: setting documentation\n            - `cmdline_help`: command line argument description. When not\n              provided then the setting won't be available as a command-line\n              option\n            - `basic`: when `True` the option is considered a basic option -\n              one of those which should usually stay configurable in\n              an application.\n            - `validator`: function validating command-line option value string\n              and returning proper value for the settings objects. Defaults\n              to `type`.\n        :Types:\n            - `name`: `unicode`\n            - `type`: type or `unicode`\n            - `factory`: a callable\n            - `cache`: `bool`\n            - `default_d`: `unicode`\n            - `doc`: `unicode`\n            - `cmdline_help`: `unicode`\n            - `basic`: `bool`\n            - `validator`: a callable\n        \"\"\"\n        # pylint: disable-msg=W0622,R0913\n        setting_def = _SettingDefinition(name, type, default, factory,\n                                            cache, default_d, doc,\n                                            cmdline_help, validator, basic)\n        if name not in cls._defs:\n            cls._defs[name] = setting_def\n            return\n        duplicate = cls._defs[name]\n        if duplicate.type != setting_def.type:\n            raise ValueError(\"Setting duplicate, with a different type\")\n        if duplicate.default != setting_def.default:\n            raise ValueError(\"Setting duplicate, with a different default\")\n        if duplicate.factory != setting_def.factory:\n            raise ValueError(\"Setting duplicate, with a different factory\")"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef validate_string_list(value):\n        try:\n            if sys.version_info.major < 3:\n                # pylint: disable-msg=W0404\n                from locale import getpreferredencoding\n                encoding = getpreferredencoding()\n                value = value.decode(encoding)\n            return [x.strip() for x in value.split(u\",\")]\n        except (AttributeError, TypeError, UnicodeError):\n            raise ValueError(\"Bad string list\")", "response": "Validator for string lists to be used with add_setting."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef get_int_range_validator(start, stop):\n        def validate_int_range(value):\n            \"\"\"Integer range validator.\"\"\"\n            value = int(value)\n            if value >= start and value < stop:\n                return value\n            raise ValueError(\"Not in <{0},{1}) range\".format(start, stop))\n        return validate_int_range", "response": "Returns a function that validates the integer value of the current locale."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nlists all known settings.", "response": "def list_all(cls, basic = None):\n        \"\"\"List known settings.\n\n        :Parameters:\n            - `basic`: When `True` then limit output to the basic settings,\n              when `False` list only the extra settings.\n        \"\"\"\n        if basic is None:\n            return [s for s in cls._defs]\n        else:\n            return [s.name for s in cls._defs.values() if s.basic == basic]"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\ncreating an argument parser for the application.", "response": "def get_arg_parser(cls, settings = None, option_prefix = u'--',\n                                                            add_help = False):\n        \"\"\"Make a command-line option parser.\n\n        The returned parser may be used as a parent parser for application\n        argument parser.\n\n        :Parameters:\n            - `settings`: list of PyXMPP2 settings to consider. By default\n              all 'basic' settings are provided.\n            - `option_prefix`: custom prefix for PyXMPP2 options. E.g.\n              ``'--xmpp'`` to differentiate them from not xmpp-related\n              application options.\n            - `add_help`: when `True` a '--help' option will be included\n              (probably already added in the application parser object)\n        :Types:\n            - `settings`: list of `unicode`\n            - `option_prefix`: `str`\n            - `add_help`:\n\n        :return: an argument parser object.\n        :returntype: :std:`argparse.ArgumentParser`\n        \"\"\"\n        # pylint: disable-msg=R0914,R0912\n        parser = argparse.ArgumentParser(add_help = add_help,\n                                            prefix_chars = option_prefix[0])\n        if settings is None:\n            settings = cls.list_all(basic = True)\n\n        if sys.version_info.major < 3:\n            # pylint: disable-msg=W0404\n            from locale import getpreferredencoding\n            encoding = getpreferredencoding()\n            def decode_string_option(value):\n                \"\"\"Decode a string option.\"\"\"\n                return value.decode(encoding)\n\n        for name in settings:\n            if name not in cls._defs:\n                logger.debug(\"get_arg_parser: ignoring unknown option {0}\"\n                                                                .format(name))\n                return\n            setting = cls._defs[name]\n            if not setting.cmdline_help:\n                logger.debug(\"get_arg_parser: option {0} has no cmdline\"\n                                                                .format(name))\n                return\n            if sys.version_info.major < 3:\n                name = name.encode(encoding, \"replace\")\n            option = option_prefix + name.replace(\"_\", \"-\")\n            dest = \"pyxmpp2_\" + name\n            if setting.validator:\n                opt_type = setting.validator\n            elif setting.type is unicode and sys.version_info.major < 3:\n                opt_type = decode_string_option\n            else:\n                opt_type = setting.type\n            if setting.default_d:\n                default_s = setting.default_d\n                if sys.version_info.major < 3:\n                    default_s = default_s.encode(encoding, \"replace\")\n            elif setting.default is not None:\n                default_s = repr(setting.default)\n            else:\n                default_s = None\n            opt_help = setting.cmdline_help\n            if sys.version_info.major < 3:\n                opt_help = opt_help.encode(encoding, \"replace\")\n            if default_s:\n                opt_help += \" (Default: {0})\".format(default_s)\n            if opt_type is bool:\n                opt_action = _YesNoAction\n            else:\n                opt_action = \"store\"\n            parser.add_argument(option,\n                                action = opt_action,\n                                default = setting.default,\n                                type = opt_type,\n                                help = opt_help,\n                                metavar = name.upper(),\n                                dest = dest)\n        return parser"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ncompare two International Domain Names.", "response": "def are_domains_equal(domain1, domain2):\n    \"\"\"Compare two International Domain Names.\n\n    :Parameters:\n        - `domain1`: domains name to compare\n        - `domain2`: domains name to compare\n    :Types:\n        - `domain1`: `unicode`\n        - `domain2`: `unicode`\n\n    :return: True `domain1` and `domain2` are equal as domain names.\"\"\"\n\n    domain1 = domain1.encode(\"idna\")\n    domain2 = domain2.encode(\"idna\")\n    return domain1.lower() == domain2.lower()"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _validate_ip_address(family, address):\n    try:\n        info = socket.getaddrinfo(address, 0, family, socket.SOCK_STREAM, 0,\n                                                        socket.AI_NUMERICHOST)\n    except socket.gaierror, err:\n        logger.debug(\"gaierror: {0} for {1!r}\".format(err, address))\n        raise ValueError(\"Bad IP address\")\n\n    if not info:\n        logger.debug(\"getaddrinfo result empty\")\n        raise ValueError(\"Bad IP address\")\n    addr = info[0][4]\n    logger.debug(\" got address: {0!r}\".format(addr))\n\n    try:\n        return socket.getnameinfo(addr, socket.NI_NUMERICHOST)[0]\n    except socket.gaierror, err:\n        logger.debug(\"gaierror: {0} for {1!r}\".format(err, addr))\n        raise ValueError(\"Bad IP address\")", "response": "Check if the given IP address is valid and return it in a normalized\n    form."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nreturn jid tuple from an Unicode string.", "response": "def __from_unicode(cls, data, check = True):\n        \"\"\"Return jid tuple from an Unicode string.\n\n        :Parameters:\n            - `data`: the JID string\n            - `check`: when `False` then the JID is not checked for\n              specification compliance.\n\n        :Return: (localpart, domainpart, resourcepart) tuple\"\"\"\n        parts1 = data.split(u\"/\", 1)\n        parts2 = parts1[0].split(u\"@\", 1)\n        if len(parts2) == 2:\n            local = parts2[0]\n            domain = parts2[1]\n            if check:\n                local = cls.__prepare_local(local)\n                domain = cls.__prepare_domain(domain)\n        else:\n            local = None\n            domain = parts2[0]\n            if check:\n                domain = cls.__prepare_domain(domain)\n        if len(parts1) == 2:\n            resource = parts1[1]\n            if check:\n                resource = cls.__prepare_resource(parts1[1])\n        else:\n            resource = None\n        if not domain:\n            raise JIDError(\"Domain is required in JID.\")\n        return (local, domain, resource)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nprepare the local part of the JID.", "response": "def __prepare_local(data):\n        \"\"\"Prepare localpart of the JID\n\n        :Parameters:\n            - `data`: localpart of the JID\n        :Types:\n            - `data`: `unicode`\n\n        :raise JIDError: if the local name is too long.\n        :raise pyxmpp.xmppstringprep.StringprepError: if the\n            local name fails Nodeprep preparation.\"\"\"\n        if not data:\n            return None\n        data = unicode(data)\n        try:\n            local = NODEPREP.prepare(data)\n        except StringprepError, err:\n            raise JIDError(u\"Local part invalid: {0}\".format(err))\n        if len(local.encode(\"utf-8\")) > 1023:\n            raise JIDError(u\"Local part too long\")\n        return local"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nprepare the domain part of the JID.", "response": "def __prepare_domain(data):\n        \"\"\"Prepare domainpart of the JID.\n\n        :Parameters:\n            - `data`: Domain part of the JID\n        :Types:\n            - `data`: `unicode`\n\n        :raise JIDError: if the domain name is too long.\n        \"\"\"\n        # pylint: disable=R0912\n        if not data:\n            raise JIDError(\"Domain must be given\")\n        data = unicode(data)\n        if not data:\n            raise JIDError(\"Domain must be given\")\n        if u'[' in data:\n            if data[0] == u'[' and data[-1] == u']':\n                try:\n                    addr = _validate_ip_address(socket.AF_INET6, data[1:-1])\n                    return \"[{0}]\".format(addr)\n                except ValueError, err:\n                    logger.debug(\"ValueError: {0}\".format(err))\n                    raise JIDError(u\"Invalid IPv6 literal in JID domainpart\")\n            else:\n                raise JIDError(u\"Invalid use of '[' or ']' in JID domainpart\")\n        elif data[0].isdigit() and data[-1].isdigit():\n            try:\n                addr = _validate_ip_address(socket.AF_INET, data)\n            except ValueError, err:\n                logger.debug(\"ValueError: {0}\".format(err))\n        data = UNICODE_DOT_RE.sub(u\".\", data)\n        data = data.rstrip(u\".\")\n        labels = data.split(u\".\")\n        try:\n            labels = [idna.nameprep(label) for label in labels]\n        except UnicodeError:\n            raise JIDError(u\"Domain name invalid\")\n        for label in labels:\n            if not STD3_LABEL_RE.match(label):\n                raise JIDError(u\"Domain name invalid\")\n            try:\n                idna.ToASCII(label)\n            except UnicodeError:\n                raise JIDError(u\"Domain name invalid\")\n        domain = u\".\".join(labels)\n        if len(domain.encode(\"utf-8\")) > 1023:\n            raise JIDError(u\"Domain name too long\")\n        return domain"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef __prepare_resource(data):\n        if not data:\n            return None\n        data = unicode(data)\n        try:\n            resource = RESOURCEPREP.prepare(data)\n        except StringprepError, err:\n            raise JIDError(u\"Local part invalid: {0}\".format(err))\n        if len(resource.encode(\"utf-8\")) > 1023:\n            raise JIDError(\"Resource name too long\")\n        return resource", "response": "Prepare the resourcepart of the JID."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef as_unicode(self):\n        result = self.domain\n        if self.local:\n            result = self.local + u'@' + result\n        if self.resource:\n            result = result + u'/' + self.resource\n        if not JID.cache.has_key(result):\n            JID.cache[result] = self\n        return result", "response": "Unicode string JID representation."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nchecks if an IPv6 socket is available.", "response": "def is_ipv6_available():\n    \"\"\"Check if IPv6 is available.\n\n    :Return: `True` when an IPv6 socket can be created.\n    \"\"\"\n    try:\n        socket.socket(socket.AF_INET6).close()\n    except (socket.error, AttributeError):\n        return False\n    return True"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ncheck if an IPv4 socket is available.", "response": "def is_ipv4_available():\n    \"\"\"Check if IPv4 is available.\n\n    :Return: `True` when an IPv4 socket can be created.\n    \"\"\"\n    try:\n        socket.socket(socket.AF_INET).close()\n    except socket.error:\n        return False\n    return True"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef shuffle_srv(records):\n    if not records:\n        return []\n    ret = []\n    while len(records) > 1:\n        weight_sum = 0\n        for rrecord in records:\n            weight_sum += rrecord.weight + 0.1\n        thres = random.random() * weight_sum\n        weight_sum = 0\n        for rrecord in records:\n            weight_sum += rrecord.weight + 0.1\n            if thres < weight_sum:\n                records.remove(rrecord)\n                ret.append(rrecord)\n                break\n    ret.append(records[0])\n    return ret", "response": "Randomly reorder the given list of SRV records using their weights."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef reorder_srv(records):\n    records = list(records)\n    records.sort()\n    ret = []\n    tmp = []\n    for rrecord in records:\n        if not tmp or rrecord.priority == tmp[0].priority:\n            tmp.append(rrecord)\n            continue\n        ret += shuffle_srv(tmp)\n        tmp = [rrecord]\n    if tmp:\n        ret += shuffle_srv(tmp)\n    return ret", "response": "Reorder the given list of SRV records using their priorities and weights."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef stop(self):\n        with self.lock:\n            for dummy in self.threads:\n                self.queue.put(None)", "response": "Stop the resolver threads."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _start_thread(self):\n        with self.lock:\n            if self.threads and self.queue.empty():\n                return\n            if len(self.threads) >= self.max_threads:\n                return\n            thread_n = self.last_thread_n + 1\n            self.last_thread_n = thread_n\n            thread = threading.Thread(target = self._run,\n                            name = \"{0!r} #{1}\".format(self, thread_n),\n                            args = (thread_n,))\n            self.threads.append(thread)\n            thread.daemon = True\n            thread.start()", "response": "Start a new thread if the maximum number of threads has been reached."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nresolve an address to a record.", "response": "def resolve_address(self, hostname, callback, allow_cname = True):\n        \"\"\"Start looking up an A or AAAA record.\n\n        `callback` will be called with a list of (family, address) tuples\n        on success. Family is :std:`socket.AF_INET` or :std:`socket.AF_INET6`,\n        the address is IPv4 or IPv6 literal. The list will be empty on error.\n\n        :Parameters:\n            - `hostname`: the host name to look up\n            - `callback`: a function to be called with a list of received\n              addresses\n            - `allow_cname`: `True` if CNAMEs should be followed\n        :Types:\n            - `hostname`: `unicode`\n            - `callback`: function accepting a single argument\n            - `allow_cname`: `bool`\n        \"\"\"\n        if self.settings[\"ipv6\"]:\n            if self.settings[\"ipv4\"]:\n                family = socket.AF_UNSPEC\n            else:\n                family = socket.AF_INET6\n        elif self.settings[\"ipv4\"]:\n            family = socket.AF_INET\n        else:\n            logger.warning(\"Neither IPv6 or IPv4 allowed.\")\n            callback([])\n            return\n        try:\n            ret = socket.getaddrinfo(hostname, 0, family, socket.SOCK_STREAM, 0)\n        except socket.gaierror, err:\n            logger.warning(\"Couldn't resolve {0!r}: {1}\".format(hostname,\n                                                                        err))\n            callback([])\n            return\n        except IOError as err:\n            logger.warning(\"Couldn't resolve {0!r}, unexpected error: {1}\"\n                                                        .format(hostname,err))\n            callback([])\n            return\n        if family == socket.AF_UNSPEC:\n            tmp = ret\n            if self.settings[\"prefer_ipv6\"]:\n                ret = [ addr for addr in tmp if addr[0] == socket.AF_INET6 ]\n                ret += [ addr for addr in tmp if addr[0] == socket.AF_INET ]\n            else:\n                ret = [ addr for addr in tmp if addr[0] == socket.AF_INET ]\n                ret += [ addr for addr in tmp if addr[0] == socket.AF_INET6 ]\n        callback([(addr[0], addr[4][0]) for addr in ret])"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nstarring an XMPP session and send a message, then exit. :Parameters: - `source_jid`: sender JID - `password`: sender password - `target_jid`: recipient JID - `body`: message body - `subject`: message subject - `message_type`: message type - `message_thread`: message thread id - `settings`: other settings :Types: - `source_jid`: `pyxmpp2.jid.JID` or `basestring` - `password`: `basestring` - `target_jid`: `pyxmpp.jid.JID` or `basestring` - `body`: `basestring` - `subject`: `basestring` - `message_type`: `basestring` - `settings`: `pyxmpp2.settings.XMPPSettings`", "response": "def send_message(source_jid, password, target_jid, body, subject = None,\n                message_type = \"chat\", message_thread = None, settings = None):\n    \"\"\"Star an XMPP session and send a message, then exit.\n\n    :Parameters:\n        - `source_jid`: sender JID\n        - `password`: sender password\n        - `target_jid`: recipient JID\n        - `body`: message body\n        - `subject`: message subject\n        - `message_type`: message type\n        - `message_thread`: message thread id\n        - `settings`: other settings\n    :Types:\n        - `source_jid`: `pyxmpp2.jid.JID` or `basestring`\n        - `password`: `basestring`\n        - `target_jid`: `pyxmpp.jid.JID` or `basestring`\n        - `body`: `basestring`\n        - `subject`: `basestring`\n        - `message_type`: `basestring`\n        - `settings`: `pyxmpp2.settings.XMPPSettings`\n    \"\"\"\n    # pylint: disable=R0913,R0912\n    if sys.version_info.major < 3:\n        # pylint: disable-msg=W0404\n        from locale import getpreferredencoding\n        encoding = getpreferredencoding()\n        if isinstance(source_jid, str):\n            source_jid = source_jid.decode(encoding)\n        if isinstance(password, str):\n            password = password.decode(encoding)\n        if isinstance(target_jid, str):\n            target_jid = target_jid.decode(encoding)\n        if isinstance(body, str):\n            body = body.decode(encoding)\n        if isinstance(message_type, str):\n            message_type = message_type.decode(encoding)\n        if isinstance(message_thread, str):\n            message_thread = message_thread.decode(encoding)\n\n    if not isinstance(source_jid, JID):\n        source_jid = JID(source_jid)\n    if not isinstance(target_jid, JID):\n        target_jid = JID(target_jid)\n\n    msg = Message(to_jid = target_jid, body = body, subject = subject,\n                                                    stanza_type = message_type)\n    def action(client):\n        \"\"\"Send a mesage `msg` via a client.\"\"\"\n        client.stream.send(msg)\n\n    if settings is None:\n        settings = XMPPSettings({\"starttls\": True, \"tls_verify_peer\": False})\n\n    if password is not None:\n        settings[\"password\"] = password\n\n    handler = FireAndForget(source_jid, action, settings)\n    try:\n        handler.run()\n    except KeyboardInterrupt:\n        handler.disconnect()\n        raise"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nestablishing a client connection to a server.", "response": "def connect(self,server=None,port=None):\n        \"\"\"Establish a client connection to a server.\n\n        [component only]\n\n        :Parameters:\n            - `server`: name or address of the server to use.  If not given\n              then use the one specified when creating the object.\n            - `port`: port number of the server to use.  If not given then use\n              the one specified when creating the object.\n\n        :Types:\n            - `server`: `unicode`\n            - `port`: `int`\"\"\"\n        self.lock.acquire()\n        try:\n            self._connect(server,port)\n        finally:\n            self.lock.release()"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _compute_handshake(self):\n        return hashlib.sha1(to_utf8(self.stream_id)+to_utf8(self.secret)).hexdigest()", "response": "Compute the authentication handshake value."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _auth(self):\n        if self.authenticated:\n            self.__logger.debug(\"_auth: already authenticated\")\n            return\n        self.__logger.debug(\"doing handshake...\")\n        hash_value=self._compute_handshake()\n        n=common_root.newTextChild(None,\"handshake\",hash_value)\n        self._write_node(n)\n        n.unlinkNode()\n        n.freeNode()\n        self.__logger.debug(\"handshake hash sent.\")", "response": "Authenticate on the server."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _process_node(self,node):\n        ns=node.ns()\n        if ns:\n            ns_uri=node.ns().getContent()\n        if (not ns or ns_uri==\"jabber:component:accept\") and node.name==\"handshake\":\n            if self.initiator and not self.authenticated:\n                self.authenticated=1\n                self.state_change(\"authenticated\",self.me)\n                self._post_auth()\n                return\n            elif not self.authenticated and node.getContent()==self._compute_handshake():\n                self.peer=self.me\n                n=common_doc.newChild(None,\"handshake\",None)\n                self._write_node(n)\n                n.unlinkNode()\n                n.freeNode()\n                self.peer_authenticated=1\n                self.state_change(\"authenticated\",self.peer)\n                self._post_auth()\n                return\n            else:\n                self._send_stream_error(\"not-authorized\")\n                raise FatalComponentStreamError(\"Hanshake error.\")\n\n        if ns_uri in (\"jabber:component:accept\",\"jabber:client\",\"jabber:server\"):\n            stanza=stanza_factory(node)\n            self.lock.release()\n            try:\n                self.process_stanza(stanza)\n            finally:\n                self.lock.acquire()\n                stanza.free()\n            return\n        return Stream._process_node(self,node)", "response": "Process the first level element of the stream."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _set_state(self, state):\n        logger.debug(\" _set_state({0!r})\".format(state))\n        self._state = state\n        self._state_cond.notify()", "response": "Set the state of the object."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nstarting establishing TCP connection with given address.", "response": "def connect(self, addr, port = None, service = None):\n        \"\"\"Start establishing TCP connection with given address.\n\n        One of: `port` or `service` must be provided and `addr` must be\n        a domain name and not an IP address if `port` is not given.\n\n        When `service` is given try an SRV lookup for that service\n        at domain `addr`. If `service` is not given or `addr` is an IP address,\n        or the SRV lookup fails, connect to `port` at host `addr` directly.\n\n        [initiating entity only]\n\n        :Parameters:\n            - `addr`: peer name or IP address\n            - `port`: port number to connect to\n            - `service`: service name (to be resolved using SRV DNS records)\n        \"\"\"\n        with self.lock:\n            self._connect(addr, port, service)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _connect(self, addr, port, service):\n        self._dst_name = addr\n        self._dst_port = port\n        family = None\n        try:\n            res = socket.getaddrinfo(addr, port, socket.AF_UNSPEC,\n                                socket.SOCK_STREAM, 0, socket.AI_NUMERICHOST)\n            family = res[0][0]\n            sockaddr = res[0][4]\n        except socket.gaierror:\n            family = None\n            sockaddr = None\n\n        if family is not None:\n            if not port:\n                raise ValueError(\"No port number given with literal IP address\")\n            self._dst_service = None\n            self._family = family\n            self._dst_addrs = [(family, sockaddr)]\n            self._set_state(\"connect\")\n        elif service is not None:\n            self._dst_service = service\n            self._set_state(\"resolve-srv\")\n            self._dst_name = addr\n        elif port:\n            self._dst_nameports = [(self._dst_name, self._dst_port)]\n            self._dst_service = None\n            self._set_state(\"resolve-hostname\")\n        else:\n            raise ValueError(\"No port number and no SRV service name given\")", "response": "Connect to a new instance of the base class."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _resolve_srv(self):\n        resolver = self.settings[\"dns_resolver\"] # pylint: disable=W0621\n        self._set_state(\"resolving-srv\")\n        self.event(ResolvingSRVEvent(self._dst_name, self._dst_service))\n        resolver.resolve_srv(self._dst_name, self._dst_service, \"tcp\",\n                                                    callback = self._got_srv)", "response": "Start resolving the SRV record."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _got_srv(self, addrs):\n        with self.lock:\n            if not addrs:\n                self._dst_service = None\n                if self._dst_port:\n                    self._dst_nameports = [(self._dst_name, self._dst_port)]\n                else:\n                    self._dst_nameports = []\n                    self._set_state(\"aborted\")\n                    raise DNSError(\"Could not resolve SRV for service {0!r}\"\n                            \" on host {1!r} and fallback port number not given\"\n                                    .format(self._dst_service, self._dst_name))\n            elif addrs == [(\".\", 0)]:\n                self._dst_nameports = []\n                self._set_state(\"aborted\")\n                raise DNSError(\"Service {0!r} not available on host {1!r}\"\n                                    .format(self._dst_service, self._dst_name))\n            else:\n                self._dst_nameports = addrs\n            self._set_state(\"resolve-hostname\")", "response": "Handle a SRV lookup result."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nstart hostname resolution for the next name to try.", "response": "def _resolve_hostname(self):\n        \"\"\"Start hostname resolution for the next name to try.\n\n        [called with `lock` acquired]\n        \"\"\"\n        self._set_state(\"resolving-hostname\")\n        resolver = self.settings[\"dns_resolver\"] # pylint: disable=W0621\n        logger.debug(\"_dst_nameports: {0!r}\".format(self._dst_nameports))\n        name, port = self._dst_nameports.pop(0)\n        self._dst_hostname = name\n        resolver.resolve_address(name, callback = partial(\n                                self._got_addresses, name, port),\n                                allow_cname = self._dst_service is None)\n        self.event(ResolvingAddressEvent(name))"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _got_addresses(self, name, port, addrs):\n        with self.lock:\n            if not addrs:\n                if self._dst_nameports:\n                    self._set_state(\"resolve-hostname\")\n                    return\n                else:\n                    self._dst_addrs = []\n                    self._set_state(\"aborted\")\n                    raise DNSError(\"Could not resolve address record for {0!r}\"\n                                                                .format(name))\n            self._dst_addrs = [ (family, (addr, port)) for (family, addr)\n                                                                    in addrs ]\n            self._set_state(\"connect\")", "response": "Handler DNS address record lookup result."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nstarting connecting to the next address on the _dst_addrs list.", "response": "def _start_connect(self):\n        \"\"\"Start connecting to the next address on the `_dst_addrs` list.\n\n        [ called with `lock` acquired ]\n\n        \"\"\"\n        family, addr = self._dst_addrs.pop(0)\n        self._socket = socket.socket(family, socket.SOCK_STREAM)\n        self._socket.setblocking(False)\n        self._dst_addr = addr\n        self._family  = family\n        try:\n            self._socket.connect(addr)\n        except socket.error, err:\n            logger.debug(\"Connect error: {0}\".format(err))\n            if err.args[0] in BLOCKING_ERRORS:\n                self._set_state(\"connecting\")\n                self._write_queue.append(ContinueConnect())\n                self._write_queue_cond.notify()\n                self.event(ConnectingEvent(addr))\n                return\n            elif self._dst_addrs:\n                self._set_state(\"connect\")\n                return\n            elif self._dst_nameports:\n                self._set_state(\"resolve-hostname\")\n                return\n            else:\n                self._socket.close()\n                self._socket = None\n                self._set_state(\"aborted\")\n                self._write_queue.clear()\n                self._write_queue_cond.notify()\n                raise\n        self._connected()"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _continue_connect(self):\n        try:\n            self._socket.connect(self._dst_addr)\n        except socket.error, err:\n            logger.debug(\"Connect error: {0}\".format(err))\n            if err.args[0] == errno.EISCONN:\n                pass\n            elif err.args[0] in BLOCKING_ERRORS:\n                return None\n            elif self._dst_addrs:\n                self._set_state(\"connect\")\n                return None\n            elif self._dst_nameports:\n                self._set_state(\"resolve-hostname\")\n                return None\n            else:\n                self._socket.close()\n                self._socket = None\n                self._set_state(\"aborted\")\n                raise\n        self._connected()", "response": "Continue connecting.\n\n        [called with `lock` acquired]\n\n        :Return: `True` when just connected"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nwrite raw data to the socket.", "response": "def _write(self, data):\n        \"\"\"Write raw data to the socket.\n\n        :Parameters:\n            - `data`: data to send\n        :Types:\n            - `data`: `bytes`\n        \"\"\"\n        OUT_LOGGER.debug(\"OUT: %r\", data)\n        if self._hup or not self._socket:\n            raise PyXMPPIOError(u\"Connection closed.\")\n        try:\n            while data:\n                try:\n                    sent = self._socket.send(data)\n                except ssl.SSLError, err:\n                    if err.args[0] == ssl.SSL_ERROR_WANT_WRITE:\n                        continue\n                    else:\n                        raise\n                except socket.error, err:\n                    if err.args[0] == errno.EINTR:\n                        continue\n                    if err.args[0] in BLOCKING_ERRORS:\n                        wait_for_write(self._socket)\n                        continue\n                    raise\n                data = data[sent:]\n        except (IOError, OSError, socket.error), err:\n            raise PyXMPPIOError(u\"IO Error: {0}\".format(err))"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nsetting the stream the target for this transport instance.", "response": "def set_target(self, stream):\n        \"\"\"Make the `stream` the target for this transport instance.\n\n        The 'stream_start', 'stream_end' and 'stream_element' methods\n        of the target will be called when appropriate content is received.\n\n        :Parameters:\n            - `stream`: the stream handler to receive stream content\n              from the transport\n        :Types:\n            - `stream`: `StreamBase`\n        \"\"\"\n        with self.lock:\n            if self._stream:\n                raise ValueError(\"Target stream already set\")\n            self._stream = stream\n            self._reader = StreamReader(stream)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef send_stream_head(self, stanza_namespace, stream_from, stream_to,\n                        stream_id = None, version = u'1.0', language = None):\n        \"\"\"\n        Send stream head via the transport.\n\n        :Parameters:\n            - `stanza_namespace`: namespace of stream stanzas (e.g.\n              'jabber:client')\n            - `stream_from`: the 'from' attribute of the stream. May be `None`.\n            - `stream_to`: the 'to' attribute of the stream. May be `None`.\n            - `version`: the 'version' of the stream.\n            - `language`: the 'xml:lang' of the stream\n        :Types:\n            - `stanza_namespace`: `unicode`\n            - `stream_from`: `unicode`\n            - `stream_to`: `unicode`\n            - `version`: `unicode`\n            - `language`: `unicode`\n        \"\"\"\n        # pylint: disable=R0913\n        with self.lock:\n            self._serializer = XMPPSerializer(stanza_namespace,\n                                            self.settings[\"extra_ns_prefixes\"])\n            head = self._serializer.emit_head(stream_from, stream_to,\n                                                stream_id, version, language)\n            self._write(head.encode(\"utf-8\"))", "response": "Send a stream head via the transport."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nsends the tail of the message to the transport.", "response": "def send_stream_tail(self):\n        \"\"\"\n        Send stream tail via the transport.\n        \"\"\"\n        with self.lock:\n            if not self._socket or self._hup:\n                logger.debug(u\"Cannot send stream closing tag: already closed\")\n                return\n            data = self._serializer.emit_tail()\n            try:\n                self._write(data.encode(\"utf-8\"))\n            except (IOError, SystemError, socket.error), err:\n                logger.debug(u\"Sending stream closing tag failed: {0}\"\n                                                                .format(err))\n            self._serializer = None\n            self._hup = True\n            if self._tls_state is None:\n                try:\n                    self._socket.shutdown(socket.SHUT_WR)\n                except socket.error:\n                    pass\n            self._set_state(\"closing\")\n            self._write_queue.clear()\n            self._write_queue_cond.notify()"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nsending an element via the transport.", "response": "def send_element(self, element):\n        \"\"\"\n        Send an element via the transport.\n        \"\"\"\n        with self.lock:\n            if self._eof or self._socket is None or not self._serializer:\n                logger.debug(\"Dropping element: {0}\".format(\n                                                element_to_unicode(element)))\n                return\n            data = self._serializer.emit_stanza(element)\n            self._write(data.encode(\"utf-8\"))"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns True if the I/O channel can be read.", "response": "def is_readable(self):\n        \"\"\"\n        :Return: `True` when the I/O channel can be read\n        \"\"\"\n        return self._socket is not None and not self._eof and (\n                    self._state in (\"connected\", \"closing\")\n                        or self._state == \"tls-handshake\"\n                                        and self._tls_state == \"want_read\")"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nwaiting until the channel is readable.", "response": "def wait_for_readability(self):\n        \"\"\"\n        Stop current thread until the channel is readable.\n\n        :Return: `False` if it won't be readable (e.g. is closed)\n        \"\"\"\n        with self.lock:\n            while True:\n                if self._socket is None or self._eof:\n                    return False\n                if self._state in (\"connected\", \"closing\"):\n                    return True\n                if self._state == \"tls-handshake\" and \\\n                                            self._tls_state == \"want_read\":\n                    return True\n                self._state_cond.wait()"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef wait_for_writability(self):\n        with self.lock:\n            while True:\n                if self._state in (\"closing\", \"closed\", \"aborted\"):\n                    return False\n                if self._socket and bool(self._write_queue):\n                    return True\n                self._write_queue_cond.wait()\n        return False", "response": "Wait until the channel is writable."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef handle_write(self):\n        with self.lock:\n            logger.debug(\"handle_write: queue: {0!r}\".format(self._write_queue))\n            try:\n                job = self._write_queue.popleft()\n            except IndexError:\n                return\n            if isinstance(job, WriteData):\n                self._do_write(job.data) # pylint: disable=E1101\n            elif isinstance(job, ContinueConnect):\n                self._continue_connect()\n            elif isinstance(job, StartTLS):\n                self._initiate_starttls(**job.kwargs)\n            elif isinstance(job, TLSHandshake):\n                self._continue_tls_handshake()\n            else:\n                raise ValueError(\"Unrecognized job in the write queue: \"\n                                        \"{0!r}\".format(job))", "response": "Handle the channel writable state. E. g. send buffered data via socket. a\n           "}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef starttls(self, **kwargs):\n        with self.lock:\n            self.event(TLSConnectingEvent())\n            self._write_queue.append(StartTLS(**kwargs))\n            self._write_queue_cond.notify()", "response": "Request a TLS handshake on the socket ans switch\n        to encrypted output."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreturns the peer certificate.", "response": "def getpeercert(self):\n        \"\"\"Return the peer certificate.\n\n        :ReturnType: `pyxmpp2.cert.Certificate`\n        \"\"\"\n        with self.lock:\n            if not self._socket or self._tls_state != \"connected\":\n                raise ValueError(\"Not TLS-connected\")\n            return get_certificate_from_ssl_socket(self._socket)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _initiate_starttls(self, **kwargs):\n        if self._tls_state == \"connected\":\n            raise RuntimeError(\"Already TLS-connected\")\n        kwargs[\"do_handshake_on_connect\"] = False\n        logger.debug(\"Wrapping the socket into ssl\")\n        self._socket = ssl.wrap_socket(self._socket, **kwargs)\n        self._set_state(\"tls-handshake\")\n        self._continue_tls_handshake()", "response": "Initiate TLS handshake over the socket."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ncontinue a TLS handshake.", "response": "def _continue_tls_handshake(self):\n        \"\"\"Continue a TLS handshake.\"\"\"\n        try:\n            logger.debug(\" do_handshake()\")\n            self._socket.do_handshake()\n        except ssl.SSLError, err:\n            if err.args[0] == ssl.SSL_ERROR_WANT_READ:\n                self._tls_state = \"want_read\"\n                logger.debug(\"   want_read\")\n                self._state_cond.notify()\n                return\n            elif err.args[0] == ssl.SSL_ERROR_WANT_WRITE:\n                self._tls_state = \"want_write\"\n                logger.debug(\"   want_write\")\n                self._write_queue.appendleft(TLSHandshake)\n                return\n            else:\n                raise\n        self._tls_state = \"connected\"\n        self._set_state(\"connected\")\n        self._auth_properties['security-layer'] = \"TLS\"\n        if \"tls-unique\" in CHANNEL_BINDING_TYPES:\n            try:\n                # pylint: disable=E1103\n                tls_unique = self._socket.get_channel_binding(\"tls-unique\")\n            except ValueError:\n                pass\n            else:\n                self._auth_properties['channel-binding'] = {\n                                                    \"tls-unique\": tls_unique}\n        try:\n            cipher = self._socket.cipher()\n        except AttributeError:\n            # SSLSocket.cipher doesn't work on PyPy\n            cipher = \"unknown\"\n        cert = get_certificate_from_ssl_socket(self._socket)\n        self.event(TLSConnectedEvent(cipher, cert))"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef handle_read(self):\n        with self.lock:\n            logger.debug(\"handle_read()\")\n            if self._eof or self._socket is None:\n                return\n            if self._state == \"tls-handshake\":\n                while True:\n                    logger.debug(\"tls handshake read...\")\n                    self._continue_tls_handshake()\n                    logger.debug(\"  state: {0}\".format(self._tls_state))\n                    if self._tls_state != \"want_read\":\n                        break\n            elif self._tls_state == \"connected\":\n                while self._socket and not self._eof:\n                    logger.debug(\"tls socket read...\")\n                    try:\n                        data = self._socket.read(4096)\n                    except ssl.SSLError, err:\n                        if err.args[0] == ssl.SSL_ERROR_WANT_READ:\n                            break\n                        elif err.args[0] == ssl.SSL_ERROR_WANT_WRITE:\n                            break\n                        else:\n                            raise\n                    except socket.error, err:\n                        if err.args[0] == errno.EINTR:\n                            continue\n                        elif err.args[0] in BLOCKING_ERRORS:\n                            break\n                        elif err.args[0] == errno.ECONNRESET:\n                            logger.warning(\"Connection reset by peer\")\n                            data = None\n                        else:\n                            raise\n                    self._feed_reader(data)\n            else:\n                while self._socket and not self._eof:\n                    logger.debug(\"raw socket read...\")\n                    try:\n                        data = self._socket.recv(4096)\n                    except socket.error, err:\n                        if err.args[0] == errno.EINTR:\n                            continue\n                        elif err.args[0] in BLOCKING_ERRORS:\n                            break\n                        elif err.args[0] == errno.ECONNRESET:\n                            logger.warning(\"Connection reset by peer\")\n                            data = None\n                        else:\n                            raise\n                    self._feed_reader(data)", "response": "Handle the channel readable state. Eg. read from a socket."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nhandling the channel hungup state.", "response": "def handle_hup(self):\n        \"\"\"\n        Handle the 'channel hungup' state. The handler should not be writable\n        after this.\n        \"\"\"\n        with self.lock:\n            if self._state == 'connecting' and self._dst_addrs:\n                self._hup = False\n                self._set_state(\"connect\")\n                return\n        self._hup = True"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef handle_err(self):\n        with self.lock:\n            if self._state == 'connecting' and self._dst_addrs:\n                self._hup = False\n                self._set_state(\"connect\")\n                return\n            self._socket.close()\n            self._socket = None\n            self._set_state(\"aborted\")\n            self._write_queue.clear()\n            self._write_queue_cond.notify()\n        raise PyXMPPIOError(\"Unhandled error on socket\")", "response": "Handle an error reported by the socket."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef disconnect(self):\n        logger.debug(\"TCPTransport.disconnect()\")\n        with self.lock:\n            if self._socket is None:\n                if self._state != \"closed\":\n                    self.event(DisconnectedEvent(self._dst_addr))\n                    self._set_state(\"closed\")\n                return\n            if self._hup or not self._serializer:\n                self._close()\n            else:\n                self.send_stream_tail()", "response": "Disconnect the stream gracefully."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _close(self):\n        if self._state != \"closed\":\n            self.event(DisconnectedEvent(self._dst_addr))\n            self._set_state(\"closed\")\n        if self._socket is None:\n            return\n        try:\n            self._socket.shutdown(socket.SHUT_RDWR)\n        except socket.error:\n            pass\n        self._socket.close()\n        self._socket = None\n        self._write_queue.clear()\n        self._write_queue_cond.notify()", "response": "Close the socket and close the internal state."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nfeeds the stream reader with data received.", "response": "def _feed_reader(self, data):\n        \"\"\"Feed the stream reader with data received.\n\n        [ called with `lock` acquired ]\n\n        If `data` is None or empty, then stream end (peer disconnected) is\n        assumed and the stream is closed.\n\n        `lock` is acquired during the operation.\n\n        :Parameters:\n            - `data`: data received from the stream socket.\n        :Types:\n            - `data`: `unicode`\n        \"\"\"\n        IN_LOGGER.debug(\"IN: %r\", data)\n        if data:\n            self.lock.release() # not to deadlock with the stream\n            try:\n                self._reader.feed(data)\n            finally:\n                self.lock.acquire()\n        else:\n            self._eof = True\n            self.lock.release() # not to deadlock with the stream\n            try:\n                self._stream.stream_eof()\n            finally:\n                self.lock.acquire()\n            if not self._serializer:\n                if self._state != \"closed\":\n                    self.event(DisconnectedEvent(self._dst_addr))\n                    self._set_state(\"closed\")"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef event(self, event):\n        logger.debug(u\"TCP transport event: {0}\".format(event))\n        if self._stream:\n            event.stream = self._stream\n        self._event_queue.put(event)", "response": "Pass an event to the target stream or just log it."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nhandle the start tag.", "response": "def start(self, tag, attrs):\n        \"\"\"Handle the start tag.\n\n        Call the handler's 'stream_start' methods with\n        an empty root element if it is top level.\n\n        For lower level tags use :etree:`ElementTree.TreeBuilder` to collect\n        them.\n        \"\"\"\n        if self._level == 0:\n            self._root = ElementTree.Element(tag, attrs)\n            self._handler.stream_start(self._root)\n        if self._level < 2:\n            self._builder = ElementTree.TreeBuilder()\n        self._level += 1\n        return self._builder.start(tag, attrs)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nhandle an end tag.", "response": "def end(self, tag):\n        \"\"\"Handle an end tag.\n\n        Call the handler's 'stream_end' method with\n        an the root element (built by the `start` method).\n\n        On the first level below root, sent the built element tree\n        to the handler via the 'stanza methods'.\n\n        Any tag below will be just added to the tree builder.\n        \"\"\"\n        self._level -= 1\n        if self._level < 0:\n            self._handler.stream_parse_error(u\"Unexpected end tag for: {0!r}\"\n                                                                .format(tag))\n            return\n        if self._level == 0:\n            if tag != self._root.tag:\n                self._handler.stream_parse_error(u\"Unexpected end tag for:\"\n                            \" {0!r} (stream end tag expected)\".format(tag))\n                return\n            self._handler.stream_end()\n            return\n        element = self._builder.end(tag)\n        if self._level == 1:\n            self._handler.stream_element(element)"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nfeeds the parser with a chunk of data. Apropriate methods of handler will be called whenever something interesting is found.", "response": "def feed(self, data):\n        \"\"\"Feed the parser with a chunk of data. Apropriate methods\n        of `handler` will be called whenever something interesting is\n        found.\n\n        :Parameters:\n            - `data`: the chunk of data to parse.\n        :Types:\n            - `data`: `str`\"\"\"\n        with self.lock:\n            if self.in_use:\n                raise StreamParseError(\"StreamReader.feed() is not reentrant!\")\n            self.in_use = True\n            try:\n                if not self._started:\n                    # workaround for lxml bug when fed with a big chunk at once\n                    if len(data) > 1:\n                        self.parser.feed(data[:1])\n                        data = data[1:]\n                    self._started = True\n                if data:\n                    self.parser.feed(data)\n                else:\n                    self.parser.close()\n            except ElementTree.ParseError, err:\n                self.handler.stream_parse_error(unicode(err))\n            finally:\n                self.in_use = False"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _from_xml(self, element):\n        # pylint: disable-msg=R0912\n        if element.tag != self.error_qname:\n            raise ValueError(u\"{0!r} is not a {1!r} element\".format(\n                                                    element, self.error_qname))\n        lang = element.get(XML_LANG_QNAME, None)\n        if lang:\n            self.language = lang\n        self.condition = None\n        for child in element:\n            if child.tag.startswith(self.cond_qname_prefix):\n                if self.condition is not None:\n                    logger.warning(\"Multiple conditions in XMPP error element.\")\n                    continue\n                self.condition = deepcopy(child)\n            elif child.tag == self.text_qname:\n                lang = child.get(XML_LANG_QNAME, None)\n                if lang:\n                    self.language = lang\n                self.text = child.text.strip()\n            else:\n                bad = False\n                for prefix in (STREAM_QNP, STANZA_CLIENT_QNP, STANZA_SERVER_QNP,\n                                            STANZA_ERROR_QNP, STREAM_ERROR_QNP):\n                    if child.tag.startswith(prefix):\n                        logger.warning(\"Unexpected stream-namespaced\"\n                                                        \" element in error.\")\n                        bad = True\n                        break\n                if not bad:\n                    self.custom_condition.append( deepcopy(child) )\n        if self.condition is None:\n            self.condition = ElementTree.Element(self.cond_qname_prefix\n                                                    + \"undefined-condition\")\n        if self.condition.tag in OBSOLETE_CONDITIONS:\n            new_cond_name = OBSOLETE_CONDITIONS[self.condition.tag]\n            self.condition = ElementTree.Element(new_cond_name)", "response": "Initialize an ErrorElement object from an XMPP error element."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef as_xml(self):\n        result = ElementTree.Element(self.error_qname)\n        result.append(deepcopy(self.condition))\n        if self.text:\n            text = ElementTree.SubElement(result, self.text_qname)\n            if self.language:\n                text.set(XML_LANG_QNAME, self.language)\n            text.text = self.text\n        return result", "response": "Return the XML error representation."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _from_xml(self, element):\n        ErrorElement._from_xml(self, element)\n        error_type = element.get(u\"type\")\n        if error_type:\n            self.error_type = error_type", "response": "Initialize an ErrorElement object from an XML element."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn the XML error representation of the error element.", "response": "def as_xml(self, stanza_namespace = None): # pylint: disable-msg=W0221\n        \"\"\"Return the XML error representation.\n\n        :Parameters:\n            - `stanza_namespace`: namespace URI of the containing stanza\n        :Types:\n            - `stanza_namespace`: `unicode`\n\n        :returntype: :etree:`ElementTree.Element`\"\"\"\n        if stanza_namespace:\n            self.error_qname = \"{{{0}}}error\".format(stanza_namespace)\n            self.text_qname = \"{{{0}}}text\".format(stanza_namespace)\n        result = ErrorElement.as_xml(self)\n        result.set(\"type\", self.error_type)\n        return result"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef run(self):\n        while not self._quit:\n            interval = self.settings[\"poll_interval\"]\n            if self.io_handler.is_writable():\n                logger.debug(\"{0}: writable\".format(self.name))\n                fileno = self.io_handler\n                if fileno:\n                    writable = wait_for_write(fileno, interval)\n                    if writable:\n                        self.io_handler.handle_write()\n            else:\n                logger.debug(\"{0}: waiting for writaility\".format(self.name))\n                if not self.io_handler.wait_for_writability():\n                    break", "response": "The thread function. Loop waiting for the handler and socket being writable and calling\n           . handle_write."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _run_io_threads(self, handler):\n        reader = ReadingThread(self.settings, handler, daemon = self.daemon,\n                                                exc_queue = self.exc_queue)\n        writter = WrittingThread(self.settings, handler, daemon = self.daemon,\n                                                exc_queue = self.exc_queue)\n        self.io_threads += [reader, writter]\n        reader.start()\n        writter.start()", "response": "Start threads for an IOHandler."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _remove_io_handler(self, handler):\n        if handler not in self.io_handlers:\n            return\n        self.io_handlers.remove(handler)\n        for thread in self.io_threads:\n            if thread.io_handler is handler:\n                thread.stop()", "response": "Remove an IOHandler from the pool."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _add_timeout_handler(self, handler):\n        self.timeout_handlers.append(handler)\n        if self.event_thread is None:\n            return\n        self._run_timeout_threads(handler)", "response": "Add a TimeoutHandler to the pool."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _run_timeout_threads(self, handler):\n        # pylint: disable-msg=W0212\n        for dummy, method in inspect.getmembers(handler, callable):\n            if not hasattr(method, \"_pyxmpp_timeout\"):\n                continue\n            thread = TimeoutThread(method, daemon = self.daemon,\n                                                    exc_queue = self.exc_queue)\n            self.timeout_threads.append(thread)\n            thread.start()", "response": "Start threads for a TimeoutHandler."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nremove a TimeoutHandler from the pool.", "response": "def _remove_timeout_handler(self, handler):\n        \"\"\"Remove a TimeoutHandler from the pool.\n        \"\"\"\n        if handler not in self.timeout_handlers:\n            return\n        self.timeout_handlers.remove(handler)\n        for thread in self.timeout_threads:\n            if thread.method.im_self is handler:\n                thread.stop()"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef stop(self, join = False, timeout = None):\n        logger.debug(\"Closing the io handlers...\")\n        for handler in self.io_handlers:\n            handler.close()\n        if self.event_thread and self.event_thread.is_alive():\n            logger.debug(\"Sending the QUIT signal\")\n            self.event_queue.put(QUIT)\n        logger.debug(\"  sent\")\n        threads = self.io_threads + self.timeout_threads\n        for thread in threads:\n            logger.debug(\"Stopping thread: {0!r}\".format(thread))\n            thread.stop()\n        if not join:\n            return\n        if self.event_thread:\n            threads.append(self.event_thread)\n        if timeout is None:\n            for thread in threads:\n                thread.join()\n        else:\n            timeout1 = (timeout * 0.01) / len(threads)\n            threads_left = []\n            for thread in threads:\n                logger.debug(\"Quick-joining thread {0!r}...\".format(thread))\n                thread.join(timeout1)\n                if thread.is_alive():\n                    logger.debug(\"  thread still alive\".format(thread))\n                    threads_left.append(thread)\n            if threads_left:\n                timeout2 = (timeout * 0.99) / len(threads_left)\n                for thread in threads_left:\n                    logger.debug(\"Joining thread {0!r}...\".format(thread))\n                    thread.join(timeout2)\n        self.io_threads = []\n        self.event_thread = None", "response": "Stop the threads.\n\n        :Parameters:\n            - `join`: join the threads (wait until they exit)\n            - `timeout`: maximum time (in seconds) to wait when `join` is\n              `True`).  No limit when `timeout` is `None`."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef loop_iteration(self, timeout = 0.1):\n        try:\n            exc_info = self.exc_queue.get(True, timeout)[1]\n        except Queue.Empty:\n            return\n        exc_type, exc_value, ext_stack = exc_info\n        raise exc_type, exc_value, ext_stack", "response": "Loop over the thread pool and raise any exceptions."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreset the legacy object state making the object ready to handle new connections.", "response": "def _reset(self):\n        \"\"\"Reset the `LegacyClientStream` object state, making the object ready\n        to handle new connections.\"\"\"\n        ClientStream._reset(self)\n        self.available_auth_methods = None\n        self.auth_stanza = None\n        self.registration_callback = None"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ninitialize authentication when the connection is established and we are the initiator.", "response": "def _post_connect(self):\n        \"\"\"Initialize authentication when the connection is established\n        and we are the initiator.\"\"\"\n        if not self.initiator:\n            if \"plain\" in self.auth_methods or \"digest\" in self.auth_methods:\n                self.set_iq_get_handler(\"query\",\"jabber:iq:auth\",\n                            self.auth_in_stage1)\n                self.set_iq_set_handler(\"query\",\"jabber:iq:auth\",\n                            self.auth_in_stage2)\n        elif self.registration_callback:\n            iq = Iq(stanza_type = \"get\")\n            iq.set_content(Register())\n            self.set_response_handlers(iq, self.registration_form_received, self.registration_error)\n            self.send(iq)\n            return\n        ClientStream._post_connect(self)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _try_auth(self):\n        if self.authenticated:\n            self.__logger.debug(\"try_auth: already authenticated\")\n            return\n        self.__logger.debug(\"trying auth: %r\" % (self._auth_methods_left,))\n        if not self._auth_methods_left:\n            raise LegacyAuthenticationError(\"No allowed authentication methods available\")\n        method=self._auth_methods_left[0]\n        if method.startswith(\"sasl:\"):\n            return ClientStream._try_auth(self)\n        elif method not in (\"plain\",\"digest\"):\n            self._auth_methods_left.pop(0)\n            self.__logger.debug(\"Skipping unknown auth method: %s\" % method)\n            return self._try_auth()\n        elif self.available_auth_methods is not None:\n            if method in self.available_auth_methods:\n                self._auth_methods_left.pop(0)\n                self.auth_method_used=method\n                if method==\"digest\":\n                    self._digest_auth_stage2(self.auth_stanza)\n                else:\n                    self._plain_auth_stage2(self.auth_stanza)\n                self.auth_stanza=None\n                return\n            else:\n                self.__logger.debug(\"Skipping unavailable auth method: %s\" % method)\n        else:\n            self._auth_stage1()", "response": "Try to authenticate using the first one of allowed authentication\n        methods left."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nhandles the first stage of legacy authentication.", "response": "def auth_in_stage1(self,stanza):\n        \"\"\"Handle the first stage (<iq type='get'/>) of legacy (\"plain\" or\n        \"digest\") authentication.\n\n        [server only]\"\"\"\n        self.lock.acquire()\n        try:\n            if \"plain\" not in self.auth_methods and \"digest\" not in self.auth_methods:\n                iq=stanza.make_error_response(\"not-allowed\")\n                self.send(iq)\n                return\n\n            iq=stanza.make_result_response()\n            q=iq.new_query(\"jabber:iq:auth\")\n            q.newChild(None,\"username\",None)\n            q.newChild(None,\"resource\",None)\n            if \"plain\" in self.auth_methods:\n                q.newChild(None,\"password\",None)\n            if \"digest\" in self.auth_methods:\n                q.newChild(None,\"digest\",None)\n            self.send(iq)\n            iq.free()\n        finally:\n            self.lock.release()"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef auth_in_stage2(self,stanza):\n        self.lock.acquire()\n        try:\n            if \"plain\" not in self.auth_methods and \"digest\" not in self.auth_methods:\n                iq=stanza.make_error_response(\"not-allowed\")\n                self.send(iq)\n                return\n\n            username=stanza.xpath_eval(\"a:query/a:username\",{\"a\":\"jabber:iq:auth\"})\n            if username:\n                username=from_utf8(username[0].getContent())\n            resource=stanza.xpath_eval(\"a:query/a:resource\",{\"a\":\"jabber:iq:auth\"})\n            if resource:\n                resource=from_utf8(resource[0].getContent())\n            if not username or not resource:\n                self.__logger.debug(\"No username or resource found in auth request\")\n                iq=stanza.make_error_response(\"bad-request\")\n                self.send(iq)\n                return\n\n            if stanza.xpath_eval(\"a:query/a:password\",{\"a\":\"jabber:iq:auth\"}):\n                if \"plain\" not in self.auth_methods:\n                    iq=stanza.make_error_response(\"not-allowed\")\n                    self.send(iq)\n                    return\n                else:\n                    return self._plain_auth_in_stage2(username,resource,stanza)\n            if stanza.xpath_eval(\"a:query/a:digest\",{\"a\":\"jabber:iq:auth\"}):\n                if \"plain\" not in self.auth_methods:\n                    iq=stanza.make_error_response(\"not-allowed\")\n                    self.send(iq)\n                    return\n                else:\n                    return self._digest_auth_in_stage2(username,resource,stanza)\n        finally:\n            self.lock.release()", "response": "Handle the second stage of legacy authentication."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ndoing the first stage of legacy authentication.", "response": "def _auth_stage1(self):\n        \"\"\"Do the first stage (<iq type='get'/>) of legacy (\"plain\" or\n        \"digest\") authentication.\n\n        [client only]\"\"\"\n        iq=Iq(stanza_type=\"get\")\n        q=iq.new_query(\"jabber:iq:auth\")\n        q.newTextChild(None,\"username\",to_utf8(self.my_jid.node))\n        q.newTextChild(None,\"resource\",to_utf8(self.my_jid.resource))\n        self.send(iq)\n        self.set_response_handlers(iq,self.auth_stage2,self.auth_error,\n                            self.auth_timeout,timeout=60)\n        iq.free()"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nhandles legacy authentication timeout.", "response": "def auth_timeout(self):\n        \"\"\"Handle legacy authentication timeout.\n\n        [client only]\"\"\"\n        self.lock.acquire()\n        try:\n            self.__logger.debug(\"Timeout while waiting for jabber:iq:auth result\")\n            if self._auth_methods_left:\n                self._auth_methods_left.pop(0)\n        finally:\n            self.lock.release()"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef auth_error(self,stanza):\n        self.lock.acquire()\n        try:\n            err=stanza.get_error()\n            ae=err.xpath_eval(\"e:*\",{\"e\":\"jabber:iq:auth:error\"})\n            if ae:\n                ae=ae[0].name\n            else:\n                ae=err.get_condition().name\n            raise LegacyAuthenticationError(\"Authentication error condition: %s\"\n                        % (ae,))\n        finally:\n            self.lock.release()", "response": "Handle legacy authentication error."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nhandle the first stage authentication response.", "response": "def auth_stage2(self,stanza):\n        \"\"\"Handle the first stage authentication response (result of the <iq\n        type=\"get\"/>).\n\n        [client only]\"\"\"\n        self.lock.acquire()\n        try:\n            self.__logger.debug(\"Procesing auth response...\")\n            self.available_auth_methods=[]\n            if (stanza.xpath_eval(\"a:query/a:digest\",{\"a\":\"jabber:iq:auth\"}) and self.stream_id):\n                self.available_auth_methods.append(\"digest\")\n            if (stanza.xpath_eval(\"a:query/a:password\",{\"a\":\"jabber:iq:auth\"})):\n                self.available_auth_methods.append(\"plain\")\n            self.auth_stanza=stanza.copy()\n            self._try_auth()\n        finally:\n            self.lock.release()"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _plain_auth_stage2(self, _unused):\n        iq=Iq(stanza_type=\"set\")\n        q=iq.new_query(\"jabber:iq:auth\")\n        q.newTextChild(None,\"username\",to_utf8(self.my_jid.node))\n        q.newTextChild(None,\"resource\",to_utf8(self.my_jid.resource))\n        q.newTextChild(None,\"password\",to_utf8(self.password))\n        self.send(iq)\n        self.set_response_handlers(iq,self.auth_finish,self.auth_error)\n        iq.free()", "response": "Do the second stage of legacy plain authentication."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _plain_auth_in_stage2(self, username, _unused, stanza):\n        password=stanza.xpath_eval(\"a:query/a:password\",{\"a\":\"jabber:iq:auth\"})\n        if password:\n            password=from_utf8(password[0].getContent())\n        if not password:\n            self.__logger.debug(\"No password found in plain auth request\")\n            iq=stanza.make_error_response(\"bad-request\")\n            self.send(iq)\n            return\n\n        if self.check_password(username,password):\n            iq=stanza.make_result_response()\n            self.send(iq)\n            self.peer_authenticated=True\n            self.auth_method_used=\"plain\"\n            self.state_change(\"authorized\",self.peer)\n            self._post_auth()\n        else:\n            self.__logger.debug(\"Plain auth failed\")\n            iq=stanza.make_error_response(\"bad-request\")\n            e=iq.get_error()\n            e.add_custom_condition('jabber:iq:auth:error',\"user-unauthorized\")\n            self.send(iq)", "response": "Handle the second stage of plain auth."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _digest_auth_stage2(self, _unused):\n        iq=Iq(stanza_type=\"set\")\n        q=iq.new_query(\"jabber:iq:auth\")\n        q.newTextChild(None,\"username\",to_utf8(self.my_jid.node))\n        q.newTextChild(None,\"resource\",to_utf8(self.my_jid.resource))\n\n        digest = hashlib.sha1(to_utf8(self.stream_id)+to_utf8(self.password)).hexdigest()\n\n        q.newTextChild(None,\"digest\",digest)\n        self.send(iq)\n        self.set_response_handlers(iq,self.auth_finish,self.auth_error)\n        iq.free()", "response": "Do the second stage of legacy digest authentication."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _digest_auth_in_stage2(self, username, _unused, stanza):\n        digest=stanza.xpath_eval(\"a:query/a:digest\",{\"a\":\"jabber:iq:auth\"})\n        if digest:\n            digest=digest[0].getContent()\n        if not digest:\n            self.__logger.debug(\"No digest found in digest auth request\")\n            iq=stanza.make_error_response(\"bad-request\")\n            self.send(iq)\n            return\n\n        password,pwformat=self.get_password(username)\n        if not password or pwformat!=\"plain\":\n            iq=stanza.make_error_response(\"bad-request\")\n            e=iq.get_error()\n            e.add_custom_condition('jabber:iq:auth:error',\"user-unauthorized\")\n            self.send(iq)\n            return\n\n        mydigest = hashlib.sha1(to_utf8(self.stream_id)+to_utf8(password)).hexdigest()\n\n        if mydigest==digest:\n            iq=stanza.make_result_response()\n            self.send(iq)\n            self.peer_authenticated=True\n            self.auth_method_used=\"digest\"\n            self.state_change(\"authorized\",self.peer)\n            self._post_auth()\n        else:\n            self.__logger.debug(\"Digest auth failed: %r != %r\" % (digest,mydigest))\n            iq=stanza.make_error_response(\"bad-request\")\n            e=iq.get_error()\n            e.add_custom_condition('jabber:iq:auth:error',\"user-unauthorized\")\n            self.send(iq)", "response": "Handle the second stage of legacy digest authentication."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef auth_finish(self, _unused):\n        self.lock.acquire()\n        try:\n            self.__logger.debug(\"Authenticated\")\n            self.authenticated=True\n            self.state_change(\"authorized\",self.my_jid)\n            self._post_auth()\n        finally:\n            self.lock.release()", "response": "Handle success of the legacy authentication."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef registration_error(self, stanza):\n        self.lock.acquire()\n        try:\n            err=stanza.get_error()\n            ae=err.xpath_eval(\"e:*\",{\"e\":\"jabber:iq:auth:error\"})\n            if ae:\n                ae=ae[0].name\n            else:\n                ae=err.get_condition().name\n            raise RegistrationError(\"Authentication error condition: %s\" % (ae,))\n        finally:\n            self.lock.release()", "response": "Handle in - band registration error."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nhandle the registration form received.", "response": "def registration_form_received(self, stanza):\n        \"\"\"Handle registration form received.\n\n        [client only]\n\n        Call self.registration_callback with the registration form received\n        as the argument. Use the value returned by the callback will be a\n        filled-in form.\n\n        :Parameters:\n            - `stanza`: the stanza received.\n        :Types:\n            - `stanza`: `pyxmpp.iq.Iq`\"\"\"\n        self.lock.acquire()\n        try:\n            self.__register = Register(stanza.get_query())\n            self.registration_callback(stanza, self.__register.get_form())\n        finally:\n            self.lock.release()"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nsubmit a registration form.", "response": "def submit_registration_form(self, form):\n        \"\"\"Submit a registration form.\n\n        [client only]\n\n        :Parameters:\n            - `form`: the filled-in form. When form is `None` or its type is\n              \"cancel\" the registration is to be canceled.\n\n        :Types:\n            - `form`: `pyxmpp.jabber.dataforms.Form`\"\"\"\n        self.lock.acquire()\n        try:\n            if form and form.type!=\"cancel\":\n                self.registration_form = form\n                iq = Iq(stanza_type = \"set\")\n                iq.set_content(self.__register.submit_form(form))\n                self.set_response_handlers(iq, self.registration_success, self.registration_error)\n                self.send(iq)\n            else:\n                self.__register = None\n        finally:\n            self.lock.release()"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef registration_success(self, stanza):\n        _unused = stanza\n        self.lock.acquire()\n        try:\n            self.state_change(\"registered\", self.registration_form)\n            if ('FORM_TYPE' in self.registration_form\n                    and self.registration_form['FORM_TYPE'].value == 'jabber:iq:register'):\n                if 'username' in self.registration_form:\n                    self.my_jid = JID(self.registration_form['username'].value,\n                            self.my_jid.domain, self.my_jid.resource)\n                if 'password' in self.registration_form:\n                    self.password = self.registration_form['password'].value\n            self.registration_callback = None\n            self._post_connect()\n        finally:\n            self.lock.release()", "response": "Handle registration success.\n\n        [client only]\n\n        Clean up registration stuff, change state to \"registered\" and initialize\n        authentication.\n\n        :Parameters:\n            - `stanza`: the stanza received.\n        :Types:\n            - `stanza`: `pyxmpp.iq.Iq`"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nadds an I / O handler to the loop.", "response": "def _add_io_handler(self, handler):\n        \"\"\"Add an I/O handler to the loop.\"\"\"\n        logger.debug('adding io handler: %r', handler)\n        self._unprepared_handlers[handler] = None\n        self._configure_io_handler(handler)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nregister an io - handler at the polling object.", "response": "def _configure_io_handler(self, handler):\n        \"\"\"Register an io-handler at the polling object.\"\"\"\n        if self.check_events():\n            return\n        if handler in self._unprepared_handlers:\n            old_fileno = self._unprepared_handlers[handler]\n            prepared = self._prepare_io_handler(handler)\n        else:\n            old_fileno = None\n            prepared = True\n        fileno = handler.fileno()\n        if old_fileno is not None and fileno != old_fileno:\n            del self._handlers[old_fileno]\n            # remove_handler won't raise something like KeyError if the fd\n            # isn't registered; it will just print a debug log.\n            self.io_loop.remove_handler(old_fileno)\n        if not prepared:\n            self._unprepared_handlers[handler] = fileno\n        if not fileno:\n            return\n        update = fileno in self._handlers\n        events = ioloop.IOLoop.NONE\n        if handler.is_readable():\n            logger.debug(\" {0!r} readable\".format(handler))\n            events |= ioloop.IOLoop.READ\n        if handler.is_writable():\n            logger.debug(\" {0!r} writable\".format(handler))\n            events |= ioloop.IOLoop.WRITE\n\n        if self._handlers.get(fileno, None) == events:\n            return\n        self._handlers[fileno] = events\n        if events:\n            logger.debug(\" registering {0!r} handler fileno {1} for\"\n                            \" events {2}\".format(handler, fileno, events))\n            if update:\n                self.io_loop.update_handler(fileno, events)\n            else:\n                self.io_loop.add_handler(\n                    fileno, partial(self._handle_event, handler), events\n                )"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ncall the interfaces. IOHandler. prepare method and return True if the handler is ready and False otherwise.", "response": "def _prepare_io_handler(self, handler):\n        \"\"\"Call the `interfaces.IOHandler.prepare` method and\n        remove the handler from unprepared handler list when done.\n        \"\"\"\n        logger.debug(\" preparing handler: {0!r}\".format(handler))\n        ret = handler.prepare()\n        logger.debug(\"   prepare result: {0!r}\".format(ret))\n        if isinstance(ret, HandlerReady):\n            del self._unprepared_handlers[handler]\n            prepared = True\n        elif isinstance(ret, PrepareAgain):\n            if ret.timeout is not None:\n                now = time.time()\n                self.io_loop.add_timeout(\n                    now + ret.timeout,\n                    partial(self._configure_io_handler, handler)\n                )\n            else:\n                self.io_loop.add_callback(\n                    partial(self._configure_io_handler, handler)\n                )\n            prepared = False\n        else:\n            raise TypeError(\"Unexpected result type from prepare()\")\n        return prepared"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _remove_io_handler(self, handler):\n        if handler in self._unprepared_handlers:\n            old_fileno = self._unprepared_handlers[handler]\n            del self._unprepared_handlers[handler]\n        else:\n            old_fileno = handler.fileno()\n        if old_fileno is not None:\n            del self._handlers[old_fileno]\n            self.io_loop.remove_handler(handler.fileno())", "response": "Remove an i - handler."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nhandles events from the IOLoop.", "response": "def _handle_event(self, handler, fd, event):\n        \"\"\"handle I/O events\"\"\"\n        # pylint: disable=C0103\n        logger.debug('_handle_event: %r, %r, %r', handler, fd, event)\n        if event & ioloop.IOLoop.ERROR:\n            handler.handle_hup()\n        if event & ioloop.IOLoop.READ:\n            handler.handle_read()\n        if event & ioloop.IOLoop.WRITE:\n            handler.handle_write()\n        self._configure_io_handler(handler)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nrequesting software version information from a remote entity.", "response": "def request_software_version(stanza_processor, target_jid, callback,\n                                                    error_callback = None):\n    \"\"\"Request software version information from a remote entity.\n\n    When a valid response is received the `callback` will be handled\n    with a `VersionPayload` instance as its only argument. The object will\n    provide the requested infromation.\n\n    In case of error stanza received or invalid response the `error_callback`\n    (if provided) will be called with the offending stanza (which can\n    be ``<iq type='error'/>`` or ``<iq type='result'>``) as its argument.\n\n    The same function will be called on timeout, with the argument set to\n    `None`.\n\n    :Parameters:\n        - `stanza_processor`: a object used to send the query and handle\n          response. E.g. a `pyxmpp2.client.Client` instance\n        - `target_jid`: the JID of the entity to query\n        - `callback`: function to be called with a valid response\n        - `error_callback`: function to be called on error\n    :Types:\n        - `stanza_processor`: `StanzaProcessor`\n        - `target_jid`: `JID`\n    \"\"\"\n    stanza = Iq(to_jid = target_jid, stanza_type = \"get\")\n    payload = VersionPayload()\n    stanza.set_payload(payload)\n    def wrapper(stanza):\n        \"\"\"Wrapper for the user-provided `callback` that extracts the payload\n        from stanza received.\"\"\"\n        payload = stanza.get_payload(VersionPayload)\n        if payload is None:\n            if error_callback:\n                error_callback(stanza)\n            else:\n                logger.warning(\"Invalid version query response.\")\n        else:\n            callback(payload)\n    stanza_processor.set_response_handlers(stanza, wrapper, error_callback)\n    stanza_processor.send(stanza)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _os_name_factory(settings):\n    # pylint: disable-msg=W0613,W0142\n    return u\"{0} {1} {2}\".format(platform.system(), platform.release(),\n                                                        platform.machine())", "response": "Factory for the default software_os setting."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef initiate(self, transport, to = None):\n        with self.lock:\n            self.initiator = True\n            self.transport = transport\n            transport.set_target(self)\n            if to:\n                self.peer = JID(to)\n            else:\n                self.peer = None\n            if transport.is_connected():\n                self._initiate()", "response": "Initiate an XMPP connection over the transport."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreceives an XMPP connection over the transport.", "response": "def receive(self, transport, myname):\n        \"\"\"Receive an XMPP connection over the `transport`.\n\n        :Parameters:\n            - `transport`: an XMPP transport instance\n            - `myname`: local stream endpoint name.\n        \"\"\"\n        with self.lock:\n            self.transport = transport\n            transport.set_target(self)\n            self.me = JID(myname)\n            self.initiator = False\n            self._setup_stream_element_handlers()"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _setup_stream_element_handlers(self):\n        # pylint: disable-msg=W0212\n        if self.initiator:\n            mode = \"initiator\"\n        else:\n            mode = \"receiver\"\n        self._element_handlers = {}\n        for handler in self.handlers:\n            if not isinstance(handler, StreamFeatureHandler):\n                continue\n            for _unused, meth in inspect.getmembers(handler, callable):\n                if not hasattr(meth, \"_pyxmpp_stream_element_handled\"):\n                    continue\n                element_handled = meth._pyxmpp_stream_element_handled\n                if element_handled in self._element_handlers:\n                    # use only the first matching handler\n                    continue\n                if meth._pyxmpp_usage_restriction in (None, mode):\n                    self._element_handlers[element_handled] = meth", "response": "Sets up stream element handlers."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nhandling a stream event.", "response": "def event(self, event): # pylint: disable-msg=R0201\n        \"\"\"Handle a stream event.\n\n        Called when connection state is changed.\n\n        Should not be called with self.lock acquired!\n        \"\"\"\n        event.stream = self\n        logger.debug(u\"Stream event: {0}\".format(event))\n        self.settings[\"event_queue\"].put(event)\n        return False"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ncalling when the transport has been connected.", "response": "def transport_connected(self):\n        \"\"\"Called when transport has been connected.\n\n        Send the stream head if initiator.\n        \"\"\"\n        with self.lock:\n            if self.initiator:\n                if self._output_state is None:\n                    self._initiate()"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nprocess the stream start tag received from peer.", "response": "def stream_start(self, element):\n        \"\"\"Process <stream:stream> (stream start) tag received from peer.\n\n        `lock` is acquired when this method is called.\n\n        :Parameters:\n            - `element`: root element (empty) created by the parser\"\"\"\n        with self.lock:\n            logger.debug(\"input document: \" + element_to_unicode(element))\n            if not element.tag.startswith(STREAM_QNP):\n                self._send_stream_error(\"invalid-namespace\")\n                raise FatalStreamError(\"Bad stream namespace\")\n            if element.tag != STREAM_ROOT_TAG:\n                self._send_stream_error(\"bad-format\")\n                raise FatalStreamError(\"Bad root element\")\n\n            if self._input_state == \"restart\":\n                event = StreamRestartedEvent(self.peer)\n            else:\n                event = StreamConnectedEvent(self.peer)\n            self._input_state = \"open\"\n            version = element.get(\"version\")\n            if version:\n                try:\n                    major, minor = version.split(\".\", 1)\n                    major, minor = int(major), int(minor)\n                except ValueError:\n                    self._send_stream_error(\"unsupported-version\")\n                    raise FatalStreamError(\"Unsupported protocol version.\")\n                self.version = (major, minor)\n            else:\n                self.version = (0, 9)\n\n            if self.version[0] != 1 and self.version != (0, 9):\n                self._send_stream_error(\"unsupported-version\")\n                raise FatalStreamError(\"Unsupported protocol version.\")\n\n            peer_lang = element.get(XML_LANG_QNAME)\n            self.peer_language = peer_lang\n            if not self.initiator:\n                lang = None\n                languages = self.settings[\"languages\"]\n                while peer_lang:\n                    if peer_lang in languages:\n                        lang = peer_lang\n                        break\n                    match = LANG_SPLIT_RE.match(peer_lang)\n                    if not match:\n                        break\n                    peer_lang = match.group(0)\n                if lang:\n                    self.language = lang\n\n            if self.initiator:\n                self.stream_id = element.get(\"id\")\n                peer = element.get(\"from\")\n                if peer:\n                    peer = JID(peer)\n                if self.peer:\n                    if peer and peer != self.peer:\n                        logger.debug(\"peer hostname mismatch: {0!r} != {1!r}\"\n                                                    .format(peer, self.peer))\n                self.peer = peer\n            else:\n                to = element.get(\"to\")\n                if to:\n                    to = self.check_to(to)\n                    if not to:\n                        self._send_stream_error(\"host-unknown\")\n                        raise FatalStreamError('Bad \"to\"')\n                    self.me = JID(to)\n                peer = element.get(\"from\")\n                if peer:\n                    peer = JID(peer)\n                self._send_stream_start(self.generate_id(), stream_to = peer)\n                self._send_stream_features()\n            self.event(event)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef stream_end(self):\n        logger.debug(\"Stream ended\")\n        with self.lock:\n            self._input_state = \"closed\"\n            self.transport.disconnect()\n            self._output_state = \"closed\"", "response": "Process the end of the stream."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nsending stream start tag.", "response": "def _send_stream_start(self, stream_id = None, stream_to = None):\n        \"\"\"Send stream start tag.\"\"\"\n        if self._output_state in (\"open\", \"closed\"):\n            raise StreamError(\"Stream start already sent\")\n        if not self.language:\n            self.language = self.settings[\"language\"]\n        if stream_to:\n            stream_to = unicode(stream_to)\n        elif self.peer and self.initiator:\n            stream_to = unicode(self.peer)\n        stream_from = None\n        if self.me and (self.tls_established or not self.initiator):\n            stream_from = unicode(self.me)\n        if stream_id:\n            self.stream_id = stream_id\n        else:\n            self.stream_id = None\n        self.transport.send_stream_head(self.stanza_namespace,\n                                        stream_from, stream_to,\n                                    self.stream_id, language = self.language)\n        self._output_state = \"open\""}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _send_stream_error(self, condition):\n        if self._output_state is \"closed\":\n            return\n        if self._output_state in (None, \"restart\"):\n            self._send_stream_start()\n        element = StreamErrorElement(condition).as_xml()\n        self.transport.send_element(element)\n        self.transport.disconnect()\n        self._output_state = \"closed\"", "response": "Send an error message to the controller."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nrestarting the stream as needed after SASL and StartTLS negotiation.", "response": "def _restart_stream(self):\n        \"\"\"Restart the stream as needed after SASL and StartTLS negotiation.\"\"\"\n        self._input_state = \"restart\"\n        self._output_state = \"restart\"\n        self.features = None\n        self.transport.restart()\n        if self.initiator:\n            self._send_stream_start(self.stream_id)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ncreate the <features/> element for the stream.", "response": "def _make_stream_features(self):\n        \"\"\"Create the <features/> element for the stream.\n\n        [receving entity only]\n\n        :returns: new <features/> element\n        :returntype: :etree:`ElementTree.Element`\"\"\"\n        features = ElementTree.Element(FEATURES_TAG)\n        for handler in self._stream_feature_handlers:\n            handler.make_stream_features(self, features)\n        return features"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _send_stream_features(self):\n        self.features = self._make_stream_features()\n        self._write_element(self.features)", "response": "Send stream <features >.\n        [ receiving entity only"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _send(self, stanza):\n        self.fix_out_stanza(stanza)\n        element = stanza.as_xml()\n        self._write_element(element)", "response": "Same as send but assumes lock is acquired."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _process_element(self, element):\n        tag = element.tag\n        if tag in self._element_handlers:\n            handler = self._element_handlers[tag]\n            logger.debug(\"Passing element {0!r} to method {1!r}\"\n                                                .format(element, handler))\n            handled = handler(self, element)\n            if handled:\n                return\n        if tag.startswith(self._stanza_namespace_p):\n            stanza = stanza_factory(element, self, self.language)\n            self.uplink_receive(stanza)\n        elif tag == ERROR_TAG:\n            error = StreamErrorElement(element)\n            self.process_stream_error(error)\n        elif tag == FEATURES_TAG:\n            logger.debug(\"Got features element: {0}\".format(serialize(element)))\n            self._got_features(element)\n        else:\n            logger.debug(\"Unhandled element: {0}\".format(serialize(element)))\n            logger.debug(\" known handlers: {0!r}\".format(\n                                                    self._element_handlers))", "response": "Process the first level element of the stream."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef uplink_receive(self, stanza):\n        with self.lock:\n            if self.stanza_route:\n                self.stanza_route.uplink_receive(stanza)\n            else:\n                logger.debug(u\"Stanza dropped (no route): {0!r}\".format(stanza))", "response": "Handle a stanza received from the stream."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nprocessing an error element received from the server.", "response": "def process_stream_error(self, error):\n        \"\"\"Process stream error element received.\n\n        :Parameters:\n            - `error`: error received\n\n        :Types:\n            - `error`: `StreamErrorElement`\n        \"\"\"\n        # pylint: disable-msg=R0201\n        logger.debug(\"Unhandled stream error: condition: {0} {1!r}\"\n                            .format(error.condition_name, error.serialize()))"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nprocess incoming <stream : features/> element.", "response": "def _got_features(self, features):\n        \"\"\"Process incoming <stream:features/> element.\n\n        [initiating entity only]\n\n        The received features node is available in `features`.\"\"\"\n        self.features = features\n        logger.debug(\"got features, passing to event handlers...\")\n        handled = self.event(GotFeaturesEvent(self.features))\n        logger.debug(\"  handled: {0}\".format(handled))\n        if not handled:\n            mandatory_handled = []\n            mandatory_not_handled = []\n            logger.debug(\"  passing to stream features handlers: {0}\"\n                                    .format(self._stream_feature_handlers))\n            for handler in self._stream_feature_handlers:\n                ret = handler.handle_stream_features(self, self.features)\n                if ret is None:\n                    continue\n                elif isinstance(ret, StreamFeatureHandled):\n                    if ret.mandatory:\n                        mandatory_handled.append(unicode(ret))\n                        break\n                    break\n                elif isinstance(ret, StreamFeatureNotHandled):\n                    if ret.mandatory:\n                        mandatory_not_handled.append(unicode(ret))\n                        break\n                else:\n                    raise ValueError(\"Wrong value returned from a stream\"\n                            \" feature handler: {0!r}\".format(ret))\n            if mandatory_not_handled and not mandatory_handled:\n                self.send_stream_error(\"unsupported-feature\")\n                raise FatalStreamError(\n                        u\"Unsupported mandatory-to-implement features: \"\n                                        + u\" \".join(mandatory_not_handled))"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef set_peer_authenticated(self, peer, restart_stream = False):\n        with self.lock:\n            self.peer_authenticated = True\n            self.peer = peer\n            if restart_stream:\n                self._restart_stream()\n        self.event(AuthenticatedEvent(self.peer))", "response": "Mark the other side of the stream authenticated as peer"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef set_authenticated(self, me, restart_stream = False):\n        with self.lock:\n            self.authenticated = True\n            self.me = me\n            if restart_stream:\n                self._restart_stream()\n        self.event(AuthenticatedEvent(self.me))", "response": "Mark the local JID just authenticated as me."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\ninitiate an XMPP connection over the transport.", "response": "def initiate(self, transport, to = None):\n        \"\"\"Initiate an XMPP connection over the `transport`.\n\n        :Parameters:\n            - `transport`: an XMPP transport instance\n            - `to`: peer name (defaults to own jid domain part)\n        \"\"\"\n        if to is None:\n            to = JID(self.me.domain)\n        return StreamBase.initiate(self, transport, to)"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreceives an XMPP connection over the transport.", "response": "def receive(self, transport, myname = None):\n        \"\"\"Receive an XMPP connection over the `transport`.\n\n        :Parameters:\n            - `transport`: an XMPP transport instance\n            - `myname`: local stream endpoint name (defaults to own jid domain\n              part).\n        \"\"\"\n        if myname is None:\n            myname = JID(self.me.domain)\n        return StreamBase.receive(self, transport, myname)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nfixing outgoing stanza. COOKIES.", "response": "def fix_out_stanza(self, stanza):\n        \"\"\"Fix outgoing stanza.\n\n        On a client clear the sender JID. On a server set the sender\n        address to the own JID if the address is not set yet.\"\"\"\n        StreamBase.fix_out_stanza(self, stanza)\n        if self.initiator:\n            if stanza.from_jid:\n                stanza.from_jid = None\n        else:\n            if not stanza.from_jid:\n                stanza.from_jid = self.me"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef fix_in_stanza(self, stanza):\n        StreamBase.fix_in_stanza(self, stanza)\n        if not self.initiator:\n            if stanza.from_jid != self.peer:\n                stanza.set_from(self.peer)", "response": "Fix an incoming stanza.\n            If the sender is not authorized client JID then set the from to the authorized client JID."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef loop_iteration(self, timeout = 60):\n        if self.check_events():\n            return 0\n        next_timeout, sources_handled = self._call_timeout_handlers()\n        if self._quit:\n            return sources_handled\n        if next_timeout is not None:\n            timeout = min(next_timeout, timeout)\n        readable, writable, next_timeout = self._prepare_handlers()\n        if next_timeout is not None:\n            timeout = min(next_timeout, timeout)\n        if not readable and not writable:\n            readable, writable, _unused = [], [], None\n            time.sleep(timeout)\n        else:\n            logger.debug(\"select({0!r}, {1!r}, [], {2!r})\"\n                                    .format( readable, writable,timeout))\n            readable, writable, _unused = select.select(\n                                            readable, writable, [], timeout)\n        for handler in readable:\n            handler.handle_read()\n            sources_handled += 1\n        for handler in writable:\n            handler.handle_write()\n            sources_handled += 1\n        return sources_handled", "response": "A loop iteration - check any scheduled events\n            and I / O available and run the handlers."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nprepare the I/O handlers.", "response": "def _prepare_handlers(self):\n        \"\"\"Prepare the I/O handlers.\n\n        :Return: (readable, writable, timeout) tuple. 'readable' is the list\n            of readable handlers, 'writable' - the list of writable handlers,\n            'timeout' the suggested maximum timeout for this loop iteration or\n            `None`\n        \"\"\"\n        timeout = None\n        readable = []\n        writable = []\n        for handler in self._handlers:\n            if handler not in self._prepared:\n                logger.debug(\" preparing handler: {0!r}\".format(handler))\n                ret = handler.prepare()\n                logger.debug(\"   prepare result: {0!r}\".format(ret))\n                if isinstance(ret, HandlerReady):\n                    self._prepared.add(handler)\n                elif isinstance(ret, PrepareAgain):\n                    if ret.timeout is not None:\n                        if timeout is None:\n                            timeout = ret.timeout\n                        else:\n                            timeout = min(timeout, ret.timeout)\n                else:\n                    raise TypeError(\"Unexpected result type from prepare()\")\n            if not handler.fileno():\n                logger.debug(\" {0!r}: no fileno\".format(handler))\n                continue\n            if handler.is_readable():\n                logger.debug(\" {0!r} readable\".format(handler))\n                readable.append(handler)\n            if handler.is_writable():\n                logger.debug(\" {0!r} writable\".format(handler))\n                writable.append(handler)\n        return readable, writable, timeout"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ninitializing the object from an xml. dom Node object.", "response": "def __from_xml(self, xmlnode):\n        \"\"\"Initialize `Register` from an XML node.\n\n        :Parameters:\n            - `xmlnode`: the jabber:x:register XML element.\n        :Types:\n            - `xmlnode`: `libxml2.xmlNode`\"\"\"\n\n        self.__logger.debug(\"Converting jabber:iq:register element from XML\")\n        if xmlnode.type!=\"element\":\n            raise ValueError(\"XML node is not a jabber:iq:register element (not an element)\")\n        ns=get_node_ns_uri(xmlnode)\n        if ns and ns!=REGISTER_NS or xmlnode.name!=\"query\":\n            raise ValueError(\"XML node is not a jabber:iq:register element\")\n\n        for element in xml_element_iter(xmlnode.children):\n            ns = get_node_ns_uri(element)\n            if ns == DATAFORM_NS and element.name == \"x\" and not self.form:\n                self.form = Form(element)\n            elif ns != REGISTER_NS:\n                continue\n            name = element.name\n            if name == \"instructions\" and not self.instructions:\n                self.instructions = from_utf8(element.getContent())\n            elif name == \"registered\":\n                self.registered = True\n            elif name == \"remove\":\n                self.remove = True\n            elif name in legacy_fields and not getattr(self, name):\n                value = from_utf8(element.getContent())\n                if value is None:\n                    value = u\"\"\n                self.__logger.debug(u\"Setting legacy field %r to %r\" % (name, value))\n                setattr(self, name, value)"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef complete_xml_element(self, xmlnode, doc):\n        ns = xmlnode.ns()\n        if self.instructions is not None:\n            xmlnode.newTextChild(ns, \"instructions\", to_utf8(self.instructions))\n        if self.form:\n            self.form.as_xml(xmlnode, doc)\n        if self.remove:\n            xmlnode.newChild(ns, \"remove\", None)\n        else:\n            if self.registered:\n                xmlnode.newChild(ns, \"registered\", None)\n            for field in legacy_fields:\n                value = getattr(self, field)\n                if value is not None:\n                    xmlnode.newTextChild(ns, field, to_utf8(value))", "response": "Complete the XML node with the content of the current element."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreturning a data form for the Register object.", "response": "def get_form(self, form_type = \"form\"):\n        \"\"\"Return Data Form for the `Register` object.\n\n        Convert legacy fields to a data form if `self.form` is `None`, return `self.form` otherwise.\n\n        :Parameters:\n            - `form_type`: If \"form\", then a form to fill-in should be\n              returned. If \"sumbit\", then a form with submitted data.\n        :Types:\n            - `form_type`: `unicode`\n\n        :return: `self.form` or a form created from the legacy fields\n        :returntype: `pyxmpp.jabber.dataforms.Form`\"\"\"\n\n        if self.form:\n            if self.form.type != form_type:\n                raise ValueError(\"Bad form type in the jabber:iq:register element\")\n            return self.form\n\n        form = Form(form_type, instructions = self.instructions)\n        form.add_field(\"FORM_TYPE\", [u\"jabber:iq:register\"], \"hidden\")\n        for field in legacy_fields:\n            field_type, field_label = legacy_fields[field]\n            value = getattr(self, field)\n            if value is None:\n                continue\n            if form_type == \"form\":\n                if not value:\n                    value = None\n                form.add_field(name = field, field_type = field_type, label = field_label,\n                        value = value, required = True)\n            else:\n                form.add_field(name = field, value = value)\n        return form"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nconverts form data to legacy fields and return a new Register object.", "response": "def submit_form(self, form):\n        \"\"\"Make `Register` object for submitting the registration form.\n\n        Convert form data to legacy fields if `self.form` is `None`.\n\n        :Parameters:\n            - `form`: The form to submit. Its type doesn't have to be \"submit\"\n              (a \"submit\" form will be created here), so it could be the form\n              obtained from `get_form` just with the data entered.\n\n        :return: new registration element\n        :returntype: `Register`\"\"\"\n\n        result = Register()\n        if self.form:\n            result.form = form.make_submit()\n            return result\n\n        if \"FORM_TYPE\" not in form or \"jabber:iq:register\" not in form[\"FORM_TYPE\"].values:\n            raise ValueError(\"FORM_TYPE is not jabber:iq:register\")\n\n        for field in legacy_fields:\n            self.__logger.debug(u\"submitted field %r\" % (field, ))\n            value = getattr(self, field)\n            try:\n                form_value = form[field].value\n            except KeyError:\n                if value:\n                    raise ValueError(\"Required field with no value!\")\n                continue\n            setattr(result, field, form_value)\n\n        return result"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef get_delays(stanza):\n    delays=[]\n    n=stanza.xmlnode.children\n    while n:\n        if n.type==\"element\" and get_node_ns_uri(n)==DELAY_NS and n.name==\"x\":\n            delays.append(Delay(n))\n        n=n.next\n    delays.sort()\n    return delays", "response": "Get the jabber : x : delay elements from the stanza."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef from_xml(self,xmlnode):\n        if xmlnode.type!=\"element\":\n            raise ValueError(\"XML node is not a jabber:x:delay element (not an element)\")\n        ns=get_node_ns_uri(xmlnode)\n        if ns and ns!=DELAY_NS or xmlnode.name!=\"x\":\n            raise ValueError(\"XML node is not a jabber:x:delay element\")\n        stamp=xmlnode.prop(\"stamp\")\n        if stamp.endswith(\"Z\"):\n            stamp=stamp[:-1]\n        if \"-\" in stamp:\n            stamp=stamp.split(\"-\",1)[0]\n        try:\n            tm = time.strptime(stamp, \"%Y%m%dT%H:%M:%S\")\n        except ValueError:\n            raise BadRequestProtocolError(\"Bad timestamp\")\n        tm=tm[0:8]+(0,)\n        self.timestamp=datetime.datetime.fromtimestamp(time.mktime(tm))\n        delay_from=from_utf8(xmlnode.prop(\"from\"))\n        if delay_from:\n            try:\n                self.delay_from = JID(delay_from)\n            except JIDError:\n                raise JIDMalformedProtocolError(\"Bad JID in the jabber:x:delay 'from' attribute\")\n        else:\n            self.delay_from = None\n        self.reason = from_utf8(xmlnode.getContent())", "response": "Initialize a Delay object from an xml. xmlNode."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\ncompletes the XML node with self content.", "response": "def complete_xml_element(self, xmlnode, _unused):\n        \"\"\"Complete the XML node with `self` content.\n\n        Should be overriden in classes derived from `StanzaPayloadObject`.\n\n        :Parameters:\n            - `xmlnode`: XML node with the element being built. It has already\n              right name and namespace, but no attributes or content.\n            - `_unused`: document to which the element belongs.\n        :Types:\n            - `xmlnode`: `libxml2.xmlNode`\n            - `_unused`: `libxml2.xmlDoc`\"\"\"\n        tm=self.timestamp.strftime(\"%Y%m%dT%H:%M:%S\")\n        xmlnode.setProp(\"stamp\",tm)\n        if self.delay_from:\n            xmlnode.setProp(\"from\",self.delay_from.as_utf8())\n        if self.reason:\n            xmlnode.setContent(to_utf8(self.reason))"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef main():\n    parser = argparse.ArgumentParser(description = 'XMPP echo bot',\n                                    parents = [XMPPSettings.get_arg_parser()])\n    parser.add_argument('jid', metavar = 'JID', \n                                        help = 'The bot JID')\n    parser.add_argument('--debug',\n                        action = 'store_const', dest = 'log_level',\n                        const = logging.DEBUG, default = logging.INFO,\n                        help = 'Print debug messages')\n    parser.add_argument('--quiet', const = logging.ERROR,\n                        action = 'store_const', dest = 'log_level',\n                        help = 'Print only error messages')\n    parser.add_argument('--trace', action = 'store_true',\n                        help = 'Print XML data sent and received')\n\n    args = parser.parse_args()\n    settings = XMPPSettings({\n                            \"software_name\": \"Echo Bot\"\n                            })\n    settings.load_arguments(args)\n\n    if settings.get(\"password\") is None:\n        password = getpass(\"{0!r} password: \".format(args.jid))\n        if sys.version_info.major < 3:\n            password = password.decode(\"utf-8\")\n        settings[\"password\"] = password\n\n    if sys.version_info.major < 3:\n        args.jid = args.jid.decode(\"utf-8\")\n\n    logging.basicConfig(level = args.log_level)\n    if args.trace:\n        print \"enabling trace\"\n        handler = logging.StreamHandler()\n        handler.setLevel(logging.DEBUG)\n        for logger in (\"pyxmpp2.IN\", \"pyxmpp2.OUT\"):\n            logger = logging.getLogger(logger)\n            logger.setLevel(logging.DEBUG)\n            logger.addHandler(handler)\n            logger.propagate = False\n\n    bot = EchoBot(JID(args.jid), settings)\n    try:\n        bot.run()\n    except KeyboardInterrupt:\n        bot.disconnect()", "response": "Parse the command - line arguments and run the bot."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef handle_message(self, stanza):\n        if stanza.subject:\n            subject = u\"Re: \" + stanza.subject\n        else:\n            subject = None\n        msg = Message(stanza_type = stanza.stanza_type,\n                        from_jid = stanza.to_jid, to_jid = stanza.from_jid,\n                        subject = subject, body = stanza.body,\n                        thread = stanza.thread)\n        return msg", "response": "Return a Message object from a stanza."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\naccept any incoming connections.", "response": "def handle_read(self):\n        \"\"\"\n        Accept any incoming connections.\n        \"\"\"\n        with self._lock:\n            logger.debug(\"handle_read()\")\n            if self._socket is None:\n                return\n            while True:\n                try:\n                    sock, address = self._socket.accept()\n                except socket.error, err:\n                    if err.args[0] in BLOCKING_ERRORS:\n                        break\n                    else:\n                        raise\n                logger.debug(\"Accepted connection from: {0!r}\".format(address))\n                self._target(sock, address)"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ndecodes the stanza subelements.", "response": "def _decode_subelements(self):\n        \"\"\"Decode the stanza subelements.\"\"\"\n        for child in self._element:\n            if child.tag == self._show_tag:\n                self._show = child.text\n            elif child.tag == self._status_tag:\n                self._status = child.text\n            elif child.tag == self._priority_tag:\n                try:\n                    self._priority = int(child.text.strip())\n                    if self._priority < -128 or self._priority > 127:\n                        raise ValueError\n                except ValueError:\n                    raise BadRequestProtocolError(\n                                            \"Presence priority not an integer\")"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef as_xml(self):\n        result = Stanza.as_xml(self)\n        if self._show:\n            child = ElementTree.SubElement(result, self._show_tag)\n            child.text = self._show\n        if self._status:\n            child = ElementTree.SubElement(result, self._status_tag)\n            child.text = self._status\n        if self._priority:\n            child = ElementTree.SubElement(result, self._priority_tag)\n            child.text = unicode(self._priority)\n        return result", "response": "Return the XML stanza representation."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef copy(self):\n        result = Presence(None, self.from_jid, self.to_jid,\n                        self.stanza_type, self.stanza_id, self.error,\n                        self._return_path(),\n                        self._show, self._status, self._priority)\n        if self._payload is None:\n            self.decode_payload()\n        for payload in self._payload:\n            result.add_payload(payload.copy())\n        return result", "response": "Create a deep copy of the stanza."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef make_accept_response(self):\n        if self.stanza_type not in (\"subscribe\", \"subscribed\",\n                                                \"unsubscribe\", \"unsubscribed\"):\n            raise ValueError(\"Results may only be generated for 'subscribe',\"\n                \"'subscribed','unsubscribe' or 'unsubscribed' presence\")\n        stanza = Presence(stanza_type = ACCEPT_RESPONSES[self.stanza_type],\n                            from_jid = self.to_jid, to_jid = self.from_jid,\n                                                    stanza_id = self.stanza_id)\n        return stanza", "response": "Create an accept response stanza."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ncreates a deny response for the current presence stanza.", "response": "def make_deny_response(self):\n        \"\"\"Create \"deny\" response for the \"subscribe\" / \"subscribed\" /\n        \"unsubscribe\" / \"unsubscribed\" presence stanza.\n\n        :return: new presence stanza.\n        :returntype: `Presence`\n        \"\"\"\n        if self.stanza_type not in (\"subscribe\", \"subscribed\",\n                                                \"unsubscribe\", \"unsubscribed\"):\n            raise ValueError(\"Results may only be generated for 'subscribe',\"\n                \"'subscribed','unsubscribe' or 'unsubscribed' presence\")\n        stanza = Presence(stanza_type = DENY_RESPONSES[self.stanza_type],\n                            from_jid = self.to_jid, to_jid = self.from_jid,\n                                                    stanza_id = self.stanza_id)\n        return stanza"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef make_error_response(self, cond):\n\n        if self.stanza_type == \"error\":\n            raise ValueError(\"Errors may not be generated in response\"\n                                                                \" to errors\")\n\n        stanza = Presence(stanza_type = \"error\", from_jid = self.from_jid,\n                            to_jid = self.to_jid, stanza_id = self.stanza_id,\n                            status = self._status, show = self._show,\n                            priority = self._priority, error_cond = cond)\n\n        if self._payload is None:\n            self.decode_payload()\n\n        for payload in self._payload:\n            stanza.add_payload(payload)\n\n        return stanza", "response": "Create an error response for the any non - error presence stanza."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nactivate an existing billing plan in a CREATED state.", "response": "def activate(self):\n\t\t\"\"\"\n\t\tActivate an plan in a CREATED state.\n\t\t\"\"\"\n\t\tobj = self.find_paypal_object()\n\t\tif obj.state == enums.BillingPlanState.CREATED:\n\t\t\tsuccess = obj.activate()\n\t\t\tif not success:\n\t\t\t\traise PaypalApiError(\"Failed to activate plan: %r\" % (obj.error))\n\t\t# Resync the updated data to the database\n\t\tself.get_or_update_from_api_data(obj, always_sync=True)\n\t\treturn obj"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef execute(self):\n\t\t# Save the execution time first.\n\t\t# If execute() fails, executed_at will be set, with no executed_agreement set.\n\t\tself.executed_at = now()\n\t\tself.save()\n\n\t\twith transaction.atomic():\n\t\t\tret = BillingAgreement.execute(self.id)\n\t\t\tret.user = self.user\n\t\t\tret.save()\n\t\t\tself.executed_agreement = ret\n\t\t\tself.save()\n\n\t\treturn ret", "response": "Execute the PreparedBillingAgreement by creating and executing a\n\t matching BillingAgreement."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef from_request(cls, request, webhook_id=PAYPAL_WEBHOOK_ID):\n\n\t\theaders = fix_django_headers(request.META)\n\t\tassert headers\n\t\ttry:\n\t\t\tbody = request.body.decode(request.encoding or \"utf-8\")\n\t\texcept Exception:\n\t\t\tbody = \"(error decoding body)\"\n\n\t\tip = request.META[\"REMOTE_ADDR\"]\n\t\tobj = cls.objects.create(headers=headers, body=body, remote_ip=ip)\n\n\t\ttry:\n\t\t\tobj.valid = obj.verify(PAYPAL_WEBHOOK_ID)\n\t\t\tif obj.valid:\n\t\t\t\t# Process the item (do not save it, it'll get saved below)\n\t\t\t\tobj.process(save=False)\n\t\texcept Exception as e:\n\t\t\tmax_length = WebhookEventTrigger._meta.get_field(\"exception\").max_length\n\t\t\tobj.exception = str(e)[:max_length]\n\t\t\tobj.traceback = format_exc()\n\t\tfinally:\n\t\t\tobj.save()\n\n\t\treturn obj", "response": "Create a WebhookEventTrigger object from a Django request object."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nrun the command line interface.", "response": "def run(*args):\n\t\"\"\"\n\tCheck and/or create Django migrations.\n\n\tIf --check is present in the arguments then migrations are checked only.\n\t\"\"\"\n\tif not settings.configured:\n\t\tsettings.configure(**DEFAULT_SETTINGS)\n\n\tdjango.setup()\n\n\tparent = os.path.dirname(os.path.abspath(__file__))\n\tsys.path.insert(0, parent)\n\n\tif \"--check\" in args:\n\t\tcheck_migrations()\n\telse:\n\t\tdjango.core.management.call_command(\"makemigrations\", APP_NAME, *args)"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef check_paypal_api_key(app_configs=None, **kwargs):\n\tmessages = []\n\n\tmode = getattr(djpaypal_settings, \"PAYPAL_MODE\", None)\n\tif mode not in VALID_MODES:\n\t\tmsg = \"Invalid PAYPAL_MODE specified: {}.\".format(repr(mode))\n\t\thint = \"PAYPAL_MODE must be one of {}\".format(\", \".join(repr(k) for k in VALID_MODES))\n\t\tmessages.append(checks.Critical(msg, hint=hint, id=\"djpaypal.C001\"))\n\n\tfor setting in \"PAYPAL_CLIENT_ID\", \"PAYPAL_CLIENT_SECRET\":\n\t\tif not getattr(djpaypal_settings, setting, None):\n\t\t\tmsg = \"Invalid value specified for {}\".format(setting)\n\t\t\thint = \"Add PAYPAL_CLIENT_ID and PAYPAL_CLIENT_SECRET to your settings.\"\n\t\t\tmessages.append(checks.Critical(msg, hint=hint, id=\"djpaypal.C002\"))\n\n\treturn messages", "response": "Check that the Paypal API keys are configured correctly."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ncreating the upstream applications.", "response": "async def _create_upstream_applications(self):\n        \"\"\"\n        Create the upstream applications.\n        \"\"\"\n        loop = asyncio.get_event_loop()\n        for steam_name, ApplicationsCls in self.applications.items():\n            application = ApplicationsCls(self.scope)\n            upstream_queue = asyncio.Queue()\n            self.application_streams[steam_name] = upstream_queue\n            self.application_futures[steam_name] = loop.create_task(\n                application(\n                    upstream_queue.get,\n                    partial(self.dispatch_downstream, steam_name=steam_name)\n                )\n            )"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\nasync def send_upstream(self, message, stream_name=None):\n        if stream_name is None:\n            for steam_queue in self.application_streams.values():\n                await steam_queue.put(message)\n            return\n        steam_queue = self.application_streams.get(stream_name)\n        if steam_queue is None:\n            raise ValueError(\"Invalid multiplexed frame received (stream not mapped)\")\n        await steam_queue.put(message)", "response": "Send a message upstream to a de - multiplexed application."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\nasync def dispatch_downstream(self, message, steam_name):\n        handler = getattr(self, get_handler_name(message), None)\n        if handler:\n            await handler(message, stream_name=steam_name)\n        else:\n            # if there is not handler then just pass the message further downstream.\n            await self.base_send(message)", "response": "Dispatch a message to the downstream."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\nasync def receive_json(self, content, **kwargs):\n        # Check the frame looks good\n        if isinstance(content, dict) and \"stream\" in content and \"payload\" in content:\n            # Match it to a channel\n            steam_name = content[\"stream\"]\n            payload = content[\"payload\"]\n            # block upstream frames\n            if steam_name not in self.applications_accepting_frames:\n                raise ValueError(\"Invalid multiplexed frame received (stream not mapped)\")\n            # send it on to the application that handles this stream\n            await self.send_upstream(\n                message={\n                    \"type\": \"websocket.receive\",\n                    \"text\": await self.encode_json(payload)\n                },\n                stream_name=steam_name\n            )\n            return\n        else:\n            raise ValueError(\"Invalid multiplexed **frame received (no channel/payload key)\")", "response": "Rout the message down the correct stream."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nhandle the disconnect message. This is a no - op if the connection is closed.", "response": "async def websocket_disconnect(self, message):\n        \"\"\"\n        Handle the disconnect message.\n\n        This is propagated to all upstream applications.\n        \"\"\"\n        # set this flag so as to ensure we don't send a downstream `websocket.close` message due to all\n        # child applications closing.\n        self.closing = True\n        # inform all children\n        await self.send_upstream(message)\n        await super().websocket_disconnect(message)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\nasync def disconnect(self, code):\n        try:\n            await asyncio.wait(\n                self.application_futures.values(),\n                return_when=asyncio.ALL_COMPLETED,\n                timeout=self.application_close_timeout\n            )\n        except asyncio.TimeoutError:\n            pass", "response": "Disconnects the child application."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\nasync def websocket_send(self, message, stream_name):\n        text = message.get(\"text\")\n        # todo what to do on binary!\n        json = await self.decode_json(text)\n        data = {\n            \"stream\": stream_name,\n            \"payload\": json\n        }\n        await self.send_json(data)", "response": "Capture downstream websocket. send messages from the upstream applications."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\nasync def websocket_accept(self, message, stream_name):\n        is_first = not self.applications_accepting_frames\n        self.applications_accepting_frames.add(stream_name)\n        # accept the connection after the first upstream application accepts.\n        if is_first:\n            await self.accept()", "response": "Intercept downstream websocket. accept message."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\nasync def websocket_close(self, message, stream_name):\n        if stream_name in self.applications_accepting_frames:\n            # remove from set of upsteams steams than can receive new messages\n            self.applications_accepting_frames.remove(stream_name)\n\n        # we are already closing due to an upstream websocket.disconnect command\n\n        if self.closing:\n            return\n        # if none of the upstream applications are listing we need to close.\n        if not self.applications_accepting_frames:\n            await self.close(message.get(\"code\"))", "response": "Handle downstream websocket. close message."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nupdate interpolation data :param list(float) x_list: x values :param list(float) y_list: y values", "response": "def update(self, x_list=list(), y_list=list()):\n        \"\"\"\n        update interpolation data\n        :param list(float) x_list: x values\n        :param list(float) y_list: y values\n        \"\"\"\n        if not y_list:\n            for x in x_list:\n                if x in self.x_list:\n                    i = self.x_list.index(float(x))\n                    self.x_list.pop(i)\n                    self.y_list.pop(i)\n        else:\n            x_list = map(float, x_list)\n            y_list = map(float, y_list)\n            data = [(x, y) for x, y in zip(self.x_list, self.y_list) if x not in x_list]\n            data.extend(zip(x_list, y_list))\n            data = sorted(data)\n            self.x_list = [float(x) for (x, y) in data]\n            self.y_list = [float(y) for (x, y) in data]"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn the interval of the interpolation in which x lies.", "response": "def get_interval(x, intervals):\n        \"\"\"\n        finds interval of the interpolation in which x lies.\n        :param x:\n        :param intervals: the interpolation intervals\n        :return:\n        \"\"\"\n        n = len(intervals)\n        if n < 2:\n            return intervals[0]\n        n2 = n / 2\n        if x < intervals[n2][0]:\n            return spline.get_interval(x, intervals[:n2])\n        else:\n            return spline.get_interval(x, intervals[n2:])"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nsets the interpolation coefficients for the single polynomials of the spline.", "response": "def set_interpolation_coefficients(self):\n        \"\"\"\n        computes the coefficients for the single polynomials of the spline.\n        \"\"\"\n\n        left_boundary_slope = 0\n        right_boundary_slope = 0\n\n        if isinstance(self.boundary_condition, tuple):\n            left_boundary_slope = self.boundary_condition[0]\n            right_boundary_slope = self.boundary_condition[1]\n        elif self.boundary_condition is None:\n            pass\n        else:\n            msg = 'The given object {} of type {} is not a valid condition ' \\\n                  'for the border'.format(self.boundary_condition, type(self.boundary_condition))\n            raise ValueError(msg)\n\n        # getting the values such that we get a continuous second derivative\n        # by solving a system of linear equations\n\n        # setup the matrix\n        n = len(self.x_list)\n        mat = numpy.zeros((n, n))\n        b = numpy.zeros((n, 1))\n        x = self.x_list\n        y = self.y_list\n\n        if n > 2:\n            for i in range(1, n - 1):\n                mat[i, i - 1] = 1.0 / (x[i] - x[i - 1])\n                mat[i, i + 1] = 1.0 / (x[i + 1] - x[i])\n                mat[i, i] = 2 * (mat[i, i - 1] + mat[i, i + 1])\n\n                b[i, 0] = 3 * ((y[i] - y[i - 1]) / (x[i] - x[i - 1]) ** 2\n                               + (y[i + 1] - y[i]) / (x[i + 1] - x[i]) ** 2)\n        elif n < 2:\n            raise ValueError('too less points for interpolation')\n\n        if self.boundary_condition is None:  # not a knot\n            mat[0, 0] = 1.0 / (x[1] - x[0]) ** 2\n            mat[0, 2] = -1.0 / (x[2] - x[1]) ** 2\n            mat[0, 1] = mat[0, 0] + mat[0, 2]\n\n            b[0, 0] = 2.0 * ((y[1] - y[0]) / (x[1] - x[0]) ** 3\n                             - (y[2] - y[1]) / (x[2] - x[1]) ** 3)\n\n            mat[n - 1, n - 3] = 1.0 / (x[n - 2] - x[n - 3]) ** 2\n            mat[n - 1, n - 1] = -1.0 / (x[n - 1] - x[n - 2]) ** 2\n            mat[n - 1, n - 2] = mat[n - 1, n - 3] + mat[n - 1, n - 1]\n\n            b[n - 1, 0] = 2.0 * ((y[n - 2] - y[n - 3]) / (x[n - 2] - x[n - 3]) ** 3\n                                 - (y[n - 1] - y[n - 2]) / (x[n - 1] - x[n - 2]) ** 3)\n        else:\n            mat[0, 0] = 2.0 / (x[1] - x[0])\n            mat[0, 1] = 1.0 / (x[1] - x[0])\n\n            b[0, 0] = 3 * (y[1] - y[0]) / (x[1] - x[0]) ** 2 - 0.5 * left_boundary_slope\n\n            mat[n - 1, n - 2] = 1.0 / (x[n - 1] - x[n - 2])\n            mat[n - 1, n - 1] = 2.0 / (x[n - 1] - x[n - 2])\n\n            b[n - 1, 0] = 3 * (y[n - 1] - y[n - 2]) / (x[n - 1] - x[n - 2]) ** 2 + 0.5 * right_boundary_slope\n\n        k = numpy.linalg.solve(mat, b)\n\n        for i in range(1, n):\n            c1 = k[i - 1, 0] * (x[i] - x[i - 1]) - (y[i] - y[i - 1])\n            c2 = -k[i, 0] * (x[i] - x[i - 1]) + (y[i] - y[i - 1])\n            self.interpolation_coefficients.append([c1, c2])"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef cast(cls, fx_spot, domestic_curve=None, foreign_curve=None):\n        assert domestic_curve.origin == foreign_curve.origin\n        return cls(fx_spot, domestic_curve=domestic_curve, foreign_curve=foreign_curve)", "response": "Create a new FxCurve object from fx spot rate and domestic discount curve and foreign discount curve."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nadd contents to self.", "response": "def add(self, foreign_currency, foreign_curve=None, fx_spot=1.0):\n        \"\"\"\n        adds contents to FxShelf.\n        If curve is FxCurve or FxDict, spot should turn curve.currency into self.currency,\n        else spot should turn currency into self.currency by\n        N in EUR * spot = N in USD for currency = EUR and self.currency = USD\n        \"\"\"\n        assert isinstance(foreign_currency, type(self.currency))\n        assert isinstance(foreign_curve, curve.RateCurve)\n        assert isinstance(fx_spot, float)\n\n        # create missing FxCurves\n        self[self.currency, foreign_currency] = FxCurve.cast(fx_spot, self.domestic_curve, foreign_curve)\n        self[foreign_currency, self.currency] = FxCurve.cast(1 / fx_spot, foreign_curve, self.domestic_curve)\n        # update relevant FxCurves\n        f = foreign_currency\n        new = dict()\n        for d, s in self:\n            if s is self.currency and d is not foreign_currency:\n                triangulated = self[d, s](self.domestic_curve.origin) * fx_spot\n                if (d, f) in self:\n                    self[d, f].foreign_curve = foreign_curve\n                    self[d, f].fx_spot = triangulated\n                    self[f, d].domestic_curve = foreign_curve\n                    self[f, d].fx_spot = 1 / triangulated\n                else:\n                    new[d, f] = FxCurve.cast(triangulated, self[d, s].domestic_curve, foreign_curve)\n                    new[f, d] = FxCurve.cast(1 / triangulated, foreign_curve, self[d, s].domestic_curve)\n        self.update(new)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _frange(start, stop=None, step=None):\n    if stop is None:\n        stop = start\n        start = 0.0\n    if step is None:\n        step = 1.0\n    r = start\n    while r < stop:\n        yield r\n        r += step", "response": "_frange like function for float inputs\n   "}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef interest_accrued(self, valuation_date):\n        return sum([l.interest_accrued(valuation_date) for l in self.legs if hasattr(l, 'interest_accrued')])", "response": "Gets the number of interest accrued for a given date."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef get_secrets(prefixes, relative_paths):\n    try:\n        return os.path.join(sys._MEIPASS, relative_paths[-1])\n    except Exception:\n        for prefix in prefixes:\n            for relative_path in relative_paths:\n                path = os.path.join(prefix, relative_path)\n                if os.path.exists(path):\n                    return path\n        else:\n            return None", "response": "Get the first existing file of relative_path in prefixes directories."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef multiglob_compile(globs, prefix=False):\n    if not globs:\n        # An empty globs list should only match empty strings\n        return re.compile('^$')\n    elif prefix:\n        globs = [x + '*' for x in globs]\n    return re.compile('|'.join(fnmatch.translate(x) for x in globs))", "response": "Generates a single A or B or C regex from a list of shell globs."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ngenerates a SHA1 hash from a potentially long file.", "response": "def hashFile(handle, want_hex=False, limit=None, chunk_size=CHUNK_SIZE):\n    \"\"\"Generate a hash from a potentially long file.\n    Digesting will obey :const:`CHUNK_SIZE` to conserve memory.\n\n    :param handle: A file-like object or path to hash from.\n    :param want_hex: If ``True``, returned hash will be hex-encoded.\n    :type want_hex: :class:`~__builtins__.bool`\n\n    :param limit: Maximum number of bytes to read (rounded up to a multiple of\n        ``CHUNK_SIZE``)\n    :type limit: :class:`~__builtins__.int`\n\n    :param chunk_size: Size of :meth:`~__builtins__.file.read` operations\n        in bytes.\n    :type chunk_size: :class:`~__builtins__.int`\n\n\n    :rtype: :class:`~__builtins__.str`\n    :returns: A binary or hex-encoded SHA1 hash.\n\n    .. note:: It is your responsibility to close any file-like objects you pass\n        in\n    \"\"\"\n    fhash, read = hashlib.sha1(), 0\n    if isinstance(handle, basestring):\n        handle = file(handle, 'rb')\n\n    if limit:\n        chunk_size = min(chunk_size, limit)\n\n    # Chunked digest generation (conserve memory)\n    for block in iter(lambda: handle.read(chunk_size), ''):\n        fhash.update(block)\n        read += chunk_size\n        if 0 < limit <= read:\n            break\n\n    return want_hex and fhash.hexdigest() or fhash.digest()"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef getPaths(roots, ignores=None):\n    paths, count, ignores = [], 0, ignores or []\n\n    # Prepare the ignores list for most efficient use\n    ignore_re = multiglob_compile(ignores, prefix=False)\n\n    for root in roots:\n        # For safety, only use absolute, real paths.\n        root = os.path.realpath(root)\n\n        # Handle directly-referenced filenames properly\n        # (And override ignores to \"do as I mean, not as I say\")\n        if os.path.isfile(root):\n            paths.append(root)\n            continue\n\n        for fldr in os.walk(root):\n            out.write(\"Gathering file paths to compare... (%d files examined)\"\n                      % count)\n\n            # Don't even descend into IGNOREd directories.\n            for subdir in fldr[1]:\n                dirpath = os.path.join(fldr[0], subdir)\n                if ignore_re.match(dirpath):\n                    fldr[1].remove(subdir)\n\n            for filename in fldr[2]:\n                filepath = os.path.join(fldr[0], filename)\n                if ignore_re.match(filepath):\n                    continue  # Skip IGNOREd files.\n\n                paths.append(filepath)\n                count += 1\n\n    out.write(\"Found %s files to be compared for duplication.\" % (len(paths)),\n              newline=True)\n    return paths", "response": "Recursively walk a set of paths and return a listing of contained files."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\ngroup a set of paths according to a function.", "response": "def groupBy(groups_in, classifier, fun_desc='?', keep_uniques=False,\n            *args, **kwargs):\n    \"\"\"Subdivide groups of paths according to a function.\n\n    :param groups_in: Grouped sets of paths.\n    :type groups_in: :class:`~__builtins__.dict` of iterables\n\n    :param classifier: Function to group a list of paths by some attribute.\n    :type classifier: ``function(list, *args, **kwargs) -> str``\n\n    :param fun_desc: Human-readable term for what the classifier operates on.\n        (Used in log messages)\n    :type fun_desc: :class:`~__builtins__.str`\n\n    :param keep_uniques: If ``False``, discard groups with only one member.\n    :type keep_uniques: :class:`~__builtins__.bool`\n\n\n    :returns: A dict mapping classifier keys to groups of matches.\n    :rtype: :class:`~__builtins__.dict`\n\n\n    :attention: Grouping functions generally use a :class:`~__builtins__.set`\n        ``groups`` as extra protection against accidentally counting a given\n        file twice. (Complimentary to use of :func:`os.path.realpath` in\n        :func:`~fastdupes.getPaths`)\n\n    .. todo:: Find some way to bring back the file-by-file status text\n    \"\"\"\n    groups, count, group_count = {}, 0, len(groups_in)\n    for pos, paths in enumerate(groups_in.values()):\n        out.write(\"Subdividing group %d of %d by %s... (%d files examined, %d \"\n                  \"in current group)\" % (\n                      pos + 1, group_count, fun_desc, count, len(paths)\n                  ))\n\n        for key, group in classifier(paths, *args, **kwargs).items():\n            groups.setdefault(key, set()).update(group)\n            count += len(group)\n\n    if not keep_uniques:\n        # Return only the groups with more than one file.\n        groups = dict([(x, groups[x]) for x in groups if len(groups[x]) > 1])\n\n    out.write(\"Found %s sets of files with identical %s. (%d files examined)\"\n              % (len(groups), fun_desc, count), newline=True)\n    return groups"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef groupify(function):\n\n    @wraps(function)\n    def wrapper(paths, *args, **kwargs):  # pylint: disable=missing-docstring\n        groups = {}\n\n        for path in paths:\n            key = function(path, *args, **kwargs)\n            if key is not None:\n                groups.setdefault(key, set()).add(path)\n\n        return groups\n    return wrapper", "response": "Decorator to group a function into a dict of key - group tuples."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef sizeClassifier(path, min_size=DEFAULTS['min_size']):\n    filestat = _stat(path)\n    if stat.S_ISLNK(filestat.st_mode):\n        return  # Skip symlinks.\n\n    if filestat.st_size < min_size:\n        return  # Skip files below the size limit\n\n    return filestat.st_size", "response": "Sort a file into a group based on - disk size."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef groupByContent(paths):\n    handles, results = [], []\n\n    # Silently ignore files we don't have permission to read.\n    hList = []\n    for path in paths:\n        try:\n            hList.append((path, open(path, 'rb'), ''))\n        except IOError:\n            pass  # TODO: Verbose-mode output here.\n    handles.append(hList)\n\n    while handles:\n        # Process more blocks.\n        more, done = compareChunks(handles.pop(0))\n\n        # Add the results to the top-level lists.\n        handles.extend(more)\n        results.extend(done)\n\n    # Keep the same API as the others.\n    return dict((x[0], x) for x in results)", "response": "This function takes a list of paths and returns a dict mapping one path to a list of all paths with the same contents."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef compareChunks(handles, chunk_size=CHUNK_SIZE):\n    chunks = [(path, fh, fh.read(chunk_size)) for path, fh, _ in handles]\n    more, done = [], []\n\n    # While there are combinations not yet tried...\n    while chunks:\n        # Compare the first chunk to all successive chunks\n        matches, non_matches = [chunks[0]], []\n        for chunk in chunks[1:]:\n            if matches[0][2] == chunk[2]:\n                matches.append(chunk)\n            else:\n                non_matches.append(chunk)\n        # Check for EOF or obviously unique files\n        if len(matches) == 1 or matches[0][2] == \"\":\n            for x in matches:\n                x[1].close()\n            done.append([x[0] for x in matches])\n        else:\n            more.append(matches)\n        chunks = non_matches\n\n    return more, done", "response": "This function takes a list of file handles and returns a list of lists of lists that are fed back into this function individually."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ndisplays a list of files and prompt for ones to be kept.", "response": "def pruneUI(dupeList, mainPos=1, mainLen=1):\n    \"\"\"Display a list of files and prompt for ones to be kept.\n\n    The user may enter ``all`` or one or more numbers separated by spaces\n    and/or commas.\n\n    .. note:: It is impossible to accidentally choose to keep none of the\n        displayed files.\n\n    :param dupeList: A list duplicate file paths\n    :param mainPos: Used to display \"set X of Y\"\n    :param mainLen: Used to display \"set X of Y\"\n    :type dupeList: :class:`~__builtins__.list`\n    :type mainPos: :class:`~__builtins__.int`\n    :type mainLen: :class:`~__builtins__.int`\n\n    :returns: A list of files to be deleted.\n    :rtype: :class:`~__builtins__.int`\n    \"\"\"\n    dupeList = sorted(dupeList)\n    print\n    for pos, val in enumerate(dupeList):\n        print \"%d) %s\" % (pos + 1, val)\n    while True:\n        choice = raw_input(\"[%s/%s] Keepers: \" % (mainPos, mainLen)).strip()\n        if not choice:\n            print (\"Please enter a space/comma-separated list of numbers or \"\n                   \"'all'.\")\n            continue\n        elif choice.lower() == 'all':\n            return []\n        try:\n            out = [int(x) - 1 for x in choice.replace(',', ' ').split()]\n            return [val for pos, val in enumerate(dupeList) if pos not in out]\n        except ValueError:\n            print(\"Invalid choice. Please enter a space/comma-separated list\"\n                  \"of numbers or 'all'.\")"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ndeletes duplicate files in a single tree tree.", "response": "def delete_dupes(groups, prefer_list=None, interactive=True, dry_run=False):\n    \"\"\"Code to handle the :option:`--delete` command-line option.\n\n    :param groups: A list of groups of paths.\n    :type groups: iterable\n\n    :param prefer_list: A whitelist to be compiled by\n        :func:`~fastdupes.multiglob_compile` and used to skip some prompts.\n\n    :param interactive: If ``False``, assume the user wants to keep all copies\n        when a prompt would otherwise be displayed.\n    :type interactive: :class:`~__builtins__.bool`\n\n    :param dry_run: If ``True``, only pretend to delete files.\n    :type dry_run: :class:`~__builtins__.bool`\n\n    .. todo:: Add a secondary check for symlinks for safety.\n    \"\"\"\n    prefer_list = prefer_list or []\n    prefer_re = multiglob_compile(prefer_list, prefix=True)\n\n    for pos, group in enumerate(groups.values()):\n        preferred = [x for x in group if prefer_re.match(x)]\n        pruneList = [x for x in group if x not in preferred]\n        if not preferred:\n            if interactive:\n                pruneList = pruneUI(group, pos + 1, len(groups))\n                preferred = [x for x in group if x not in pruneList]\n            else:\n                preferred, pruneList = pruneList, []\n\n        assert preferred  # Safety check\n        for path in pruneList:\n            print \"Removing %s\" % path\n            if not dry_run:\n                os.remove(path)"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nwrites the given text to the current line.", "response": "def write(self, text, newline=False):\n        \"\"\"Use ``\\\\r`` to overdraw the current line with the given text.\n\n        This function transparently handles tracking how much overdrawing is\n        necessary to erase the previous line when used consistently.\n\n        :param text: The text to be outputted\n        :param newline: Whether to start a new line and reset the length count.\n        :type text: :class:`~__builtins__.str`\n        :type newline: :class:`~__builtins__.bool`\n        \"\"\"\n        if not self.isatty:\n            self.fobj.write('%s\\n' % text)\n            return\n\n        msg_len = len(text)\n        self.max_len = max(self.max_len, msg_len)\n\n        self.fobj.write(\"\\r%-*s\" % (self.max_len, text))\n        if newline or not self.isatty:\n            self.fobj.write('\\n')\n            self.max_len = 0"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nsummarizes the text in terms of maximum coverage problem.", "response": "def summarize(text, char_limit, sentence_filter=None, debug=False):\n    '''\n    select sentences in terms of maximum coverage problem\n\n    Args:\n      text: text to be summarized (unicode string)\n      char_limit: summary length (the number of characters)\n\n    Returns:\n      list of extracted sentences\n\n    Reference:\n      Hiroya Takamura, Manabu Okumura.\n      Text summarization model based on maximum coverage problem and its\n      variant. (section 3)\n      http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.222.6945\n    '''\n    debug_info = {}\n\n    sents = list(tools.sent_splitter_ja(text))\n    words_list = [\n        # pulp variables should be utf-8 encoded\n        w.encode('utf-8') for s in sents for w in tools.word_segmenter_ja(s)\n    ]\n\n    tf = collections.Counter()\n    for words in words_list:\n        for w in words:\n            tf[w] += 1.0\n\n    if sentence_filter is not None:\n        valid_indices = [i for i, s in enumerate(sents) if sentence_filter(s)]\n        sents = [sents[i] for i in valid_indices]\n        words_list = [words_list[i] for i in valid_indices]\n\n    sent_ids = [str(i) for i in range(len(sents))]  # sentence id\n    sent_id2len = dict((id_, len(s)) for id_, s in zip(sent_ids, sents))  # c\n\n    word_contain = dict()  # a\n    for id_, words in zip(sent_ids, words_list):\n        word_contain[id_] = collections.defaultdict(lambda: 0)\n        for w in words:\n            word_contain[id_][w] = 1\n\n    prob = pulp.LpProblem('summarize', pulp.LpMaximize)\n\n    # x\n    sent_vars = pulp.LpVariable.dicts('sents', sent_ids, 0, 1, pulp.LpBinary)\n    # z\n    word_vars = pulp.LpVariable.dicts('words', tf.keys(), 0, 1, pulp.LpBinary)\n\n    # first, set objective function: sum(w*z)\n    prob += pulp.lpSum([tf[w] * word_vars[w] for w in tf])\n\n    # next, add constraints\n    # limit summary length: sum(c*x) <= K\n    prob += pulp.lpSum(\n        [sent_id2len[id_] * sent_vars[id_] for id_ in sent_ids]\n    ) <= char_limit, 'lengthRequirement'\n    # for each term, sum(a*x) <= z\n    for w in tf:\n        prob += pulp.lpSum(\n            [word_contain[id_][w] * sent_vars[id_] for id_ in sent_ids]\n        ) >= word_vars[w], 'z:{}'.format(w)\n\n    prob.solve()\n    # print(\"Status:\", pulp.LpStatus[prob.status])\n\n    sent_indices = []\n    for v in prob.variables():\n        # print v.name, \"=\", v.varValue\n        if v.name.startswith('sents') and v.varValue == 1:\n            sent_indices.append(int(v.name.split('_')[-1]))\n\n    return [sents[i] for i in sent_indices], debug_info"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ncompute the lexical centrality score of a list of sentences.", "response": "def lexrank(sentences, continuous=False, sim_threshold=0.1, alpha=0.9,\n            use_divrank=False, divrank_alpha=0.25):\n    '''\n    compute centrality score of sentences.\n\n    Args:\n      sentences: [u'\u3053\u3093\u306b\u3061\u306f\uff0e', u'\u79c1\u306e\u540d\u524d\u306f\u98ef\u6cbc\u3067\u3059\uff0e', ... ]\n      continuous: if True, apply continuous LexRank. (see reference)\n      sim_threshold: if continuous is False and smilarity is greater or\n        equal to sim_threshold, link the sentences.\n      alpha: the damping factor of PageRank and DivRank\n      divrank: if True, apply DivRank instead of PageRank\n      divrank_alpha: strength of self-link [0.0-1.0]\n        (it's not the damping factor, see divrank.py)\n\n    Returns: tuple\n      (\n        {\n          # sentence index -> score\n          0: 0.003,\n          1: 0.002,\n          ...\n        },\n        similarity_matrix\n      )\n    \n    Reference:\n      G\u00fcnes Erkan and Dragomir R. Radev.\n      LexRank: graph-based lexical centrality as salience in text\n      summarization. (section 3)\n      http://www.cs.cmu.edu/afs/cs/project/jair/pub/volume22/erkan04a-html/erkan04a.html\n    '''\n    # configure ranker\n    ranker_params = {'max_iter': 1000}\n    if use_divrank:\n        ranker = divrank_scipy\n        ranker_params['alpha'] = divrank_alpha\n        ranker_params['d'] = alpha\n    else:\n        ranker = networkx.pagerank_scipy\n        ranker_params['alpha'] = alpha\n\n    graph = networkx.DiGraph()\n\n    # sentence -> tf\n    sent_tf_list = []\n    for sent in sentences:\n        words = tools.word_segmenter_ja(sent)\n        tf = collections.Counter(words)\n        sent_tf_list.append(tf)\n\n    sent_vectorizer = DictVectorizer(sparse=True)\n    sent_vecs = sent_vectorizer.fit_transform(sent_tf_list)\n\n    # compute similarities between senteces\n    sim_mat = 1 - pairwise_distances(sent_vecs, sent_vecs, metric='cosine')\n\n    if continuous:\n        linked_rows, linked_cols = numpy.where(sim_mat > 0)\n    else:\n        linked_rows, linked_cols = numpy.where(sim_mat >= sim_threshold)\n\n    # create similarity graph\n    graph.add_nodes_from(range(sent_vecs.shape[0]))\n    for i, j in zip(linked_rows, linked_cols):\n        if i == j:\n            continue\n        weight = sim_mat[i,j] if continuous else 1.0\n        graph.add_edge(i, j, {'weight': weight})\n\n    scores = ranker(graph, **ranker_params)\n    return scores, sim_mat"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef summarize(text, sent_limit=None, char_limit=None, imp_require=None,\n              debug=False, **lexrank_params):\n    '''\n    Args:\n      text: text to be summarized (unicode string)\n      sent_limit: summary length (the number of sentences)\n      char_limit: summary length (the number of characters)\n      imp_require: cumulative LexRank score [0.0-1.0]\n\n    Returns:\n      list of extracted sentences\n    '''\n    debug_info = {}\n    sentences = list(tools.sent_splitter_ja(text))\n    scores, sim_mat = lexrank(sentences, **lexrank_params)\n    sum_scores = sum(scores.itervalues())\n    acc_scores = 0.0\n    indexes = set()\n    num_sent, num_char = 0, 0\n    for i in sorted(scores, key=lambda i: scores[i], reverse=True):\n        num_sent += 1\n        num_char += len(sentences[i])\n        if sent_limit is not None and num_sent > sent_limit:\n            break\n        if char_limit is not None and num_char > char_limit:\n            break\n        if imp_require is not None and acc_scores / sum_scores >= imp_require:\n            break\n        indexes.add(i)\n        acc_scores += scores[i]\n\n    if len(indexes) > 0:\n        summary_sents = [sentences[i] for i in sorted(indexes)]\n    else:\n        summary_sents = sentences\n\n    if debug:\n        debug_info.update({\n            'sentences': sentences, 'scores': scores\n        })\n\n    return summary_sents, debug_info", "response": "Summarize the text into a list of sentences and their scores."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreturns the summarizer for the given name.", "response": "def get_summarizer(self, name):\n        '''\n        import summarizers on-demand\n        '''\n        if name in self.summarizers:\n            pass\n        elif name == 'lexrank':\n            from . import lexrank\n            self.summarizers[name] = lexrank.summarize\n        elif name == 'mcp':\n            from . import mcp_summ\n            self.summarizers[name] = mcp_summ.summarize\n\n        return self.summarizers[name]"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef summarize(self, text=None, algo=u'lexrank', **summarizer_params):\n        '''\n        Args:\n          text: text to be summarized\n          algo: summarizaion algorithm\n              - 'lexrank' (default) graph-based\n              - 'clexrank' Continuous LexRank\n              - 'divrank' DivRank (Diverse Rank)\n              - 'mcp' select sentences in terms of maximum coverage problem\n\n          summarizer_params examples:\n            char_limit: summary length (the number of characters)\n            sent_limit: (not supported with mcp)\n              summary length (the number of sentences)\n            imp_require: (lexrank only)\n              cumulative LexRank score [0.0-1.0]\n        '''\n        try:  # TODO: generate more useful error message\n            # fix parameter type\n            for param, value in summarizer_params.items():\n                if value == '':\n                    del summarizer_params[param]\n                    continue\n                elif re.match(r'^\\d*.\\d+$', value):\n                    value = float(value)\n                elif re.match(r'^\\d+$', value):\n                    value = int(value)\n                elif value == 'true':\n                    value = True\n                elif value == 'false':\n                    value = False\n                summarizer_params[param] = value\n\n            if algo in ('lexrank', 'clexrank', 'divrank'):\n                summarizer = self.get_summarizer('lexrank')\n                if algo == 'clexrank':\n                    summarizer_params['continuous'] = True\n                if algo == 'divrank':\n                    summarizer_params['use_divrank'] = True\n            elif algo == 'mcp':\n                summarizer = self.get_summarizer('mcp')\n\n            summary, debug_info = summarizer(text, **summarizer_params)\n\n        except Exception, e:\n            return json.dumps({'error': str(e)}, ensure_ascii=False, indent=2)\n        else:\n            res = json.dumps(\n                tools.tree_encode({\n                    'summary': summary, 'debug_info': debug_info\n                }),\n                ensure_ascii=False, indent=2\n            )\n            return res", "response": "This function is used to summarize a single object in a tree file."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ntakes a unicode string that contains multiple Japanese sentences and returns a generator that yields all the sentences that are in the given set of delimiters.", "response": "def sent_splitter_ja(text, delimiters=set(u'\u3002\uff0e\uff1f\uff01\\n\\r'),\n                     parenthesis=u'\uff08\uff09\u300c\u300d\u300e\u300f\u201c\u201d'):\n    '''\n    Args:\n      text: unicode string that contains multiple Japanese sentences.\n      delimiters: set() of sentence delimiter characters.\n      parenthesis: to be checked its correspondence.\n    Returns:\n      generator that yields sentences.\n    '''\n    paren_chars = set(parenthesis)\n    close2open = dict(zip(parenthesis[1::2], parenthesis[0::2]))\n    pstack = []\n    buff = []\n\n    for i, c in enumerate(text):\n        c_next = text[i+1] if i+1 < len(text) else None\n        # check correspondence of parenthesis\n        if c in paren_chars:\n            if c in close2open:  # close\n                if len(pstack) > 0 and pstack[-1] == close2open[c]:\n                    pstack.pop()\n            else:  # open\n                pstack.append(c)\n\n        buff.append(c)\n        if c in delimiters:\n            if len(pstack) == 0 and c_next not in delimiters:\n                yield ''.join(buff)\n                buff = []\n\n    if len(buff) > 0:\n        yield ''.join(buff)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef divrank(G, alpha=0.25, d=0.85, personalization=None,\n            max_iter=100, tol=1.0e-6, nstart=None, weight='weight',\n            dangling=None):\n    '''\n    Returns the DivRank (Diverse Rank) of the nodes in the graph.\n    This code is based on networkx.pagerank.\n\n    Args: (diff from pagerank)\n      alpha: controls strength of self-link [0.0-1.0]\n      d: the damping factor\n\n    Reference:\n      Qiaozhu Mei and Jian Guo and Dragomir Radev,\n      DivRank: the Interplay of Prestige and Diversity in Information Networks,\n      http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.174.7982\n    '''\n\n    if len(G) == 0:\n        return {}\n\n    if not G.is_directed():\n        D = G.to_directed()\n    else:\n        D = G\n\n    # Create a copy in (right) stochastic form\n    W = nx.stochastic_graph(D, weight=weight)\n    N = W.number_of_nodes()\n\n    # self-link (DivRank)\n    for n in W.nodes_iter():\n        for n_ in W.nodes_iter():\n            if n != n_ :\n                if n_ in W[n]:\n                    W[n][n_][weight] *= alpha\n            else:\n                if n_ not in W[n]:\n                    W.add_edge(n, n_)\n                W[n][n_][weight] = 1.0 - alpha\n\n    # Choose fixed starting vector if not given\n    if nstart is None:\n        x = dict.fromkeys(W, 1.0 / N)\n    else:\n        # Normalized nstart vector\n        s = float(sum(nstart.values()))\n        x = dict((k, v / s) for k, v in nstart.items())\n\n    if personalization is None:\n        # Assign uniform personalization vector if not given\n        p = dict.fromkeys(W, 1.0 / N)\n    else:\n        missing = set(G) - set(personalization)\n        if missing:\n            raise NetworkXError('Personalization dictionary '\n                                'must have a value for every node. '\n                                'Missing nodes %s' % missing)\n        s = float(sum(personalization.values()))\n        p = dict((k, v / s) for k, v in personalization.items())\n\n    if dangling is None:\n        # Use personalization vector if dangling vector not specified\n        dangling_weights = p\n    else:\n        missing = set(G) - set(dangling)\n        if missing:\n            raise NetworkXError('Dangling node dictionary '\n                                'must have a value for every node. '\n                                'Missing nodes %s' % missing)\n        s = float(sum(dangling.values()))\n        dangling_weights = dict((k, v/s) for k, v in dangling.items())\n    dangling_nodes = [n for n in W if W.out_degree(n, weight=weight) == 0.0]\n\n    # power iteration: make up to max_iter iterations\n    for _ in range(max_iter):\n        xlast = x\n        x = dict.fromkeys(xlast.keys(), 0)\n        danglesum = d * sum(xlast[n] for n in dangling_nodes)\n        for n in x:\n            D_t = sum(W[n][nbr][weight] * xlast[nbr] for nbr in W[n])\n            for nbr in W[n]:\n                #x[nbr] += d * xlast[n] * W[n][nbr][weight]\n                x[nbr] += (\n                    d * (W[n][nbr][weight] * xlast[nbr] / D_t) * xlast[n]\n                )\n            x[n] += danglesum * dangling_weights[n] + (1.0 - d) * p[n]\n\n        # check convergence, l1 norm\n        err = sum([abs(x[n] - xlast[n]) for n in x])\n        if err < N*tol:\n            return x\n    raise NetworkXError('divrank: power iteration failed to converge '\n                        'in %d iterations.' % max_iter)", "response": "Returns the DivRank of the nodes in the graph G."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef divrank_scipy(G, alpha=0.25, d=0.85, personalization=None,\n                  max_iter=100, tol=1.0e-6, nstart=None, weight='weight',\n                  dangling=None):\n    '''\n    Returns the DivRank (Diverse Rank) of the nodes in the graph.\n    This code is based on networkx.pagerank_scipy\n    '''\n    import scipy.sparse\n\n    N = len(G)\n    if N == 0:\n        return {}\n\n    nodelist = G.nodes()\n    M = nx.to_scipy_sparse_matrix(G, nodelist=nodelist, weight=weight,\n                                  dtype=float)\n    S = scipy.array(M.sum(axis=1)).flatten()\n    S[S != 0] = 1.0 / S[S != 0]\n    Q = scipy.sparse.spdiags(S.T, 0, *M.shape, format='csr')\n    M = Q * M\n\n    # self-link (DivRank)\n    M = scipy.sparse.lil_matrix(M)\n    M.setdiag(0.0)\n    M = alpha * M\n    M.setdiag(1.0 - alpha)\n    #print M.sum(axis=1)\n\n    # initial vector\n    x = scipy.repeat(1.0 / N, N)\n\n    # Personalization vector\n    if personalization is None:\n        p = scipy.repeat(1.0 / N, N)\n    else:\n        missing = set(nodelist) - set(personalization)\n        if missing:\n            raise NetworkXError('Personalization vector dictionary '\n                                'must have a value for every node. '\n                                'Missing nodes %s' % missing)\n        p = scipy.array([personalization[n] for n in nodelist],\n                        dtype=float)\n        p = p / p.sum()\n\n    # Dangling nodes\n    if dangling is None:\n        dangling_weights = p\n    else:\n        missing = set(nodelist) - set(dangling)\n\n\n        if missing:\n            raise NetworkXError('Dangling node dictionary '\n                                'must have a value for every node. '\n                                'Missing nodes %s' % missing)\n        # Convert the dangling dictionary into an array in nodelist order\n        dangling_weights = scipy.array([dangling[n] for n in nodelist],\n                                       dtype=float)\n        dangling_weights /= dangling_weights.sum()\n    is_dangling = scipy.where(S == 0)[0]\n\n    # power iteration: make up to max_iter iterations\n    for _ in range(max_iter):\n        xlast = x\n        D_t =  M * x\n        x = (\n            d * (x / D_t * M * x + sum(x[is_dangling]) * dangling_weights)\n            + (1.0 - d) * p\n        )\n        # check convergence, l1 norm\n        err = scipy.absolute(x - xlast).sum()\n        if err < N * tol:\n            return dict(zip(nodelist, map(float, x)))\n\n    raise NetworkXError('divrank_scipy: power iteration failed to converge '\n                        'in %d iterations.' % max_iter)", "response": "This function is based on pagerank_scipy. It returns the DivRank of the nodes in the graph."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreturn an error code between 0 and 99.", "response": "def code_mapping(level, msg, default=99):\n    \"\"\"Return an error code between 0 and 99.\"\"\"\n    try:\n        return code_mappings_by_level[level][msg]\n    except KeyError:\n        pass\n    # Following assumes any variable messages take the format\n    # of 'Fixed text \"variable text\".' only:\n    # e.g. 'Unknown directive type \"req\".'\n    # ---> 'Unknown directive type'\n    # e.g. 'Unknown interpreted text role \"need\".'\n    # ---> 'Unknown interpreted text role'\n    if msg.count('\"') == 2 and ' \"' in msg and msg.endswith('\".'):\n        txt = msg[: msg.index(' \"')]\n        return code_mappings_by_level[level].get(txt, default)\n    return default"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nremove the quotes delimiting a docstring.", "response": "def dequote_docstring(text):\n    \"\"\"Remove the quotes delimiting a docstring.\"\"\"\n    # TODO: Process escaped characters unless raw mode?\n    text = text.strip()\n    if len(text) > 6 and text[:3] == text[-3:] == '\"\"\"':\n        # Standard case, \"\"\"...\"\"\"\n        return text[3:-3]\n    if len(text) > 7 and text[:4] in ('u\"\"\"', 'r\"\"\"') and text[-3:] == '\"\"\"':\n        # Unicode, u\"\"\"...\"\"\", or raw r\"\"\"...\"\"\"\n        return text[4:-3]\n    # Other flake8 tools will report atypical quotes:\n    if len(text) > 6 and text[:3] == text[-3:] == \"'''\":\n        return text[3:-3]\n    if len(text) > 7 and text[:4] in (\"u'''\", \"r'''\") and text[-3:] == \"'''\":\n        return text[4:-3]\n    if len(text) > 2 and text[0] == text[-1] == '\"':\n        return text[1:-1]\n    if len(text) > 3 and text[:2] in ('u\"', 'r\"') and text[-1] == '\"':\n        return text[2:-1]\n    if len(text) > 2 and text[0] == text[-1] == \"'\":\n        return text[1:-1]\n    if len(text) > 3 and text[:2] in (\"u'\", \"r'\") and text[-1] == \"'\":\n        return text[2:-1]\n    raise ValueError(\"Bad quotes!\")"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nreturn True iff this function should be considered public.", "response": "def is_public(self):\n        \"\"\"Return True iff this function should be considered public.\"\"\"\n        if self.all is not None:\n            return self.name in self.all\n        else:\n            return not self.name.startswith(\"_\")"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef is_public(self):\n        # Check if we are a setter/deleter method, and mark as private if so.\n        for decorator in self.decorators:\n            # Given 'foo', match 'foo.bar' but not 'foobar' or 'sfoo'\n            if re.compile(r\"^{}\\.\".format(self.name)).match(decorator.name):\n                return False\n        name_is_public = (\n            not self.name.startswith(\"_\")\n            or self.name in VARIADIC_MAGIC_METHODS\n            or self.is_magic\n        )\n        return self.parent.is_public and name_is_public", "response": "Return True iff this method should be considered public."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nreturning True iff this class should be considered public.", "response": "def is_public(self):\n        \"\"\"Return True iff this class should be considered public.\"\"\"\n        return (\n            not self.name.startswith(\"_\")\n            and self.parent.is_class\n            and self.parent.is_public\n        )"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef move(self):\n        previous = self.current\n        current = self._next_from_generator()\n        self.current = None if current is None else Token(*current)\n        self.line = self.current.start[0] if self.current else self.line\n        self.got_logical_newline = previous.kind in self.LOGICAL_NEWLINES\n        return previous", "response": "Moves the cursor to the next token."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nparses the given file - like object and return its Module object.", "response": "def parse(self, filelike, filename):\n        \"\"\"Parse the given file-like object and return its Module object.\"\"\"\n        self.log = log\n        self.source = filelike.readlines()\n        src = \"\".join(self.source)\n        # This may raise a SyntaxError:\n        compile(src, filename, \"exec\")\n        self.stream = TokenStream(StringIO(src))\n        self.filename = filename\n        self.all = None\n        self.future_imports = set()\n        self._accumulated_decorators = []\n        return self.parse_module()"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nconsuming one token and verify it is of the expected kind.", "response": "def consume(self, kind):\n        \"\"\"Consume one token and verify it is of the expected kind.\"\"\"\n        next_token = self.stream.move()\n        assert next_token.kind == kind"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nskips tokens in the stream until a certain token kind is reached.", "response": "def leapfrog(self, kind, value=None):\n        \"\"\"Skip tokens in the stream until a certain token kind is reached.\n\n        If `value` is specified, tokens whose values are different will also\n        be skipped.\n        \"\"\"\n        while self.current is not None:\n            if self.current.kind == kind and (\n                value is None or self.current.value == value\n            ):\n                self.consume(kind)\n                return\n            self.stream.move()"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef parse_docstring(self):\n        self.log.debug(\n            \"parsing docstring, token is %r (%s)\", self.current.kind, self.current.value\n        )\n        while self.current.kind in (tk.COMMENT, tk.NEWLINE, tk.NL):\n            self.stream.move()\n            self.log.debug(\n                \"parsing docstring, token is %r (%s)\",\n                self.current.kind,\n                self.current.value,\n            )\n        if self.current.kind == tk.STRING:\n            docstring = self.current.value\n            self.stream.move()\n            return docstring\n        return None", "response": "Parse a single docstring and return its value."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nparsing multiple definitions and yield them.", "response": "def parse_definitions(self, class_, all=False):\n        \"\"\"Parse multiple definitions and yield them.\"\"\"\n        while self.current is not None:\n            self.log.debug(\n                \"parsing definition list, current token is %r (%s)\",\n                self.current.kind,\n                self.current.value,\n            )\n            self.log.debug(\"got_newline: %s\", self.stream.got_logical_newline)\n            if all and self.current.value == \"__all__\":\n                self.parse_all()\n            elif (\n                self.current.kind == tk.OP\n                and self.current.value == \"@\"\n                and self.stream.got_logical_newline\n            ):\n                self.consume(tk.OP)\n                self.parse_decorators()\n            elif self.current.value in [\"def\", \"class\"]:\n                yield self.parse_definition(class_._nest(self.current.value))\n            elif self.current.kind == tk.INDENT:\n                self.consume(tk.INDENT)\n                for definition in self.parse_definitions(class_):\n                    yield definition\n            elif self.current.kind == tk.DEDENT:\n                self.consume(tk.DEDENT)\n                return\n            elif self.current.value == \"from\":\n                self.parse_from_import_statement()\n            else:\n                self.stream.move()"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef parse_all(self):\n        assert self.current.value == \"__all__\"\n        self.consume(tk.NAME)\n        if self.current.value != \"=\":\n            raise AllError(\"Could not evaluate contents of __all__. \")\n        self.consume(tk.OP)\n        if self.current.value not in \"([\":\n            raise AllError(\"Could not evaluate contents of __all__. \")\n        self.consume(tk.OP)\n\n        self.all = []\n        all_content = \"(\"\n        while self.current.kind != tk.OP or self.current.value not in \")]\":\n            if self.current.kind in (tk.NL, tk.COMMENT):\n                pass\n            elif self.current.kind == tk.STRING or self.current.value == \",\":\n                all_content += self.current.value\n            else:\n                raise AllError(\n                    \"Unexpected token kind in  __all__: {!r}. \".format(\n                        self.current.kind\n                    )\n                )\n            self.stream.move()\n        self.consume(tk.OP)\n        all_content += \")\"\n        try:\n            self.all = eval(all_content, {})\n        except BaseException as e:\n            raise AllError(\n                \"Could not evaluate contents of __all__.\"\n                \"\\bThe value was {}. The exception was:\\n{}\".format(all_content, e)\n            )", "response": "Parse the __all__ definition in a module."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef parse_module(self):\n        self.log.debug(\"parsing module.\")\n        start = self.line\n        docstring = self.parse_docstring()\n        children = list(self.parse_definitions(Module, all=True))\n        assert self.current is None, self.current\n        end = self.line\n        cls = Module\n        if self.filename.endswith(\"__init__.py\"):\n            cls = Package\n        module = cls(\n            self.filename,\n            self.source,\n            start,\n            end,\n            [],\n            docstring,\n            children,\n            None,\n            self.all,\n            None,\n            \"\",\n        )\n        for child in module.children:\n            child.parent = module\n        module.future_imports = self.future_imports\n        self.log.debug(\"finished parsing module.\")\n        return module", "response": "Parse a module and its children and return a Module object."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef check_current(self, kind=None, value=None):\n        msg = textwrap.dedent(\n            \"\"\"\n        Unexpected token at line {self.line}:\n        In file: {self.filename}\n        Got kind {self.current.kind!r}\n        Got value {self.current.value}\n        \"\"\".format(\n                self=self\n            )\n        )\n        kind_valid = self.current.kind == kind if kind else True\n        value_valid = self.current.value == value if value else True\n        assert kind_valid and value_valid, msg", "response": "Verify the current token is of type kind and equals value."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nparse a from x import y statement.", "response": "def parse_from_import_statement(self):\n        \"\"\"Parse a 'from x import y' statement.\n\n        The purpose is to find __future__ statements.\n        \"\"\"\n        self.log.debug(\"parsing from/import statement.\")\n        is_future_import = self._parse_from_import_source()\n        self._parse_from_import_names(is_future_import)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nparsing the y part in a from x import y statement.", "response": "def _parse_from_import_names(self, is_future_import):\n        \"\"\"Parse the 'y' part in a 'from x import y' statement.\"\"\"\n        if self.current.value == \"(\":\n            self.consume(tk.OP)\n            expected_end_kinds = (tk.OP,)\n        else:\n            expected_end_kinds = (tk.NEWLINE, tk.ENDMARKER)\n        while self.current.kind not in expected_end_kinds and not (\n            self.current.kind == tk.OP and self.current.value == \";\"\n        ):\n            if self.current.kind != tk.NAME:\n                self.stream.move()\n                continue\n            self.log.debug(\n                \"parsing import, token is %r (%s)\",\n                self.current.kind,\n                self.current.value,\n            )\n            if is_future_import:\n                self.log.debug(\"found future import: %s\", self.current.value)\n                self.future_imports.add(self.current.value)\n            self.consume(tk.NAME)\n            self.log.debug(\n                \"parsing import, token is %r (%s)\",\n                self.current.kind,\n                self.current.value,\n            )\n            if self.current.kind == tk.NAME and self.current.value == \"as\":\n                self.consume(tk.NAME)  # as\n                if self.current.kind == tk.NAME:\n                    self.consume(tk.NAME)  # new name, irrelevant\n            if self.current.value == \",\":\n                self.consume(tk.OP)\n            self.log.debug(\n                \"parsing import, token is %r (%s)\",\n                self.current.kind,\n                self.current.value,\n            )"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef run(self):\n        # Is there any reason not to call load_source here?\n        if self.err is not None:\n            assert self.source is None\n            msg = \"%s%03i %s\" % (\n                rst_prefix,\n                rst_fail_load,\n                \"Failed to load file: %s\" % self.err,\n            )\n            yield 0, 0, msg, type(self)\n            module = []\n        try:\n            module = parse(StringIO(self.source), self.filename)\n        except SyntaxError as err:\n            msg = \"%s%03i %s\" % (\n                rst_prefix,\n                rst_fail_parse,\n                \"Failed to parse file: %s\" % err,\n            )\n            yield 0, 0, msg, type(self)\n            module = []\n        except AllError:\n            msg = \"%s%03i %s\" % (\n                rst_prefix,\n                rst_fail_all,\n                \"Failed to parse __all__ entry.\",\n            )\n            yield 0, 0, msg, type(self)\n            module = []\n        for definition in module:\n            if not definition.docstring:\n                # People can use flake8-docstrings to report missing\n                # docstrings\n                continue\n            try:\n                # Note we use the PEP257 trim algorithm to remove the\n                # leading whitespace from each line - this avoids false\n                # positive severe error \"Unexpected section title.\"\n                unindented = trim(dequote_docstring(definition.docstring))\n                # Off load RST validation to reStructuredText-lint\n                # which calls docutils internally.\n                # TODO: Should we pass the Python filename as filepath?\n                rst_errors = list(rst_lint.lint(unindented))\n            except Exception as err:\n                # e.g. UnicodeDecodeError\n                msg = \"%s%03i %s\" % (\n                    rst_prefix,\n                    rst_fail_lint,\n                    \"Failed to lint docstring: %s - %s\" % (definition.name, err),\n                )\n                yield definition.start, 0, msg, type(self)\n                continue\n            for rst_error in rst_errors:\n                # TODO - make this a configuration option?\n                if rst_error.level <= 1:\n                    continue\n                # Levels:\n                #\n                # 0 - debug   --> we don't receive these\n                # 1 - info    --> RST1## codes\n                # 2 - warning --> RST2## codes\n                # 3 - error   --> RST3## codes\n                # 4 - severe  --> RST4## codes\n                #\n                # Map the string to a unique code:\n                msg = rst_error.message.split(\"\\n\", 1)[0]\n                code = code_mapping(rst_error.level, msg)\n                assert code < 100, code\n                code += 100 * rst_error.level\n                msg = \"%s%03i %s\" % (rst_prefix, code, msg)\n\n                # This will return the line number by combining the\n                # start of the docstring with the offet within it.\n                # We don't know the column number, leaving as zero.\n                yield definition.start + rst_error.line, 0, msg, type(self)", "response": "Yields a list of tuples each containing the start and length of the entry."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nloads the source for the specified file.", "response": "def load_source(self):\n        \"\"\"Load the source for the specified file.\"\"\"\n        if self.filename in self.STDIN_NAMES:\n            self.filename = \"stdin\"\n            if sys.version_info[0] < 3:\n                self.source = sys.stdin.read()\n            else:\n                self.source = TextIOWrapper(sys.stdin.buffer, errors=\"ignore\").read()\n        else:\n            # Could be a Python 2.7 StringIO with no context manager, sigh.\n            # with tokenize_open(self.filename) as fd:\n            #     self.source = fd.read()\n            handle = tokenize_open(self.filename)\n            self.source = handle.read()\n            handle.close()"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nconverts a CIE Lab to RGB components.", "response": "def lab_to_rgb(l, a, b):\n    \n    \"\"\" Converts CIE Lab to RGB components.\n    \n    First we have to convert to XYZ color space.\n    Conversion involves using a white point,\n    in this case D65 which represents daylight illumination.\n    \n    Algorithms adopted from:\n    http://www.easyrgb.com/math.php\n    \n    \"\"\"\n    \n    y = (l+16) / 116.0\n    x = a/500.0 + y\n    z = y - b/200.0\n    v = [x,y,z]\n    for i in range(3):\n        if pow(v[i],3) > 0.008856: \n            v[i] = pow(v[i],3)\n        else: \n            v[i] = (v[i]-16/116.0) / 7.787\n\n    # Observer = 2, Illuminant = D65\n    x = v[0] * 95.047/100\n    y = v[1] * 100.0/100\n    z = v[2] * 108.883/100\n\n    r = x * 3.2406 + y *-1.5372 + z *-0.4986\n    g = x *-0.9689 + y * 1.8758 + z * 0.0415\n    b = x * 0.0557 + y *-0.2040 + z * 1.0570\n    v = [r,g,b]\n    for i in range(3):\n        if v[i] > 0.0031308:\n            v[i] = 1.055 * pow(v[i], 1/2.4) - 0.055\n        else:\n            v[i] = 12.92 * v[i]\n\n    #r, g, b = v[0]*255, v[1]*255, v[2]*255\n    r, g, b = v[0], v[1], v[2]\n    return r, g, b"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn the darkest swatch.", "response": "def _darkest(self):\n        \n        \"\"\" Returns the darkest swatch.\n        \n        Knowing the contract between a light and a dark swatch\n        can help us decide how to display readable typography.\n        \n        \"\"\"\n        \n        rgb, n = (1.0, 1.0, 1.0), 3.0\n        for r,g,b in self:\n            if r+g+b < n:\n                rgb, n = (r,g,b), r+g+b\n        \n        return rgb"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nparsing a theme from XML.", "response": "def parse_theme(self, xml):\n        \n        \"\"\" Parses a theme from XML returned by Kuler.\n        \n        Gets the theme's id, label and swatches.\n        All of the swatches are converted to RGB.\n        If we have a full description for a theme id in cache,\n        parse that to get tags associated with the theme.\n        \n        \"\"\"\n\n        kt = KulerTheme()        \n        kt.author = xml.getElementsByTagName(\"author\")[0]\n        kt.author = kt.author.childNodes[1].childNodes[0].nodeValue\n        kt.id = int(self.parse_tag(xml, \"id\"))\n        kt.label = self.parse_tag(xml, \"label\")\n        mode = self.parse_tag(xml, \"mode\")\n        \n        for swatch in xml.getElementsByTagName(\"swatch\"):\n            \n            c1 = float(self.parse_tag(swatch, \"c1\"))\n            c2 = float(self.parse_tag(swatch, \"c2\"))\n            c3 = float(self.parse_tag(swatch, \"c3\"))\n            c4 = float(self.parse_tag(swatch, \"c4\"))\n            \n            if mode == \"rgb\":\n                kt.append((c1,c2,c3))\n            if mode == \"cmyk\":   \n                kt.append(cmyk_to_rgb(c1,c2,c3,c4))\n            if mode == \"hsv\":\n                kt.append(colorsys.hsv_to_rgb(c1,c2,c3))\n            if mode == \"hex\":\n                kt.append(hex_to_rgb(c1))\n            if mode == \"lab\":\n                kt.append(lab_to_rgb(c1,c2,c3))\n        \n        # If we have the full theme in cache,\n        # parse tags from it.\n        if self._cache.exists(self.id_string + str(kt.id)):\n            xml = self._cache.read(self.id_string + str(kt.id))\n            xml = minidom.parseString(xml)\n        for tags in xml.getElementsByTagName(\"tag\"):\n            tags = self.parse_tag(tags, \"label\")\n            tags = tags.split(\" \")\n            kt.tags.extend(tags)\n        \n        return kt"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef create_listening_socket(host, port, handler):\n    sock = socket.socket()\n    sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)\n\n    sock.bind((host, port))\n    sock.listen(1)\n\n    GObject.io_add_watch(sock, GObject.IO_IN, handler)\n    return sock", "response": "Create a listening socket and set listening options"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef listener(self, sock, *args):\n        '''Asynchronous connection listener. Starts a handler for each connection.'''\n        conn, addr = sock.accept()\n        f = conn.makefile(conn)\n        self.shell = ShoebotCmd(self.bot, stdin=f, stdout=f, intro=INTRO)\n\n        print(_(\"Connected\"))\n        GObject.io_add_watch(conn, GObject.IO_IN, self.handler)\n        if self.shell.intro:\n            self.shell.stdout.write(str(self.shell.intro)+\"\\n\")\n            self.shell.stdout.flush()\n        return True", "response": "Asynchronous connection listener. Starts a handler for each connection."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ntrust commands cannot be run remotely :param f: :return:", "response": "def trusted_cmd(f):\n    \"\"\"\n    Trusted commands cannot be run remotely\n\n    :param f:\n    :return:\n    \"\"\"\n    def run_cmd(self, line):\n        if self.trusted:\n            f(self, line)\n        else:\n            print(\"Sorry cannot do %s here.\" % f.__name__[3:])\n\n    global trusted_cmds\n    trusted_cmds.add(f.__name__)\n    run_cmd.__doc__ = f.__doc__\n    return run_cmd"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nprinting the response of a single line of the text.", "response": "def print_response(self, input='', keep=False, *args, **kwargs):\n        \"\"\"\n        print response, if cookie is set then print that each line\n        :param args:\n        :param keep: if True more output is to come\n        :param cookie: set a custom cookie,\n                       if set to 'None' then self.cookie will be used.\n                       if set to 'False' disables cookie output entirely\n        :return:\n        \"\"\"\n        cookie = kwargs.get('cookie')\n        if cookie is None:\n            cookie = self.cookie or ''\n        status = kwargs.get('status')\n        lines = input.splitlines()\n        if status and not lines:\n            lines = ['']\n\n        if cookie:\n            output_template = '{cookie} {status}{cookie_char}{line}'\n        else:\n            output_template = '{line}'\n\n        for i, line in enumerate(lines):\n            if i != len(lines) - 1 or keep is True:\n                cookie_char = '>'\n            else:\n                # last line\n                cookie_char = ':'\n\n            print(output_template.format(\n                cookie_char=cookie_char,\n                cookie=cookie,\n                status=status or '',\n                line=line.strip()), file=self.stdout)"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef do_escape_nl(self, arg):\n        if arg.lower() == 'off':\n            self.escape_nl = False\n        else:\n            self.escape_nl = True", "response": "Escape newlines in any responses\n           "}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef do_prompt(self, arg):\n        if arg.lower() == 'off':\n            self.response_prompt = ''\n            self.prompt = ''\n            return\n        elif arg.lower() == 'on':\n            self.prompt = PROMPT\n            self.response_prompt = RESPONSE_PROMPT\n        self.print_response('prompt: %s' % self.prompt, '\\n', 'response: %s' % self.response_prompt)", "response": "Enable or disable prompt\n           "}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nsetting the framerate of the current session", "response": "def do_speed(self, speed):\n        \"\"\"\n        rewind\n        \"\"\"\n        if speed:\n            try:\n                self.bot._speed = float(speed)\n            except Exception as e:\n                self.print_response('%s is not a valid framerate' % speed)\n                return\n        self.print_response('Speed: %s FPS' % self.bot._speed)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nrestarting the current thread.", "response": "def do_restart(self, line):\n        \"\"\"\n        Attempt to restart the bot.\n        \"\"\"\n        self.bot._frame = 0\n        self.bot._namespace.clear()\n        self.bot._namespace.update(self.bot._initial_namespace)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef do_pause(self, line):\n        # along with stuff in socketserver and shell\n        if self.pause_speed is None:\n            self.pause_speed = self.bot._speed\n            self.bot._speed = 0\n            self.print_response('Paused')\n        else:\n            self.bot._speed = self.pause_speed\n            self.pause_speed = None\n            self.print_response('Playing')", "response": "Pause the current locale."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nplays the current user.", "response": "def do_play(self, line):\n        \"\"\"\n        Resume playback if bot is paused\n        \"\"\"\n        if self.pause_speed is None:\n            self.bot._speed = self.pause_speed\n            self.pause_speed = None\n        self.print_response(\"Play\")"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\ngoing to specific frame", "response": "def do_goto(self, line):\n        \"\"\"\n        Go to specific frame\n        :param line:\n        :return:\n        \"\"\"\n        self.print_response(\"Go to frame %s\" % line)\n        self.bot._frame = int(line)"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nrewinds the current frame", "response": "def do_rewind(self, line):\n        \"\"\"\n        rewind\n        \"\"\"\n        self.print_response(\"Rewinding from frame %s to 0\" % self.bot._frame)\n        self.bot._frame = 0"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nlist bot variables and values", "response": "def do_vars(self, line):\n        \"\"\"\n        List bot variables and values\n        \"\"\"\n        if self.bot._vars:\n            max_name_len = max([len(name) for name in self.bot._vars])\n            for i, (name, v) in enumerate(self.bot._vars.items()):\n                keep = i < len(self.bot._vars) - 1\n                self.print_response(\"%s = %s\" % (name.ljust(max_name_len), v.value), keep=keep)\n        else:\n            self.print_response(\"No vars\")"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef do_load_base64(self, line):\n        cookie = self.cookie\n        executor = self.bot._executor\n\n        def source_good():\n            self.print_response(status=RESPONSE_CODE_OK, cookie=cookie)\n            executor.clear_callbacks()\n\n        def source_bad(tb):\n            if called_good:\n                # good and bad callbacks shouldn't both be called\n                raise ValueError('Good AND Bad callbacks called !')\n            self.print_response(status=RESPONSE_REVERTED, keep=True, cookie=cookie)\n            self.print_response(tb.replace('\\n', '\\\\n'), cookie=cookie)\n            executor.clear_callbacks()\n\n        called_good = False\n        source = str(base64.b64decode(line))\n        # Test compile\n        publish_event(SOURCE_CHANGED_EVENT, data=source, extra_channels=\"shoebot.source\")\n        self.bot._executor.load_edited_source(source, good_cb=source_good, bad_cb=source_bad)", "response": "Load base64 encoded source file and send new code to Editors."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nexits shell and shoebot", "response": "def do_exit(self, line):\n        \"\"\"\n        Exit shell and shoebot\n        \"\"\"\n        if self.trusted:\n            publish_event(QUIT_EVENT)\n        self.print_response('Bye.\\n')\n        return True"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nmaking the current window fullscreen", "response": "def do_fullscreen(self, line):\n        \"\"\"\n        Make the current window fullscreen\n        \"\"\"\n        self.bot.canvas.sink.trigger_fullscreen_action(True)\n        print(self.response_prompt, file=self.stdout)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef do_windowed(self, line):\n        self.bot.canvas.sink.trigger_fullscreen_action(False)\n        print(self.response_prompt, file=self.stdout)", "response": "Un - fullscreen the current window."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nexit shell and shoebot Alias for exit.", "response": "def do_EOF(self, line):\n        \"\"\"\n        Exit shell and shoebot\n\n        Alias for exit.\n        \"\"\"\n        print(self.response_prompt, file=self.stdout)\n        return self.do_exit(line)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nshow help on all commands.", "response": "def do_help(self, arg):\n        \"\"\"\n        Show help on all commands.\n        \"\"\"\n        print(self.response_prompt, file=self.stdout)\n        return cmd.Cmd.do_help(self, arg)"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nallows commands to have a last parameter of 'cookie=somevalue' TODO somevalue will be prepended onto any output lines so that editors can distinguish output from certain kinds of events they have sent. :param line: :return:", "response": "def precmd(self, line):\n        \"\"\"\n        Allow commands to have a last parameter of 'cookie=somevalue'\n\n        TODO somevalue will be prepended onto any output lines so\n        that editors can distinguish output from certain kinds\n        of events they have sent.\n\n        :param line:\n        :return:\n        \"\"\"\n        args = shlex.split(line or \"\")\n        if args and 'cookie=' in args[-1]:\n            cookie_index = line.index('cookie=')\n            cookie = line[cookie_index + 7:]\n            line = line[:cookie_index].strip()\n            self.cookie = cookie\n        if line.startswith('#'):\n            return ''\n        elif '=' in line:\n            # allow  somevar=somevalue\n\n            # first check if we really mean a command\n            cmdname = line.partition(\" \")[0]\n            if hasattr(self, \"do_%s\" % cmdname):\n                return line\n\n            if not line.startswith(\"set \"):\n                return \"set \" + line\n            else:\n                return line\n        if len(args) and args[0] in self.shortcuts:\n            return \"%s %s\" % (self.shortcuts[args[0]], \" \".join(args[1:]))\n        else:\n            return line"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ndraws a daisy at x y", "response": "def drawdaisy(x, y, color='#fefefe'):\n    \"\"\"\n    Draw a daisy at x, y\n    \"\"\"\n    # save location, size etc\n    _ctx.push()\n    \n    # save fill and stroke\n    _fill =_ctx.fill()\n    _stroke = _ctx.stroke()\n\n    sc = (1.0 / _ctx.HEIGHT) * float(y * 0.5) * 4.0\n\n    # draw stalk\n    _ctx.strokewidth(sc * 2.0)\n    _ctx.stroke('#3B240B')\n    \n    _ctx.line(x + (sin(x * 0.1) * 10.0), y + 80, x + sin(_ctx.FRAME * 0.1), y)\n\n    # draw flower\n    _ctx.translate(-20, 0)\n    _ctx.scale(sc)\n\n    # draw petals\n    _ctx.fill(color)\n    _ctx.nostroke()\n    for angle in xrange(0, 360, 45):\n        _ctx.rotate(degrees=45)\n        _ctx.rect(x, y, 40, 8, 1)\n\n    # draw centre\n    _ctx.fill('#F7FE2E')\n    _ctx.ellipse(x + 15, y, 10, 10)\n\n    # restore fill and stroke\n    _ctx.fill(_fill)\n    _ctx.stroke(_stroke)\n\n    # restore location, size etc\n    _ctx.pop()"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef flatten_fft(scale=1.0):\n    _len = len(audio.spectrogram)\n    for i, v in enumerate(audio.spectrogram):\n        yield scale * (i * v) / _len", "response": "Flattens the entire tree into a single graph."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef scaled_fft(fft, scale=1.0):\n    data = np.zeros(len(fft))\n    for i, v in enumerate(fft):\n        data[i] = scale * (i * v) / NUM_SAMPLES\n\n    return data", "response": "Returns a scaled version of the FFT."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ngrab the contents of doc and return it", "response": "def get_source(self, doc):\n        \"\"\"\n        Grab contents of 'doc' and return it\n\n        :param doc: The active document\n        :return:\n        \"\"\"\n        start_iter = doc.get_start_iter()\n        end_iter = doc.get_end_iter()\n        source = doc.get_text(start_iter, end_iter, False)\n        return source"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _create_view(self, name=\"shoebot-output\"):\n        view = gtk.TextView()\n        view.set_editable(False)\n\n        fontdesc = pango.FontDescription(\"Monospace\")\n        view.modify_font(fontdesc)\n        view.set_name(name)\n\n        buff = view.get_buffer()\n        buff.create_tag('error', foreground='red')\n        return view", "response": "Create the gtk. TextView used for shell output"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ncreate the gtk. TextView inside a Gtk. ScrolledWindow containing the names of the current user s log entries.", "response": "def _create_view(self, name=\"shoebot-output\"):\n        \"\"\"\n        Create the gtk.TextView inside a Gtk.ScrolledWindow\n        :return: container, text_view\n        \"\"\"\n        text_view = Gtk.TextView()\n        text_view.set_editable(False)\n\n        fontdesc = Pango.FontDescription(\"Monospace\")\n        text_view.modify_font(fontdesc)\n        text_view.set_name(name)\n\n        buff = text_view.get_buffer()\n        buff.create_tag('error', foreground='red')\n\n        container = Gtk.ScrolledWindow()\n        container.add(text_view)\n        container.show_all()\n        return container, text_view"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef openAnything(source, searchpaths=None):\r\n\r\n    if hasattr(source, \"read\"):\r\n        return source\r\n    \r\n    if source == \"-\":\r\n        import sys\r\n        return sys.stdin\r\n\r\n    # try to open with urllib (if source is http, ftp, or file URL)\r\n    import urllib\r\n    try:\r\n        return urllib.urlopen(source)\r\n    except (IOError, OSError):\r\n        pass\r\n    \r\n    # try to open with native open function (if source is pathname)\r\n    for path in searchpaths or ['.']:\r\n        try:\r\n            return open(os.path.join(path, source))\r\n        except (IOError, OSError):\r\n            pass\r\n\r\n\r\n    # treat source as string\r\n    import StringIO\r\n    return StringIO.StringIO(str(source))", "response": "Open a file or file - like object and return it as a stream."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nloads XML input source return parsed XML document", "response": "def _load(self, source, searchpaths=None):\r\n        \"\"\"load XML input source, return parsed XML document\r\n\r\n        - a URL of a remote XML file (\"http://diveintopython.org/kant.xml\")\r\n        - a filename of a local XML file (\"~/diveintopython/common/py/kant.xml\")\r\n        - standard input (\"-\")\r\n        - the actual XML document, as a string\r\n\r\n        :param searchpaths: optional searchpaths if file is used.\r\n        \"\"\"\r\n        sock = openAnything(source, searchpaths=searchpaths)\r\n        xmldoc = minidom.parse(sock).documentElement\r\n        sock.close()\r\n        return xmldoc"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nguess default source of the current grammar", "response": "def getDefaultSource(self):\r\n        \"\"\"guess default source of the current grammar\r\n        \r\n        The default source will be one of the <ref>s that is not\r\n        cross-referenced.  This sounds complicated but it's not.\r\n        Example: The default source for kant.xml is\r\n        \"<xref id='section'/>\", because 'section' is the one <ref>\r\n        that is not <xref>'d anywhere in the grammar.\r\n        In most grammars, the default source will produce the\r\n        longest (and most interesting) output.\r\n        \"\"\"\r\n        xrefs = {}\r\n        for xref in self.grammar.getElementsByTagName(\"xref\"):\r\n            xrefs[xref.attributes[\"id\"].value] = 1\r\n        xrefs = xrefs.keys()\r\n        standaloneXrefs = [e for e in self.refs.keys() if e not in xrefs]\r\n        if not standaloneXrefs:\r\n            raise NoSourceError, \"can't guess source, and no source specified\"\r\n        return '<xref id=\"%s\"/>' % random.choice(standaloneXrefs)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nresetting output buffer re - parse entire source file and return output", "response": "def refresh(self):\r\n        \"\"\"reset output buffer, re-parse entire source file, and return output\r\n        \r\n        Since parsing involves a good deal of randomness, this is an\r\n        easy way to get new output without having to reload a grammar file\r\n        each time.\r\n        \"\"\"\r\n        self.reset()\r\n        self.parse(self.source)\r\n        return self.output()"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nchooses a random child element of a node", "response": "def randomChildElement(self, node):\r\n        \"\"\"choose a random child element of a node\r\n        \r\n        This is a utility method used by do_xref and do_choice.\r\n        \"\"\"\r\n        choices = [e for e in node.childNodes\r\n                   if e.nodeType == e.ELEMENT_NODE]\r\n        chosen = random.choice(choices)\r\n        if _debug:\r\n            sys.stderr.write('%s available choices: %s\\n' % \\\r\n                (len(choices), [e.toxml() for e in choices]))\r\n            sys.stderr.write('Chosen: %s\\n' % chosen.toxml())\r\n        return chosen"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nparsing a single XML node", "response": "def parse(self, node):\r\n        \"\"\"parse a single XML node\r\n        \r\n        A parsed XML document (from minidom.parse) is a tree of nodes\r\n        of various types.  Each node is represented by an instance of the\r\n        corresponding Python class (Element for a tag, Text for\r\n        text data, Document for the top-level document).  The following\r\n        statement constructs the name of a class method based on the type\r\n        of node we're parsing (\"parse_Element\" for an Element node,\r\n        \"parse_Text\" for a Text node, etc.) and then calls the method.\r\n        \"\"\"\r\n        parseMethod = getattr(self, \"parse_%s\" % node.__class__.__name__)\r\n        parseMethod(node)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef parse_Text(self, node):\r\n        text = node.data\r\n        if self.capitalizeNextWord:\r\n            self.pieces.append(text[0].upper())\r\n            self.pieces.append(text[1:])\r\n            self.capitalizeNextWord = 0\r\n        else:\r\n            self.pieces.append(text)", "response": "parse a text node\r\n           "}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef parse_Element(self, node):\r\n        handlerMethod = getattr(self, \"do_%s\" % node.tagName)\r\n        handlerMethod(node)", "response": "parse an element of the specified node"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nhandle an xref tag", "response": "def do_xref(self, node):\r\n        \"\"\"handle <xref id='...'> tag\r\n        \r\n        An <xref id='...'> tag is a cross-reference to a <ref id='...'>\r\n        tag.  <xref id='sentence'/> evaluates to a randomly chosen child of\r\n        <ref id='sentence'>.\r\n        \"\"\"\r\n        id = node.attributes[\"id\"].value\r\n        self.parse(self.randomChildElement(self.refs[id]))"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nhandles the p tag", "response": "def do_p(self, node):\r\n        \"\"\"handle <p> tag\r\n        \r\n        The <p> tag is the core of the grammar.  It can contain almost\r\n        anything: freeform text, <choice> tags, <xref> tags, even other\r\n        <p> tags.  If a \"class='sentence'\" attribute is found, a flag\r\n        is set and the next word will be capitalized.  If a \"chance='X'\"\r\n        attribute is found, there is an X% chance that the tag will be\r\n        evaluated (and therefore a (100-X)% chance that it will be\r\n        completely ignored)\r\n        \"\"\"\r\n        keys = node.attributes.keys()\r\n        if \"class\" in keys:\r\n            if node.attributes[\"class\"].value == \"sentence\":\r\n                self.capitalizeNextWord = 1\r\n        if \"chance\" in keys:\r\n            chance = int(node.attributes[\"chance\"].value)\r\n            doit = (chance > random.randrange(100))\r\n        else:\r\n            doit = 1\r\n        if doit:\r\n            for child in node.childNodes: self.parse(child)"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreplaces HTML special characters by readable characters.", "response": "def replace_entities(ustring, placeholder=\" \"):\n\n    \"\"\"Replaces HTML special characters by readable characters.\n\n    As taken from Leif K-Brooks algorithm on:\n    http://groups-beta.google.com/group/comp.lang.python\n    \n    \"\"\"\n\n    def _repl_func(match):\n        try:\n            if match.group(1): # Numeric character reference\n                return unichr( int(match.group(2)) ) \n            else:\n                try: return cp1252[ unichr(int(match.group(3))) ].strip()\n                except: return unichr( name2codepoint[match.group(3)] )\n        except:\n            return placeholder\n\n    # Force to Unicode.\n    if not isinstance(ustring, unicode):\n        ustring = UnicodeDammit(ustring).unicode\n    \n    # Don't want some weird unicode character here\n    # that truncate_spaces() doesn't know of:\n    ustring = ustring.replace(\"&nbsp;\", \" \")\n    \n    # The ^> makes sure nothing inside a tag (i.e. href with query arguments) gets processed.\n    _entity_re = re.compile(r'&(?:(#)(\\d+)|([^;^> ]+));') \n    return _entity_re.sub(_repl_func, ustring)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nopening a url and return a connection to it.", "response": "def open(url, wait=10):\n\n    \"\"\" Returns a connection to a url which you can read().\n\n    When the wait amount is exceeded, raises a URLTimeout.\n    When an error occurs, raises a URLError.\n    404 errors specifically return a HTTP404NotFound.\n\n    \"\"\"\n\n    # If the url is a URLParser, get any POST parameters.\n    post = None\n    if isinstance(url, URLParser) and url.method == \"post\":\n        post = urllib.urlencode(url.query)\n\n    # If the url is a URLParser (or a YahooResult or something),\n    # use its string representation.\n    url = str(url)\n\n    # Use urllib instead of urllib2 for local files.\n    if os.path.exists(url):\n        return urllib.urlopen(url)\n\n    else:\n        socket.setdefaulttimeout(wait)\n        try:\n            #connection = urllib2.urlopen(url, post)\n            request = urllib2.Request(url, post, {\"User-Agent\": USER_AGENT, \"Referer\": REFERER})\n            if PROXY:\n                p = urllib2.ProxyHandler({PROXY[1]: PROXY[0]})\n                o = urllib2.build_opener(p, urllib2.HTTPHandler)\n                urllib2.install_opener(o)\n            connection = urllib2.urlopen(request)\n        except urllib2.HTTPError, e:\n            if e.code == 401: raise HTTP401Authentication\n            if e.code == 403: raise HTTP403Forbidden\n            if e.code == 404: raise HTTP404NotFound\n            raise HTTPError\n        except urllib2.URLError, e:\n            if e.reason[0] == 36: raise URLTimeout\n            raise URLError\n\n    return connection"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef not_found(url, wait=10):\n\n    \"\"\" Returns True when the url generates a \"404 Not Found\" error.\n    \"\"\"\n\n    try: connection = open(url, wait)\n    except HTTP404NotFound:\n        return True\n    except:\n        return False\n\n    return False", "response": "Returns True when the url generates a 404 Not Found error."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\ndetermining if the given URL is a MIME - type of one of the given types.", "response": "def is_type(url, types=[], wait=10):\n\n    \"\"\" Determine the MIME-type of the document behind the url.\n\n    MIME is more reliable than simply checking the document extension.\n    Returns True when the MIME-type starts with anything in the list of types.\n\n    \"\"\"\n\n    # Types can also be a single string for convenience.\n    if isinstance(types, str):\n        types = [types]\n\n    try: connection = open(url, wait)\n    except:\n        return False\n\n    type = connection.info()[\"Content-Type\"]\n    for t in types:\n        if type.startswith(t): return True\n\n    return False"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef requirements(debug=True, with_examples=True, with_pgi=None):\n    reqs = list(BASE_REQUIREMENTS)\n    if with_pgi is None:\n        with_pgi = is_jython\n\n    if debug:\n        print(\"setup options: \")\n        print(\"with_pgi:      \", \"yes\" if with_pgi else \"no\")\n        print(\"with_examples: \", \"yes\" if with_examples else \"no\")\n    if with_pgi:\n        reqs.append(\"pgi\")\n        if debug:\n            print(\"warning, as of April 2019 typography does not work with pgi\")\n    else:\n        reqs.append(PYGOBJECT)\n\n    if with_examples:\n        reqs.extend(EXAMPLE_REQUIREMENTS)\n\n    if debug:\n        print(\"\")\n        print(\"\")\n        for req in reqs:\n            print(req)\n    return reqs", "response": "Build requirements based on flags\n   "}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\ndraw a image form path in x y and resize it to width height dimensions.", "response": "def image(self, path, x, y, width=None, height=None, alpha=1.0, data=None, draw=True, **kwargs):\n        '''Draws a image form path, in x,y and resize it to width, height dimensions.\n        '''\n        return self.Image(path, x, y, width, height, alpha, data, **kwargs)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef rect(self, x, y, width, height, roundness=0.0, draw=True, **kwargs):\n        '''\n        Draw a rectangle from x, y of width, height.\n\n        :param startx: top left x-coordinate\n        :param starty: top left y-coordinate\n\n        :param width: height  Size of rectangle.\n        :roundness: Corner roundness defaults to 0.0 (a right-angle).\n        :draw: If True draws immediately.\n        :fill: Optionally pass a fill color.\n\n        :return: path representing the rectangle.\n\n        '''\n        path = self.BezierPath(**kwargs)\n        path.rect(x, y, width, height, roundness, self.rectmode)\n        if draw:\n            path.draw()\n        return path", "response": "Draw a rectangle from x y width height."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef rectmode(self, mode=None):\n        '''\n        Set the current rectmode.\n\n        :param mode: CORNER, CENTER, CORNERS\n        :return: rectmode if mode is None or valid.\n        '''\n        if mode in (self.CORNER, self.CENTER, self.CORNERS):\n            self.rectmode = mode\n            return self.rectmode\n        elif mode is None:\n            return self.rectmode\n        else:\n            raise ShoebotError(_(\"rectmode: invalid input\"))", "response": "Set the current rectmode."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef ellipsemode(self, mode=None):\n        '''\n        Set the current ellipse drawing mode.\n\n        :param mode: CORNER, CENTER, CORNERS\n        :return: ellipsemode if mode is None or valid.\n        '''\n        if mode in (self.CORNER, self.CENTER, self.CORNERS):\n            self.ellipsemode = mode\n            return self.ellipsemode\n        elif mode is None:\n            return self.ellipsemode\n        else:\n            raise ShoebotError(_(\"ellipsemode: invalid input\"))", "response": "Set the current ellipse drawing mode."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef circle(self, x, y, diameter, draw=True, **kwargs):\n        '''Draw a circle\n        :param x: x-coordinate of the top left corner\n        :param y: y-coordinate of the top left corner\n        :param diameter: Diameter of circle.\n        :param draw: Draw immediately (defaults to True, set to False to inhibit drawing)\n        :return: Path object representing circle\n        '''\n        return self.ellipse(x, y, diameter, diameter, draw, **kwargs)", "response": "Draw a circle with the specified diameter"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef arrow(self, x, y, width, type=NORMAL, draw=True, **kwargs):\n        '''Draw an arrow.\n\n        Arrows can be two types: NORMAL or FORTYFIVE.\n\n        :param x: top left x-coordinate\n        :param y: top left y-coordinate\n        :param width: width of arrow\n        :param type:  NORMAL or FORTYFIVE\n        :draw:  If True draws arrow immediately\n\n        :return: Path object representing the arrow.\n        '''\n        # Taken from Nodebox\n        path = self.BezierPath(**kwargs)\n        if type == self.NORMAL:\n            head = width * .4\n            tail = width * .2\n            path.moveto(x, y)\n            path.lineto(x - head, y + head)\n            path.lineto(x - head, y + tail)\n            path.lineto(x - width, y + tail)\n            path.lineto(x - width, y - tail)\n            path.lineto(x - head, y - tail)\n            path.lineto(x - head, y - head)\n            path.lineto(x, y)\n        elif type == self.FORTYFIVE:\n            head = .3\n            tail = 1 + head\n            path.moveto(x, y)\n            path.lineto(x, y + width * (1 - head))\n            path.lineto(x - width * head, y + width)\n            path.lineto(x - width * head, y + width * tail * .4)\n            path.lineto(x - width * tail * .6, y + width)\n            path.lineto(x - width, y + width * tail * .6)\n            path.lineto(x - width * tail * .4, y + width * head)\n            path.lineto(x - width, y + width * head)\n            path.lineto(x - width * (1 - head), y)\n            path.lineto(x, y)\n        else:\n            raise NameError(_(\"arrow: available types for arrow() are NORMAL and FORTYFIVE\\n\"))\n        if draw:\n            path.draw()\n        return path", "response": "Draw an arrow.\n\n        Arrows can be two types: NORMAL or FORTYFIVE.\n\n        :param x: top left x-coordinate\n        :param y: top left y-coordinate\n        :param width: width of arrow\n        :param type:  NORMAL or FORTYFIVE\n        :draw:  If True draws arrow immediately\n\n        :return: Path object representing the arrow."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef drawimage(self, image, x=None, y=None):\n        if x is None:\n            x = image.x\n        if y is None:\n            y = image.y\n        self.image(image.path, image.x, image.y, data=image.data)", "response": "Draw an image from the cache."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nmove relatively to the last point.", "response": "def relmoveto(self, x, y):\n        '''Move relatively to the last point.'''\n        if self._path is None:\n            raise ShoebotError(_(\"No current path. Use beginpath() first.\"))\n        self._path.relmoveto(x, y)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ndraw a line using relative coordinates.", "response": "def rellineto(self, x, y):\n        '''Draw a line using relative coordinates.'''\n        if self._path is None:\n            raise ShoebotError(_(\"No current path. Use beginpath() first.\"))\n        self._path.rellineto(x, y)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef relcurveto(self, h1x, h1y, h2x, h2y, x, y):\n        '''Draws a curve relatively to the last point.\n        '''\n        if self._path is None:\n            raise ShoebotError(_(\"No current path. Use beginpath() first.\"))\n        self._path.relcurveto(h1x, h1y, h2x, h2y, x, y)", "response": "Draws a curve relatively to the last point."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nconstructs a path between the given list of points and determines the curvature of the bezier curve.", "response": "def findpath(self, points, curvature=1.0):\n\n        \"\"\"Constructs a path between the given list of points.\n\n        Interpolates the list of points and determines\n        a smooth bezier path betweem them.\n\n        The curvature parameter offers some control on\n        how separate segments are stitched together:\n        from straight angles to smooth curves.\n        Curvature is only useful if the path has more than  three points.\n        \"\"\"\n\n        # The list of points consists of Point objects,\n        # but it shouldn't crash on something straightforward\n        # as someone supplying a list of (x,y)-tuples.\n\n        for i, pt in enumerate(points):\n            if type(pt) == TupleType:\n                points[i] = Point(pt[0], pt[1])\n\n        if len(points) == 0:\n            return None\n        if len(points) == 1:\n            path = self.BezierPath(None)\n            path.moveto(points[0].x, points[0].y)\n            return path\n        if len(points) == 2:\n            path = self.BezierPath(None)\n            path.moveto(points[0].x, points[0].y)\n            path.lineto(points[1].x, points[1].y)\n            return path\n\n        # Zero curvature means straight lines.\n\n        curvature = max(0, min(1, curvature))\n        if curvature == 0:\n            path = self.BezierPath(None)\n            path.moveto(points[0].x, points[0].y)\n            for i in range(len(points)):\n                path.lineto(points[i].x, points[i].y)\n            return path\n\n        curvature = 4 + (1.0 - curvature) * 40\n\n        dx = {0: 0, len(points) - 1: 0}\n        dy = {0: 0, len(points) - 1: 0}\n        bi = {1: -0.25}\n        ax = {1: (points[2].x - points[0].x - dx[0]) / 4}\n        ay = {1: (points[2].y - points[0].y - dy[0]) / 4}\n\n        for i in range(2, len(points) - 1):\n            bi[i] = -1 / (curvature + bi[i - 1])\n            ax[i] = -(points[i + 1].x - points[i - 1].x - ax[i - 1]) * bi[i]\n            ay[i] = -(points[i + 1].y - points[i - 1].y - ay[i - 1]) * bi[i]\n\n        r = range(1, len(points) - 1)\n        r.reverse()\n        for i in r:\n            dx[i] = ax[i] + dx[i + 1] * bi[i]\n            dy[i] = ay[i] + dy[i + 1] * bi[i]\n\n        path = self.BezierPath(None)\n        path.moveto(points[0].x, points[0].y)\n        for i in range(len(points) - 1):\n            path.curveto(points[i].x + dx[i],\n                         points[i].y + dy[i],\n                         points[i + 1].x - dx[i + 1],\n                         points[i + 1].y - dy[i + 1],\n                         points[i + 1].x,\n                         points[i + 1].y)\n\n        return path"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef transform(self, mode=None):\n        '''\n        Set the current transform mode.\n\n        :param mode: CENTER or CORNER'''\n        if mode:\n            self._canvas.mode = mode\n        return self._canvas.mode", "response": "Set the current transform mode."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef translate(self, xt, yt, mode=None):\n        '''\n        Translate the current position by (xt, yt) and\n        optionally set the transform mode.\n\n        :param xt: Amount to move horizontally\n        :param yt: Amount to move vertically\n        :mode: Set the transform mode to CENTER or CORNER\n        '''\n        self._canvas.translate(xt, yt)\n        if mode:\n            self._canvas.mode = mode", "response": "Translate the current position by xt yt and set the transform mode to CENTER or CORNER."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nsets a scale at which to draw objects at their natural size.", "response": "def scale(self, x=1, y=None):\n        '''\n        Set a scale at which to draw objects.\n\n        1.0 draws objects at their natural size\n\n        :param x: Scale on the horizontal plane\n        :param y: Scale on the vertical plane\n        '''\n        if not y:\n            y = x\n        if x == 0:\n            # Cairo borks on zero values\n            x = 1\n        if y == 0:\n            y = 1\n        self._canvas.scale(x, y)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef fill(self, *args):\n        '''Sets a fill color, applying it to new paths.\n\n        :param args: color in supported format\n        '''\n        if args is not None:\n            self._canvas.fillcolor = self.color(*args)\n        return self._canvas.fillcolor", "response": "Sets a fill color applying it to new paths."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nsetting a stroke color applying it to new paths.", "response": "def stroke(self, *args):\n        '''Set a stroke color, applying it to new paths.\n\n        :param args: color in supported format\n        '''\n        if args is not None:\n            self._canvas.strokecolor = self.color(*args)\n        return self._canvas.strokecolor"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nstopping applying strokes to new paths.", "response": "def nostroke(self):\n        ''' Stop applying strokes to new paths.\n\n        :return: stroke color before nostroke was called.\n        '''\n        c = self._canvas.strokecolor\n        self._canvas.strokecolor = None\n        return c"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nsetting the stroke width.", "response": "def strokewidth(self, w=None):\n        '''Set the stroke width.\n\n        :param w: Stroke width.\n        :return: If no width was specified then current width is returned.\n        '''\n        if w is not None:\n            self._canvas.strokewidth = w\n        else:\n            return self._canvas.strokewidth"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nsetting the font to be used with new text instances.", "response": "def font(self, fontpath=None, fontsize=None):\n        '''Set the font to be used with new text instances.\n\n        :param fontpath: path to truetype or opentype font.\n        :param fontsize: size of font\n\n        :return: current current fontpath (if fontpath param not set)\n        Accepts TrueType and OpenType files. Depends on FreeType being\n        installed.'''\n        if fontpath is not None:\n            self._canvas.fontfile = fontpath\n        else:\n            return self._canvas.fontfile\n        if fontsize is not None:\n            self._canvas.fontsize = fontsize"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef fontsize(self, fontsize=None):\n        '''\n        Set or return size of current font.\n\n        :param fontsize: Size of font.\n        :return: Size of font (if fontsize was not specified)\n        '''\n        if fontsize is not None:\n            self._canvas.fontsize = fontsize\n        else:\n            return self._canvas.fontsize", "response": "Set or return size of current font."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ndraw a string of text according to current font settings.", "response": "def text(self, txt, x, y, width=None, height=1000000, outline=False, draw=True, **kwargs):\n        '''\n        Draws a string of text according to current font settings.\n\n        :param txt: Text to output\n        :param x: x-coordinate of the top left corner\n        :param y: y-coordinate of the top left corner\n        :param width: text width\n        :param height: text height\n        :param outline: If True draws outline text (defaults to False)\n        :param draw: Set to False to inhibit immediate drawing (defaults to True)\n        :return: Path object representing the text.\n        '''\n        txt = self.Text(txt, x, y, width, height, outline=outline, ctx=None, **kwargs)\n        if outline:\n            path = txt.path\n            if draw:\n                path.draw()\n            return path\n        else:\n            return txt"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef textheight(self, txt, width=None):\n        '''Returns the height of a string of text according to the current\n        font settings.\n\n        :param txt: string to measure\n        :param width: width of a line of text in a block\n        '''\n        w = width\n        return self.textmetrics(txt, width=w)[1]", "response": "Returns the height of a string of text according to the current\n        font settings."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef graph_traffic(s, node, alpha=1.0):\n    \n    \"\"\" Visualization of traffic-intensive nodes (based on their centrality).\n    \"\"\"\n    \n    r = node.__class__(None).r\n    r += (node.weight+0.5) * r * 5\n    s._ctx.nostroke()\n    if s.traffic:\n        s._ctx.fill(\n            s.traffic.r, \n            s.traffic.g, \n            s.traffic.b, \n            s.traffic.a * alpha\n        )\n        s._ctx.oval(node.x-r, node.y-r, r*2, r*2)", "response": "Visualization of traffic - intensive nodes."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef node_label(s, node, alpha=1.0):\n\n    \"\"\" Visualization of a node's id.\n    \"\"\"\n\n    if s.text:\n        #s._ctx.lineheight(1)    \n        s._ctx.font(s.font)\n        s._ctx.fontsize(s.fontsize)\n        s._ctx.nostroke()\n        s._ctx.fill(\n            s.text.r, \n            s.text.g, \n            s.text.b, \n            s.text.a * alpha\n        )\n\n        # Cache an outlined label text and translate it.\n        # This enhances the speed and avoids wiggling text.\n        try: p = node._textpath\n        except: \n            txt = node.label\n            try: txt = unicode(txt)\n            except:\n                try: txt = txt.decode(\"utf-8\")\n                except:\n                    pass\n            # Abbreviation.\n            #root = node.graph.root\n            #if txt != root and txt[-len(root):] == root: \n            #    txt = txt[:len(txt)-len(root)]+root[0]+\".\"\n            dx, dy = 0, 0\n            if s.align == 2: #CENTER\n                dx = -s._ctx.textwidth(txt, s.textwidth) / 2\n                dy =  s._ctx.textheight(txt) / 2\n            node._textpath = s._ctx.textpath(txt, dx, dy, width=s.textwidth)\n            p = node._textpath\n        \n        if s.depth:\n            try: __colors.shadow(dx=2, dy=4, blur=5, alpha=0.3*alpha)\n            except: pass\n        \n        s._ctx.push()\n        s._ctx.translate(node.x, node.y)\n        s._ctx.scale(alpha)\n        s._ctx.drawpath(p.copy())\n        s._ctx.pop()", "response": "Visualize a node s label."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nvisualize the edges in a network.", "response": "def edges(s, edges, alpha=1.0, weighted=False, directed=False):\n    \n    \"\"\" Visualization of the edges in a network.\n    \"\"\"\n    \n    p = s._ctx.BezierPath()\n    \n    if directed and s.stroke: \n        pd = s._ctx.BezierPath()           \n    if weighted and s.fill: \n        pw = [s._ctx.BezierPath() for i in range(11)]\n    \n    # Draw the edges in a single BezierPath for speed.\n    # Weighted edges are divided into ten BezierPaths,\n    # depending on their weight rounded between 0 and 10.\n    if len(edges) == 0: return\n    for e in edges:\n        try:  s2 = e.node1.graph.styles[e.node1.style]\n        except: s2 = s\n        if s2.edge:\n            s2.edge(s2, p, e, alpha)\n            if directed and s.stroke:\n                s2.edge_arrow(s2, pd, e, radius=10)\n            if weighted and s.fill:\n                s2.edge(s2, pw[int(e.weight*10)], e, alpha)                \n\n    s._ctx.autoclosepath(False)\n    s._ctx.nofill()\n    s._ctx.nostroke()\n\n    \n\n    # All weighted edges use the default fill.\n    if weighted and s.fill:\n        r = e.node1.__class__(None).r\n        s._ctx.stroke(\n            s.fill.r,\n            s.fill.g,\n            s.fill.b,\n            s.fill.a * 0.65 * alpha\n        )\n        for w in range(1, len(pw)):\n            s._ctx.strokewidth(r*w*0.1)\n            s._ctx.drawpath(pw[w].copy())        \n\n    # All edges use the default stroke.\n    if s.stroke: \n        s._ctx.strokewidth(s.strokewidth)\n\t\n        s._ctx.stroke(\n            s.stroke.r, \n            s.stroke.g, \n            s.stroke.b, \n            s.stroke.a * 0.65 * alpha\n        )\n    \n    s._ctx.drawpath(p.copy())\n    \n    if directed and s.stroke:\n        #clr = s._ctx.stroke().copy()\n\tclr=s._ctx.color(\n            s.stroke.r, \n            s.stroke.g, \n            s.stroke.b, \n            s.stroke.a * 0.65 * alpha\n        )\n\t\t\n        clr.a *= 1.3\n        \n\ts._ctx.stroke(clr)\n        \n    \ts._ctx.drawpath(pd.copy())\n    \n    for e in edges:\n        try:  s2 = self.styles[e.node1.style]\n        except: s2 = s\n        if s2.edge_label:\n            s2.edge_label(s2, e, alpha)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef edge(s, path, edge, alpha=1.0):\n    \n    \"\"\" Visualization of a single edge between two nodes.\n    \"\"\"\n    \n    path.moveto(edge.node1.x, edge.node1.y)\n    if edge.node2.style == BACK:\n        path.curveto(\n            edge.node1.x,\n            edge.node2.y,\n            edge.node2.x,\n            edge.node2.y,\n            edge.node2.x,\n            edge.node2.y,\n        )        \n    else:\n        path.lineto(\n            edge.node2.x, \n            edge.node2.y\n        )", "response": "Visualization of a single edge between two nodes."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef edge_label(s, edge, alpha=1.0):\n\n    \"\"\" Visualization of the label accompanying an edge.\n    \"\"\"\n\n    if s.text and edge.label != \"\":       \n        s._ctx.nostroke()\n        s._ctx.fill(\n            s.text.r, \n            s.text.g, \n            s.text.b, \n            s.text.a * alpha*0.75\n        )\n        s._ctx.lineheight(1)    \n        s._ctx.font(s.font)\n        s._ctx.fontsize(s.fontsize*0.75)\n        \n        # Cache an outlined label text and translate it.\n        # This enhances the speed and avoids wiggling text.\n        try: p = edge._textpath\n        except:\n            try: txt = unicode(edge.label)\n            except:\n                try: txt = edge.label.decode(\"utf-8\")\n                except:\n                    pass\n            edge._textpath = s._ctx.textpath(txt, s._ctx.textwidth(\" \"), 0, width=s.textwidth)\n            p = edge._textpath\n        \n        # Position the label centrally along the edge line.\n        a  = degrees( atan2(edge.node2.y-edge.node1.y, edge.node2.x-edge.node1.x) )\n        d  = sqrt((edge.node2.x-edge.node1.x)**2 +(edge.node2.y-edge.node1.y)**2)\n        d  = abs(d-s._ctx.textwidth(edge.label)) * 0.5\n        \n        s._ctx.push()\n        s._ctx.transform(CORNER)\n        s._ctx.translate(edge.node1.x, edge.node1.y)\n        s._ctx.rotate(-a)\n        s._ctx.translate(d, s.fontsize*1.0)\n        s._ctx.scale(alpha)\n        \n        # Flip labels on the left hand side so they are legible.\n        if 90 < a%360 < 270:\n            s._ctx.translate(s._ctx.textwidth(edge.label), -s.fontsize*2.0)\n            s._ctx.transform(CENTER)\n            s._ctx.rotate(180)\n            s._ctx.transform(CORNER)\n        \n        s._ctx.drawpath(p.copy())\n        s._ctx.pop()", "response": "Visualizes the label accompanying an edge."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef create(self, stylename, **kwargs):\n        if stylename == \"default\":    \n            self[stylename] = style(stylename, self._ctx, **kwargs)\n            return self[stylename]\n        k = kwargs.get(\"template\", \"default\")\n        s = self[stylename] = self[k].copy(stylename)\n        for attr in kwargs:\n            if s.__dict__.has_key(attr):\n                s.__dict__[attr] = kwargs[attr]\n        return s", "response": "Creates a new style which inherits from the default style and returns it."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nreturn a copy of all styles and the styleguide.", "response": "def copy(self, graph):\n        \"\"\" Returns a copy of all styles and a copy of the styleguide.\n        \"\"\"\n        s = styles(graph)\n        s.guide = self.guide.copy(graph)\n        dict.__init__(s, [(v.name, v.copy()) for v in self.values()])\n        return s"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef apply(self):\n        sorted = self.order + self.keys()\n        unique = []; [unique.append(x) for x in sorted if x not in unique]\n        for node in self.graph.nodes:\n            for s in unique:\n                if self.has_key(s) and self[s](self.graph, node): \n                    node.style = s", "response": "Check the rules for each node in the graph and apply the style."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreturning a copy of the styleguide for the given graph.", "response": "def copy(self, graph):\n        \"\"\" Returns a copy of the styleguide for the given graph.\n        \"\"\"\n        g = styleguide(graph)\n        g.order = self.order\n        dict.__init__(g, [(k, v) for k, v in self.iteritems()])\n        return g"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ndecoding a JSON document from s and return a 2 - tuple of the Python object representation and the index in s where the document ended.", "response": "def raw_decode(self, s, **kw):\n        \"\"\"\n        Decode a JSON document from ``s`` (a ``str`` or ``unicode`` beginning\n        with a JSON document) and return a 2-tuple of the Python\n        representation and the index in ``s`` where the document ended.\n\n        This can be used to decode a JSON document from a string that may\n        have extraneous data at the end.\n        \"\"\"\n        kw.setdefault('context', self)\n        try:\n            obj, end = self._scanner.iterscan(s, **kw).next()\n        except StopIteration:\n            raise ValueError(\"No JSON object could be decoded\")\n        return obj, end"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef open_socket(self):\n        self.socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)\n        self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)\n        self.socket.setblocking(0)\n        self.socket.bind((self.host, self.port))", "response": "Opens the socket and binds to the given host and port."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nloads all possible TUIO profiles and returns a dictionary with the keys and values as keys and an instance of a profile as the value", "response": "def load_profiles(self):\n        \"\"\"\n        Loads all possible TUIO profiles and returns a dictionary with the\n        profile addresses as keys and an instance of a profile as the value \n        \"\"\"\n        _profiles = {}\n        for name, klass in inspect.getmembers(profiles):\n            if inspect.isclass(klass) and name.endswith('Profile') and name != 'TuioProfile':\n                # Adding profile to the self.profiles dictionary\n                profile = klass()\n                _profiles[profile.address] = profile\n                # setting convenient variable to access objects of profile\n                try:\n                    setattr(self, profile.list_label, profile.objs)\n                except AttributeError:\n                    continue\n                # Mapping callback method to every profile\n                self.manager.add(self.callback, profile.address)\n        return _profiles"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nupdates the internal state of the internal state of the internal state of the internal state.", "response": "def update(self):\n        \"\"\"\n        Tells the connection manager to receive the next 1024 byte of messages\n        to analyze.\n        \"\"\"\n        try:\n            self.manager.handle(self.socket.recv(1024))\n        except socket.error:\n            pass"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ncalls by the CallbackManager when a new message is received by the user.", "response": "def callback(self, *incoming):\n        \"\"\"\n        Gets called by the CallbackManager if a new message was received \n        \"\"\"\n        message = incoming[0]\n        if message:\n            address, command = message[0], message[2]\n            profile = self.get_profile(address)\n            if profile is not None:\n                try:\n                    getattr(profile, command)(self, message)\n                except AttributeError:\n                    pass"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef copytree(src, dst, symlinks=False, ignore=None):\n    # http://stackoverflow.com/questions/1868714/how-do-i-copy-an-entire-directory-of-files-into-an-existing-directory-using-pyth\n    if not os.path.exists(dst):\n        os.makedirs(dst)\n        shutil.copystat(src, dst)\n    lst = os.listdir(src)\n    if ignore:\n        excl = ignore(src, lst)\n        lst = [x for x in lst if x not in excl]\n    for item in lst:\n        s = os.path.join(src, item)\n        d = os.path.join(dst, item)\n        if symlinks and os.path.islink(s):\n            if os.path.lexists(d):\n                os.remove(d)\n            os.symlink(os.readlink(s), d)\n            try:\n                st = os.lstat(s)\n                mode = stat.S_IMODE(st.st_mode)\n                os.lchmod(d, mode)\n            except:\n                pass  # lchmod not available\n        elif os.path.isdir(s):\n            copytree(s, d, symlinks, ignore)\n        else:\n            shutil.copy2(s, d)", "response": "copytree that works even if folder already exists"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nserializing obj to a JSON formatted string.", "response": "def dumps(obj, skipkeys=False, ensure_ascii=True, check_circular=True,\n        allow_nan=True, cls=None, indent=None, separators=None,\n        encoding='utf-8', default=None, **kw):\n    \"\"\"\n    Serialize ``obj`` to a JSON formatted ``str``.\n\n    If ``skipkeys`` is ``True`` then ``dict`` keys that are not basic types\n    (``str``, ``unicode``, ``int``, ``long``, ``float``, ``bool``, ``None``) \n    will be skipped instead of raising a ``TypeError``.\n\n    If ``ensure_ascii`` is ``False``, then the return value will be a\n    ``unicode`` instance subject to normal Python ``str`` to ``unicode``\n    coercion rules instead of being escaped to an ASCII ``str``.\n\n    If ``check_circular`` is ``False``, then the circular reference check\n    for container types will be skipped and a circular reference will\n    result in an ``OverflowError`` (or worse).\n\n    If ``allow_nan`` is ``False``, then it will be a ``ValueError`` to\n    serialize out of range ``float`` values (``nan``, ``inf``, ``-inf``) in\n    strict compliance of the JSON specification, instead of using the\n    JavaScript equivalents (``NaN``, ``Infinity``, ``-Infinity``).\n\n    If ``indent`` is a non-negative integer, then JSON array elements and\n    object members will be pretty-printed with that indent level. An indent\n    level of 0 will only insert newlines. ``None`` is the most compact\n    representation.\n\n    If ``separators`` is an ``(item_separator, dict_separator)`` tuple\n    then it will be used instead of the default ``(', ', ': ')`` separators.\n    ``(',', ':')`` is the most compact JSON representation.\n\n    ``encoding`` is the character encoding for str instances, default is UTF-8.\n\n    ``default(obj)`` is a function that should return a serializable version\n    of obj or raise TypeError. The default simply raises TypeError.\n\n    To use a custom ``JSONEncoder`` subclass (e.g. one that overrides the\n    ``.default()`` method to serialize additional types), specify it with\n    the ``cls`` kwarg.\n    \"\"\"\n    # cached encoder\n    if (skipkeys is False and ensure_ascii is True and\n        check_circular is True and allow_nan is True and\n        cls is None and indent is None and separators is None and\n        encoding == 'utf-8' and default is None and not kw):\n        return _default_encoder.encode(obj)\n    if cls is None:\n        cls = JSONEncoder\n    return cls(\n        skipkeys=skipkeys, ensure_ascii=ensure_ascii,\n        check_circular=check_circular, allow_nan=allow_nan, indent=indent,\n        separators=separators, encoding=encoding, default=default,\n        **kw).encode(obj)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn a Google web query formatted as a GoogleSearch list object.", "response": "def search(q, start=0, wait=10, asynchronous=False, cached=False):\n    \n    \"\"\" Returns a Google web query formatted as a GoogleSearch list object.\n    \"\"\"\n    \n    service = GOOGLE_SEARCH\n    return GoogleSearch(q, start, service, \"\", wait, asynchronous, cached)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef search_images(q, start=0, size=\"\", wait=10, asynchronous=False, cached=False):\n    \n    \"\"\" Returns a Google images query formatted as a GoogleSearch list object.\n    \"\"\"\n    \n    service = GOOGLE_IMAGES\n    return GoogleSearch(q, start, service, size, wait, asynchronous, cached)", "response": "Returns a Google search list formatted as a GoogleSearch list object."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreturns a GoogleSearch list formatted as a GoogleSearch list object.", "response": "def search_news(q, start=0, wait=10, asynchronous=False, cached=False):\n    \n    \"\"\" Returns a Google news query formatted as a GoogleSearch list object.\n    \"\"\"\n    \n    service = GOOGLE_NEWS\n    return GoogleSearch(q, start, service, \"\", wait, asynchronous, cached)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef search_blogs(q, start=0, wait=10, asynchronous=False, cached=False):\n    \n    \"\"\" Returns a Google blogs query formatted as a GoogleSearch list object.\n    \"\"\"\n    \n    service = GOOGLE_BLOGS\n    return GoogleSearch(q, start, service, \"\", wait, asynchronous, cached)", "response": "Returns a GoogleSearch list object formatted as a GoogleSearch list object."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _parse(self, str):\n        \n        \"\"\" Parses the text data from an XML element defined by tag.\n        \"\"\"\n        \n        str = replace_entities(str)\n        str = strip_tags(str)\n        str = collapse_spaces(str)\n        return str", "response": "Parses the text data from an XML element defined by tag."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef hash(self, id):\n\n        \"\"\" Creates a unique filename in the cache for the id.\n        \"\"\"\n\n        h = md5(id).hexdigest()\n        return os.path.join(self.path, h+self.type)", "response": "Creates a unique filename in the cache for the id."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nreturning the age of the cache entry in days.", "response": "def age(self, id):\n\n        \"\"\" Returns the age of the cache entry, in days.\n        \"\"\"\n\n        path = self.hash(id)\n        if os.path.exists(path):\n            modified = datetime.datetime.fromtimestamp(os.stat(path)[8])\n            age = datetime.datetime.today() - modified\n            return age.days\n        else:\n            return 0"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef angle(x0, y0, x1, y1):\r\n    return degrees(atan2(y1-y0, x1-x0))", "response": "Returns the angle between two points."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreturns the distance between two points.", "response": "def distance(x0, y0, x1, y1):\r\n    \"\"\" Returns the distance between two points.\r\n    \"\"\"\r\n    return sqrt(pow(x1-x0, 2) + pow(y1-y0, 2))"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nreturning the coordinates of a point in a surface by rotating around origin.", "response": "def coordinates(x0, y0, distance, angle):\r\n    \"\"\" Returns the location of a point by rotating around origin (x0,y0).\r\n    \"\"\"\r\n    return (x0 + cos(radians(angle)) * distance,\r\n            y0 + sin(radians(angle)) * distance)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nrotate the coordinates of x y around origin x0 y0 around origin y0 angle degrees.", "response": "def rotate(x, y, x0, y0, angle):\r\n    \"\"\" Returns the coordinates of (x,y) rotated around origin (x0,y0).\r\n    \"\"\"\r\n    x, y = x - x0, y - y0\r\n    a, b = cos(radians(angle)), sin(radians(angle))\r\n    return (x * a - y * b + x0,\r\n            y * a + x * b + y0)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreturning the reflection of a point through origin x y0.", "response": "def reflect(x, y, x0, y0, d=1.0, a=180):\r\n    \"\"\" Returns the reflection of a point through origin (x0,y0).\r\n    \"\"\"\r\n    return coordinates(x0, y0, d * distance(x0, y0, x, y),\r\n                       a + angle(x0, y0, x, y))"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning the linear interpolation between a and b for time t between 0. 0 - 1. 0.", "response": "def lerp(a, b, t):\r\n    \"\"\" Returns the linear interpolation between a and b for time t between 0.0-1.0.\r\n        For example: lerp(100, 200, 0.5) => 150.\r\n    \"\"\"\r\n    if t < 0.0:\r\n        return a\r\n    if t > 1.0:\r\n        return b\r\n    return a + (b - a) * t"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns a smooth transition between 0. 0 and 1. 0 using Hermite interpolation.", "response": "def smoothstep(a, b, x):\r\n    \"\"\" Returns a smooth transition between 0.0 and 1.0 using Hermite interpolation (cubic spline),\r\n        where x is a number between a and b. The return value will ease (slow down) as x nears a or b.\r\n        For x smaller than a, returns 0.0. For x bigger than b, returns 1.0.\r\n    \"\"\"\r\n    if x < a:\r\n        return 0.0\r\n    if x >= b:\r\n        return 1.0\r\n    x = float(x - a) / (b - a)\r\n    return x * x * (3 - 2 * x)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef line_line_intersection(x1, y1, x2, y2, x3, y3, x4, y4, infinite=False):\r\n    # Based on: P. Bourke, http://local.wasp.uwa.edu.au/~pbourke/geometry/lineline2d/\r\n    ua = (x4 - x3) * (y1 - y3) - (y4 - y3) * (x1 - x3)\r\n    ub = (x2 - x1) * (y1 - y3) - (y2 - y1) * (x1 - x3)\r\n    d = (y4 - y3) * (x2 - x1) - (x4 - x3) * (y2 - y1)\r\n    if d == 0:\r\n        if ua == ub == 0:\r\n            # The lines are coincident\r\n            return []\r\n        else:\r\n            # The lines are parallel.\r\n            return []\r\n    ua /= float(d)\r\n    ub /= float(d)\r\n    if not infinite and not (0 <= ua <= 1 and 0 <= ub <= 1):\r\n        # Intersection point is not within both line segments.\r\n        return None, None\r\n    return [(x1 + ua * (x2 - x1),\r\n             y1 + ua * (y2 - y1))]", "response": "Determines the intersection point of two lines or two finite line segments."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef circle_line_intersection(cx, cy, radius, x1, y1, x2, y2, infinite=False):\r\n    # Based on: http://www.vb-helper.com/howto_net_line_circle_intersections.html\r\n    dx = x2 - x1\r\n    dy = y2 - y1\r\n    A = dx * dx + dy * dy\r\n    B = 2 * (dx * (x1 - cx) + dy * (y1 - cy))\r\n    C = pow(x1 - cx, 2) + pow(y1 - cy, 2) - radius * radius\r\n    det = B * B - 4 * A * C\r\n    if A <= 0.0000001 or det < 0:\r\n        return []\r\n    elif det == 0:\r\n        # One point of intersection.\r\n        t = -B / (2 * A)\r\n        return [(x1 + t * dx, y1 + t * dy)]\r\n    else:\r\n        # Two points of intersection.\r\n        # A point of intersection lies on the line segment if 0 <= t <= 1,\r\n        # and on an extension of the segment otherwise.\r\n        points = []\r\n        det2 = sqrt(det)\r\n        t1 = (-B + det2) / (2 * A)\r\n        t2 = (-B - det2) / (2 * A)\r\n        if infinite or 0 <= t1 <= 1:\r\n            points.append((x1 + t1 * dx, y1 + t1 * dy))\r\n        if infinite or 0 <= t2 <= 1:\r\n            points.append((x1 + t2 * dx, y1 + t2 * dy))\r\n        return points", "response": "Returns a list of points where the circle and the line intersect."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning True if the point is inside the polygon.", "response": "def point_in_polygon(points, x, y):\r\n    \"\"\" Ray casting algorithm.\r\n        Determines how many times a horizontal ray starting from the point\r\n        intersects with the sides of the polygon.\r\n        If it is an even number of times, the point is outside, if odd, inside.\r\n        The algorithm does not always report correctly when the point is very close to the boundary.\r\n        The polygon is passed as a list of (x,y)-tuples.\r\n    \"\"\"\r\n    odd = False\r\n    n = len(points)\r\n    for i in range(n):\r\n        j = i < n - 1 and i + 1 or 0\r\n        x0, y0 = points[i][0], points[i][1]\r\n        x1, y1 = points[j][0], points[j][1]\r\n        if (y0 < y and y1 >= y) or (y1 < y and y0 >= y):\r\n            if x0 + (y - y0) / (y1 - y0) * (x1 - x0) < x:\r\n                odd = not odd\r\n    return odd"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns the 3x3 matrix multiplication of A and B.", "response": "def _mmult(self, a, b):\r\n        \"\"\" Returns the 3x3 matrix multiplication of A and B.\r\n            Note that scale(), translate(), rotate() work with premultiplication,\r\n            e.g. the matrix A followed by B = BA and not AB.\r\n        \"\"\"\r\n        # No need to optimize (C version is just as fast).\r\n        return [\r\n            a[0] * b[0] + a[1] * b[3],\r\n            a[0] * b[1] + a[1] * b[4],\r\n            0,\r\n            a[3] * b[0] + a[4] * b[3],\r\n            a[3] * b[1] + a[4] * b[4],\r\n            0,\r\n            a[6] * b[0] + a[7] * b[3] + b[6],\r\n            a[6] * b[1] + a[7] * b[4] + b[7],\r\n            1\r\n        ]"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ninverts the matrix by its inverse produces the identity matrix.", "response": "def invert(self):\r\n        \"\"\" Multiplying a matrix by its inverse produces the identity matrix.\r\n        \"\"\"\r\n        m = self.matrix\r\n        d = m[0] * m[4] - m[1] * m[3]\r\n        self.matrix = [\r\n             m[4] / d, -m[1] / d, 0,\r\n             -m[3] / d,  m[0] / d, 0,\r\n             (m[3] * m[7] - m[4] * m[6]) / d,\r\n             -(m[0] * m[7] - m[1] * m[6]) / d,\r\n             1\r\n        ]"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nreturning the new coordinates of x y after transformation.", "response": "def transform_point(self, x, y):\r\n        \"\"\" Returns the new coordinates of (x,y) after transformation.\r\n        \"\"\"\r\n        m = self.matrix\r\n        return (x*m[0]+y*m[3]+m[6], x*m[1]+y*m[4]+m[7])"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef transform_path(self, path):\r\n        p = path.__class__() # Create a new BezierPath.\r\n        for pt in path:\r\n            if pt.cmd == \"close\":\r\n                p.closepath()\r\n            elif pt.cmd == \"moveto\":\r\n                p.moveto(*self.apply(pt.x, pt.y))\r\n            elif pt.cmd == \"lineto\":\r\n                p.lineto(*self.apply(pt.x, pt.y))\r\n            elif pt.cmd == \"curveto\":\r\n                vx1, vy1 = self.apply(pt.ctrl1.x, pt.ctrl1.y)\r\n                vx2, vy2 = self.apply(pt.ctrl2.x, pt.ctrl2.y)\r\n                x, y = self.apply(pt.x, pt.y)\r\n                p.curveto(vx1, vy1, vx2, vy2, x, y)\r\n        return p", "response": "Returns a new BezierPath object with the transformation applied."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreturns True if a part of the two bounds overlaps.", "response": "def intersects(self, b):\r\n        \"\"\" Return True if a part of the two bounds overlaps.\r\n        \"\"\"\r\n        return max(self.x, b.x) < min(self.x+self.width, b.x+b.width) \\\r\n           and max(self.y, b.y) < min(self.y+self.height, b.y+b.height)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef intersection(self, b):\r\n        if not self.intersects(b):\r\n            return None\r\n        mx, my = max(self.x, b.x), max(self.y, b.y)\r\n        return Bounds(mx, my,\r\n            min(self.x+self.width, b.x+b.width) - mx,\r\n            min(self.y+self.height, b.y+b.height) - my)", "response": "Returns a new bounds that encompasses the intersection of the two sets."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef union(self, b):\r\n        mx, my = min(self.x, b.x), min(self.y, b.y)\r\n        return Bounds(mx, my,\r\n            max(self.x+self.width, b.x+b.width) - mx,\r\n            max(self.y+self.height, b.y+b.height) - my)", "response": "Returns a new bounds that encompasses the union of the two locations."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef contains(self, *a):\r\n        if len(a) == 2: a = [Point(a[0], a[1])]\r\n        if len(a) == 1:\r\n            a = a[0]\r\n            if isinstance(a, Point):\r\n                return a.x >= self.x and a.x <= self.x+self.width \\\r\n                   and a.y >= self.y and a.y <= self.y+self.height\r\n            if isinstance(a, Bounds):\r\n                return a.x >= self.x and a.x+a.width <= self.x+self.width \\\r\n                   and a.y >= self.y and a.y+a.height <= self.y+self.height", "response": "Returns True if the given point or rectangle falls within the bounds."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nprint an error message the help message and quits", "response": "def error(message):\n    '''Prints an error message, the help message and quits'''\n    global parser\n    print (_(\"Error: \") + message)\n    print ()\n    parser.print_help()\n    sys.exit()"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef ellipse(self, x, y, width, height, draw=True, **kwargs):\n        '''Draws an ellipse starting from (x,y)'''\n        path = self.BezierPath(**kwargs)\n        path.ellipse(x,y,width,height)\n        if draw:\n            path.draw()\n        return path", "response": "Draws an ellipse starting from x y width height"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\ndrawing a line from x1 y1 to x2 y2", "response": "def line(self, x1, y1, x2, y2, draw=True):\n        '''Draws a line from (x1,y1) to (x2,y2)'''\n        p = self._path\n        self.newpath()\n        self.moveto(x1,y1)\n        self.lineto(x2,y2)\n        self.endpath(draw=draw)\n        self._path = p\n        return p"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nset the current colormode and eventually the current color range.", "response": "def colormode(self, mode=None, crange=None):\n        '''Sets the current colormode (can be RGB or HSB) and eventually\n        the color range.\n\n        If called without arguments, it returns the current colormode.\n        '''\n        if mode is not None:\n            if mode == \"rgb\":\n                self.color_mode = Bot.RGB\n            elif mode == \"hsb\":\n                self.color_mode = Bot.HSB\n            else:\n                raise NameError, _(\"Only RGB and HSB colormodes are supported.\")\n        if crange is not None:\n            self.color_range = crange\n        return self.color_mode"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef fill(self,*args):\n        '''Sets a fill color, applying it to new paths.'''\n        self._fillcolor = self.color(*args)\n        return self._fillcolor", "response": "Sets a fill color applying it to new paths."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef stroke(self,*args):\n        '''Set a stroke color, applying it to new paths.'''\n        self._strokecolor = self.color(*args)\n        return self._strokecolor", "response": "Set a stroke color applying it to new paths."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ndraw an outlined path of the input text.", "response": "def textpath(self, txt, x, y, width=None, height=1000000, enableRendering=False, **kwargs):\n        '''\n        Draws an outlined path of the input text\n        '''\n        txt = self.Text(txt, x, y, width, height, **kwargs)\n        path = txt.path\n        if draw:\n            path.draw()\n        return path"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreturns the width and height of a string of text as a tuple", "response": "def textmetrics(self, txt, width=None, height=None, **kwargs):\n        '''Returns the width and height of a string of text as a tuple\n        (according to current font settings).\n        '''\n        # for now only returns width and height (as per Nodebox behaviour)\n        # but maybe we could use the other data from cairo\n        \n        # we send doRender=False to prevent the actual rendering process, only the path generation is enabled\n        # not the most efficient way, but it generates accurate results\n        txt = self.Text(txt, 0, 0, width, height, enableRendering=False, **kwargs)\n        return txt.metrics"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\ndraws a flat LINETO segment with the given parameters.", "response": "def draw_cornu_flat(x0, y0, t0, t1, s0, c0, flip, cs, ss, cmd):\n    \n    \"\"\" Raph Levien's code draws fast LINETO segments.\n    \"\"\"\n    \n    for j in range(0, 100):\n        t = j * .01\n        s, c = eval_cornu(t0 + t * (t1 - t0))\n        s *= flip\n        s -= s0\n        c -= c0\n        #print '%', c, s\n        x = c * cs - s * ss\n        y = s * cs + c * ss\n        print_pt(x0 + x, y0 + y, cmd)\n        cmd = 'lineto'\n    return cmd"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef draw_cornu_bezier(x0, y0, t0, t1, s0, c0, flip, cs, ss, cmd, scale, rot):\n\n    \"\"\" Mark Meyer's code draws elegant CURVETO segments.\n    \"\"\"\n\n    s = None\n    for j in range(0, 5):\n        # travel along the function two points at a time (at time t and t2)\n        # the first time through we'll need to get both points\n        # after that we only need the second point because the old second point\n        # becomes the new first point\n        t = j * .2\n        t2 = t+ .2\n        \n        curvetime = t0 + t * (t1 - t0)\n        curvetime2 = t0 + t2 * (t1 - t0)\n        Dt = (curvetime2 - curvetime) * scale\n        \n        if not s:\n            # get first point\n            # avoid calling this again: the next time though x,y will equal x3, y3\n            s, c = eval_cornu(curvetime)\n            s *= flip\n            s -= s0\n            c -= c0\n            # calculate derivative of fresnel function at point to get tangent slope\n            # just take the integrand of the fresnel function\n            dx1 =  cos(pow(curvetime, 2) + (flip * rot))  \n            dy1 =  flip * sin(pow(curvetime, 2) + (flip *rot))\n            # x,y = first point on function\n            x = ((c * cs - s * ss) +x0)\n            y = ((s * cs + c * ss) + y0)\n\n        #evaluate the fresnel further along the function to look ahead to the next point\n        s2,c2 = eval_cornu(curvetime2) \n        s2 *= flip\n        s2 -= s0\n        c2 -= c0\n\n        dx2 = cos(pow(curvetime2, 2) + (flip * rot)) \n        dy2 = flip * sin(pow(curvetime2, 2) + (flip * rot))\n        # x3, y3 = second point on function\n        x3 = ((c2 * cs - s2 * ss)+x0)\n        y3 = ((s2 * cs + c2 * ss)+y0)\n\n        # calculate control points\n        x1 = (x + ((Dt/3.0) * dx1))\n        y1 = (y + ((Dt/3.0) * dy1))   \n        x2 = (x3 - ((Dt/3.0) * dx2))\n        y2 = (y3 - ((Dt/3.0) * dy2))\n\n        if cmd == 'moveto':\n            print_pt(x, y, cmd)\n            cmd = 'curveto'\n        print_crv(x1, y1, x2, y2, x3, y3)\n                    \n        dx1, dy1 = dx2, dy2\n        x,y = x3, y3\n        \n    return cmd", "response": "Draw elegant CURVETO segments."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nreturning a YahooSearch list object formatted as a YahooSearch list object.", "response": "def search(q, start=1, count=10, context=None, wait=10, asynchronous=False, cached=False):\n    \n    \"\"\" Returns a Yahoo web query formatted as a YahooSearch list object.\n    \"\"\"\n    \n    service = YAHOO_SEARCH\n    return YahooSearch(q, start, count, service, context, wait, asynchronous, cached)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn a YahooSearch list object formatted as a YahooSearch list object.", "response": "def search_images(q, start=1, count=10, wait=10, asynchronous=False, cached=False):\n    \n    \"\"\" Returns a Yahoo images query formatted as a YahooSearch list object.\n    \"\"\"\n    \n    service = YAHOO_IMAGES\n    return YahooSearch(q, start, count, service, None, wait, asynchronous, cached)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreturn a YahooSearch list object formatted as a YahooSearch list object.", "response": "def search_news(q, start=1, count=10, wait=10, asynchronous=False, cached=False):\n    \n    \"\"\" Returns a Yahoo news query formatted as a YahooSearch list object.\n    \"\"\"\n    \n    service = YAHOO_NEWS\n    return YahooSearch(q, start, count, service, None, wait, asynchronous, cached)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef suggest_spelling(q, wait=10, asynchronous=False, cached=False):\n    \n    \"\"\" Returns list of suggested spelling corrections for the given query.\n    \"\"\"\n    \n    return YahooSpelling(q, wait, asynchronous, cached)", "response": "Returns a list of suggested spelling corrections for the given query."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef sort(words, context=\"\", strict=True, relative=True, service=YAHOO_SEARCH,\n         wait=10, asynchronous=False, cached=False):\n    \n    \"\"\"Performs a Yahoo sort on the given list.\n    \n    Sorts the items in the list according to \n    the result count Yahoo yields on an item.\n    \n    Setting a context sorts the items according\n    to their relation to this context;\n    for example sorting [red, green, blue] by \"love\"\n    yields red as the highest results,\n    likely because red is the color commonly associated with love.\n    \n    \"\"\"\n    \n    results = []\n    for word in words:\n        q = word + \" \" + context\n        q.strip()\n        if strict: q = \"\\\"\"+q+\"\\\"\"\n        r = YahooSearch(q, 1, 1, service, context, wait, asynchronous, cached)\n        results.append(r)\n        \n    results.sort(YahooResults.__cmp__)\n    results.reverse()\n    \n    if relative and len(results) > 0:\n        sum = 0.000000000000000001\n        for r in results: sum += r.total\n        for r in results: r.total /= float(sum)\n    \n    results = [(r.query, r.total) for r in results]\n    return results", "response": "Perform a Yahoo sort on the given list of words."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _parse(self, e, tag):\n        \n        \"\"\" Parses the text data from an XML element defined by tag.\n        \"\"\"\n        \n        tags = e.getElementsByTagName(tag)\n        children = tags[0].childNodes\n        if len(children) != 1: return None\n        assert children[0].nodeType == xml.dom.minidom.Element.TEXT_NODE\n        \n        s = children[0].nodeValue\n        s = format_data(s)\n        s = replace_entities(s)\n        \n        return s", "response": "Parses the text data from an XML element defined by tag."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef layer(self, img, x=0, y=0, name=\"\"):\n    \n        \"\"\"Creates a new layer from file, Layer, PIL Image.\n    \n        If img is an image file or PIL Image object,\n        Creates a new layer with the given image file.\n        The image is positioned on the canvas at x, y.\n        \n        If img is a Layer,\n        uses that layer's x and y position and name.\n    \n        \"\"\"\n\n        from types import StringType\n        if isinstance(img, Image.Image):\n            img = img.convert(\"RGBA\")\n            self.layers.append(Layer(self, img, x, y, name))\n            return len(self.layers)-1\n        if isinstance(img, Layer):\n            img.canvas = self\n            self.layers.append(img)\n            return len(self.layers)-1                 \n        if type(img) == StringType: \n            img = Image.open(img)\n            img = img.convert(\"RGBA\")\n            self.layers.append(Layer(self, img, x, y, name))\n            return len(self.layers)-1", "response": "Creates a new layer from file Layer Image. Image object or Image. Image object."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef fill(self, rgb, x=0, y=0, w=None, h=None, name=\"\"):\n    \n        \"\"\"Creates a new fill layer.\n    \n        Creates a new layer filled with the given rgb color.\n        For example, fill((255,0,0)) creates a red fill.\n        The layers fills the entire canvas by default.\n    \n        \"\"\" \n    \n        if w == None: w = self.w - x\n        if h == None: h = self.h - y\n        img = Image.new(\"RGBA\", (w,h), rgb)\n        self.layer(img, x, y, name)", "response": "Fill the image with the given rgb color."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ncreates a gradient layer.", "response": "def gradient(self, style=LINEAR, w=1.0, h=1.0, name=\"\"):\n    \n        \"\"\"Creates a gradient layer.\n    \n        Creates a gradient layer, that is usually used\n        together with the mask() function.\n    \n        All the image functions work on gradients,\n        so they can easily be flipped, rotated, scaled, inverted,\n        made brighter or darker, ...\n    \n        Styles for gradients are LINEAR, RADIAL and DIAMOND.\n    \n        \"\"\"\n    \n        from types import FloatType\n        w0 = self.w \n        h0 = self.h\n        if type(w) == FloatType: w *= w0\n        if type(h) == FloatType: h *= h0\n    \n        img = Image.new(\"L\", (int(w),int(h)), 255)\n        draw = ImageDraw.Draw(img)\n    \n        if style == LINEAR:\n            for i in range(int(w)):\n                k = 255.0 * i/w\n                draw.rectangle((i, 0, i, h), fill=int(k))\n            \n        if style == RADIAL:\n            r = min(w,h)/2\n            for i in range(int(r)):\n                k = 255 - 255.0 * i/r\n                draw.ellipse((w/2-r+i, h/2-r+i, w/2+r-i, h/2+r-i), fill=int(k))\n            \n        if style == DIAMOND:\n            r = max(w,h)\n            for i in range(int(r)):\n                x = int(i*w/r*0.5)\n                y = int(i*h/r*0.5)\n                k = 255.0 * i/r\n                draw.rectangle((x, y, w-x, h-y), outline=int(k))\n    \n        img = img.convert(\"RGBA\")\n        self.layer(img, 0, 0, name=\"\")"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef merge(self, layers):\n        \n        \"\"\"Flattens the given layers on the canvas.\n        \n        Merges the given layers with the indices in the list\n        on the bottom layer in the list.\n        The other layers are discarded.\n        \n        \"\"\"\n        \n        layers.sort()\n        if layers[0] == 0: del layers[0]\n        self.flatten(layers)", "response": "Merges the given list of layers into the current list."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef flatten(self, layers=[]):\n    \n        \"\"\"Flattens all layers according to their blend modes.\n    \n        Merges all layers to the canvas,\n        using the blend mode and opacity defined for each layer.\n        Once flattened, the stack of layers is emptied except\n        for the transparent background (bottom layer).\n    \n        \"\"\"\n        \n        #When the layers argument is omitted,\n        #flattens all the layers on the canvas.\n        #When given, merges the indexed layers.\n        \n        #Layers that fall outside of the canvas are cropped:\n        #this should be fixed by merging to a transparent background\n        #large enough to hold all the given layers' data\n        #(=time consuming).\n        \n        if layers == []: \n            layers = range(1, len(self.layers))\n        background = self.layers._get_bg()\n        background.name = \"Background\"\n        \n        for i in layers:\n\n            layer = self.layers[i]\n        \n            #Determine which portion of the canvas\n            #needs to be updated with the overlaying layer.\n        \n            x = max(0, layer.x)\n            y = max(0, layer.y)\n            w = min(background.w, layer.x+layer.w)\n            h = min(background.h, layer.y+layer.h)\n        \n            base = background.img.crop((x, y, w, h))\n\n            #Determine which piece of the layer\n            #falls within the canvas.\n\n            x = max(0, -layer.x)\n            y = max(0, -layer.y)\n            w -= layer.x\n            h -= layer.y\n\n            blend = layer.img.crop((x, y, w, h))\n        \n            #Buffer layer blend modes:\n            #the base below is a flattened version\n            #of all the layers below this one,\n            #on which to merge this blended layer.\n        \n            if layer.blend == NORMAL:\n                buffer = blend\n            if layer.blend == MULTIPLY:\n                buffer = ImageChops.multiply(base, blend)\n            if layer.blend == SCREEN:\n                buffer = ImageChops.screen(base, blend)\n            if layer.blend == OVERLAY:\n                buffer = Blend().overlay(base, blend)\n            if layer.blend == HUE:\n                buffer = Blend().hue(base, blend)\n            if layer.blend == COLOR:\n                buffer = Blend().color(base, blend)\n            \n            #Buffer a merge between the base and blend\n            #according to the blend's alpha channel:\n            #the base shines through where the blend is less opaque.\n        \n            #Merging the first layer to the transparent canvas\n            #works slightly different than the other layers.\n        \n            alpha = buffer.split()[3]\n            if i == 1:\n                buffer = Image.composite(base, buffer, base.split()[3])\n            else:\n                buffer = Image.composite(buffer, base, alpha)\n        \n            #The alpha channel becomes a composite of\n            #this layer and the base:\n            #the base's (optional) tranparent background\n            #is retained in arrays where the blend layer\n            #is transparent as well.\n        \n            alpha = ImageChops.lighter(alpha, base.split()[3])\n            buffer.putalpha(alpha)\n        \n            #Apply the layer's opacity,\n            #merging the buffer to the base with\n            #the given layer opacity.\n        \n            base = Image.blend(base, buffer, layer.alpha)\n\n            #Merge the base to the flattened canvas.\n\n            x = max(0, layer.x)\n            y = max(0, layer.y)\n            background.img.paste(base, (x,y))\n    \n        layers.reverse()\n        for i in layers: del self.layers[i]\n    \n        img = Image.new(\"RGBA\", (self.w,self.h), (255,255,255,0))\n        self.layers._set_bg(Layer(self, img, 0, 0, name=\"_bg\"))\n        \n        if len(self.layers) == 1:\n            self.layers.append(background)\n        else:\n            self.layers.insert(layers[-1], background)", "response": "Flattens all layers according to their blend modes and opacity defined for each layer."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nexporting the flattened canvas to a file.", "response": "def export(self, filename):\n    \n        \"\"\"Exports the flattened canvas.\n    \n        Flattens the canvas.\n        PNG retains the alpha channel information.\n        Other possibilities are JPEG and GIF.\n    \n        \"\"\"\n\n        self.flatten()\n        self.layers[1].img.save(filename)\n        return filename"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nplaces the flattened canvas in a temporary PNG file.", "response": "def draw(self, x, y):\n        \n        \"\"\"Places the flattened canvas in NodeBox.\n        \n        Exports to a temporary PNG file.\n        Draws the PNG in NodeBox using the image() command.\n        Removes the temporary file.\n        \n        \"\"\"\n        \n        try:\n            from time import time\n            import md5\n            from os import unlink\n            m = md5.new()\n            m.update(str(time()))\n            filename = \"photobot\" + str(m.hexdigest()) + \".png\"\n            self.export(filename)\n            _ctx.image(filename, x, y)\n            unlink(filename)\n        \n        except:\n            pass"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreturns the index of this layer in the canvas. layers list or None when not found.", "response": "def index(self):\n        \n        \"\"\"Returns this layer's index in the canvas.layers[].\n        \n        Searches the position of this layer in the canvas'\n        layers list, return None when not found.\n        \n        \"\"\"\n        \n        for i in range(len(self.canvas.layers)):\n            if self.canvas.layers[i] == self: break\n        if self.canvas.layers[i] == self: \n            return i\n        else:\n            return None"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nreturn a copy of the layer.", "response": "def copy(self):\n        \n        \"\"\"Returns a copy of the layer.\n        \n        This is different from the duplicate() method,\n        which duplicates the layer as a new layer on the canvas.\n        The copy() method returns a copy of the layer\n        that can be added to a different canvas.\n        \n        \"\"\"\n        \n        layer = Layer(None, self.img.copy(), self.x, self.y, self.name)\n        layer.w = self.w\n        layer.h = self.h\n        layer.alpha = self.alpha\n        layer.blend = self.blend\n        \n        return layer"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nremoves this layer from the canvas.", "response": "def delete(self):\n        \n        \"\"\"Removes this layer from the canvas.\n              \n        \"\"\"\n        \n        i = self.index()\n        if i != None: del self.canvas.layers[i]"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef up(self):\n        \n        \"\"\"Moves the layer up in the stacking order.\n        \n        \"\"\"\n        \n        i = self.index()\n        if i != None:\n            del self.canvas.layers[i]\n            i = min(len(self.canvas.layers), i+1)\n            self.canvas.layers.insert(i, self)", "response": "Moves the layer up in the stacking order."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef down(self):\n        \n        \"\"\"Moves the layer down in the stacking order.\n        \n        \"\"\"\n        \n        i = self.index()\n        if i != None:\n            del self.canvas.layers[i]\n            i = max(0, i-1)\n            self.canvas.layers.insert(i, self)", "response": "Moves the layer down in the stacking order."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef select(self, path, feather=True):\n    \n        \"\"\"Applies the polygonal lasso tool on a layer.\n    \n        The path paramater is a list of points,\n        either [x1, y1, x2, y2, x3, y3, ...]\n        or [(x1,y1), (x2,y2), (x3,y3), ...]\n    \n        The parts of the layer that fall outside\n        this polygonal area are cut.\n        \n        The selection is not anti-aliased,\n        but the feather parameter creates soft edges.\n    \n        \"\"\"\n    \n        w, h = self.img.size\n        mask = Image.new(\"L\", (w,h), 0)\n        draw = ImageDraw.Draw(mask)\n        \n        draw = ImageDraw.Draw(mask)\n        draw.polygon(path, fill=255)\n\n        if feather:\n            mask = mask.filter(ImageFilter.SMOOTH_MORE)\n            mask = mask.filter(ImageFilter.SMOOTH_MORE)\n            \n        mask = ImageChops.darker(mask, self.img.split()[3])\n        self.img.putalpha(mask)", "response": "Applies the polygonal lasso tool on a layer."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef mask(self):\n    \n        \"\"\"Masks the layer below with this layer.\n    \n        Commits the current layer to the alpha channel of \n        the previous layer. Primarily, mask() is useful when \n        using gradient layers as masks on images below. \n    \n        For example:\n        canvas.layer(\"image.jpg\")\n        canvas.gradient()\n        canvas.layer(2).flip()\n        canvas.layer(2).mask()\n    \n        Adds a white-to-black linear gradient to\n        the alpha channel of image.jpg, \n        making it evolve from opaque on \n        the left to transparent on the right.\n    \n        \"\"\"\n\n        if len(self.canvas.layers) < 2: return\n        i = self.index()\n        if i == 0: return\n        \n        layer = self.canvas.layers[i-1]\n    \n        alpha = Image.new(\"L\", layer.img.size, 0)\n    \n        #Make a composite of the mask layer in grayscale\n        #and its own alpha channel.\n    \n        mask = self.canvas.layers[i]        \n        flat = ImageChops.darker(mask.img.convert(\"L\"), mask.img.split()[3])\n        alpha.paste(flat, (mask.x,mask.y))\n        alpha = ImageChops.darker(alpha, layer.img.split()[3])\n        layer.img.putalpha(alpha)\n    \n        self.delete()", "response": "Masks the current layer below with this layer."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef duplicate(self):\n    \n        \"\"\"Creates a copy of the current layer.\n    \n        This copy becomes the top layer on the canvas.\n    \n        \"\"\"\n    \n        i = self.canvas.layer(self.img.copy(), self.x, self.y, self.name)\n        clone = self.canvas.layers[i]\n        clone.alpha = self.alpha\n        clone.blend = self.blend", "response": "Creates a copy of the current layer."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef brightness(self, value=1.0):\n\n        \"\"\"Increases or decreases the brightness in the layer.\n    \n        The given value is a percentage to increase\n        or decrease the image brightness,\n        for example 0.8 means brightness at 80%.\n    \n        \"\"\"\n     \n        b = ImageEnhance.Brightness(self.img) \n        self.img = b.enhance(value)", "response": "Increases or decreases the brightness in the layer."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef contrast(self, value=1.0):\n    \n        \"\"\"Increases or decreases the contrast in the layer.\n    \n        The given value is a percentage to increase\n        or decrease the image contrast,\n        for example 1.2 means contrast at 120%.\n    \n        \"\"\"\n\n        c = ImageEnhance.Contrast(self.img) \n        self.img = c.enhance(value)", "response": "Increases or decreases the contrast in the layer."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef desaturate(self):\n    \n        \"\"\"Desaturates the layer, making it grayscale.\n    \n        Instantly removes all color information from the layer,\n        while maintaing its alpha channel.\n    \n        \"\"\"\n    \n        alpha = self.img.split()[3]\n        self.img = self.img.convert(\"L\")\n        self.img = self.img.convert(\"RGBA\")\n        self.img.putalpha(alpha)", "response": "Desaturates the layer making it grayscale."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nposition the layer at the given coordinates.", "response": "def translate(self, x, y):\n    \n        \"\"\"Positions the layer at the given coordinates.\n    \n        The x and y parameters define where to position \n        the top left corner of the layer,\n        measured from the top left of the canvas.\n    \n        \"\"\"\n    \n        self.x = x\n        self.y = y"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef scale(self, w=1.0, h=1.0):\n    \n        \"\"\"Resizes the layer to the given width and height.\n    \n        When width w or height h is a floating-point number,\n        scales percentual, \n        otherwise scales to the given size in pixels.\n    \n        \"\"\"\n\n        from types import FloatType\n        w0, h0 = self.img.size\n        if type(w) == FloatType: w = int(w*w0)\n        if type(h) == FloatType: h = int(h*h0)\n    \n        self.img = self.img.resize((w,h), INTERPOLATION)\n        self.w = w\n        self.h = h", "response": "Resizes the layer to the given width and height."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ndistort the image by translating the four corners of the layer to the given coordinates.", "response": "def distort(self, x1=0,y1=0, x2=0,y2=0, x3=0,y3=0, x4=0,y4=0):\n    \n        \"\"\"Distorts the layer.\n        \n        Distorts the layer by translating \n        the four corners of its bounding box to the given coordinates:\n        upper left (x1,y1), upper right(x2,y2),\n        lower right (x3,y3) and lower left (x4,y4).\n        \n        \"\"\"\n    \n        w, h = self.img.size\n        quad = (-x1,-y1, -x4,h-y4, w-x3,w-y3, w-x2,-y2)\n        self.img = self.img.transform(self.img.size, Image.QUAD, quad, INTERPOLATION)"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nrotates the image by given angle.", "response": "def rotate(self, angle):\n    \n        \"\"\"Rotates the layer.\n    \n        Rotates the layer by given angle.\n        Positive numbers rotate counter-clockwise,\n        negative numbers rotate clockwise.\n    \n        Rotate commands are executed instantly,\n        so many subsequent rotates will distort the image.\n    \n        \"\"\"\n    \n        #When a layer rotates, its corners will fall outside\n        #of its defined width and height.\n        #Thus, its bounding box needs to be expanded.\n    \n        #Calculate the diagonal width, and angle from the layer center.\n        #This way we can use the layers's corners \n        #to calculate the bounding box.\n    \n        from math import sqrt, pow, sin, cos, degrees, radians, asin\n        w0, h0 = self.img.size\n        d = sqrt(pow(w0,2) + pow(h0,2))\n        d_angle = degrees(asin((w0*0.5) / (d*0.5)))\n    \n        angle = angle % 360\n        if angle > 90 and angle <= 270: d_angle += 180\n    \n        w = sin(radians(d_angle + angle)) * d\n        w = max(w, sin(radians(d_angle - angle)) * d)\n        w = int(abs(w))\n    \n        h = cos(radians(d_angle + angle)) * d\n        h = max(h, cos(radians(d_angle - angle)) * d)\n        h = int(abs(h))\n    \n        dx = int((w-w0) / 2)\n        dy = int((h-h0) / 2)\n        d = int(d)\n\n        #The rotation box's background color\n        #is the mean pixel value of the rotating image.\n        #This is the best option to avoid borders around\n        #the rotated image.\n\n        bg = ImageStat.Stat(self.img).mean\n        bg = (int(bg[0]), int(bg[1]), int(bg[2]), 0)\n\n        box = Image.new(\"RGBA\", (d,d), bg)\n        box.paste(self.img, ((d-w0)/2, (d-h0)/2))\n        box = box.rotate(angle, INTERPOLATION)\n        box = box.crop(((d-w)/2+2, (d-h)/2, d-(d-w)/2, d-(d-h)/2))\n        self.img = box\n    \n        #Since rotate changes the bounding box size,\n        #update the layers' width, height, and position,\n        #so it rotates from the center.\n    \n        self.x += (self.w-w)/2\n        self.y += (self.h-h)/2\n        self.w = w\n        self.h = h"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef flip(self, axis=HORIZONTAL):\n    \n        \"\"\"Flips the layer, either HORIZONTAL or VERTICAL.\n    \n        \"\"\"\n\n        if axis == HORIZONTAL:\n            self.img = self.img.transpose(Image.FLIP_LEFT_RIGHT)\n        if axis == VERTICAL:\n            self.img = self.img.transpose(Image.FLIP_TOP_BOTTOM)", "response": "Flips the layer either HORIZONTAL or VERTICAL."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef sharpen(self, value=1.0):\n\n        \"\"\"Increases or decreases the sharpness in the layer.\n    \n        The given value is a percentage to increase\n        or decrease the image sharpness,\n        for example 0.8 means sharpness at 80%.\n    \n        \"\"\"\n     \n        s = ImageEnhance.Sharpness(self.img) \n        self.img = s.enhance(value)", "response": "Increases or decreases the sharpness in the layer."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef levels(self):\n        \n        \"\"\"Returns a histogram for each RGBA channel.\n        \n        Returns a 4-tuple of lists, r, g, b, and a.\n        Each list has 255 items, a count for each pixel value.\n                \n        \"\"\"\n        \n        h = self.img.histogram()\n        r = h[0:255]\n        g = h[256:511]\n        b = h[512:767]\n        a = h[768:1024]\n        \n        return r, g, b, a", "response": "Returns a 4 - tuple of lists r g b and a count for each RGBA channel."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef overlay(self, img1, img2):\n    \n        \"\"\"Applies the overlay blend mode.\n    \n        Overlays image img2 on image img1.\n        The overlay pixel combines multiply and screen:\n        it multiplies dark pixels values and screen light values.\n        Returns a composite image with the alpha channel retained.\n    \n        \"\"\"\n    \n        p1 = list(img1.getdata())\n        p2 = list(img2.getdata())\n    \n        for i in range(len(p1)):\n        \n            p3 = ()\n            for j in range(len(p1[i])):\n\n                a = p1[i][j] / 255.0\n                b = p2[i][j] / 255.0\n            \n                #When overlaying the alpha channels,\n                #take the alpha of the most transparent layer.\n            \n                if j == 3:\n                    #d = (a+b)*0.5\n                    #d = a\n                    d = min(a,b)\n                elif a > 0.5: \n                    d = 2*(a+b-a*b)-1\n                else: \n                    d = 2*a*b            \n                p3 += (int(d*255),)\n        \n            p1[i] = p3\n        \n        img = Image.new(\"RGBA\", img1.size, 255)\n        img.putdata(p1)\n        return img", "response": "Applies the overlay blend mode on image img1 and img2."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef hue(self, img1, img2):\n    \n        \"\"\"Applies the hue blend mode.\n    \n        Hues image img1 with image img2.\n        The hue filter replaces the hues of pixels in img1\n        with the hues of pixels in img2.\n        Returns a composite image with the alpha channel retained.\n    \n        \"\"\"\n\n        import colorsys\n\n        p1 = list(img1.getdata())\n        p2 = list(img2.getdata())\n        for i in range(len(p1)):\n        \n            r1, g1, b1, a1 = p1[i]\n            r1 = r1 / 255.0\n            g1 = g1 / 255.0\n            b1 = b1 / 255.0\n        \n            h1, s1, v1 = colorsys.rgb_to_hsv(r1, g1, b1)\n        \n            r2, g2, b2, a2 = p2[i]\n            r2 = r2 / 255.0\n            g2 = g2 / 255.0\n            b2 = b2 / 255.0\n            h2, s2, v2 = colorsys.rgb_to_hsv(r2, g2, b2)\n        \n            r3, g3, b3 = colorsys.hsv_to_rgb(h2, s1, v1)\n        \n            r3 = int(r3*255)\n            g3 = int(g3*255)\n            b3 = int(b3*255)\n            p1[i] = (r3, g3, b3, a1)\n    \n        img = Image.new(\"RGBA\", img1.size, 255)\n        img.putdata(p1)\n        return img", "response": "Applies the hue blend mode."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _load_namespace(self, namespace, filename=None):\n        from shoebot import data\n        for name in dir(data):\n            namespace[name] = getattr(data, name)\n\n        for name in dir(self):\n            if name[0] != '_':\n                namespace[name] = getattr(self, name)\n\n        namespace['_ctx'] = self  # Used in older nodebox scripts.\n        namespace['__file__'] = filename", "response": "Load bot namespace with info in shoebot. data\n       "}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreturns True if bot should quit", "response": "def _should_run(self, iteration, max_iterations):\n        ''' Return False if bot should quit '''\n        if iteration == 0:\n            # First frame always runs\n            return True\n        if max_iterations:\n            if iteration < max_iterations:\n                return True\n        elif max_iterations is None:\n            if self._dynamic:\n                return True\n            else:\n                return False\n            return True\n        if not self._dynamic:\n            return False\n\n        return False"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nlimit to framerate for the current time.", "response": "def _frame_limit(self, start_time):\n        \"\"\"\n        Limit to framerate, should be called after\n        rendering has completed\n\n        :param start_time: When execution started\n        \"\"\"\n        if self._speed:\n            completion_time = time()\n            exc_time = completion_time - start_time\n            sleep_for = (1.0 / abs(self._speed)) - exc_time\n            if sleep_for > 0:\n                sleep(sleep_for)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nrun a single frame of the bot.", "response": "def _run_frame(self, executor, limit=False, iteration=0):\n        \"\"\" Run single frame of the bot\n\n        :param source_or_code: path to code to run, or actual code.\n        :param limit: Time a frame should take to run (float - seconds)\n        \"\"\"\n        #\n        # Gets a bit complex here...\n        #\n        # Nodebox (which we are trying to be compatible with) supports two\n        # kinds of bot 'dynamic' which has a 'draw' function and non dynamic\n        # which doesn't have one.\n        #\n        # Dynamic bots:\n        #\n        # First run:\n        # run body and 'setup' if it exists, then 'draw'\n        #\n        # Later runs:\n        # run 'draw'\n        #\n        # Non Dynamic bots:\n        #\n        # Just have a 'body' and run once...\n        #\n        # UNLESS...  a 'var' is changed, then run it again.\n        #\n        #\n        # Livecoding:\n        #\n        # Code can be 'known_good' or 'tenous' (when it has been edited).\n        #\n        # If code is tenous and an exception occurs, attempt to roll\n        # everything back.\n        #\n        # Livecoding and vars\n        #\n        # If vars are added / removed or renamed then attempt to update\n        # the GUI\n\n        start_time = time()\n        if iteration != 0 and self._speed != 0:\n            self._canvas.reset_canvas()\n        self._set_dynamic_vars()\n        if iteration == 0:\n            # First frame\n            executor.run()\n            # run setup and draw\n            # (assume user hasn't live edited already)\n            executor.ns['setup']()\n            executor.ns['draw']()\n            self._canvas.flush(self._frame)\n        else:\n            # Subsequent frames\n            if self._dynamic:\n                if self._speed != 0:  # speed 0 is paused, so do nothing\n                    with executor.run_context() as (known_good, source, ns):\n                        # Code in main block may redefine 'draw'\n                        if not known_good:\n                            executor.reload_functions()\n                            with VarListener.batch(self._vars, self._oldvars, ns):\n                                self._oldvars.clear()\n\n                                # Re-run the function body - ideally this would only\n                                # happen if the body had actually changed\n                                # - Or perhaps if the line included a variable declaration\n                                exec source in ns\n\n                        ns['draw']()\n                        self._canvas.flush(self._frame)\n            else:\n                # Non \"dynamic\" bots\n                #\n                # TODO - This part is overly complex, before live-coding it\n                #        was just exec source in ns ... have to see if it\n                #        can be simplified again.\n                #\n                with executor.run_context() as (known_good, source, ns):\n                    if not known_good:\n                        executor.reload_functions()\n                        with VarListener.batch(self._vars, self._oldvars, ns):\n                            self._oldvars.clear()\n\n                            # Re-run the function body - ideally this would only\n                            # happen if the body had actually changed\n                            # - Or perhaps if the line included a variable declaration\n                            exec source in ns\n                    else:\n                        exec source in ns\n\n                    self._canvas.flush(self._frame)\n        if limit:\n            self._frame_limit(start_time)\n\n        # Can set speed to go backwards using the shell if you really want\n        # or pause by setting speed == 0\n        if self._speed > 0:\n            self._frame += 1\n        elif self._speed < 0:\n            self._frame -= 1"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nruns a single nodebox script in current surface s context.", "response": "def run(self, inputcode, iterations=None, run_forever=False, frame_limiter=False, verbose=False,\n            break_on_error=False):\n        '''\n        Executes the contents of a Nodebox/Shoebot script\n        in current surface's context.\n\n        :param inputcode: Path to shoebot source or string containing source\n        :param iterations: None or Maximum amount of frames to run\n        :param run_forever: If True then run until user quits the bot\n        :param frame_limiter: If True then sleep between frames to respect speed() command.\n        '''\n        source = None\n        filename = None\n\n        if os.path.isfile(inputcode):\n            source = open(inputcode).read()\n            filename = inputcode\n        elif isinstance(inputcode, basestring):\n            filename = 'shoebot_code'\n            source = inputcode\n\n        self._load_namespace(self._namespace, filename)\n        self._executor = executor = LiveExecution(source, ns=self._namespace, filename=filename)\n\n        try:\n            if not iterations:\n                if run_forever:\n                    iterations = None\n                else:\n                    iterations = 1\n            iteration = 0\n\n            event = None\n\n            while iteration != iterations and not event_is(event, QUIT_EVENT):\n                # do the magic\n\n                # First iteration\n                self._run_frame(executor, limit=frame_limiter, iteration=iteration)\n                if iteration == 0:\n                    self._initial_namespace = copy.copy(self._namespace)  # Stored so script can be rewound\n                iteration += 1\n\n                # Subsequent iterations\n                while self._should_run(iteration, iterations) and event is None:\n                    iteration += 1\n                    self._run_frame(executor, limit=frame_limiter, iteration=iteration)\n                    event = next_event()\n                    if not event:\n                        self._canvas.sink.main_iteration()  # update GUI, may generate events..\n\n                while run_forever:\n                    #\n                    # Running in GUI, bot has finished\n                    # Either -\n                    #   receive quit event and quit\n                    #   receive any other event and loop (e.g. if var changed or source edited)\n                    #\n                    while event is None:\n                        self._canvas.sink.main_iteration()\n                        event = next_event(block=True, timeout=0.05)\n                        if not event:\n                            self._canvas.sink.main_iteration()  # update GUI, may generate events..\n\n                    if event.type == QUIT_EVENT:\n                        break\n                    elif event.type == SOURCE_CHANGED_EVENT:\n                        # Debounce SOURCE_CHANGED events -\n                        # gedit generates two events for changing a single character -\n                        # delete and then add\n                        while event and event.type == SOURCE_CHANGED_EVENT:\n                            event = next_event(block=True, timeout=0.001)\n                    elif event.type == SET_WINDOW_TITLE:\n                        self._canvas.sink.set_title(event.data)\n\n                    event = None  # this loop is a bit weird...\n                    break\n\n        except Exception as e:\n            # this makes KeyboardInterrupts still work\n            # if something goes wrong, print verbose system output\n            # maybe this is too verbose, but okay for now\n\n            import sys\n            if verbose:\n                errmsg = traceback.format_exc()\n            else:\n                errmsg = simple_traceback(e, executor.known_good or '')\n            print >> sys.stderr, errmsg\n            if break_on_error:\n                raise"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nset a new accessible variable.", "response": "def _addvar(self, v):\n        ''' Sets a new accessible variable.\n\n        :param v: Variable.\n        '''\n        oldvar = self._oldvars.get(v.name)\n        if oldvar is not None:\n            if isinstance(oldvar, Variable):\n                if oldvar.compliesTo(v):\n                    v.value = oldvar.value\n            else:\n                # Set from commandline\n                v.value = v.sanitize(oldvar)\n        else:\n            for listener in VarListener.listeners:\n                listener.var_added(v)\n        self._vars[v.name] = v\n        self._namespace[v.name] = v.value\n        self._oldvars[v.name] = v\n        return v"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreceives a colour definition and returns a tuple.", "response": "def parse_color(v, color_range=1):\n    '''Receives a colour definition and returns a (r,g,b,a) tuple.\n\n    Accepts:\n    - v\n    - (v)\n    - (v,a)\n    - (r,g,b)\n    - (r,g,b,a)\n    - #RRGGBB\n    - RRGGBB\n    - #RRGGBBAA\n    - RRGGBBAA\n\n    Returns a (red, green, blue, alpha) tuple, with values ranging from\n    0 to 1.\n\n    The 'color_range' parameter sets the colour range in which the\n    colour data values are specified (except in hexstrings).\n    '''\n\n    # unpack one-element tuples, they show up sometimes\n    while isinstance(v, (tuple, list)) and len(v) == 1:\n        v = v[0]\n\n    if isinstance(v, (int, float)):\n        red = green = blue = v / color_range\n        alpha = 1.\n\n    elif isinstance(v, data.Color):\n        red, green, blue, alpha = v\n\n    elif isinstance(v, (tuple, list)):\n        # normalise values according to the supplied colour range\n        # for this we make a list with the normalised data\n        color = []\n        for index in range(0, len(v)):\n            color.append(v[index] / color_range)\n\n        if len(color) == 1:\n            red = green = blue = alpha = color[0]\n        elif len(color) == 2:\n            red = green = blue = color[0]\n            alpha = color[1]\n        elif len(color) == 3:\n            red = color[0]\n            green = color[1]\n            blue = color[2]\n            alpha = 1.\n        elif len(color) == 4:\n            red = color[0]\n            green = color[1]\n            blue = color[2]\n            alpha = color[3]\n\n    elif isinstance(v, basestring):\n        # got a hexstring: first remove hash character, if any\n        v = v.strip('#')\n        if len(data) == 6:\n            # RRGGBB\n            red = hex2dec(v[0:2]) / 255.\n            green = hex2dec(v[2:4]) / 255.\n            blue = hex2dec(v[4:6]) / 255.\n            alpha = 1.\n        elif len(v) == 8:\n            red = hex2dec(v[0:2]) / 255.\n            green = hex2dec(v[2:4]) / 255.\n            blue = hex2dec(v[4:6]) / 255.\n            alpha = hex2dec(v[6:8]) / 255.\n\n    return red, green, blue, alpha"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nconverting a hex color string to RGB values.", "response": "def hex_to_rgb(hex):\n    \"\"\" Returns RGB values for a hex color string.\n    \"\"\"\n    hex = hex.lstrip(\"#\")\n    if len(hex) < 6:\n        hex += hex[-1] * (6 - len(hex))\n    if len(hex) == 6:\n        r, g, b = hex[0:2], hex[2:4], hex[4:]\n        r, g, b = [int(n, 16) / 255.0 for n in (r, g, b)]\n        a = 1.0\n    elif len(hex) == 8:\n        r, g, b, a = hex[0:2], hex[2:4], hex[4:6], hex[6:]\n        r, g, b, a = [int(n, 16) / 255.0 for n in (r, g, b, a)]\n    return r, g, b, a"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nconverting from CMYK color to RGB.", "response": "def cmyk_to_rgb(c, m, y, k):\n    \"\"\" Cyan, magenta, yellow, black to red, green, blue.\n    ReportLab, http://www.koders.com/python/fid5C006F554616848C01AC7CB96C21426B69D2E5A9.aspx\n    Results will differ from the way NSColor converts color spaces.\n    \"\"\"\n\n    r = 1.0 - min(1.0, c + k)\n    g = 1.0 - min(1.0, m + k)\n    b = 1.0 - min(1.0, y + k)\n\n    return r, g, b"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef hsv_to_rgb(h, s, v):\n\n    if s == 0: return v, v, v\n\n    h = h / (60.0 / 360)\n    i = floor(h)\n    f = h - i\n    p = v * (1 - s)\n    q = v * (1 - s * f)\n    t = v * (1 - s * (1 - f))\n\n    if i == 0:\n        r, g, b = v, t, p\n    elif i == 1:\n        r, g, b = q, v, p\n    elif i == 2:\n        r, g, b = p, v, t\n    elif i == 3:\n        r, g, b = p, q, v\n    elif i == 4:\n        r, g, b = t, p, v\n    else:\n        r, g, b = v, p, q\n\n    return r, g, b", "response": "Hue saturation brightness to red green blue."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nformats traceback showing line number and surrounding source.", "response": "def simple_traceback(ex, source):\n    \"\"\"\n    Format traceback, showing line number and surrounding source.\n    \"\"\"\n    exc_type, exc_value, exc_tb = sys.exc_info()\n    exc = traceback.format_exception(exc_type, exc_value, exc_tb)\n\n    source_arr = source.splitlines()\n\n    # Defaults...\n    exc_location = exc[-2]\n    for i, err in enumerate(exc):\n        if 'exec source in ns' in err:\n            exc_location = exc[i + 1]\n            break\n\n    # extract line number from traceback\n    fn = exc_location.split(',')[0][8:-1]\n    line_number = int(exc_location.split(',')[1].replace('line', '').strip())\n\n    # Build error messages\n\n    err_msgs = []\n\n    # code around the error\n    err_where = ' '.join(exc[i - 1].split(',')[1:]).strip()  # 'line 37 in blah\"\n    err_msgs.append('Error in the Shoebot script at %s:' % err_where)\n    for i in xrange(max(0, line_number - 5), line_number):\n        if fn == \"<string>\":\n            line = source_arr[i]\n        else:\n            line = linecache.getline(fn, i + 1)\n        err_msgs.append('%s: %s' % (i + 1, line.rstrip()))\n    err_msgs.append('  %s^ %s' % (len(str(i)) * ' ', exc[-1].rstrip()))\n\n    err_msgs.append('')\n    # traceback\n    err_msgs.append(exc[0].rstrip())\n    for err in exc[3:]:\n        err_msgs.append(err.rstrip())\n\n    return '\\n'.join(err_msgs)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef connect(self, name):\n        \n        \"\"\"Generic database.\n        \n        Opens the SQLite database with the given name.\n        The .db extension is automatically appended to the name.\n        For each table in the database an attribute is created,\n        and assigned a Table object.\n        \n        You can do: database.table or database[table].\n        \n        \"\"\"\n        \n        self._name = name.rstrip(\".db\")\n        self._con = sqlite.connect(self._name + \".db\")\n        self._cur = self._con.cursor()\n \n        self._tables = []\n        self._cur.execute(\"select name from sqlite_master where type='table'\")\n        for r in self._cur: self._tables.append(r[0])\n        \n        self._indices = []\n        self._cur.execute(\"select name from sqlite_master where type='index'\")\n        for r in self._cur: self._indices.append(r[0])\n        \n        for t in self._tables:\n            self._cur.execute(\"pragma table_info(\"+t+\")\")\n            fields = []\n            key = \"\"\n            for r in self._cur:\n                fields.append(r[1])\n                if r[2] == \"integer\": key = r[1]\n            setattr(self, t, Table(self, t, key, fields))", "response": "Generic database.\n                Opens the SQLite database with the given name."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ncreates an SQLite database file with the given name.", "response": "def create(self, name, overwrite=True):\n        \n        \"\"\"Creates an SQLite database file.\n        \n        Creates an SQLite database with the given name.\n        The .box file extension is added automatically.\n        Overwrites any existing database by default.\n        \n        \"\"\"\n        \n        self._name = name.rstrip(\".db\")\n        from os import unlink\n        if overwrite: \n            try: unlink(self._name + \".db\")\n            except: pass       \n        self._con = sqlite.connect(self._name + \".db\")\n        self._cur = self._con.cursor()"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ncreate a new table with the given name containing the list of fields.", "response": "def create_table(self, name, fields=[], key=\"id\"):\n        \n        \"\"\"Creates a new table.\n        \n        Creates a table with the given name,\n        containing the list of given fields.\n        Since SQLite uses manifest typing, no data type need be supplied.\n        The primary key is \"id\" by default,\n        an integer that can be set or otherwise autoincrements.\n        \n        \"\"\"\n        \n        for f in fields: \n            if f == key: fields.remove(key)\n        sql  = \"create table \"+name+\" \"\n        sql += \"(\"+key+\" integer primary key\"\n        for f in fields: sql += \", \"+f+\" varchar(255)\"\n        sql += \")\"\n        self._cur.execute(sql)\n        self._con.commit()\n        self.index(name, key, unique=True)\n        self.connect(self._name)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef create_index(self, table, field, unique=False, ascending=True):\n        \n        \"\"\"Creates a table index.\n        \n        Creates an index on the given table,\n        on the given field with unique values enforced or not,\n        in ascending or descending order.\n        \n        \"\"\"\n        \n        if unique: u = \"unique \"\n        else: u = \"\"\n        if ascending: a = \"asc\"\n        else: a = \"desc\"\n        sql  = \"create \"+u+\"index index_\"+table+\"_\"+field+\" \"\n        sql += \"on \"+table+\"(\"+field+\" \"+a+\")\"\n        self._cur.execute(sql)\n        self._con.commit()", "response": "Creates an index on the given table field with unique values enforced or not ascending or descending order."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef close(self):\n        \n        \"\"\"Commits any pending transactions and closes the database.\n        \"\"\"\n        \n        self._con.commit()\n        self._cur.close()\n        self._con.close()", "response": "Commits any pending transactions and closes the database."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nexecutes a raw SQL statement on the database and returns a list of the records that were found in the database.", "response": "def sql(self, sql):\n        \n        \"\"\" Executes a raw SQL statement on the database.\n        \"\"\"\n        \n        self._cur.execute(sql)\n        if sql.lower().find(\"select\") >= 0:\n            matches = []\n            for r in self._cur: matches.append(r)\n            return matches"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef find(self, q, operator=\"=\", fields=\"*\", key=None):\n        \n        \"\"\"A simple SQL SELECT query.\n        \n        Retrieves all rows from the table \n        where the given query value is found in the given column (primary key if None).\n        A different comparison operator (e.g. >, <, like) can be set.\n        The wildcard character is * and automatically sets the operator to \"like\".\n        Optionally, the fields argument can be a list of column names to select.\n        Returns a list of row tuples containing fields.\n        \n        \"\"\"\n        \n        if key == None: key = self._key\n        if fields != \"*\": fields = \", \".join(fields)\n        try: q = unicode(q)\n        except: pass\n        if q != \"*\" and (q[0] == \"*\" or q[-1] == \"*\"):\n            if q[0]  == \"*\": q = \"%\"+q.lstrip(\"*\")\n            if q[-1] == \"*\": q = q.rstrip(\"*\")+\"%\"\n            operator = \"like\"\n        \n        if q != \"*\":\n            sql = \"select \"+fields+\" from \"+self._name+\" where \"+key+\" \"+operator+\" ?\"\n            self._db._cur.execute(sql, (q,))\n        else:\n            sql = \"select \"+fields+\" from \"+self._name\n            self._db._cur.execute(sql)\n        \n        # You need to watch out here when bad unicode data \n        # has entered the database: pysqlite will throw an OperationalError.\n        # http://lists.initd.org/pipermail/pysqlite/2006-April/000488.html\n        matches = []\n        for r in self._db._cur: matches.append(r)\n        return matches", "response": "A simple SQL SELECT query."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef append(self, *args, **kw):\n        \n        \"\"\"Adds a new row to a table.\n        \n        Adds a row to the given table.\n        The column names and their corresponding values\n        must either be supplied as a dictionary of {fields:values},\n        or a series of keyword arguments of field=value style.\n        \n        \"\"\"\n\n        if args and kw: \n            return\n        if args and type(args[0]) == dict:\n            fields = [k for k in args[0]]\n            v = [args[0][k] for k in args[0]]\n        if kw:\n            fields = [k for k in kw]\n            v = [kw[k] for k in kw]\n        \n        q = \", \".join([\"?\" for x in fields])\n        sql  = \"insert into \"+self._name+\" (\"+\", \".join(fields)+\") \"\n        sql += \"values (\"+q+\")\"\n        self._db._cur.execute(sql, v)\n        self._db._i += 1\n        if self._db._i >= self._db._commit:\n            self._db._i = 0\n            self._db._con.commit()", "response": "Adds a new row to the given table."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef edit(self, id, *args, **kw):\n        \n        \"\"\" Edits the row with given id.\n        \"\"\"\n        \n        if args and kw: \n            return\n        if args and type(args[0]) == dict:\n            fields = [k for k in args[0]]\n            v = [args[0][k] for k in args[0]]\n        if kw:\n            fields = [k for k in kw]\n            v = [kw[k] for k in kw]\n        \n        sql  = \"update \"+self._name+\" set \"+\"=?, \".join(fields)+\"=? where \"+self._key+\"=\"+unicode(id)\n        self._db._cur.execute(sql, v)\n        self._db._i += 1\n        if self._db._i >= self._db._commit:\n            self._db._i = 0\n            self._db._con.commit()", "response": "Edits the row with given id."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef remove(self, id, operator=\"=\", key=None):\n\n        \"\"\" Deletes the row with given id.\n        \"\"\"\n\n        if key == None: key = self._key\n        try: id = unicode(id)\n        except: pass        \n        sql = \"delete from \"+self._name+\" where \"+key+\" \"+operator+\" ?\"\n        self._db._cur.execute(sql, (id,))", "response": "Deletes the row with given id."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ngetting the next available event or None", "response": "def next_event(block=False, timeout=None):\n    \"\"\"\n    Get the next available event or None\n\n    :param block:\n    :param timeout:\n    :return: None or (event, data)\n    \"\"\"\n    try:\n        return channel.listen(block=block, timeout=timeout).next()['data']\n    except StopIteration:\n        return None"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef publish_event(event_t, data=None, extra_channels=None, wait=None):\n    event = Event(event_t, data)\n    pubsub.publish(\"shoebot\", event)\n    for channel_name in extra_channels or []:\n        pubsub.publish(channel_name, event)\n    if wait is not None:\n        channel = pubsub.subscribe(wait)\n        channel.listen(wait)", "response": "Publish an event to any subscribers."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nsets the mode of the call transform.", "response": "def _set_mode(self, mode):\n        '''\n        Sets call_transform_mode to point to the\n        center_transform or corner_transform\n        '''\n        if mode == CENTER:\n            self._call_transform_mode = self._center_transform\n        elif mode == CORNER:\n            self._call_transform_mode = self._corner_transform\n        else:\n            raise ValueError('mode must be CENTER or CORNER')"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn a matrix that transforms the image from the center to the given transform.", "response": "def _center_transform(self, transform):\n        ''''\n        Works like setupTransform of a version of java nodebox\n        http://dev.nodebox.net/browser/nodebox-java/branches/rewrite/src/java/net/nodebox/graphics/Grob.java\n        '''\n        dx, dy = self._get_center()\n        t = cairo.Matrix()\n        t.translate(dx, dy)\n        t = transform * t\n        t.translate(-dx, -dy)\n        return t"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef inheritFromContext(self, ignore=()):\n        for canvas_attr, grob_attr in STATES.items():\n            if canvas_attr in ignore:\n                continue\n            setattr(self, grob_attr, getattr(self._bot._canvas, canvas_attr))", "response": "Inherit from the current context."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nloads the source code into the execution environment and call the bad_cb function if the source is not None.", "response": "def load_edited_source(self, source, good_cb=None, bad_cb=None, filename=None):\n        \"\"\"\n        Load changed code into the execution environment.\n\n        Until the code is executed correctly, it will be\n        in the 'tenuous' state.\n        \"\"\"\n        with LiveExecution.lock:\n            self.good_cb = good_cb\n            self.bad_cb = bad_cb\n            try:\n                # text compile\n                compile(source + '\\n\\n', filename or self.filename, \"exec\")\n                self.edited_source = source\n            except Exception as e:\n                if bad_cb:\n                    self.edited_source = None\n                    tb = traceback.format_exc()\n                    self.call_bad_cb(tb)\n                return\n            if filename is not None:\n                self.filename = filename"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreloads functions in namespace with functions from edited_source.", "response": "def reload_functions(self):\n        \"\"\"\n        Replace functions in namespace with functions from edited_source.\n        \"\"\"\n        with LiveExecution.lock:\n            if self.edited_source:\n                tree = ast.parse(self.edited_source)\n                for f in [n for n in ast.walk(tree) if isinstance(n, ast.FunctionDef)]:\n                    self.ns[f.name].__code__ = meta.decompiler.compile_func(f, self.filename, self.ns).__code__"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nrunning the tenuous command.", "response": "def run_tenuous(self):\n        \"\"\"\n        Run edited source, if no exceptions occur then it\n        graduates to known good.\n        \"\"\"\n        with LiveExecution.lock:\n            ns_snapshot = copy.copy(self.ns)\n            try:\n                source = self.edited_source\n                self.edited_source = None\n                self.do_exec(source, ns_snapshot)\n                self.known_good = source\n                self.call_good_cb()\n                return True, None\n            except Exception as ex:\n                tb = traceback.format_exc()\n                self.call_bad_cb(tb)\n                self.ns.clear()\n                self.ns.update(ns_snapshot)\n                return False, ex"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nrunning the current process.", "response": "def run(self):\n        \"\"\"\n        Attempt to known good or tenuous source.\n        \"\"\"\n        with LiveExecution.lock:\n            if self.edited_source:\n                success, ex = self.run_tenuous()\n                if success:\n                    return\n\n            self.do_exec(self.known_good, self.ns)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef call_bad_cb(self, tb):\n        with LiveExecution.lock:\n            if self.bad_cb and not self.bad_cb(tb):\n                self.bad_cb = None", "response": "keep it\n            if bad_cb returns True"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ncalling the good_cb function if it returns True then keep it", "response": "def call_good_cb(self):\n        \"\"\"\n        If good_cb returns True then keep it\n        :return:\n        \"\"\"\n        with LiveExecution.lock:\n            if self.good_cb and not self.good_cb():\n                self.good_cb = None"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef cohesion(self, d=100):\n        \n        \"\"\" Boids move towards the flock's centre of mass.\n        \n        The centre of mass is the average position of all boids,\n        not including itself (the \"perceived centre\").\n        \n        \"\"\"\n        \n        vx = vy = vz = 0\n        for b in self.boids:\n            if b != self:\n                vx, vy, vz = vx+b.x, vy+b.y, vz+b.z\n                \n        n = len(self.boids)-1\n        vx, vy, vz = vx/n, vy/n, vz/n\n        \n        return (vx-self.x)/d, (vy-self.y)/d, (vz-self.z)/d", "response": "Returns the cohesion of the current object."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef separation(self, r=10):\n        \n        \"\"\" Boids keep a small distance from other boids.\n        \n        Ensures that boids don't collide into each other,\n        in a smoothly accelerated motion.\n        \n        \"\"\"\n        \n        vx = vy = vz = 0\n        for b in self.boids:\n            if b != self:\n                if abs(self.x-b.x) < r: vx += (self.x-b.x)\n                if abs(self.y-b.y) < r: vy += (self.y-b.y)\n                if abs(self.z-b.z) < r: vz += (self.z-b.z)\n                \n        return vx, vy, vz", "response": "Calculates the separation of the two sets of components."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef alignment(self, d=5):\n        \n        \"\"\" Boids match velocity with other boids.\n        \"\"\"\n        \n        vx = vy = vz = 0\n        for b in self.boids:\n           if b != self:\n               vx, vy, vz = vx+b.vx, vy+b.vy, vz+b.vz\n        \n        n = len(self.boids)-1\n        vx, vy, vz = vx/n, vy/n, vz/n\n        \n        return (vx-self.vx)/d, (vy-self.vy)/d, (vz-self.vz)/d", "response": "Calculates the alignment of the current object s components based on the velocity of the other boids."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef limit(self, max=30):\n        \n        \"\"\" The speed limit for a boid.\n        \n        Boids can momentarily go very fast,\n        something that is impossible for real animals.\n        \n        \"\"\"\n        \n        if abs(self.vx) > max: \n            self.vx = self.vx/abs(self.vx)*max\n        if abs(self.vy) > max: \n            self.vy = self.vy/abs(self.vy)*max\n        if abs(self.vz) > max: \n            self.vz = self.vz/abs(self.vz)*max", "response": "Limit the number of antenna in the sequence to a certain amount."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _angle(self):\n        \n        \"\"\" Returns the angle towards which the boid is steering.\n        \"\"\"\n        \n        from math import atan, pi, degrees\n        a = degrees(atan(self.vy/self.vx)) + 360\n        if self.vx < 0: a += 180\n\n        return a", "response": "Returns the angle in degrees of the boid is steering."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef goal(self, x, y, z, d=50.0):\n        \n        \"\"\" Tendency towards a particular place.\n        \"\"\"\n        \n        return (x-self.x)/d, (y-self.y)/d, (z-self.z)/d", "response": "The goal of the current location."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef constrain(self):\n        \n        \"\"\" Cages the flock inside the x, y, w, h area.\n        \n        The actual cage is a bit larger,\n        so boids don't seem to bounce of invisible walls\n        (they are rather \"encouraged\" to stay in the area).\n        \n        If a boid touches the ground level,\n        it may decide to perch there for a while.\n        \n        \"\"\"\n        \n        dx = self.w * 0.1\n        dy = self.h * 0.1 \n        \n        for b in self:\n            \n            if b.x < self.x-dx: b.vx += _ctx.random(dx)\n            if b.y < self.y-dy: b.vy += _ctx.random(dy)\n            if b.x > self.x+self.w+dx: b.vx -= _ctx.random(dx)\n            if b.y > self.y+self.h+dy: b.vy -= _ctx.random(dy)\n            if b.z < 0: b.vz += 10\n            if b.z > 100: b.vz -= 10\n            \n            if b.y > self._perch_y and _ctx.random() < self._perch:\n                b.y = self._perch_y\n                b.vy = -abs(b.vy) * 0.2\n                b.is_perching = True\n                try:\n                    b._perch_t = self._perch_t()\n                except:\n                    b._perch_t = self._perch_t", "response": "Constrain the area of the cages."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nupdates the list of boids with the next motion frame.", "response": "def update(self, \n               shuffled=True, \n               cohesion=100, \n               separation=10, \n               alignment=5, \n               goal=20,\n               limit=30):\n        \n        \"\"\" Calculates the next motion frame for the flock.\n        \"\"\"\n        \n        # Shuffling the list of boids ensures fluid movement.\n        # If you need the boids to retain their position in the list\n        # each update, set the shuffled parameter to False.\n        from random import shuffle\n        if shuffled: shuffle(self)\n        \n        m1 = 1.0 # cohesion\n        m2 = 1.0 # separation\n        m3 = 1.0 # alignment\n        m4 = 1.0 # goal\n        \n        # The flock scatters randomly with a Boids.scatter chance.\n        # This means their cohesion (m1) is reversed,\n        # and their joint alignment (m3) is dimished,\n        # causing boids to oscillate in confusion.\n        # Setting Boids.scatter(chance=0) ensures they never scatter.\n        if not self.scattered and _ctx.random() < self._scatter:\n            self.scattered = True\n        if self.scattered:\n            m1 = -m1\n            m3 *= 0.25\n            self._scatter_i += 1\n        if self._scatter_i >= self._scatter_t:\n            self.scattered = False\n            self._scatter_i = 0\n\n        # A flock can have a goal defined with Boids.goal(x,y,z),\n        # a place of interest to flock around.\n        if not self.has_goal:\n            m4 = 0\n        if self.flee:\n            m4 = -m4\n        \n        for b in self:\n            \n            # A boid that is perching will continue to do so\n            # until Boid._perch_t reaches zero.\n            if b.is_perching:\n                if b._perch_t > 0:\n                    b._perch_t -= 1\n                    continue\n                else:\n                    b.is_perching = False\n            \n            vx1, vy1, vz1 = b.cohesion(cohesion)\n            vx2, vy2, vz2 = b.separation(separation)\n            vx3, vy3, vz3 = b.alignment(alignment)\n            vx4, vy4, vz4 = b.goal(self._gx, self._gy, self._gz, goal)\n            \n            b.vx += m1*vx1 + m2*vx2 + m3*vx3 + m4*vx4\n            b.vy += m1*vy1 + m2*vy2 + m3*vy3 + m4*vy4\n            b.vz += m1*vz1 + m2*vz2 + m3*vz3 + m4*vz4\n            \n            b.limit(limit)\n        \n            b.x += b.vx\n            b.y += b.vy\n            b.z += b.vz\n        \n        self.constrain()"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nyielding the match and end_idx for each match", "response": "def iterscan(self, string, idx=0, context=None):\n        \"\"\"\n        Yield match, end_idx for each match\n        \"\"\"\n        match = self.scanner.scanner(string, idx).match\n        actions = self.actions\n        lastend = idx\n        end = len(string)\n        while True:\n            m = match()\n            if m is None:\n                break\n            matchbegin, matchend = m.span()\n            if lastend == matchend:\n                break\n            action = actions[m.lastindex]\n            if action is not None:\n                rval, next_pos = action(m, context)\n                if next_pos is not None and next_pos != matchend:\n                    # \"fast forward\" the scanner\n                    matchend = next_pos\n                    match = self.scanner.scanner(string, matchend).match\n                yield rval, matchend\n            lastend = matchend"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nreturn a copy of the layout for the given graph.", "response": "def copy(self, graph):\n        \n        \"\"\" Returns a copy of the layout for the given graph.\n        \"\"\"\n        \n        l = self.__class__(graph, self.n)\n        l.i = 0\n        return l"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef create(iterations=1000, distance=1.0, layout=LAYOUT_SPRING, depth=True):\n    \n    \"\"\" Returns a new graph with predefined styling.\n    \"\"\"\n\n    #global _ctx\n   \n        \n    _ctx.colormode(_ctx.RGB)\n    g = graph(iterations, distance, layout)\n   \n    \n    # Styles for different types of nodes.\n    s = style.style\n    g.styles.append(s(style.LIGHT    , _ctx, fill   = _ctx.color(0.0, 0.0, 0.0, 0.20)))\n    g.styles.append(s(style.DARK     , _ctx, fill   = _ctx.color(0.3, 0.5, 0.7, 0.75)))\n    g.styles.append(s(style.BACK     , _ctx, fill   = _ctx.color(0.5, 0.8, 0.0, 0.50)))\n    g.styles.append(s(style.IMPORTANT, _ctx, fill   = _ctx.color(0.3, 0.6, 0.8, 0.75)))\n    g.styles.append(s(style.HIGHLIGHT, _ctx, stroke = _ctx.color(1.0, 0.0, 0.5), strokewidth=1.5))\n    g.styles.append(s(style.MARKED   , _ctx))\n    g.styles.append(s(style.ROOT     , _ctx, text   = _ctx.color(1.0, 0.0, 0.4, 1.00), \n                                             stroke = _ctx.color(0.8, 0.8, 0.8, 0.60),\n                                             strokewidth = 1.5, \n                                             fontsize    = 16, \n                                             textwidth   = 150))\n\n    # Important nodes get a double stroke.\n    def important_node(s, node, alpha=1.0):\n        style.style(None, _ctx).node(s, node, alpha)\n        r = node.r * 1.4\n        _ctx.nofill()\n        _ctx.oval(node.x-r, node.y-r, r*2, r*2)  \n\n    # Marked nodes have an inner dot.\n    def marked_node(s, node, alpha=1.0):\n        style.style(None, _ctx).node(s, node, alpha)\n        r = node.r * 0.3\n        _ctx.fill(s.stroke)\n        _ctx.oval(node.x-r, node.y-r, r*2, r*2)\n    \n    g.styles.important.node = important_node\n    g.styles.marked.node = marked_node \n    \n    g.styles.depth = depth\n\n    # Styling guidelines. All nodes have the default style, except:\n    # 1) a node directly connected to the root gets the LIGHT style.\n    # 2) a node with more than 4 edges gets the DARK style.\n    # 3) a node with a weight of 0.75-1.0 gets the IMPORTANT style.\n    # 4) the graph.root node gets the ROOT style.\n    # 5) the node last clicked gets the BACK style.    \n    g.styles.guide.append(style.LIGHT     , lambda graph, node: graph.root in node.links)\n    g.styles.guide.append(style.DARK      , lambda graph, node: len(node.links) > 4)\n    g.styles.guide.append(style.IMPORTANT , lambda graph, node: node.weight > 0.75)\n    g.styles.guide.append(style.ROOT      , lambda graph, node: node == graph.root)\n    g.styles.guide.append(style.BACK      , lambda graph, node: node == graph.events.clicked)\n    \n    # An additional rule applies every node's weight to its radius.\n    def balance(graph, node): \n        node.r = node.r*0.75 + node.r*node.weight*0.75\n    g.styles.guide.append(\"balance\", balance)\n    \n    # An additional rule that keeps leaf nodes closely clustered.\n    def cluster(graph, node):\n        if len(node.links) == 1: \n            node.links.edge(node.links[0]).length *= 0.5\n    g.styles.guide.append(\"cluster\", cluster)\n    \n    g.styles.guide.order = [\n        style.LIGHT, style.DARK, style.IMPORTANT, style.ROOT, style.BACK, \"balance\", \"nurse\"\n    ]\n\n    return g", "response": "Creates a new graph with the given number of iterations and distance."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns True if given node can be reached over traversable edges.", "response": "def can_reach(self, node, traversable=lambda node, edge: True):\n        \n        \"\"\" Returns True if given node can be reached over traversable edges.\n        To enforce edge direction, use a node==edge.node1 traversable.\n        \"\"\"\n        \n        if isinstance(node, str):\n            node = self.graph[node]\n        for n in self.graph.nodes:\n            n._visited = False\n        return proximity.depth_first_search(self,\n            visit=lambda n: node == n,\n            traversable=traversable\n            )"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ncreates a copy of the graph.", "response": "def copy(self, empty=False):\n        \n        \"\"\" Create a copy of the graph (by default with nodes and edges).\n        \"\"\"\n        \n        g = graph(self.layout.n, self.distance, self.layout.type)\n        g.layout = self.layout.copy(g)\n        g.styles = self.styles.copy(g)\n        g.events = self.events.copy(g)\n\n        if not empty:\n            for n in self.nodes:\n                g.add_node(n.id, n.r, n.style, n.category, n.label, (n == self.root), n.__dict__)\n            for e in self.edges:\n                g.add_edge(e.node1.id, e.node2.id, e.weight, e.length, e.label, e.__dict__)\n        \n        return g"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nremove nodes and edges and reset the layout.", "response": "def clear(self):\n        \n        \"\"\" Remove nodes and edges and reset the layout.\n        \"\"\"\n        \n        dict.clear(self)\n        self.nodes = []\n        self.edges = []\n        self.root  = None\n        \n        self.layout.i = 0\n        self.alpha = 0"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nadd a node from id and return the node object.", "response": "def add_node(self, id, radius=8, style=style.DEFAULT, category=\"\", label=None, root=False,\n                 properties={}):\n        \n        \"\"\" Add node from id and return the node object.\n        \"\"\"\n        \n        if self.has_key(id): \n            return self[id]\n            \n        if not isinstance(style, str) and style.__dict__.has_key[\"name\"]:\n            style = style.name\n        \n        n = node(self, id, radius, style, category, label, properties)\n        self[n.id] = n\n        self.nodes.append(n)\n        if root: self.root = n\n            \n        return n"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef add_edge(self, id1, id2, weight=0.0, length=1.0, label=\"\", properties={}):\n        \n        \"\"\" Add weighted (0.0-1.0) edge between nodes, creating them if necessary.\n        The weight represents the importance of the connection (not the cost).\n        \"\"\"\n        \n        if id1 == id2: return None\n        \n        if not self.has_key(id1): self.add_node(id1)\n        if not self.has_key(id2): self.add_node(id2)\n        n1 = self[id1]\n        n2 = self[id2]\n        \n        # If a->b already exists, don't re-create it.\n        # However, b->a may still pass.\n        if n1 in n2.links:\n            if n2.links.edge(n1).node1 == n1:\n                return self.edge(id1, id2)\n\n        weight = max(0.0, min(weight, 1.0))\n\n        e = edge(n1, n2, weight, length, label, properties)\n        self.edges.append(e)    \n        n1.links.append(n2, e)\n        n2.links.append(n1, e)\n\n        return e", "response": "Add an edge between two nodes."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef remove_node(self, id):\n        \n        \"\"\" Remove node with given id.\n        \"\"\"\n \n        if self.has_key(id):\n            n = self[id]\n            self.nodes.remove(n)\n            del self[id]\n            \n            # Remove all edges involving id and all links to it.\n            for e in list(self.edges):\n                if n in (e.node1, e.node2):\n                    if n in e.node1.links: \n                        e.node1.links.remove(n)\n                    if n in e.node2.links: \n                        e.node2.links.remove(n)\n                    self.edges.remove(e)", "response": "Removes a node from the list of nodes and edges."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef remove_edge(self, id1, id2):\n        \n        \"\"\" Remove edges between nodes with given id's.\n        \"\"\"\n        \n        for e in list(self.edges):\n            if id1 in (e.node1.id, e.node2.id) and \\\n               id2 in (e.node1.id, e.node2.id):\n                e.node1.links.remove(e.node2)\n                e.node2.links.remove(e.node1)\n                self.edges.remove(e)", "response": "Removes edges between nodes with given id s."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn the edge between the nodes with given id1 and id2.", "response": "def edge(self, id1, id2):\n        \"\"\" Returns the edge between the nodes with given id1 and id2.\n        \"\"\"\n        if id1 in self and \\\n           id2 in self and \\\n           self[id2] in self[id1].links:\n            return self[id1].links.edge(id2)\n        return None"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef update(self, iterations=10):\n        \n        \"\"\" Iterates the graph layout and updates node positions.\n        \"\"\"    \n        \n        # The graph fades in when initially constructed.\n        self.alpha += 0.05\n        self.alpha = min(self.alpha, 1.0)\n\n        # Iterates over the graph's layout.\n        # Each step the graph's bounds are recalculated\n        # and a number of iterations are processed,\n        # more and more as the layout progresses.\n        if self.layout.i == 0:\n            self.layout.prepare()\n            self.layout.i += 1\n        elif self.layout.i == 1:\n            self.layout.iterate()\n        elif self.layout.i < self.layout.n:\n            n = min(iterations, self.layout.i / 10 + 1)\n            for i in range(n): \n                self.layout.iterate()\n        \n        # Calculate the absolute center of the graph.\n        min_, max = self.layout.bounds\n        self.x = _ctx.WIDTH - max.x*self.d - min_.x*self.d\n        self.y = _ctx.HEIGHT - max.y*self.d - min_.y*self.d\n        self.x /= 2\n        self.y /= 2\n            \n        return not self.layout.done", "response": "Updates the node positions and positions of the node in the current layout."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef offset(self, node):\n        x = self.x + node.x - _ctx.WIDTH/2\n        y = self.y + node.y - _ctx.HEIGHT/2\n        return x, y", "response": "Returns the distance from the center to the given node."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef draw(self, dx=0, dy=0, weighted=False, directed=False, highlight=[], traffic=None):\n        \n        \"\"\" Layout the graph incrementally.\n        \n        The graph is drawn at the center of the canvas.\n        The weighted and directed parameters visualize edge weight and direction.\n        The highlight specifies list of connected nodes. \n        The path will be colored according to the \"highlight\" style.\n        Clicking and dragging events are monitored.\n        \n        \"\"\"\n        \n        self.update()\n\n        # Draw the graph background.\n        s = self.styles.default\n        s.graph_background(s)\n\n        # Center the graph on the canvas.\n        _ctx.push()\n        _ctx.translate(self.x+dx, self.y+dy)\n \n        # Indicate betweenness centrality.\n        if traffic:\n            if isinstance(traffic, bool): \n                traffic = 5\n            for n in self.nodes_by_betweenness()[:traffic]:\n                try: s = self.styles[n.style]\n                except: s = self.styles.default\n                if s.graph_traffic:\n                    s.graph_traffic(s, n, self.alpha)        \n\n        # Draw the edges and their labels.\n        s = self.styles.default\n        if s.edges:\n            s.edges(s, self.edges, self.alpha, weighted, directed)\n        \n        # Draw each node in the graph.\n        # Apply individual style to each node (or default).        \n        for n in self.nodes:\n            try:  s = self.styles[n.style]\n            except: s = self.styles.default\n            if s.node:\n                s.node(s, n, self.alpha)\n        \n        # Highlight the given shortest path.\n        try: s = self.styles.highlight\n        except: s = self.styles.default\n        if s.path:\n            s.path(s, self, highlight)\n\n        # Draw node id's as labels on each node.\n        for n in self.nodes:\n            try:  s = self.styles[n.style]\n            except: s = self.styles.default\n            if s.node_label:\n                s.node_label(s, n, self.alpha)\n        \n        # Events for clicked and dragged nodes.\n        # Nodes will resist being dragged by attraction and repulsion,\n        # put the event listener on top to get more direct feedback.\n        #self.events.update()\n        \n        _ctx.pop()", "response": "Layout the graph incrementally.\n        \n        The graph is drawn at the center of the canvas.\n        The weighted and directed parameters visualize edge weight and direction.\n        The highlight specifies list of connected nodes. \n        The path will be colored according to the \"highlight\" style.\n        Clicking and dragging events are monitored."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nremove all nodes with less or equal links than depth.", "response": "def prune(self, depth=0):\n        \"\"\" Removes all nodes with less or equal links than depth.\n        \"\"\"\n        for n in list(self.nodes):\n            if len(n.links) <= depth:\n                self.remove_node(n.id)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef betweenness_centrality(self, normalized=True):\n        bc = proximity.brandes_betweenness_centrality(self, normalized)\n        for id, w in bc.iteritems(): self[id]._betweenness = w\n        return bc", "response": "Calculates the betweenness centrality and returns an object id - > weight dictionary."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef eigenvector_centrality(self, normalized=True, reversed=True, rating={},\n                               start=None, iterations=100, tolerance=0.0001):\n        \"\"\" Calculates eigenvector centrality and returns an node id -> weight dictionary.\n        Node eigenvalue weights are updated in the process.\n        \"\"\"\n        ec = proximity.eigenvector_centrality(\n            self, normalized, reversed, rating, start, iterations, tolerance\n        )\n        for id, w in ec.iteritems(): self[id]._eigenvalue = w\n        return ec", "response": "Calculates the eigenvector centrality and returns an object with the node id - > weight dictionary."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn a list of nodes sorted by betweenness centrality.", "response": "def nodes_by_betweenness(self, treshold=0.0):\n        \"\"\" Returns nodes sorted by betweenness centrality.\n        Nodes with a lot of passing traffic will be at the front of the list.\n        \"\"\"\n        nodes = [(n.betweenness, n) for n in self.nodes if n.betweenness > treshold]\n        nodes.sort(); nodes.reverse()\n        return [n for w, n in nodes]"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef nodes_by_eigenvalue(self, treshold=0.0):\n        nodes = [(n.eigenvalue, n) for n in self.nodes if n.eigenvalue > treshold]\n        nodes.sort(); nodes.reverse()\n        return [n for w, n in nodes]", "response": "Returns a list of nodes sorted by eigenvector centrality."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef nodes_by_category(self, category):\n        return [n for n in self.nodes if n.category == category]", "response": "Returns a list of nodes with the given category attribute."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef crown(self, depth=2):\n        nodes = []\n        for node in self.leaves: nodes += node.flatten(depth-1)\n        return cluster.unique(nodes)", "response": "Returns a list of nodes connected to leaves and etc."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _density(self):\n        return 2.0*len(self.edges) / (len(self.nodes) * (len(self.nodes)-1))", "response": "The density of the nodes and edges."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef load(self, id):\n        \n        \"\"\" Rebuilds the graph around the given node id.\n        \"\"\"\n        \n        self.clear()\n    \n        # Root node.\n        self.add_node(id, root=True)\n    \n        # Directly connected nodes have priority.\n        for w, id2 in self.get_links(id):\n            self.add_edge(id, id2, weight=w)\n            if len(self) > self.max: \n                break\n\n        # Now get all the other nodes in the cluster.\n        for w, id2, links in self.get_cluster(id):\n            for id3 in links:\n                self.add_edge(id3, id2, weight=w)\n                self.add_edge(id, id3, weight=w)\n            #if len(links) == 0:\n            #    self.add_edge(id, id2)\n            if len(self) > self.max: \n                break    \n\n        # Provide a backlink to the previous root.\n        if self.event.clicked: \n            g.add_node(self.event.clicked)", "response": "Loads the node with the given id."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\ncomputing a cubic Bezier approximation of an elliptical arc.", "response": "def bezier_arc(x1, y1, x2, y2, start_angle=0, extent=90):\n    \n    \"\"\" Compute a cubic Bezier approximation of an elliptical arc.\n\n    (x1, y1) and (x2, y2) are the corners of the enclosing rectangle.\n    The coordinate system has coordinates that increase to the right and down.\n    Angles, measured in degress, start with 0 to the right (the positive X axis) \n    and increase counter-clockwise.\n    The arc extends from start_angle to start_angle+extent.\n    I.e. start_angle=0 and extent=180 yields an openside-down semi-circle.\n\n    The resulting coordinates are of the form (x1,y1, x2,y2, x3,y3, x4,y4)\n    such that the curve goes from (x1, y1) to (x4, y4) with (x2, y2) and\n    (x3, y3) as their respective Bezier control points.\n    \"\"\"\n\n    x1,y1, x2,y2 = min(x1,x2), max(y1,y2), max(x1,x2), min(y1,y2)\n\n    if abs(extent) <= 90:\n        frag_angle = float(extent)\n        nfrag = 1\n    else:\n        nfrag = int(ceil(abs(extent)/90.))\n        if nfrag == 0:\n            warnings.warn('Invalid value for extent: %r' % extent)\n            return []\n        frag_angle = float(extent) / nfrag\n\n    x_cen = (x1+x2)/2.\n    y_cen = (y1+y2)/2.\n    rx = (x2-x1)/2.\n    ry = (y2-y1)/2.\n    half_angle = radians(frag_angle) / 2\n    kappa = abs(4. / 3. * (1. - cos(half_angle)) / sin(half_angle))\n\n    if frag_angle < 0:\n        sign = -1\n    else:\n        sign = 1\n\n    point_list = []\n\n    for i in range(nfrag):\n        theta0 = radians(start_angle + i*frag_angle)\n        theta1 = radians(start_angle + (i+1)*frag_angle)\n        c0 = cos(theta0)\n        c1 = cos(theta1)\n        s0 = sin(theta0)\n        s1 = sin(theta1)\n        if frag_angle > 0:\n            signed_kappa = -kappa\n        else:\n            signed_kappa = kappa\n        point_list.append((x_cen + rx * c0,\n                          y_cen - ry * s0,\n                          x_cen + rx * (c0 + signed_kappa * s0),\n                          y_cen - ry * (s0 - signed_kappa * c0),\n                          x_cen + rx * (c1 - signed_kappa * s1),\n                          y_cen - ry * (s1 + signed_kappa * c1),\n                          x_cen + rx * c1,\n                          y_cen - ry * s1))\n\n    return point_list"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nreturn the angle in degrees between two vectors.", "response": "def angle(x1, y1, x2, y2):\n    \"\"\" The angle in degrees between two vectors.\n    \"\"\"\n    sign = 1.0\n    usign = (x1*y2 - y1*x2)\n    if usign < 0:\n        sign = -1.0\n    num = x1*x2 + y1*y2\n    den = hypot(x1,y1) * hypot(x2,y2)\n    ratio = min(max(num/den, -1.0), 1.0)\n    return sign * degrees(acos(ratio))"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef transform_from_local(xp, yp, cphi, sphi, mx, my):\n    x = xp * cphi - yp * sphi + mx\n    y = xp * sphi + yp * cphi + my\n    return (x,y)", "response": "Transform from the local frame to absolute space."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _create_view(self, name=\"shoebot-output\"):\n        view = Gtk.TextView()\n        view.set_editable(False)\n\n        fontdesc = Pango.FontDescription(\"Monospace\")\n        view.modify_font(fontdesc)\n        view.set_name(name)\n\n        buff = view.get_buffer()\n        buff.create_tag('error', foreground='red')\n        return view", "response": "Create the gtk. TextView used for shell output"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nsetting the bot for this topic.", "response": "def set_bot(self, bot):\n        ''' Bot must be set before running '''\n        self.bot = bot\n        self.sink.set_bot(bot)"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef settings(self, **kwargs):\n        '''\n        Pass a load of settings into the canvas\n        '''\n        for k, v in kwargs.items():\n            setattr(self, k, v)", "response": "Pass a load of settings into the canvas\n       "}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef size_or_default(self):\n        '''\n        If size is not set, otherwise set size to DEFAULT_SIZE\n        and return it.\n\n        This means, only the first call to size() is valid.\n        '''\n        if not self.size:\n            self.size = self.DEFAULT_SIZE\n        return self.size", "response": "Return the size of the current object."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nset the size of the object.", "response": "def set_size(self, size):\n        '''\n        Size is only set the first time it is called\n\n        Size that is set is returned\n        '''\n        if self.size is None:\n            self.size = size\n            return size\n        else:\n            return self.size"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef snapshot(self, target, defer=True, file_number=None):\n        '''\n        Ask the drawqueue to output to target.\n\n        target can be anything supported by the combination\n        of canvas implementation and drawqueue implmentation.\n\n        If target is not supported then an exception is thrown.\n        '''\n        output_func = self.output_closure(target, file_number)\n        if defer:\n            self._drawqueue.append(output_func)\n        else:\n            self._drawqueue.append_immediate(output_func)", "response": "Ask the drawqueue to output to target."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nflushes the drawqueue to the sink.", "response": "def flush(self, frame):\n        '''\n        Passes the drawqueue to the sink for rendering\n        '''\n        self.sink.render(self.size_or_default(), frame, self._drawqueue)\n        self.reset_drawqueue()"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef overlap(self, x1, y1, x2, y2, r=5):\n        \n        \"\"\" Returns True when point 1 and point 2 overlap.\n        \n        There is an r treshold in which point 1 and point 2\n        are considered to overlap.\n        \n        \"\"\"\n        \n        if abs(x2-x1) < r and abs(y2-y1) < r:\n            return True\n        else:\n            return False", "response": "Returns True when point 1 and point 2 overlap."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreflects the point x y through origin x0 y0.", "response": "def reflect(self, x0, y0, x, y):\n        \n        \"\"\" Reflects the point x, y through origin x0, y0.\n        \"\"\"\n                \n        rx = x0 - (x-x0)\n        ry = y0 - (y-y0)\n        return rx, ry"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ncalculate the angle between two points.", "response": "def angle(self, x0, y0, x1, y1):\n        \n        \"\"\" Calculates the angle between two points.\n        \"\"\"\n    \n        a = degrees( atan((y1-y0) / (x1-x0+0.00001)) ) + 360\n        if x1-x0 < 0: a += 180\n        return a"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ncalculates the coordinates of a point from the origin.", "response": "def coordinates(self, x0, y0, distance, angle):\n        \n        \"\"\" Calculates the coordinates of a point from the origin.\n        \"\"\"\n        \n        x = x0 + cos(radians(angle)) * distance\n        y = y0 + sin(radians(angle)) * distance\n        return Point(x, y)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef contains_point(self, x, y, d=2):\n        \n        \"\"\" Returns true when x, y is on the path stroke outline.\n        \"\"\"\n        \n        if self.path != None and len(self.path) > 1 \\\n        and self.path.contains(x, y):\n            # If all points around the mouse are also part of the path,\n            # this means we are somewhere INSIDE the path.\n            # Only points near the edge (i.e. on the outline stroke)\n            # should propagate.\n            if not self.path.contains(x+d, y) \\\n            or not self.path.contains(x, y+d) \\\n            or not self.path.contains(x-d, y) \\\n            or not self.path.contains(x, y-d) \\\n            or not self.path.contains(x+d, y+d) \\\n            or not self.path.contains(x-d, y-d) \\\n            or not self.path.contains(x+d, y-d) \\\n            or not self.path.contains(x-d, y+d):\n                return True\n\n        return False", "response": "Returns True when x y is on the path stroke outline."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ninserts a point on the path at the mouse location.", "response": "def insert_point(self, x, y):\n        \n        \"\"\" Inserts a point on the path at the mouse location.\n        \n        We first need to check if the mouse location is on the path.\n        Inserting point is time intensive and experimental.\n        \n        \"\"\"\n        \n        try: \n            bezier = _ctx.ximport(\"bezier\")\n        except:\n            from nodebox.graphics import bezier\n        \n        # Do a number of checks distributed along the path.\n        # Keep the one closest to the actual mouse location.\n        # Ten checks works fast but leads to imprecision in sharp corners\n        # and curves closely located next to each other.\n        # I prefer the slower but more stable approach.\n        n = 100\n        closest = None\n        dx0 = float(\"inf\") \n        dy0 = float(\"inf\")\n        for i in range(n):\n            t = float(i)/n\n            pt = self.path.point(t)\n            dx = abs(pt.x-x)\n            dy = abs(pt.y-y)\n            if dx+dy <= dx0+dy0:\n                dx0 = dx\n                dy0 = dy\n                closest = t\n\n        # Next, scan the area around the approximation.\n        # If the closest point is located at 0.2 on the path,\n        # we need to scan between 0.1 and 0.3 for a better\n        # approximation. If 1.5 was the best guess, scan\n        # 1.40, 1.41 ... 1.59 and so on.\n        # Each decimal precision takes 20 iterations.                \n        decimals = [3,4]\n        for d in decimals:\n            d = 1.0/pow(10,d)\n            \n            for i in range(20):\n                t = closest-d + float(i)*d*0.1\n                if t < 0.0: t = 1.0+t\n                if t > 1.0: t = t-1.0\n                pt = self.path.point(t)\n                dx = abs(pt.x-x)\n                dy = abs(pt.y-y)\n                if dx <= dx0 and dy <= dy0:\n                    dx0 = dx\n                    dy0 = dy\n                    closest_precise = t\n            \n            closest = closest_precise   \n\n        # Update the points list with the inserted point.\n        p = bezier.insert_point(self.path, closest_precise)\n        i, t, pt = bezier._locate(self.path, closest_precise)\n        i += 1\n        pt = PathElement()\n        pt.cmd = p[i].cmd\n        pt.x = p[i].x\n        pt.y = p[i].y\n        pt.ctrl1 = Point(p[i].ctrl1.x, p[i].ctrl1.y)\n        pt.ctrl2 = Point(p[i].ctrl2.x, p[i].ctrl2.y)\n        pt.freehand = False\n        self._points.insert(i, pt)\n        self._points[i-1].ctrl1 = Point(p[i-1].ctrl1.x, p[i-1].ctrl1.y)\n        self._points[i+1].ctrl1 = Point(p[i+1].ctrl1.x, p[i+1].ctrl1.y)\n        self._points[i+1].ctrl2 = Point(p[i+1].ctrl2.x, p[i+1].ctrl2.y)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef update(self):\n        \n        \"\"\" Update runs each frame to check for mouse interaction.\n        \n        Alters the path by allowing the user to add new points,\n        drag point handles and move their location.\n        Updates are automatically stored as SVG\n        in the given filename.\n        \n        \"\"\"\n        \n        x, y = mouse()\n        if self.show_grid:\n            x, y = self.grid.snap(x, y)\n        \n        if _ctx._ns[\"mousedown\"] \\\n        and not self.freehand:\n            \n            self._dirty = True\n            \n            # Handle buttons first.\n            # When pressing down on a button, all other action halts.\n            # Buttons appear near a point being edited.\n            # Once clicked, actions are resolved.\n            if self.edit != None \\\n            and not self.drag_point \\\n            and not self.drag_handle1 \\\n            and not self.drag_handle2:\n                pt = self._points[self.edit]\n                dx = pt.x+self.btn_x\n                dy = pt.y+self.btn_y\n                # The delete button\n                if self.overlap(dx, dy, x, y, r=self.btn_r):\n                    self.delete = self.edit\n                    return\n                # The moveto button,\n                # active on the last point in the path.\n                dx += self.btn_r*2 + 2\n                if self.edit == len(self._points) -1 and \\\n                   self.overlap(dx, dy, x, y, r=self.btn_r):\n                    self.moveto = self.edit\n                    return\n                    \n            if self.insert:\n                self.inserting = True\n                return\n            \n            # When not dragging a point or the handle of a point,\n            # i.e. the mousebutton was released and then pressed again,\n            # check to see if a point on the path is pressed.\n            # When this point is not the last new point,\n            # enter edit mode.\n            if not self.drag_point and \\\n               not self.drag_handle1 and \\\n               not self.drag_handle2:\n                self.editing = False\n                indices = range(len(self._points))\n                indices.reverse()\n                for i in indices:\n                    pt = self._points[i]\n                    if pt != self.new \\\n                    and self.overlap(x, y, pt.x, pt.y) \\\n                    and self.new == None:\n                        # Don't select a point if in fact\n                        # it is at the same location of the first handle \n                        # of the point we are currently editing.\n                        if self.edit == i+1 \\\n                        and self.overlap(self._points[i+1].ctrl1.x,\n                                         self._points[i+1].ctrl1.y, x, y):\n                            continue\n                        else:\n                            self.edit = i\n                            self.editing = True\n                            break\n            \n            # When the mouse button is down,\n            # edit mode continues as long as\n            # a point or handle is dragged.\n            # Else, stop editing and switch to add-mode\n            # (the user is clicking somewhere on the canvas).\n            if not self.editing:\n                if self.edit != None:\n                    pt = self._points[self.edit]\n                    if self.overlap(pt.ctrl1.x, pt.ctrl1.y, x, y) or \\\n                       self.overlap(pt.ctrl2.x, pt.ctrl2.y, x, y):\n                        self.editing = True\n                    else:\n                        self.edit = None\n                    \n            # When not in edit mode, there are two options.\n            # Either no new point is defined and the user is\n            # clicking somewhere on the canvas (add a new point)\n            # or the user is dragging the handle of the new point.\n            # Adding a new point is a fluid click-to-locate and\n            # drag-to-curve action.\n            if self.edit == None:\n                if self.new == None:\n                    # A special case is when the used clicked\n                    # the moveto button on the last point in the path.\n                    # This indicates a gap (i.e. MOVETO) in the path.\n                    self.new = PathElement()\n                    if self.moveto == True \\\n                    or len(self._points) == 0:\n                        cmd = MOVETO\n                        self.moveto = None\n                        self.last_moveto = self.new\n                    else:\n                        cmd = CURVETO\n                    self.new.cmd = cmd\n                    self.new.x = x\n                    self.new.y = y\n                    self.new.ctrl1 = Point(x, y)\n                    self.new.ctrl2 = Point(x, y)\n                    self.new.freehand = False\n                    # Don't forget to map the point's ctrl1 handle\n                    # to the ctrl2 handle of the previous point.\n                    # This makes for smooth, continuous paths.\n                    if len(self._points) > 0:\n                        prev = self._points[-1]\n                        rx, ry = self.reflect(prev.x, prev.y, prev.ctrl2.x, prev.ctrl2.y)\n                        self.new.ctrl1 = Point(rx, ry)\n                    self._points.append(self.new)\n                else:\n                    # Illustrator-like behavior:\n                    # when the handle is dragged downwards,\n                    # the path bulges upwards.\n                    rx, ry = self.reflect(self.new.x, self.new.y, x, y)\n                    self.new.ctrl2 = Point(rx, ry)\n            \n            # Edit mode\n            elif self.new == None:\n            \n                pt = self._points[self.edit]\n            \n                # The user is pressing the mouse on a point,\n                # enter drag-point mode.\n                if self.overlap(pt.x, pt.y, x, y) \\\n                and not self.drag_handle1 \\\n                and not self.drag_handle2 \\\n                and not self.new != None:\n                    self.drag_point = True\n                    self.drag_handle1 = False\n                    self.drag_handle2 = False\n\n                # The user is pressing the mouse on a point's handle,\n                # enter drag-handle mode.\n                if self.overlap(pt.ctrl1.x, pt.ctrl1.y, x, y) \\\n                and pt.cmd == CURVETO \\\n                and not self.drag_point \\\n                and not self.drag_handle2:\n                    self.drag_point = False\n                    self.drag_handle1 = True\n                    self.drag_handle2 = False\n                if self.overlap(pt.ctrl2.x, pt.ctrl2.y, x, y) \\\n                and pt.cmd == CURVETO \\\n                and not self.drag_point \\\n                and not self.drag_handle1:\n                    self.drag_point = False\n                    self.drag_handle1 = False\n                    self.drag_handle2 = True\n                \n                # In drag-point mode,\n                # the point is located at the mouse coordinates.\n                # The handles move relatively to the new location\n                # (e.g. they are retained, the path does not distort).\n                # Modify the ctrl1 handle of the next point as well.\n                if self.drag_point == True:\n                    dx = x - pt.x\n                    dy = y - pt.y\n                    pt.x = x\n                    pt.y = y\n                    pt.ctrl2.x += dx\n                    pt.ctrl2.y += dy\n                    if self.edit < len(self._points)-1:\n                        rx, ry = self.reflect(pt.x, pt.y, x, y)\n                        next = self._points[self.edit+1]\n                        next.ctrl1.x += dx\n                        next.ctrl1.y += dy\n\n                # In drag-handle mode,\n                # set the path's handle to the mouse location.\n                # Rotate the handle of the next or previous point\n                # to keep paths smooth - unless the user is pressing \"x\".\n                if self.drag_handle1 == True:\n                    pt.ctrl1 = Point(x, y)\n                    if self.edit > 0 \\\n                    and self.last_key != \"x\":\n                        prev = self._points[self.edit-1]\n                        d = self.distance(prev.x, prev.y, prev.ctrl2.x, prev.ctrl2.y)\n                        a = self.angle(prev.x, prev.y, pt.ctrl1.x, pt.ctrl1.y)\n                        prev.ctrl2 = self.coordinates(prev.x, prev.y, d, a+180)                        \n                if self.drag_handle2 == True:   \n                    pt.ctrl2 = Point(x, y)\n                    if self.edit < len(self._points)-1 \\\n                    and self.last_key != \"x\":\n                        next = self._points[self.edit+1]\n                        d = self.distance(pt.x, pt.y, next.ctrl1.x, next.ctrl1.y)\n                        a = self.angle(pt.x, pt.y, pt.ctrl2.x, pt.ctrl2.y)\n                        next.ctrl1 = self.coordinates(pt.x, pt.y, d, a+180)\n        \n        elif not self.freehand:\n            \n            # The mouse button is released\n            # so we are not dragging anything around.\n            self.new = None\n            self.drag_point = False\n            self.drag_handle1 = False\n            self.drag_handle2 = False\n            \n            # The delete button for a point was clicked.\n            if self.delete != None and len(self._points) > 0:\n                i = self.delete\n                cmd = self._points[i].cmd\n                del self._points[i]\n                if 0 < i < len(self._points):\n                    prev = self._points[i-1]\n                    rx, ry = self.reflect(prev.x, prev.y, prev.ctrl2.x, prev.ctrl2.y)\n                    self._points[i].ctrl1 = Point(rx, ry)\n                # Also delete all the freehand points\n                # prior to this point.\n                start_i = i\n                while i > 1:\n                    i -= 1\n                    pt = self._points[i]\n                    if pt.freehand:\n                        del self._points[i]\n                    elif i < start_i-1 and pt.freehand == False:\n                        if pt.cmd == MOVETO:\n                            del self._points[i]\n                        break\n                # When you delete a MOVETO point,\n                # the last moveto (the one where the dashed line points to)\n                # needs to be updated.\n                if len(self._points) > 0 \\\n                and (cmd == MOVETO or i == 0):\n                    self.last_moveto = self._points[0]\n                    for pt in self._points:\n                        if pt.cmd == MOVETO:\n                            self.last_moveto = pt\n                self.delete = None\n                self.edit = None\n\n            # The moveto button for the last point\n            # in the path was clicked.\n            elif isinstance(self.moveto, int):\n                self.moveto = True\n                self.edit = None\n            \n            # We are not editing a node and\n            # the mouse is hovering over the path outline stroke:\n            # it is possible to insert a point here.\n            elif self.edit == None \\\n            and self.contains_point(x, y, d=2):\n                self.insert = True\n            else:\n                self.insert = False\n            \n            # Commit insert of new point.\n            if self.inserting \\\n            and self.contains_point(x, y, d=2): \n                self.insert_point(x, y)\n                self.insert = False\n            self.inserting = False\n            \n            # No modifications are being made right now\n            # and the SVG file needs to be updated.\n            if self._dirty == True:\n                self.export_svg()\n                self._dirty = False\n        \n        # Keyboard interaction.\n        if _ctx._ns[\"keydown\"]:\n            self.last_key = _ctx._ns[\"key\"]\n            self.last_keycode = _ctx._ns[\"keycode\"]\n        if not _ctx._ns[\"keydown\"] and self.last_key != None:\n            # If the TAB-key is pressed,\n            # switch the magnetic grid either on or off.\n            if self.last_keycode == KEY_TAB:\n                self.show_grid = not self.show_grid\n            # When \"f\" is pressed, switch freehand mode.\n            if self.last_key == \"f\":\n                self.edit = None\n                self.freehand = not self.freehand\n                if self.freehand:\n                    self.msg = \"freehand\"\n                else:\n                    self.msg = \"curves\"\n            # When ESC is pressed exit edit mode.\n            if self.last_keycode == KEY_ESC:\n                self.edit = None\n            # When BACKSPACE is pressed, delete current point.\n            if self.last_keycode == _ctx.KEY_BACKSPACE \\\n            and self.edit != None:\n                self.delete = self.edit\n            self.last_key = None\n            self.last_code = None\n        \n        # Using the keypad you can scroll the screen.\n        if _ctx._ns[\"keydown\"]:\n            dx = 0\n            dy = 0\n            keycode = _ctx._ns[\"keycode\"]\n            if keycode == _ctx.KEY_LEFT:\n                dx = -10\n            elif keycode == _ctx.KEY_RIGHT:\n                dx = 10\n            if keycode == _ctx.KEY_UP:\n                dy = -10\n            elif keycode == _ctx.KEY_DOWN:\n                dy = 10\n            if dx != 0 or dy != 0:\n                for pt in self._points:\n                    pt.x += dx\n                    pt.y += dy\n                    pt.ctrl1.x += dx\n                    pt.ctrl1.y += dy\n                    pt.ctrl2.x += dx\n                    pt.ctrl2.y += dy", "response": "Updates the path based on the mouse interaction."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef draw(self):\n        \n        \"\"\" Draws the editable path and interface elements.\n        \"\"\"\n                \n        # Enable interaction.\n        self.update()\n        x, y = mouse()\n        \n        # Snap to grid when enabled.\n        # The grid is enabled with the TAB key.\n        if self.show_grid:\n            self.grid.draw()\n            x, y = self.grid.snap(x, y)\n        \n        _ctx.strokewidth(self.strokewidth)\n        if self.freehand:\n            self.draw_freehand()\n        \n        r = 4\n        _ctx.nofill()\n        if len(self._points) > 0:\n            \n            first = True            \n            for i in range(len(self._points)):\n                \n                # Construct the path.\n                pt = self._points[i]\n                if first:\n                    _ctx.beginpath(pt.x, pt.y)\n                    first = False\n                else:\n                    if pt.cmd == CLOSE:\n                        _ctx.closepath()\n                    elif pt.cmd == MOVETO:\n                        _ctx.moveto(pt.x, pt.y)\n                    elif pt.cmd == LINETO:\n                        _ctx.lineto(pt.x, pt.y)\n                    elif pt.cmd == CURVETO:\n                        _ctx.curveto(pt.ctrl1.x, pt.ctrl1.y, \n                                     pt.ctrl2.x, pt.ctrl2.y, \n                                     pt.x, pt.y)\n                # In add- or edit-mode,\n                # display the current point's handles.\n                if ((i == self.edit and self.new == None) \\\n                or pt == self.new) \\\n                and pt.cmd == CURVETO \\\n                and not pt.freehand:\n                    _ctx.stroke(self.handle_color)\n                    _ctx.nofill()\n                    _ctx.oval(pt.x-r, pt.y-r, r*2, r*2)\n                    _ctx.stroke(self.handle_color)\n                    _ctx.line(pt.ctrl2.x, pt.ctrl2.y, pt.x, pt.y)\n                    _ctx.fill(self.handle_color)\n                # Display the new point's handle being dragged.\n                if pt == self.new \\\n                and not pt.freehand:\n                    rx, ry = self.reflect(pt.x, pt.y, pt.ctrl2.x, pt.ctrl2.y)\n                    _ctx.stroke(self.handle_color)\n                    _ctx.line(rx, ry, pt.x, pt.y)\n                    _ctx.nostroke()\n                    _ctx.fill(self.handle_color)\n                    _ctx.oval(rx-r/2, ry-r/2, r, r)\n                # Display handles for point being edited.\n                if i == self.edit \\\n                and self.new == None \\\n                and pt.cmd == CURVETO \\\n                and not pt.freehand:\n                    _ctx.oval(pt.ctrl2.x-r/2, pt.ctrl2.y-r/2, r, r)\n                    if i > 0:\n                        prev = self._points[i-1]\n                        _ctx.line(pt.ctrl1.x, pt.ctrl1.y, prev.x, prev.y)\n                        _ctx.oval(pt.ctrl1.x-r/2, pt.ctrl1.y-r/2, r, r)\n                    if i > 0 and self._points[i-1].cmd != MOVETO:\n                        _ctx.line(prev.ctrl2.x, prev.ctrl2.y, prev.x, prev.y)\n                    if i < len(self._points)-1:\n                        next = self._points[i+1]\n                        if next.cmd == CURVETO:\n                            _ctx.line(next.ctrl1.x, next.ctrl1.y, pt.x, pt.y)\n                \n                # When hovering over a point,\n                # highlight it.\n                elif self.overlap(x, y, pt.x, pt.y) \\\n                and not pt.freehand:\n                    self.insert = False # quit insert mode\n                    _ctx.nofill()\n                    _ctx.stroke(self.handle_color)\n                    _ctx.oval(pt.x-r, pt.y-r, r*2, r*2)\n                \n                # Provide visual coordinates\n                # for points being dragged, moved or hovered.\n                _ctx.fontsize(9)\n                _ctx.fill(self.handle_color)\n                txt = \" (\"+str(int(pt.x))+\", \"+str(int(pt.y))+\")\"\n                if (i == self.edit and self.new == None) \\\n                or pt == self.new \\\n                and not pt.freehand:\n                    _ctx.text(txt, pt.x+r, pt.y+2)                                       \n                elif self.overlap(x, y, pt.x, pt.y) \\\n                and not pt.freehand:\n                    _ctx.text(txt, pt.x+r, pt.y+2)\n\n                # Draw a circle for each point\n                # in the path.\n                if not pt.freehand:\n                    if pt.cmd != MOVETO:\n                        _ctx.fill(self.path_color)\n                        _ctx.nostroke()\n                    else:\n                        _ctx.stroke(self.path_color)\n                        _ctx.nofill()\n                    _ctx.oval(pt.x-r/2, pt.y-r/2, r, r)\n                \n            # Draw the current path,\n            # update the path property.\n            _ctx.stroke(self.path_color)\n            _ctx.fill(self.path_fill)\n            _ctx.autoclosepath(False)    \n            p = _ctx.endpath()\n            self.path = p\n            \n            # Possible to insert a point here.\n            if self.insert:\n                _ctx.stroke(self.handle_color)\n                _ctx.nofill()\n                _ctx.oval(x-r*0.8, y-r*0.8, r*1.6, r*1.6)\n                \n            # When not editing a node,\n            # prospect how the curve will continue\n            # when adding a new point.\n            if self.edit == None \\\n            and self.new == None \\\n            and self.moveto != True \\\n            and not self.freehand:\n                _ctx.nofill()\n                _ctx.stroke(self.new_color)\n                rx, ry = self.reflect(pt.x, pt.y, pt.ctrl2.x, pt.ctrl2.y)\n                _ctx.beginpath(pt.x, pt.y)\n                _ctx.curveto(rx, ry, x, y, x, y)\n                _ctx.endpath()\n\n                # A dashed line indicates what\n                # a CLOSETO would look like.\n                if self.last_moveto != None:\n                    start = self.last_moveto\n                else:\n                    start = self._points[0]\n                p = _ctx.line(x, y, start.x, start.y, draw=False)\n                try: p._nsBezierPath.setLineDash_count_phase_([2,4], 2, 50)\n                except:\n                    pass\n                _ctx.drawpath(p)\n        \n            # When doing a MOVETO,\n            # show the new point hovering at the mouse location.\n            elif self.edit == None \\\n            and self.new == None \\\n            and self.moveto != None:\n                _ctx.stroke(self.new_color)\n                _ctx.nofill()\n                _ctx.oval(x-r*0.8, y-r*0.8, r*1.6, r*1.6)\n            \n            # Draws button for a point being edited.\n            # The first button deletes the point.\n            # The second button, which appears only on the last point\n            # in the path, tells the editor to perform a MOVETO\n            # before adding a new point.\n            if self.edit != None:\n                pt = self._points[self.edit]\n                x = pt.x + self.btn_x\n                y = pt.y + self.btn_y\n                r = self.btn_r\n                _ctx.nostroke()\n                _ctx.fill(0,0,0,0.2)\n                _ctx.fill(self.handle_color)\n                _ctx.oval(x-r, y-r, r*2, r*2)\n                _ctx.fill(1)\n                _ctx.rotate(45)\n                _ctx.rect(x-r+2, y-0.625, r+1, 1.25)\n                _ctx.rotate(-90)\n                _ctx.rect(x-r+2, y-0.625, r+1, 1.25)\n                _ctx.reset()\n                if self.edit == len(self._points)-1:\n                    _ctx.fill(self.handle_color)\n                    _ctx.oval(x+r*2+2-r, y-r, r*2, r*2)\n                    _ctx.fill(1)\n                    _ctx.rect(x+r*2+2-2.25, y-r+3, 1.5, r-1)\n                    _ctx.rect(x+r*2+2+0.75, y-r+3, 1.5, r-1)\n        \n        # Handle onscreen notifications.\n        # Any text in msg is displayed in a box in the center\n        # and slowly fades away, after which msg is cleared.    \n        if self.msg != \"\":\n            self.msg_alpha -= 0.1\n            _ctx.nostroke()\n            _ctx.fill(0,0,0, self.msg_alpha)\n            _ctx.fontsize(18)\n            _ctx.lineheight(1)\n            w = _ctx.textwidth(self.msg)\n            _ctx.rect(_ctx.WIDTH/2-w/2-9, _ctx.HEIGHT/2-27, w+18, 36, roundness=0.4)\n            _ctx.fill(1,1,1, 0.8)\n            _ctx.align(CENTER) \n            _ctx.text(self.msg, 0, _ctx.HEIGHT/2, width=_ctx.WIDTH)\n        if self.msg_alpha <= 0.0:\n            self.msg = \"\"\n            self.msg_alpha = 1.0", "response": "Draw the editable path and interface elements."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef draw_freehand(self):\n        \n        \"\"\" Freehand sketching.\n        \"\"\"\n        \n        if _ctx._ns[\"mousedown\"]:\n            \n            x, y = mouse()\n            if self.show_grid:\n                x, y = self.grid.snap(x, y)\n            \n            if self.freehand_move == True:\n                cmd = MOVETO\n                self.freehand_move = False\n            else:\n                cmd = LINETO\n            \n            # Add a new LINETO to the path,\n            # except when starting to draw,\n            # then a MOVETO is added to the path.\n            pt = PathElement()\n            if cmd != MOVETO:\n                pt.freehand = True # Used when mixed with curve drawing.\n            else:\n                pt.freehand = False\n            pt.cmd = cmd\n            pt.x = x\n            pt.y = y\n            pt.ctrl1 = Point(x,y)\n            pt.ctrl2 = Point(x,y)\n            self._points.append(pt)\n            \n            # Draw the current location of the cursor.\n            r = 4\n            _ctx.nofill()\n            _ctx.stroke(self.handle_color)\n            _ctx.oval(pt.x-r, pt.y-r, r*2, r*2)\n            _ctx.fontsize(9)\n            _ctx.fill(self.handle_color)\n            _ctx.text(\" (\"+str(int(pt.x))+\", \"+str(int(pt.y))+\")\", pt.x+r, pt.y)\n        \n            self._dirty = True\n        \n        else:\n\n            # Export the updated drawing,\n            # remember to do a MOVETO on the next interaction.\n            self.freehand_move = True\n            if self._dirty:\n                self._points[-1].freehand = False\n                self.export_svg()\n                self._dirty = False", "response": "Draw the freehand sketching."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nexport the path as SVG.", "response": "def export_svg(self):\n        \n        \"\"\" Exports the path as SVG.\n        \n        Uses the filename given when creating this object.\n        The file is automatically updated to reflect\n        changes to the path.\n        \n        \"\"\"\n        \n        d = \"\"\n        if len(self._points) > 0:\n            d += \"M \"+str(self._points[0].x)+\" \"+str(self._points[0].y)+\" \"\n            for pt in self._points:\n                if pt.cmd == MOVETO:\n                    d += \"M \"+str(pt.x)+\" \"+str(pt.y)+\" \"\n                elif pt.cmd == LINETO:\n                    d += \"L \"+str(pt.x)+\" \"+str(pt.y)+\" \"\n                elif pt.cmd == CURVETO:\n                    d += \"C \"\n                    d += str(pt.ctrl1.x)+\" \"+str(pt.ctrl1.y)+\" \"\n                    d += str(pt.ctrl2.x)+\" \"+str(pt.ctrl2.y)+\" \"\n                    d += str(pt.x)+\" \"+str(pt.y)+\" \"\n        \n        c = \"rgb(\"\n        c += str(int(self.path_color.r*255)) + \",\"\n        c += str(int(self.path_color.g*255)) + \",\"\n        c += str(int(self.path_color.b*255)) + \")\"\n        \n        s  = '<?xml version=\"1.0\"?>\\n'\n        s += '<svg width=\"'+str(_ctx.WIDTH)+'pt\" height=\"'+str(_ctx.HEIGHT)+'pt\">\\n'\n        s += '<g>\\n'\n        s += '<path d=\"'+d+'\" fill=\"none\" stroke=\"'+c+'\" stroke-width=\"'+str(self.strokewidth)+'\" />\\n'\n        s += '</g>\\n'\n        s += '</svg>\\n'\n        \n        f = open(self.file+\".svg\", \"w\")\n        f.write(s)\n        f.close()"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\ndownload the image from Flickr and returns the path property.", "response": "def download(self, size=SIZE_XLARGE, thumbnail=False, wait=60, asynchronous=False):\n        \n        \"\"\" Downloads this image to cache.\n        \n        Calling the download() method instantiates an asynchronous URLAccumulator\n        that will fetch the image's URL from Flickr.\n        A second process then downloads the file at the retrieved URL.\n        \n        Once it is done downloading, this image will have its path property\n        set to an image file in the cache.\n        \n        \"\"\"\n        \n        if thumbnail == True: size = SIZE_THUMBNAIL # backwards compatibility\n        self._size = disambiguate_size(size)\n        self._wait = wait\n        self._asynchronous = asynchronous\n\n        url  = \"http://api.flickr.com/services/rest/?method=flickr.photos.getSizes\"\n        url += \"&photo_id=\" + self.id\n        url += \"&api_key=\" + API_KEY\n        URLAccumulator.__init__(self, url, wait, asynchronous, \"flickr\", \".xml\", 2)\n\n        if not asynchronous:\n            return self.path"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nhandling right mouse button clicks", "response": "def gtk_mouse_button_down(self, widget, event):\n        ''' Handle right mouse button clicks '''\n        if self.menu_enabled and event.button == 3:\n            menu = self.uimanager.get_widget('/Save as')\n            menu.popup(None, None, None, None, event.button, event.time)\n        else:\n            super(ShoebotWindow, self).gtk_mouse_button_down(widget, event)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nshow the variables window.", "response": "def show_variables_window(self):\n        \"\"\"\n        Show the variables window.\n        \"\"\"\n        if self.var_window is None and self.bot._vars:\n            self.var_window = VarWindow(self, self.bot, '%s variables' % (self.title or 'Shoebot'))\n            self.var_window.window.connect(\"destroy\", self.var_window_closed)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nhide the variables window", "response": "def hide_variables_window(self):\n        \"\"\"\n        Hide the variables window\n        \"\"\"\n        if self.var_window is not None:\n            self.var_window.window.destroy()\n            self.var_window = None"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef var_window_closed(self, widget):\n        # TODO - Clean up the menu handling stuff its a bit spagetti right now\n        self.action_group.get_action('vars').set_active(False)\n        self.show_vars = False\n        self.var_window = None", "response": "Called if user clicked close button on var window"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ntells the canvas to perform a snapshot when it s finished rendering", "response": "def schedule_snapshot(self, format):\n        \"\"\"\n        Tell the canvas to perform a snapshot when it's finished rendering\n        :param format:\n        :return:\n        \"\"\"\n        bot = self.bot\n        canvas = self.bot.canvas\n        script = bot._namespace['__file__']\n        if script:\n            filename = os.path.splitext(script)[0] + '.' + format\n        else:\n            filename = 'output.' + format\n\n        f = canvas.output_closure(filename, self.bot._frame)\n        self.scheduled_snapshots.append(f)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef trigger_fullscreen_action(self, fullscreen):\n        action = self.action_group.get_action('fullscreen')\n        action.set_active(fullscreen)", "response": "Toggle fullscreen from outside the GUI,\n        causes the GUI to updated and run all its actions."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef do_window_close(self, widget, data=None):\n        publish_event(QUIT_EVENT)\n\n        if self.has_server:\n            self.sock.close()\n\n        self.hide_variables_window()\n\n        self.destroy()\n        self.window_open = False", "response": "This function is called when the user closes the window."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef do_toggle_fullscreen(self, action):\n        is_fullscreen = action.get_active()\n        if is_fullscreen:\n            self.fullscreen()\n        else:\n            self.unfullscreen()", "response": "This function is called when the user clicks on the fullscreen of the user."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef do_toggle_play(self, action):\n        # TODO - move this into bot controller\n        # along with stuff in socketserver and shell\n        if self.pause_speed is None and not action.get_active():\n            self.pause_speed = self.bot._speed\n            self.bot._speed = 0\n        else:\n            self.bot._speed = self.pause_speed\n            self.pause_speed = None", "response": "Widget Action to toggle play / pause."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef do_toggle_variables(self, action):\n        self.show_vars = action.get_active()\n        if self.show_vars:\n            self.show_variables_window()\n        else:\n            self.hide_variables_window()", "response": "This method is called when the user clicks on the variables window."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nsets the default values. Called at __init__ and at the end of run.", "response": "def _set_initial_defaults(self):\n        '''Set the default values. Called at __init__ and at the end of run(),\n        do that new draw loop iterations don't take up values left over by the\n        previous one.'''\n        DEFAULT_WIDTH, DEFAULT_HEIGHT = self._canvas.DEFAULT_SIZE\n        self.WIDTH = self._namespace.get('WIDTH', DEFAULT_WIDTH)\n        self.HEIGHT = self._namespace.get('HEIGHT', DEFAULT_WIDTH)\n        if 'WIDTH' in self._namespace or 'HEIGHT' in self._namespace:\n            self.size(w=self._namespace.get('WIDTH'), h=self._namespace.get('HEIGHT'))\n\n        self._transformmode = Bot.CENTER\n\n        self._canvas.settings(\n            fontfile=\"assets/notcouriersans.ttf\",\n            fontsize=16,\n            align=Bot.LEFT,\n            lineheight=1,\n            fillcolor=self.color(.2),\n            strokecolor=None,\n            strokewidth=1.0,\n            background=self.color(1, 1, 1))"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _key_pressed(self, key, keycode):\n        '''GUI callback for key pressed'''\n        self._namespace['key'] = key\n        self._namespace['keycode'] = keycode\n        self._namespace['keydown'] = True", "response": "GUI callback for key pressed"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _makeInstance(self, clazz, args, kwargs):\n        '''Creates an instance of a class defined in this document.\n           This method sets the context of the object to the current context.'''\n        inst = clazz(self, *args, **kwargs)\n        return inst", "response": "Creates an instance of a class defined in this document."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _makeColorableInstance(self, clazz, args, kwargs):\n        kwargs = dict(kwargs)\n\n        fill = kwargs.get('fill', self._canvas.fillcolor)\n        if not isinstance(fill, Color):\n            fill = Color(fill, mode='rgb', color_range=1)\n        kwargs['fill'] = fill\n\n        stroke = kwargs.get('stroke', self._canvas.strokecolor)\n        if not isinstance(stroke, Color):\n            stroke = Color(stroke, mode='rgb', color_range=1)\n        kwargs['stroke'] = stroke\n\n        kwargs['strokewidth'] = kwargs.get('strokewidth', self._canvas.strokewidth)\n        inst = clazz(self, *args, **kwargs)\n        return inst", "response": "Create an object of the specified class with the specified arguments and kwargs."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef color(self, *args):\n        '''\n        :param args: color in a supported format.\n\n        :return: Color object containing the color.\n        '''\n        return self.Color(mode=self.color_mode, color_range=self.color_range, *args)", "response": "Color method that returns a Color object containing the color."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn an iterator that contains coordinate tuples.", "response": "def grid(self, cols, rows, colSize=1, rowSize=1, shuffled=False):\n        \"\"\"Returns an iterator that contains coordinate tuples.\n        The grid can be used to quickly create grid-like structures.\n        A common way to use them is:\n            for x, y in grid(10,10,12,12):\n                rect(x,y, 10,10)\n        \"\"\"\n        # Taken ipsis verbis from Nodebox\n        from random import shuffle\n        rowRange = range(int(rows))\n        colRange = range(int(cols))\n        if (shuffled):\n            shuffle(rowRange)\n            shuffle(colRange)\n        for y in rowRange:\n            for x in colRange:\n                yield (x * colSize, y * rowSize)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nsave the contents of current surface into a file or a Cairo surface or context.", "response": "def snapshot(self, target=None, defer=None, autonumber=False):\n        '''Save the contents of current surface into a file or cairo surface/context\n\n        :param filename: Can be a filename or a Cairo surface.\n        :param defer: If true, buffering/threading may be employed however output will not be immediate.\n        :param autonumber: If true then a number will be appended to the filename.\n        '''\n        if autonumber:\n            file_number = self._frame\n        else:\n            file_number = None\n\n        if isinstance(target, cairo.Surface):\n            # snapshot to Cairo surface\n            if defer is None:\n                self._canvas.snapshot(surface, defer)\n                defer = False\n            ctx = cairo.Context(target)\n            # this used to be self._canvas.snapshot, but I couldn't make it work.\n            # self._canvas.snapshot(target, defer)\n            # TODO: check if this breaks when taking more than 1 snapshot\n            self._canvas._drawqueue.render(ctx)\n            return\n        elif target is None:\n            # If nothing specified, use a default filename from the script name\n            script_file = self._namespace.get('__file__')\n            if script_file:\n                target = os.path.splitext(script_file)[0] + '.svg'\n                file_number = True\n\n        if target:\n            # snapshot to file, target is a filename\n            if defer is None:\n                defer = True\n            self._canvas.snapshot(target, defer=defer, file_number=file_number)\n        else:\n            raise ShoebotError('No image saved')"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef show(self, format='png', as_data=False):\n        '''Returns an Image object of the current surface. Used for displaying\n        output in Jupyter notebooks. Adapted from the cairo-jupyter project.'''\n\n        from io import BytesIO\n\n        b = BytesIO()\n\n        if format == 'png':\n            from IPython.display import Image\n            surface = cairo.ImageSurface(cairo.FORMAT_ARGB32, self.WIDTH, self.HEIGHT)\n            self.snapshot(surface)\n            surface.write_to_png(b)\n            b.seek(0)\n            data = b.read()\n            if as_data:\n                return data\n            else:\n                return Image(data)\n        elif format == 'svg':\n            from IPython.display import SVG\n            surface = cairo.SVGSurface(b, self.WIDTH, self.HEIGHT)\n            surface.finish()\n            b.seek(0)\n            data = b.read()\n            if as_data:\n                return data\n            else:\n                return SVG(data)", "response": "Returns an Image object of the current surface. Used for displaying\n            output in Jupyter notebooks."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nimporting Nodebox libraries. Import Nodebox libraries. The libraries get _ctx which provides them with the nodebox API.", "response": "def ximport(self, libName):\n        '''\n        Import Nodebox libraries.\n\n        The libraries get _ctx, which provides\n        them with the nodebox API.\n\n        :param libName: Library name to import\n        '''\n        # from Nodebox\n        lib = __import__(libName)\n        self._namespace[libName] = lib\n        lib._ctx = self\n        return lib"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nsetting the canvas size w h", "response": "def size(self, w=None, h=None):\n        '''Set the canvas size\n\n        Only the first call will actually be effective.\n\n        :param w: Width\n        :param h: height\n        '''\n\n        if not w:\n            w = self._canvas.width\n        if not h:\n            h = self._canvas.height\n        if not w and not h:\n            return (self._canvas.width, self._canvas.height)\n\n        # FIXME: Updating in all these places seems a bit hacky\n        w, h = self._canvas.set_size((w, h))\n        self._namespace['WIDTH'] = w\n        self._namespace['HEIGHT'] = h\n        self.WIDTH = w  # Added to make evolution example work\n        self.HEIGHT = h"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nsetting animation framerate. :param framerate: Frames per second to run bot. :return: Current framerate of animation.", "response": "def speed(self, framerate=None):\n        '''Set animation framerate.\n\n        :param framerate: Frames per second to run bot.\n        :return: Current framerate of animation.\n        '''\n        if framerate is not None:\n            self._speed = framerate\n            self._dynamic = True\n        else:\n            return self._speed"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef set_callbacks(self, **kwargs):\n        ''' Set callbacks for input events '''\n        for name in self.SUPPORTED_CALLBACKS:\n            func = kwargs.get(name, getattr(self, name))\n            setattr(self, name, func)", "response": "Set callbacks for input events"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef complement(clr):\n    clr = color(clr)\n    colors = colorlist(clr)\n    colors.append(clr.complement)\n\n    return colors", "response": "Returns the color and its complement in a list."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreturn a list of complementary colors.", "response": "def complementary(clr):\n    \"\"\"\n    Returns a list of complementary colors.\n\n    The complement is the color 180 degrees across\n    the artistic RYB color wheel.\n    The list contains darker and softer contrasting\n    and complementing colors.\n    \"\"\"\n    clr = color(clr)\n    colors = colorlist(clr)\n\n    # A contrasting color: much darker or lighter than the original.\n    c = clr.copy()\n    if clr.brightness > 0.4:\n        c.brightness = 0.1 + c.brightness * 0.25\n    else:\n        c.brightness = 1.0 - c.brightness * 0.25\n    colors.append(c)\n\n    # A soft supporting color: lighter and less saturated.\n    c = clr.copy()\n    c.brightness = 0.3 + c.brightness\n    c.saturation = 0.1 + c.saturation * 0.3\n    colors.append(c)\n\n    # A contrasting complement: very dark or very light.\n    clr = clr.complement\n    c = clr.copy()\n    if clr.brightness > 0.3:\n        c.brightness = 0.1 + clr.brightness * 0.25\n    else:\n        c.brightness = 1.0 - c.brightness * 0.25\n    colors.append(c)\n\n    # The complement and a light supporting variant.\n    colors.append(clr)\n\n    c = clr.copy()\n    c.brightness = 0.3 + c.brightness\n    c.saturation = 0.1 + c.saturation * 0.25\n    colors.append(c)\n\n    return colors"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef split_complementary(clr):\n    clr = color(clr)\n    colors = colorlist(clr)\n    clr = clr.complement\n    colors.append(clr.rotate_ryb(-30).lighten(0.1))\n    colors.append(clr.rotate_ryb(30).lighten(0.1))\n\n    return colors", "response": "Returns a list with the split complement of the color."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn the left half of the split complement.", "response": "def left_complement(clr):\n    \"\"\"\n    Returns the left half of the split complement.\n\n    A list is returned with the same darker and softer colors\n    as in the complementary list, but using the hue of the\n    left split complement instead of the complement itself.\n    \"\"\"\n    left = split_complementary(clr)[1]\n    colors = complementary(clr)\n    colors[3].h = left.h\n    colors[4].h = left.h\n    colors[5].h = left.h\n\n    colors = colorlist(\n        colors[0], colors[2], colors[1], colors[3], colors[4], colors[5]\n    )\n\n    return colors"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef right_complement(clr):\n    right = split_complementary(clr)[2]\n    colors = complementary(clr)\n    colors[3].h = right.h\n    colors[4].h = right.h\n    colors[5].h = right.h\n\n    colors = colorlist(\n        colors[0], colors[2], colors[1], colors[5], colors[4], colors[3]\n    )\n\n    return colors", "response": "Returns the right half of the split complement."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreturning an analogous color list for a given color.", "response": "def analogous(clr, angle=10, contrast=0.25):\n    \"\"\"\n    Returns colors that are next to each other on the wheel.\n\n    These yield natural color schemes (like shades of water or sky).\n    The angle determines how far the colors are apart,\n    making it bigger will introduce more variation.\n    The contrast determines the darkness/lightness of\n    the analogue colors in respect to the given colors.\n    \"\"\"\n    contrast = max(0, min(contrast, 1.0))\n\n    clr = color(clr)\n    colors = colorlist(clr)\n\n    for i, j in [(1, 2.2), (2, 1), (-1, -0.5), (-2, 1)]:\n        c = clr.rotate_ryb(angle * i)\n        t = 0.44 - j * 0.1\n        if clr.brightness - contrast * j < t:\n            c.brightness = t\n        else:\n            c.brightness = clr.brightness - contrast * j\n        c.saturation -= 0.05\n        colors.append(c)\n\n    return colors"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef monochrome(clr):\n    def _wrap(x, min, threshold, plus):\n        if x - min < threshold:\n            return x + plus\n        else:\n            return x - min\n\n    colors = colorlist(clr)\n\n    c = clr.copy()\n    c.brightness = _wrap(clr.brightness, 0.5, 0.2, 0.3)\n    c.saturation = _wrap(clr.saturation, 0.3, 0.1, 0.3)\n    colors.append(c)\n\n    c = clr.copy()\n    c.brightness = _wrap(clr.brightness, 0.2, 0.2, 0.6)\n    colors.append(c)\n\n    c = clr.copy()\n    c.brightness = max(0.2, clr.brightness + (1 - clr.brightness) * 0.2)\n    c.saturation = _wrap(clr.saturation, 0.3, 0.1, 0.3)\n    colors.append(c)\n\n    c = clr.copy()\n    c.brightness = _wrap(clr.brightness, 0.5, 0.2, 0.3)\n    colors.append(c)\n\n    return colors", "response": "Returns a list of colors in the same hue with varying brightness and saturation."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreturning a triad of colors.", "response": "def triad(clr, angle=120):\n    \"\"\"\n    Returns a triad of colors.\n\n    The triad is made up of this color and two other colors\n    that together make up an equilateral triangle on\n    the artistic color wheel.\n    \"\"\"\n    clr = color(clr)\n    colors = colorlist(clr)\n    colors.append(clr.rotate_ryb(angle).lighten(0.1))\n    colors.append(clr.rotate_ryb(-angle).lighten(0.1))\n\n    return colors"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns a tetrad of colors.", "response": "def tetrad(clr, angle=90):\n    \"\"\"\n    Returns a tetrad of colors.\n\n    The tetrad is made up of this color and three other colors\n    that together make up a cross on the artistic color wheel.\n    \"\"\"\n    clr = color(clr)\n    colors = colorlist(clr)\n\n    c = clr.rotate_ryb(angle)\n    if clr.brightness < 0.5:\n        c.brightness += 0.2\n    else:\n        c.brightness -= -0.2\n    colors.append(c)\n\n    c = clr.rotate_ryb(angle * 2)\n    if clr.brightness < 0.5:\n        c.brightness += 0.1\n    else:\n        c.brightness -= -0.1\n    colors.append(c)\n\n    colors.append(clr.rotate_ryb(angle * 3).lighten(0.1))\n\n    return colors"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef compound(clr, flip=False):\n    def _wrap(x, min, threshold, plus):\n        if x - min < threshold:\n            return x + plus\n        else:\n            return x - min\n\n    d = 1\n    if flip: d = -1\n\n    clr = color(clr)\n    colors = colorlist(clr)\n\n    c = clr.rotate_ryb(30 * d)\n    c.brightness = _wrap(clr.brightness, 0.25, 0.6, 0.25)\n    colors.append(c)\n\n    c = clr.rotate_ryb(30 * d)\n    c.saturation = _wrap(clr.saturation, 0.4, 0.1, 0.4)\n    c.brightness = _wrap(clr.brightness, 0.4, 0.2, 0.4)\n    colors.append(c)\n\n    c = clr.rotate_ryb(160 * d)\n    c.saturation = _wrap(clr.saturation, 0.25, 0.1, 0.25)\n    c.brightness = max(0.2, clr.brightness)\n    colors.append(c)\n\n    c = clr.rotate_ryb(150 * d)\n    c.saturation = _wrap(clr.saturation, 0.1, 0.8, 0.1)\n    c.brightness = _wrap(clr.brightness, 0.3, 0.6, 0.3)\n    colors.append(c)\n\n    c = clr.rotate_ryb(150 * d)\n    c.saturation = _wrap(clr.saturation, 0.1, 0.8, 0.1)\n    c.brightness = _wrap(clr.brightness, 0.4, 0.2, 0.4)\n    # colors.append(c)\n\n    return colors", "response": "Returns a list of colors that are compound of the given colorlist."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef outline(path, colors, precision=0.4, continuous=True):\n    # The count of points in a given path/contour.\n    def _point_count(path, precision):\n        return max(int(path.length * precision * 0.5), 10)\n\n    # The total count of points in the path.\n    n = sum([_point_count(contour, precision) for contour in path.contours])\n\n    # For a continuous gradient,\n    # we need to calculate a subrange in the list of colors\n    # for each contour to draw colors from.\n    contour_i = 0\n    contour_n = len(path.contours) - 1\n    if contour_n == 0: continuous = False\n\n    i = 0\n    for contour in path.contours:\n\n        if not continuous: i = 0\n\n        # The number of points for each contour.\n        j = _point_count(contour, precision)\n\n        first = True\n        for pt in contour.points(j):\n            if first:\n                first = False\n            else:\n                if not continuous:\n                    # If we have a list of 100 colors and 50 points,\n                    # point i maps to color i*2.\n                    clr = float(i) / j * len(colors)\n                else:\n                    # In a continuous gradient of 100 colors,\n                    # the 2nd contour in a path with 10 contours\n                    # draws colors between 10-20\n                    clr = float(i) / n * len(colors) - 1 * contour_i / contour_n\n                _ctx.stroke(colors[int(clr)])\n                _ctx.line(x0, y0, pt.x, pt.y)\n            x0 = pt.x\n            y0 = pt.y\n            i += 1\n\n        pt = contour.point(0.9999999)  # Fix in pathmatics!\n        _ctx.line(x0, y0, pt.x, pt.y)\n        contour_i += 1", "response": "Outline a path with a list of colors."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nguesses the shade and hue name of a color.", "response": "def guess_name(clr):\n    \"\"\"\n    Guesses the shade and hue name of a color.\n\n    If the given color is named in the named_colors list, return that name.\n    Otherwise guess its nearest hue and shade range.\n    \"\"\"\n    clr = Color(clr)\n\n    if clr.is_transparent: return \"transparent\"\n    if clr.is_black: return \"black\"\n    if clr.is_white: return \"white\"\n    if clr.is_black: return \"black\"\n\n    for name in named_colors:\n        try:\n            r, g, b = named_colors[name]\n        except:\n            continue\n        if r == clr.r and g == clr.g and b == clr.b:\n            return name\n\n    for shade in shades:\n        if clr in shade:\n            return shade.name + \" \" + clr.nearest_hue()\n            break\n\n    return clr.nearest_hue()"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef shader(x, y, dx, dy, radius=300, angle=0, spread=90):\n    if angle != None:\n        radius *= 2\n\n    # Get the distance and angle between point and light source.\n    d = sqrt((dx - x) ** 2 + (dy - y) ** 2)\n    a = degrees(atan2(dy - y, dx - x)) + 180\n\n    # If no angle is defined,\n    # light is emitted evenly in all directions\n    # and carries as far as the defined radius\n    # (e.g. like a radial gradient).\n    if d <= radius:\n        d1 = 1.0 * d / radius\n    else:\n        d1 = 1.0\n    if angle is None:\n        return 1 - d1\n\n    # Normalize the light's direction and spread\n    # between 0 and 360.\n    angle = 360 - angle % 360\n    spread = max(0, min(spread, 360))\n    if spread == 0:\n        return 0.0\n\n    # Objects that fall within the spreaded direction\n    # of the light are illuminated.\n    d = abs(a - angle)\n    if d <= spread / 2:\n        d2 = d / spread + d1\n    else:\n        d2 = 1.0\n\n    # Wrapping from 0 to 360:\n    # a light source with a direction of 10 degrees\n    # and a spread of 45 degrees illuminates\n    # objects between 0 and 35 degrees and 350 and 360 degrees.\n    if 360 - angle <= spread / 2:\n        d = abs(360 - angle + a)\n        if d <= spread / 2:\n            d2 = d / spread + d1\n    # Wrapping from 360 to 0.\n    if angle < spread / 2:\n        d = abs(360 + angle - a)\n        if d <= spread / 2:\n            d2 = d / spread + d1\n\n    return 1 - max(0, min(d2, 1))", "response": "Calculates the brightness of a light source at x y and dx dy and radius influences the strength of the light source at x y and dy."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nsearching the theme cache for a specific color and range.", "response": "def search_engine(query, top=5, service=\"google\", license=None,\n                  cache=os.path.join(DEFAULT_CACHE, \"google\")):\n    \"\"\"\n    Return a color aggregate from colors and ranges parsed from the web.\n    T. De Smedt, http://nodebox.net/code/index.php/Prism\n    \"\"\"\n    # Check if we have cached information first.\n    try:\n        a = theme(query, cache=cache)\n        return a\n    except:\n        pass\n\n    if service == \"google\":\n        from web import google\n        search_engine = google\n    if service == \"yahoo\":\n        from web import yahoo\n        search_engine = yahoo\n        if license:\n            yahoo.license_key = license\n\n    # Sort all the primary hues (plus black and white) for q.\n    sorted_colors = search_engine.sort(\n        [h for h in primary_hues] + [\"black\", \"white\"],\n        context=query, strict=True, cached=True\n    )\n\n    # Sort all the shades (bright, hard, ...) for q.\n    sorted_shades = search_engine.sort(\n        [str(s) for s in shades],\n        context=query, strict=True, cached=True\n    )\n\n    # Reforms '\"black death\"' to 'black'.\n    f = lambda x: x.strip(\"\\\"\").split()[0]\n\n    # Take the top most relevant hues.\n    n2 = sum([w for h, w in sorted_colors[:top]])\n    sorted_colors = [(color(f(h)), w / n2) for h, w in sorted_colors[:top]]\n\n    # Take the three most relevant shades.\n    n2 = sum([w for s, w in sorted_shades[:3]])\n    sorted_shades = [(shade(f(s)), w / n2) for s, w in sorted_shades[:3]]\n\n    a = theme(cache=cache)\n    a.name = query\n    for clr, w1 in sorted_colors:\n        for rng, w2 in sorted_shades:\n            a.add_range(rng, clr, w1 * w2)\n\n    a._save()\n    return a"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef morguefile(query, n=10, top=10):\n\n    from web import morguefile\n    images = morguefile.search(query)[:top]\n    path = choice(images).download(thumbnail=True, wait=10)\n\n    return ColorList(path, n, name=query)", "response": "Returns a list of colors drawn from a morgueFile image."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef str_to_rgb(self, str):\n\n        \"\"\" Returns RGB values based on a descriptive string.\n\n        If the given str is a named color, return its RGB values.\n        Otherwise, return a random named color that has str\n        in its name, or a random named color which name appears in str.\n\n        Specific suffixes (-ish, -ed, -y and -like) are recognised\n        as well, for example, if you need a random variation of \"red\"\n        you can use reddish (or greenish, yellowy, etc.)\n\n        \"\"\"\n\n        str = str.lower()\n        for ch in \"_- \":\n            str = str.replace(ch, \"\")\n\n        # if named_hues.has_key(str):\n        #    clr = color(named_hues[str], 1, 1, mode=\"hsb\")\n        #    return clr.r, clr.g, clr.b\n\n        if named_colors.has_key(str):\n            return named_colors[str]\n\n        for suffix in [\"ish\", \"ed\", \"y\", \"like\"]:\n            str = re.sub(\"(.*?)\" + suffix + \"$\", \"\\\\1\", str)\n        str = re.sub(\"(.*?)dd$\", \"\\\\1d\", str)\n\n        matches = []\n        for name in named_colors:\n            if name in str or str in name:\n                matches.append(named_colors[name])\n        if len(matches) > 0:\n            return choice(matches)\n\n        return named_colors[\"transparent\"]", "response": "Converts a string to a named color or a random named color."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nrotating the color wheel by the given angle.", "response": "def rotate_ryb(self, angle=180):\n\n        \"\"\" Returns a color rotated on the artistic RYB color wheel.\n\n        An artistic color wheel has slightly different opposites\n        (e.g. purple-yellow instead of purple-lime).\n        It is mathematically incorrect but generally assumed\n        to provide better complementary colors.\n\n        http://en.wikipedia.org/wiki/RYB_color_model\n\n        \"\"\"\n\n        h = self.h * 360\n        angle = angle % 360\n\n        # Approximation of Itten's RYB color wheel.\n        # In HSB, colors hues range from 0-360.\n        # However, on the artistic color wheel these are not evenly distributed.\n        # The second tuple value contains the actual distribution.\n        wheel = [\n            (0, 0), (15, 8),\n            (30, 17), (45, 26),\n            (60, 34), (75, 41),\n            (90, 48), (105, 54),\n            (120, 60), (135, 81),\n            (150, 103), (165, 123),\n            (180, 138), (195, 155),\n            (210, 171), (225, 187),\n            (240, 204), (255, 219),\n            (270, 234), (285, 251),\n            (300, 267), (315, 282),\n            (330, 298), (345, 329),\n            (360, 0)\n        ]\n\n        # Given a hue, find out under what angle it is\n        # located on the artistic color wheel.\n        for i in _range(len(wheel) - 1):\n            x0, y0 = wheel[i]\n            x1, y1 = wheel[i + 1]\n            if y1 < y0:\n                y1 += 360\n            if y0 <= h <= y1:\n                a = 1.0 * x0 + (x1 - x0) * (h - y0) / (y1 - y0)\n                break\n\n        # And the user-given angle (e.g. complement).\n        a = (a + angle) % 360\n\n        # For the given angle, find out what hue is\n        # located there on the artistic color wheel.\n        for i in _range(len(wheel) - 1):\n            x0, y0 = wheel[i]\n            x1, y1 = wheel[i + 1]\n            if y1 < y0:\n                y1 += 360\n            if x0 <= a <= x1:\n                h = 1.0 * y0 + (y1 - y0) * (a - x0) / (x1 - x0)\n                break\n\n        h = h % 360\n        return Color(h / 360, self.s, self.brightness, self.a, mode=\"hsb\", name=\"\")"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nreturns the name of the nearest named hue.", "response": "def nearest_hue(self, primary=False):\n\n        \"\"\" Returns the name of the nearest named hue.\n\n        For example,\n        if you supply an indigo color (a color between blue and violet),\n        the return value is \"violet\". If primary is set  to True,\n        the return value is \"purple\".\n\n        Primary colors leave out the fuzzy lime, teal,\n        cyan, azure and violet hues.\n\n        \"\"\"\n        if self.is_black:\n            return \"black\"\n        elif self.is_white:\n            return \"white\"\n        elif self.is_grey:\n            return \"grey\"\n\n        if primary:\n            hues = primary_hues\n        else:\n            hues = named_hues.keys()\n        nearest, d = \"\", 1.0\n        for hue in hues:\n            if abs(self.hue - named_hues[hue]) % 1 < d:\n                nearest, d = hue, abs(self.hue - named_hues[hue]) % 1\n\n        return nearest"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef blend(self, clr, factor=0.5):\n        r = self.r * (1 - factor) + clr.r * factor\n        g = self.g * (1 - factor) + clr.g * factor\n        b = self.b * (1 - factor) + clr.b * factor\n        a = self.a * (1 - factor) + clr.a * factor\n        return Color(r, g, b, a, mode=\"rgb\")", "response": "Blend this color with another color."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef distance(self, clr):\n        coord = lambda a, d: (cos(radians(a)) * d, sin(radians(a)) * d)\n        x0, y0 = coord(self.h * 360, self.s)\n        x1, y1 = coord(clr.h * 360, clr.s)\n        z0 = self.brightness\n        z1 = clr.brightness\n        d = sqrt((x1 - x0) ** 2 + (y1 - y0) ** 2 + (z1 - z0) ** 2)\n        return d", "response": "Returns the Euclidean distance between two colors."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef image_to_rgb(self, path, n=10):\n        from PIL import Image\n        img = Image.open(path)\n        p = img.getdata()\n        f = lambda p: choice(p)\n\n        for i in _range(n):\n            rgba = f(p)\n            rgba = _list(rgba)\n            if len(rgba) == 3:\n                rgba.append(255)\n            r, g, b, a = [v / 255.0 for v in rgba]\n            clr = color(r, g, b, a, mode=\"rgb\")\n            self.append(clr)", "response": "Convert an image to RGB colors."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nreturning the list of colors that have the given word in their context.", "response": "def context_to_rgb(self, str):\n        \"\"\" Returns the colors that have the given word in their context.\n\n        For example, the word \"anger\" appears\n        in black, orange and red contexts,\n        so the list will contain those three colors.\n\n        \"\"\"\n        matches = []\n        for clr in context:\n            tags = context[clr]\n            for tag in tags:\n                if tag.startswith(str) \\\n                        or str.startswith(tag):\n                    matches.append(clr)\n                    break\n\n        matches = [color(name) for name in matches]\n        return matches"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning the intersection of each color s context and each color s context.", "response": "def _context(self):\n        \"\"\"\n        Returns the intersection of each color's context.\n\n        Get the nearest named hue of each color,\n        and finds overlapping tags in each hue's colors.\n        For example, a list containing yellow, deeppink and olive\n        yields: femininity, friendship, happiness, joy.\n\n        \"\"\"\n        tags1 = None\n        for clr in self:\n            overlap = []\n            if clr.is_black:\n                name = \"black\"\n            elif clr.is_white:\n                name = \"white\"\n            elif clr.is_grey:\n                name = \"grey\"\n            else:\n                name = clr.nearest_hue(primary=True)\n            if name == \"orange\" and clr.brightness < 0.6:\n                name = \"brown\"\n            tags2 = context[name]\n            if tags1 is None:\n                tags1 = tags2\n            else:\n                for tag in tags2:\n                    if tag in tags1:\n                        if tag not in overlap:\n                            overlap.append(tag)\n                tags1 = overlap\n\n        overlap.sort()\n        return overlap"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns a deep copy of the list.", "response": "def copy(self):\n\n        \"\"\" Returns a deep copy of the list.\n        \"\"\"\n\n        return ColorList(\n            [color(clr.r, clr.g, clr.b, clr.a, mode=\"rgb\") for clr in self],\n            name=self.name,\n            tags=self.tags\n        )"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning the darkest color from the list.", "response": "def _darkest(self):\n        \"\"\"\n        Returns the darkest color from the list.\n\n        Knowing the contrast between a light and a dark swatch\n        can help us decide how to display readable typography.\n\n        \"\"\"\n        min, n = (1.0, 1.0, 1.0), 3.0\n        for clr in self:\n            if clr.r + clr.g + clr.b < n:\n                min, n = clr, clr.r + clr.g + clr.b\n\n        return min"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreturns one average color for the colors in the list.", "response": "def _average(self):\n        \"\"\"\n        Returns one average color for the colors in the list.\n        \"\"\"\n        r, g, b, a = 0, 0, 0, 0\n        for clr in self:\n            r += clr.r\n            g += clr.g\n            b += clr.b\n            a += clr.alpha\n\n        r /= len(self)\n        g /= len(self)\n        b /= len(self)\n        a /= len(self)\n\n        return color(r, g, b, a, mode=\"rgb\")"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef sort_by_distance(self, reversed=False):\n        if len(self) == 0: return ColorList()\n\n        # Find the darkest color in the list.\n        root = self[0]\n        for clr in self[1:]:\n            if clr.brightness < root.brightness:\n                root = clr\n\n        # Remove the darkest color from the stack,\n        # put it in the sorted list as starting element.\n        stack = [clr for clr in self]\n        stack.remove(root)\n        sorted = [root]\n\n        # Now find the color in the stack closest to that color.\n        # Take this color from the stack and add it to the sorted list.\n        # Now find the color closest to that color, etc.\n        while len(stack) > 1:\n            closest, distance = stack[0], stack[0].distance(sorted[-1])\n            for clr in stack[1:]:\n                d = clr.distance(sorted[-1])\n                if d < distance:\n                    closest, distance = clr, d\n            stack.remove(closest)\n            sorted.append(closest)\n        sorted.append(stack[0])\n\n        if reversed: _list.reverse(sorted)\n        return ColorList(sorted)", "response": "Returns a list with the smallest distance between two neighboring colors."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreturning a sorted copy of the color list according to the given comparison.", "response": "def _sorted_copy(self, comparison, reversed=False):\n        \"\"\"\n        Returns a sorted copy with the colors arranged according to the given comparison.\n        \"\"\"\n        sorted = self.copy()\n        _list.sort(sorted, comparison)\n        if reversed:\n            _list.reverse(sorted)\n        return sorted"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef cluster_sort(self, cmp1=\"hue\", cmp2=\"brightness\", reversed=False, n=12):\n        sorted = self.sort(cmp1)\n        clusters = ColorList()\n\n        d = 1.0\n        i = 0\n        for j in _range(len(sorted)):\n            if getattr(sorted[j], cmp1) < d:\n                clusters.extend(sorted[i:j].sort(cmp2))\n                d -= 1.0 / n\n                i = j\n        clusters.extend(sorted[i:].sort(cmp2))\n        if reversed: _list.reverse(clusters)\n        return clusters", "response": "Sorts the list by cmp1 then cuts it into n pieces which are sorted by cmp2."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nreturning a reversed copy of the list.", "response": "def reverse(self):\n        \"\"\"\n        Returns a reversed copy of the list.\n        \"\"\"\n        colors = ColorList.copy(self)\n        _list.reverse(colors)\n        return colors"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nreturning a list that is a repetition of the given list.", "response": "def repeat(self, n=2, oscillate=False, callback=None):\n        \"\"\"\n        Returns a list that is a repetition of the given list.\n\n        When oscillate is True,\n        moves from the end back to the beginning,\n        and then from the beginning to the end, and so on.\n        \"\"\"\n        colorlist = ColorList()\n        colors = ColorList.copy(self)\n        for i in _range(n):\n            colorlist.extend(colors)\n            if oscillate: colors = colors.reverse()\n            if callback: colors = callback(colors)\n\n        return colorlist"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef swarm(self, x, y, r=100):\n        sc = _ctx.stroke(0, 0, 0, 0)\n        sw = _ctx.strokewidth(0)\n\n        _ctx.push()\n        _ctx.transform(_ctx.CORNER)\n        _ctx.translate(x, y)\n\n        for i in _range(r * 3):\n            clr = choice(self).copy()\n            clr.alpha -= 0.5 * random()\n            _ctx.fill(clr)\n            clr = choice(self)\n            _ctx.stroke(clr)\n            _ctx.strokewidth(10 * random())\n\n            _ctx.rotate(360 * random())\n\n            r2 = r * 0.5 * random()\n            _ctx.oval(r * random(), 0, r2, r2)\n        _ctx.pop()\n\n        _ctx.strokewidth(sw)\n        if sc is None:\n            _ctx.nostroke()\n        else:\n            _ctx.stroke(sc)", "response": "Fancy random ovals for all the colors in the list."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _interpolate(self, colors, n=100):\n\n        \"\"\" Returns intermediary colors for given list of colors.\n        \"\"\"\n\n        gradient = []\n        for i in _range(n):\n            l = len(colors) - 1\n            x = int(1.0 * i / n * l)\n            x = min(x + 0, l)\n            y = min(x + 1, l)\n\n            base = 1.0 * n / l * x\n            d = (i - base) / (1.0 * n / l)\n            r = colors[x].r * (1 - d) + colors[y].r * d\n            g = colors[x].g * (1 - d) + colors[y].g * d\n            b = colors[x].b * (1 - d) + colors[y].b * d\n            a = colors[x].a * (1 - d) + colors[y].a * d\n\n            gradient.append(color(r, g, b, a, mode=\"rgb\"))\n\n        gradient.append(colors[-1])\n        return gradient", "response": "Interpolates the given list of colors."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _cache(self):\n        n = self.steps\n\n        # Only one color in base list.\n        if len(self._colors) == 1:\n            ColorList.__init__(self, [self._colors[0] for i in _range(n)])\n            return\n\n        # Expand the base list so we can chop more accurately.\n        colors = self._interpolate(self._colors, 40)\n\n        # Chop into left half and right half.\n        # Make sure their ending and beginning match colors.\n        left = colors[:len(colors) / 2]\n        right = colors[len(colors) / 2:]\n        left.append(right[0])\n        right.insert(0, left[-1])\n\n        # Calculate left and right gradient proportionally to spread.\n        gradient = self._interpolate(left, int(n * self.spread))[:-1]\n        gradient.extend(\n            self._interpolate(right, n - int(n * self.spread))[1:]\n        )\n\n        if self.spread > 1: gradient = gradient[:n]\n        if self.spread < 0: gradient = gradient[-n:]\n        ColorList.__init__(self, gradient)", "response": "Generates a list of colors that can be used to generate the log entries for a specific base color."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef copy(self, clr=None, d=0.0):\n        cr = ColorRange()\n        cr.name = self.name\n\n        cr.h = deepcopy(self.h)\n        cr.s = deepcopy(self.s)\n        cr.b = deepcopy(self.b)\n        cr.a = deepcopy(self.a)\n\n        cr.grayscale = self.grayscale\n        if not self.grayscale:\n            cr.black = self.black.copy()\n            cr.white = self.white.copy()\n\n        if clr != None:\n            cr.h, cr.a = clr.h + d * (random() * 2 - 1), clr.a\n\n        return cr", "response": "Returns a copy of the range."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nreturn a random color from the color range.", "response": "def color(self, clr=None, d=0.035):\n        \"\"\"\n        Returns a color with random values in the defined h, s b, a ranges.\n\n        If a color is given, use that color's hue and alpha,\n        and generate its saturation and brightness from the shade.\n        The hue is varied with the given d.\n\n        In this way you could have a \"warm\" color range\n        that returns all kinds of warm colors.\n        When a red color is given as parameter it would generate\n        all kinds of warm red colors.\n        \"\"\"\n        # Revert to grayscale for black, white and grey hues.\n        if clr != None and not isinstance(clr, Color):\n            clr = color(clr)\n        if clr != None and not self.grayscale:\n            if clr.is_black: return self.black.color(clr, d)\n            if clr.is_white: return self.white.color(clr, d)\n            if clr.is_grey: return choice(\n                (self.black.color(clr, d), self.white.color(clr, d))\n            )\n\n        h, s, b, a = self.h, self.s, self.b, self.a\n        if clr != None:\n            h, a = clr.h + d * (random() * 2 - 1), clr.a\n\n        hsba = []\n        for v in [h, s, b, a]:\n            if isinstance(v, _list):\n                min, max = choice(v)\n            elif isinstance(v, tuple):\n                min, max = v\n            else:\n                min, max = v, v\n            hsba.append(min + (max - min) * random())\n\n        h, s, b, a = hsba\n        return color(h, s, b, a, mode=\"hsb\")"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nreturn True if the given color is part of this color range.", "response": "def contains(self, clr):\n        \"\"\"\n        Returns True if the given color is part of this color range.\n\n        Check whether each h, s, b, a component of the color\n        falls within the defined range for that component.\n\n        If the given color is grayscale,\n        checks against the definitions for black and white.\n        \"\"\"\n        if not isinstance(clr, Color):\n            return False\n\n        if not isinstance(clr, _list):\n            clr = [clr]\n\n        for clr in clr:\n\n            if clr.is_grey and not self.grayscale:\n                return (self.black.contains(clr) or \\\n                        self.white.contains(clr))\n\n            for r, v in [(self.h, clr.h), (self.s, clr.s), (self.b, clr.brightness), (self.a, clr.a)]:\n                if isinstance(r, _list):\n                    pass\n                elif isinstance(r, tuple):\n                    r = [r]\n                else:\n                    r = [(r, r)]\n                for min, max in r:\n                    if not (min <= v <= max):\n                        return False\n\n        return True"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _weight_by_hue(self):\n        grouped = {}\n        weights = []\n        for clr, rng, weight in self.ranges:\n            h = clr.nearest_hue(primary=False)\n            if grouped.has_key(h):\n                ranges, total_weight = grouped[h]\n                ranges.append((clr, rng, weight))\n                total_weight += weight\n                grouped[h] = (ranges, total_weight)\n            else:\n                grouped[h] = ([(clr, rng, weight)], weight)\n\n        # Calculate the normalized (0.0-1.0) weight for each hue,\n        # and transform the dictionary to a list.\n        s = 1.0 * sum([w for r, w in grouped.values()])\n        grouped = [(grouped[h][1], grouped[h][1] / s, h, grouped[h][0]) for h in grouped]\n        grouped.sort()\n        grouped.reverse()\n\n        return grouped", "response": "Returns a list of tuples that are weight per hue."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _xml(self):\n        grouped = self._weight_by_hue()\n\n        xml = \"<colors query=\\\"\" + self.name + \"\\\" tags=\\\"\" + \", \".join(self.tags) + \"\\\">\\n\\n\"\n        for total_weight, normalized_weight, hue, ranges in grouped:\n            if hue == self.blue: hue = \"blue\"\n            clr = color(hue)\n            xml += \"\\t<color name=\\\"\" + clr.name + \"\\\" weight=\\\"\" + str(normalized_weight) + \"\\\">\\n \"\n            xml += \"\\t\\t<rgb r=\\\"\" + str(clr.r) + \"\\\" g=\\\"\" + str(clr.g) + \"\\\" \"\n            xml += \"b=\\\"\" + str(clr.b) + \"\\\" a=\\\"\" + str(clr.a) + \"\\\" />\\n \"\n            for clr, rng, wgt in ranges:\n                xml += \"\\t\\t<shade name=\\\"\" + str(rng) + \"\\\" weight=\\\"\" + str(wgt / total_weight) + \"\\\" />\\n \"\n            xml = xml.rstrip(\" \") + \"\\t</color>\\n\\n\"\n        xml += \"</colors>\"\n\n        return xml", "response": "Returns the color information as XML."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nsaves the color information in the cache as XML.", "response": "def _save(self):\n        \"\"\"\n        Saves the color information in the cache as XML.\n        \"\"\"\n        if not os.path.exists(self.cache):\n            os.makedirs(self.cache)\n\n        path = os.path.join(self.cache, self.name + \".xml\")\n        f = open(path, \"w\")\n        f.write(self.xml)\n        f.close()"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nloading a theme from aggregated web data.", "response": "def _load(self, top=5, blue=\"blue\", archive=None, member=None):\n        \"\"\"\n        Loads a theme from aggregated web data.\n\n        The data must be old-style Prism XML: <color>s consisting of <shade>s.\n        Colors named \"blue\" will be overridden with the blue parameter.\n\n        archive can be a file like object (e.g. a ZipFile)\n        and will be used along with 'member' if specified.\n        \"\"\"\n        if archive is None:\n            path = os.path.join(self.cache, self.name + \".xml\")\n            xml = open(path).read()\n        else:\n            assert member is not None\n            xml = archive.read(member)\n        dom = parseString(xml).documentElement\n\n        attr = lambda e, a: e.attributes[a].value\n\n        for e in dom.getElementsByTagName(\"color\")[:top]:\n            w = float(attr(e, \"weight\"))\n            try:\n                rgb = e.getElementsByTagName(\"rgb\")[0]\n                clr = color(\n                    float(attr(rgb, \"r\")),\n                    float(attr(rgb, \"g\")),\n                    float(attr(rgb, \"b\")),\n                    float(attr(rgb, \"a\")),\n                    mode=\"rgb\"\n                )\n                try:\n                    clr.name = attr(e, \"name\")\n                    if clr.name == \"blue\": clr = color(blue)\n                except:\n                    pass\n            except:\n                name = attr(e, \"name\")\n                if name == \"blue\": name = blue\n                clr = color(name)\n\n            for s in e.getElementsByTagName(\"shade\"):\n                self.ranges.append((\n                    clr,\n                    shade(attr(s, \"name\")),\n                    w * float(attr(s, \"weight\"))\n                ))"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef color(self, d=0.035):\n        s = sum([w for clr, rng, w in self.ranges])\n        r = random()\n        for clr, rng, weight in self.ranges:\n            if weight / s >= r: break\n            r -= weight / s\n\n        return rng(clr, d)", "response": "Returns a random color within the theme."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nreturn a number of random colors from the theme.", "response": "def colors(self, n=10, d=0.035):\n        \"\"\"\n        Returns a number of random colors from the theme.\n        \"\"\"\n        s = sum([w for clr, rng, w in self.ranges])\n        colors = colorlist()\n        for i in _range(n):\n            r = random()\n            for clr, rng, weight in self.ranges:\n                if weight / s >= r: break\n                r -= weight / s\n            colors.append(rng(clr, d))\n\n        return colors"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef swatch(self, x, y, w=35, h=35, padding=4, roundness=0, n=12, d=0.035, grouped=None):\n        if grouped is None:  # should be True or False\n            grouped = self.group_swatches\n\n        # If we dont't need to make groups,\n        # just display an individual column for each weight\n        # in the (color, range, weight) tuples.\n        if not grouped:\n            s = sum([wgt for clr, rng, wgt in self.ranges])\n            for clr, rng, wgt in self.ranges:\n                cols = max(1, int(wgt / s * n))\n                for i in _range(cols):\n                    rng.colors(clr, n=n, d=d).swatch(x, y, w, h, padding=padding, roundness=roundness)\n                    x += w + padding\n\n            return x, y + n * (h + padding)\n\n        # When grouped, combine hues and display them\n        # in batches of rows, then moving on to the next hue.\n        grouped = self._weight_by_hue()\n        for total_weight, normalized_weight, hue, ranges in grouped:\n            dy = y\n            rc = 0\n            for clr, rng, weight in ranges:\n                dx = x\n                cols = int(normalized_weight * n)\n                cols = max(1, min(cols, n - len(grouped)))\n                if clr.name == \"black\": rng = rng.black\n                if clr.name == \"white\": rng = rng.white\n                for i in _range(cols):\n                    rows = int(weight / total_weight * n)\n                    rows = max(1, rows)\n                    # Each column should add up to n rows,\n                    # if not due to rounding errors, add a row at the bottom.\n                    if (clr, rng, weight) == ranges[-1] and rc + rows < n: rows += 1\n                    rng.colors(clr, n=rows, d=d).swatch(dx, dy, w, h, padding=padding, roundness=roundness)\n                    dx += w + padding\n                dy += (w + padding) * rows  # + padding\n                rc = rows\n            x += (w + padding) * cols + padding\n\n        return x, dy", "response": "Draw a weighted swatch with approximately n columns and rows and padding."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef objs(self):\n        for obj in self.objects.itervalues():\n            if obj.sessionid in self.sessions:\n                yield obj", "response": "Returns a generator of tracked objects which are recognized with\n        this profile and are in the current session."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _append_element(self, render_func, pe):\n        '''\n        Append a render function and the parameters to pass\n        an equivilent PathElement, or the PathElement itself.\n        '''\n        self._render_funcs.append(render_func)\n        self._elements.append(pe)", "response": "Append a render function and the parameters to pass\n        an equivilent PathElement."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _get_bounds(self):\n        '''\n        Return cached bounds of this Grob.\n        If bounds are not cached, render to a meta surface, and\n        keep the meta surface and bounds cached.\n        '''\n        if self._bounds:\n            return self._bounds\n\n        record_surface = cairo.RecordingSurface(cairo.CONTENT_COLOR_ALPHA, (-1, -1, 1, 1))\n        dummy_ctx = cairo.Context(record_surface)\n        self._traverse(dummy_ctx)\n\n        self._bounds = dummy_ctx.path_extents()\n        return self._bounds", "response": "Get the bounds of this Grob."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nreturn True if the grob contains the given point.", "response": "def contains(self, x, y):\n        '''\n        Return cached bounds of this Grob.\n        If bounds are not cached, render to a meta surface, and\n        keep the meta surface and bounds cached.\n        '''\n        if self._bounds:\n            return self._bounds\n\n        record_surface = cairo.RecordingSurface(cairo.CONTENT_COLOR_ALPHA, (-1, -1, 1, 1))\n        dummy_ctx = cairo.Context(record_surface)\n        self._traverse(dummy_ctx)\n\n        in_fill = dummy_ctx.in_fill(x, y)\n        return in_fill"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _get_center(self):\n        '''\n        Return cached bounds of this Grob.\n        If bounds are not cached, render to a meta surface, and\n        keep the meta surface and bounds cached.\n        '''\n        if self._center:\n            return self._center\n\n        # get the center point\n        (x1, y1, x2, y2) = self._get_bounds()\n        x = (x1 + x2) / 2\n        y = (y1 + y2) / 2\n\n        center = self._center = x, y\n        # TODO Cache function that draws using the RecordingSurface\n        # Save the context or surface (without the bounding box strokecolor)\n        # to optimise drawing\n        return center", "response": "Get the center point of the current grob."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _render_closure(self):\n        '''Use a closure so that draw attributes can be saved'''\n        fillcolor = self.fill\n        strokecolor = self.stroke\n        strokewidth = self.strokewidth\n\n        def _render(cairo_ctx):\n            '''\n            At the moment this is based on cairo.\n\n            TODO: Need to work out how to move the cairo specific\n                  bits somewhere else.\n            '''\n            # Go to initial point (CORNER or CENTER):\n            transform = self._call_transform_mode(self._transform)\n\n            if fillcolor is None and strokecolor is None:\n                # Fixes _bug_FillStrokeNofillNostroke.bot\n                return\n\n            cairo_ctx.set_matrix(transform)\n            # Run the path commands on the cairo context:\n            self._traverse(cairo_ctx)\n            # Matrix affects stroke, so we need to reset it:\n            cairo_ctx.set_matrix(cairo.Matrix())\n\n            if fillcolor is not None and strokecolor is not None:\n                if strokecolor[3] < 1:\n                    # Draw onto intermediate surface so that stroke\n                    # does not overlay fill\n                    cairo_ctx.push_group()\n\n                    cairo_ctx.set_source_rgba(*fillcolor)\n                    cairo_ctx.fill_preserve()\n\n                    e = cairo_ctx.stroke_extents()\n                    cairo_ctx.set_source_rgba(*strokecolor)\n                    cairo_ctx.set_operator(cairo.OPERATOR_SOURCE)\n                    cairo_ctx.set_line_width(strokewidth)\n                    cairo_ctx.stroke()\n\n                    cairo_ctx.pop_group_to_source()\n                    cairo_ctx.paint()\n                else:\n                    # Fast path if no alpha in stroke\n                    cairo_ctx.set_source_rgba(*fillcolor)\n                    cairo_ctx.fill_preserve()\n\n                    cairo_ctx.set_source_rgba(*strokecolor)\n                    cairo_ctx.set_line_width(strokewidth)\n                    cairo_ctx.stroke()\n            elif fillcolor is not None:\n                cairo_ctx.set_source_rgba(*fillcolor)\n                cairo_ctx.fill()\n            elif strokecolor is not None:\n                cairo_ctx.set_source_rgba(*strokecolor)\n                cairo_ctx.set_line_width(strokewidth)\n                cairo_ctx.stroke()\n\n        return _render", "response": "Render the image on the context."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nlocates t on a specific segment in the path.", "response": "def _locate(self, t, segments=None):\n        \"\"\" Locates t on a specific segment in the path.\n            Returns (index, t, PathElement)\n            A path is a combination of lines and curves (segments).\n            The returned index indicates the start of the segment that contains point t.\n            The returned t is the absolute time on that segment,\n            in contrast to the relative t on the whole of the path.\n            The returned point is the last MOVETO, any subsequent CLOSETO after i closes to that point.\n            When you supply the list of segment lengths yourself, as returned from length(path, segmented=True),\n            point() works about thirty times faster in a for-loop since it doesn't need to recalculate\n            the length during each iteration.\n        \"\"\"\n        # Originally from nodebox-gl\n        if segments is None:\n            segments = self._segment_lengths(relative=True)\n        if len(segments) == 0:\n            raise PathError, \"The given path is empty\"\n        for i, el in enumerate(self._get_elements()):\n            if i == 0 or el.cmd == MOVETO:\n                closeto = Point(el.x, el.y)\n            if t <= segments[i] or i == len(segments) - 1:\n                break\n            else:\n                t -= segments[i]\n        try:\n            t /= segments[i]\n        except ZeroDivisionError:\n            pass\n        if i == len(segments) - 1 and segments[i] == 0:\n            i -= 1\n        return (i, t, closeto)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef point(self, t, segments=None):\n        # Originally from nodebox-gl\n        if len(self._elements) == 0:\n            raise PathError(\"The given path is empty\")\n\n        if self._segments is None:\n            self._segments = self._get_length(segmented=True, precision=10)\n\n        i, t, closeto = self._locate(t, segments=self._segments)\n        x0, y0 = self[i].x, self[i].y\n        p1 = self[i + 1]\n        if p1.cmd == CLOSE:\n            x, y = self._linepoint(t, x0, y0, closeto.x, closeto.y)\n            return PathElement(LINETO, x, y)\n        elif p1.cmd in (LINETO, MOVETO):\n            x1, y1 = p1.x, p1.y\n            x, y = self._linepoint(t, x0, y0, x1, y1)\n            return PathElement(LINETO, x, y)\n        elif p1.cmd == CURVETO:\n            # Note: the handles need to be interpreted differenty than in a BezierPath.\n            # In a BezierPath, ctrl1 is how the curve started, and ctrl2 how it arrives in this point.\n            # Here, ctrl1 is how the curve arrives, and ctrl2 how it continues to the next point.\n            x3, y3, x1, y1, x2, y2 = p1.x, p1.y, p1.ctrl1.x, p1.ctrl1.y, p1.ctrl2.x, p1.ctrl2.y\n            x, y, c1x, c1y, c2x, c2y = self._curvepoint(t, x0, y0, x1, y1, x2, y2, x3, y3)\n            return PathElement(CURVETO, c1x, c1y, c2x, c2y, x, y)\n        else:\n            raise PathError(\"Unknown cmd '%s' for p1 %s\" % (p1.cmd, p1))", "response": "Returns the coordinates for a point on the path."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef points(self, amount=100, start=0.0, end=1.0, segments=None):\n        # Originally from nodebox-gl\n        if len(self._elements) == 0:\n            raise PathError(\"The given path is empty\")\n        n = end - start\n        d = n\n        if amount > 1:\n            # The delta value is divided by amount-1, because we also want the last point (t=1.0)\n            # If we don't use amount-1, we fall one point short of the end.\n            # If amount=4, we want the point at t 0.0, 0.33, 0.66 and 1.0.\n            # If amount=2, we want the point at t 0.0 and 1.0.\n            d = float(n) / (amount - 1)\n        for i in xrange(int(amount)):\n            yield self.point(start + d * i, segments)", "response": "Returns an iterator with a list of points for the given path."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _linepoint(self, t, x0, y0, x1, y1):\n        # Originally from nodebox-gl\n        out_x = x0 + t * (x1 - x0)\n        out_y = y0 + t * (y1 - y0)\n        return (out_x, out_y)", "response": "Returns the coordinates of the line that points at t on the line."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn the length of the line.", "response": "def _linelength(self, x0, y0, x1, y1):\n        \"\"\" Returns the length of the line.\n        \"\"\"\n        # Originally from nodebox-gl\n        a = pow(abs(x0 - x1), 2)\n        b = pow(abs(y0 - y1), 2)\n        return sqrt(a + b)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _curvepoint(self, t, x0, y0, x1, y1, x2, y2, x3, y3, handles=False):\n        # Originally from nodebox-gl\n        mint = 1 - t\n        x01 = x0 * mint + x1 * t\n        y01 = y0 * mint + y1 * t\n        x12 = x1 * mint + x2 * t\n        y12 = y1 * mint + y2 * t\n        x23 = x2 * mint + x3 * t\n        y23 = y2 * mint + y3 * t\n        out_c1x = x01 * mint + x12 * t\n        out_c1y = y01 * mint + y12 * t\n        out_c2x = x12 * mint + x23 * t\n        out_c2y = y12 * mint + y23 * t\n        out_x = out_c1x * mint + out_c2x * t\n        out_y = out_c1y * mint + out_c2y * t\n        if not handles:\n            return (out_x, out_y, out_c1x, out_c1y, out_c2x, out_c2y)\n        else:\n            return (out_x, out_y, out_c1x, out_c1y, out_c2x, out_c2y, x01, y01, x23, y23)", "response": "Returns the coordinates of a curve point at t."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _segment_lengths(self, relative=False, n=20):\n        # From nodebox_gl\n        lengths = []\n        first = True\n        for el in self._get_elements():\n            if first is True:\n                close_x, close_y = el.x, el.y\n                first = False\n            elif el.cmd == MOVETO:\n                close_x, close_y = el.x, el.y\n                lengths.append(0.0)\n            elif el.cmd == CLOSE:\n                lengths.append(self._linelength(x0, y0, close_x, close_y))\n            elif el.cmd == LINETO:\n                lengths.append(self._linelength(x0, y0, el.x, el.y))\n            elif el.cmd == CURVETO:\n                x3, y3, x1, y1, x2, y2 = el.x, el.y, el.c1x, el.c1y, el.c2x, el.c2y\n                # (el.c1x, el.c1y, el.c2x, el.c2y, el.x, el.y)\n                lengths.append(self._curvelength(x0, y0, x1, y1, x2, y2, x3, y3, n))\n            if el.cmd != CLOSE:\n                x0 = el.x\n                y0 = el.y\n        if relative:\n            length = sum(lengths)\n            try:\n                # Relative segment lengths' sum is 1.0.\n                return map(lambda l: l / length, lengths)\n            except ZeroDivisionError:\n                # If the length is zero, just return zero for all segments\n                return [0.0] * len(lengths)\n        else:\n            return lengths", "response": "Returns a list with the lengths of each segment in the path."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning the length of the path.", "response": "def _get_length(self, segmented=False, precision=10):\n        \"\"\" Returns the length of the path.\n            Calculates the length of each spline in the path, using n as a number of points to measure.\n            When segmented is True, returns a list containing the individual length of each spline\n            as values between 0.0 and 1.0, defining the relative length of each spline\n            in relation to the total path length.\n        \"\"\"\n        # Originally from nodebox-gl\n        if not segmented:\n            return sum(self._segment_lengths(n=precision), 0.0)\n        else:\n            return self._segment_lengths(relative=True, n=precision)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _get_elements(self):\n        '''\n        Yields all elements as PathElements\n        '''\n        for index, el in enumerate(self._elements):\n            if isinstance(el, tuple):\n                el = PathElement(*el)\n                self._elements[index] = el\n            yield el", "response": "Yields all elements as PathElements\n           "}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef depth_first_search(root, visit=lambda node: False, traversable=lambda node, edge: True):\n\n    \"\"\" Simple, multi-purpose depth-first search.\n    \n    Visits all the nodes connected to the root, depth-first.\n    The visit function is called on each node.\n    Recursion will stop if it returns True, and ubsequently dfs() will return True.\n    The traversable function takes the current node and edge,\n    and returns True if we are allowed to follow this connection to the next node.\n    For example, the traversable for directed edges is follows:\n    lambda node, edge: node == edge.node1\n    \n    Note: node._visited is expected to be False for all nodes.\n    \n    \"\"\"\n\n    stop = visit(root)\n    root._visited = True\n    for node in root.links:\n        if stop: return True\n        if not traversable(root, root.links.edge(node)): continue\n        if not node._visited:\n            stop = depth_first_search(node, visit, traversable)\n    return stop", "response": "Simple depth - first search."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef adjacency(graph, directed=False, reversed=False, stochastic=False, heuristic=None):\n    \n    \"\"\" An edge weight map indexed by node id's.\n    \n    A dictionary indexed by node id1's in which each value is a\n    dictionary of connected node id2's linking to the edge weight.\n    If directed, edges go from id1 to id2, but not the other way.\n    If stochastic, all the weights for the neighbors of a given node sum to 1.\n    A heuristic can be a function that takes two node id's and returns\n    and additional cost for movement between the two nodes.\n    \n    \"\"\"\n    \n    v = {}\n    for n in graph.nodes:\n        v[n.id] = {}\n    \n    for e in graph.edges:\n        \n        id1 = e.node1.id\n        id2 = e.node2.id\n        if reversed:\n            id1, id2 = id2, id1\n            \n        #if not v.has_key(id1): v[id1] = {}\n        #if not v.has_key(id2): v[id2] = {}\n        v[id1][id2] = 1.0 - e.weight*0.5\n        \n        if heuristic:\n            v[id1][id2] += heuristic(id1, id2)\n        \n        if not directed: \n            v[id2][id1] = v[id1][id2]\n        \n    if stochastic:\n        for id1 in v:\n            d = sum(v[id1].values())\n            for id2 in v[id1]: \n                v[id1][id2] /= d\n    \n    return v", "response": "Returns a dictionary of adjacency of a graph."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn a dictionary of brandes betweenness centrality for nodes in the graph.", "response": "def brandes_betweenness_centrality(graph, normalized=True):\n\n    \"\"\" Betweenness centrality for nodes in the graph.\n    \n    Betweenness centrality is a measure of the number of shortests paths that pass through a node.\n    Nodes in high-density areas will get a good score.\n    \n    The algorithm is Brandes' betweenness centrality,\n    from NetworkX 0.35.1: Aric Hagberg, Dan Schult and Pieter Swart,\n    based on Dijkstra's algorithm for shortest paths modified from Eppstein.\n    https://networkx.lanl.gov/wiki\n    \n    \"\"\"\n\n    G = graph.keys()\n    W = adjacency(graph)\n    \n    betweenness = dict.fromkeys(G, 0.0) # b[v]=0 for v in G\n    for s in G: \n        S = [] \n        P = {} \n        for v in G: P[v] = [] \n        sigma = dict.fromkeys(G, 0) # sigma[v]=0 for v in G \n        D = {} \n        sigma[s] = 1\n        seen = { s: 0 }  \n        Q = [] # use Q as heap with (distance, node id) tuples \n        heapq.heappush(Q, (0, s, s)) \n        while Q:    \n            (dist, pred, v) = heapq.heappop(Q) \n            if v in D: continue # already searched this node\n            sigma[v] = sigma[v] + sigma[pred] # count paths \n            S.append(v) \n            D[v] = seen[v] \n            for w in graph[v].links:\n                \n                w = w.id\n                vw_dist = D[v] + W[v][w]\n                \n                if w not in D and (w not in seen or vw_dist < seen[w]): \n                    seen[w] = vw_dist \n                    heapq.heappush(Q, (vw_dist, v, w)) \n                    P[w] = [v] \n                elif vw_dist == seen[w]: # handle equal paths \n                    sigma[w] = sigma[w] + sigma[v] \n                    P[w].append(v)\n                    \n        delta = dict.fromkeys(G,0)  \n        while S: \n            w = S.pop() \n            for v in P[w]: \n                delta[v] = delta[v] + (float(sigma[v]) / float(sigma[w])) * (1.0 + delta[w]) \n            if w != s: \n                betweenness[w] = betweenness[w] + delta[w]\n\n        #-----------------------------------\n        if normalized:\n            # Normalize between 0.0 and 1.0.\n            m = max(betweenness.values())\n            if m == 0: m = 1\n        else:\n            m = 1\n            \n        betweenness = dict([(id, w/m) for id, w in betweenness.iteritems()])\n        return betweenness"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef eigenvector_centrality(graph, normalized=True, reversed=True, rating={},\n                           start=None, iterations=100, tolerance=0.0001):\n\n    \"\"\" Eigenvector centrality for nodes in the graph (like Google's PageRank).\n    \n    Eigenvector centrality is a measure of the importance of a node in a directed network. \n    It rewards nodes with a high potential of (indirectly) connecting to high-scoring nodes.\n    Nodes with no incoming connections have a score of zero.\n    If you want to measure outgoing connections, reversed should be False.\n\n    The eigenvector calculation is done by the power iteration method.\n    It has no guarantee of convergence.\n    A starting vector for the power iteration can be given in the start dict.\n    \n    You can adjust the importance of a node with the rating dictionary,\n    which links node id's to a score.\n    \n    The algorithm is adapted from NetworkX, Aric Hagberg (hagberg@lanl.gov):\n    https://networkx.lanl.gov/attachment/ticket/119/eigenvector_centrality.py\n\n    \"\"\"\n\n    G = graph.keys()     \n    W = adjacency (graph, directed=True, reversed=reversed)\n\n    def _normalize(x):\n        s = sum(x.values())\n        if s != 0: s = 1.0 / s\n        for k in x: \n            x[k] *= s\n    \n    x = start\n    if x is None:\n        x = dict([(n, random()) for n in G])\n    _normalize(x)\n\n    # Power method: y = Ax multiplication.\n    for i in range(iterations):\n        x0 = x\n        x = dict.fromkeys(x0.keys(), 0)\n        for n in x:\n            for nbr in W[n]:\n                r = 1\n                if rating.has_key(n): r = rating[n]\n                x[n] += 0.01 + x0[nbr] * W[n][nbr] * r\n        _normalize(x)          \n        e = sum([abs(x[n]-x0[n]) for n in x])\n        if e < len(graph.nodes) * tolerance:\n            if normalized:\n                # Normalize between 0.0 and 1.0.\n                m = max(x.values())\n                if m == 0: m = 1\n                x = dict([(id, w/m) for id, w in x.iteritems()])\n            return x\n\n    #raise NoConvergenceError\n    warn(\"node weight is 0 because eigenvector_centrality() did not converge.\", Warning)\n    return dict([(n, 0) for n in G])", "response": "Eigenvector centrality for nodes in a directed network."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef examples_menu(root_dir=None, depth=0):\n    # pre 3.12 menus\n    examples_dir = ide_utils.get_example_dir()\n    if not examples_dir:\n        return \"\", [], []\n\n    root_dir = root_dir or examples_dir\n\n    file_tmpl = '<menuitem name=\"{name}\" action=\"{action}\"/>'\n    dir_tmpl = '<menu name=\"{name}\" action=\"{action}\">{menu}</menu>'\n\n    file_actions = []\n    submenu_actions = []\n\n    xml = \"\"\n\n    for fn in sorted(os.listdir(root_dir)):\n        path = os.path.join(root_dir, fn)\n        rel_path = path[len(examples_dir):]\n        if os.path.isdir(path):\n            action = 'ShoebotExampleMenu {0}'.format(rel_path)\n            label = fn.capitalize()\n\n            sm_xml, sm_file_actions, sm_menu_actions = examples_menu(os.path.join(root_dir, fn), depth+1)\n\n            submenu_actions.extend(sm_menu_actions)\n            file_actions.extend(sm_file_actions)\n            submenu_actions.append((action, label))\n            xml += dir_tmpl.format(name=fn, action=action, menu=sm_xml)\n        elif os.path.splitext(path)[1] in ['.bot', '.py'] and not fn.startswith('_'):\n            action = 'ShoebotExampleOpen {0}'.format(rel_path)\n            label = ide_utils.make_readable_filename(fn)\n\n            xml += file_tmpl.format(name=fn, action=action)\n            file_actions.append((action, label))\n\n    return xml, file_actions, submenu_actions", "response": "returns xml for menu and sub - menus"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef mk_examples_menu(text, root_dir=None, depth=0):\n    # 3.12+ menus\n    examples_dir = ide_utils.get_example_dir()\n    if not examples_dir:\n        return None, []\n\n    root_dir = root_dir or examples_dir    \n\n    file_actions = []\n    \n    menu = Gio.Menu.new()\n    base_item = Gio.MenuItem.new_submenu(text, menu)\n\n    for fn in sorted(os.listdir(root_dir)):\n        path = os.path.join(root_dir, fn)\n        rel_path = path[len(examples_dir):]\n        if os.path.isdir(path):\n            label = fn.capitalize()\n\n            item, sm_file_actions = mk_examples_menu(label, os.path.join(root_dir, fn))\n            menu.append_item(item)\n\n            file_actions.extend(sm_file_actions)\n        elif os.path.splitext(path)[1] in ['.bot', '.py'] and not fn.startswith('_'):\n            label = ide_utils.make_readable_filename(fn)\n\n            # the only way I could work out to attach the data to the menu item is in the name :/\n            action_name = \"win.open_example__%s\" % encode_relpath(rel_path)\n\n            menu.append(label, action_name)\n            file_actions.append(rel_path)\n\n    return base_item, file_actions", "response": "Create a menu with examples."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_child_by_name(parent, name):\n    # http://stackoverflow.com/questions/2072976/access-to-widget-in-gtk\n    def iterate_children(widget, name):\n        if widget.get_name() == name:\n            return widget\n        try:\n            for w in widget.get_children():\n                result = iterate_children(w, name)\n                if result is not None:\n                    return result\n                else:\n                    continue\n        except AttributeError:\n            pass\n    return iterate_children(parent, name)", "response": "Iterate through a gtk container parent and return the widget with the name name."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef venv_has_script(script):\n    def f(venv):\n        path=os.path.join(venv, 'bin', script)\n        if os.path.isfile(path):\n            return True\n    return f", "response": "returns function that checks if venv has script"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nchecks if a directory is a virtualenv.", "response": "def is_venv(directory, executable='python'):\n    \"\"\"\n    :param directory: base directory of python environment\n    \"\"\"\n    path=os.path.join(directory, 'bin', executable)\n    return os.path.isfile(path)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef vw_envs(filter=None):\n    vw_root=os.path.abspath(os.path.expanduser(os.path.expandvars('~/.virtualenvs')))\n    venvs=[]\n    for directory in os.listdir(vw_root):\n        venv=os.path.join(vw_root, directory)\n        if os.path.isdir(os.path.join(venv)):\n            if filter and not filter(venv):\n                continue\n            venvs.append(venv)\n    return sorted(venvs)", "response": "Return a list of python environments in ~. virtualenvs."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef sbot_executable():\n    gsettings=load_gsettings()\n    venv = gsettings.get_string('current-virtualenv')\n    if venv == 'Default':\n        sbot = which('sbot')\n    elif venv == 'System':\n        # find system python\n        env_venv = os.environ.get('VIRTUAL_ENV')\n        if not env_venv:\n            return which('sbot')\n\n        # First sbot in path that is not in current venv\n        for p in os.environ['PATH'].split(os.path.pathsep):\n            sbot='%s/sbot' % p\n            if not p.startswith(env_venv) and os.path.isfile(sbot):\n                return sbot\n    else:\n        sbot = os.path.join(venv, 'bin/sbot')\n        if not os.path.isfile(sbot):\n            print('Shoebot not found, reverting to System shoebot')\n            sbot = which('sbot')\n    return os.path.realpath(sbot)", "response": "Find the executable of the current SBOT"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _description(self):\n        \n        \"\"\" Returns the meta description in the page.\n        \"\"\"        \n\n        meta = self.find(\"meta\", {\"name\":\"description\"})\n        if isinstance(meta, dict) and \\\n           meta.has_key(\"content\"):\n            return meta[\"content\"]\n        else:\n            return u\"\"", "response": "Returns the meta description in the page."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreturn the meta keywords in the page.", "response": "def _keywords(self):\n        \n        \"\"\" Returns the meta keywords in the page.\n        \"\"\"\n        \n        meta = self.find(\"meta\", {\"name\":\"keywords\"})\n        if isinstance(meta, dict) and \\\n           meta.has_key(\"content\"):\n            keywords = [k.strip() for k in meta[\"content\"].split(\",\")]\n        else:\n            keywords = []\n            \n        return keywords"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef links(self, external=True):\n        \n        \"\"\" Retrieves links in the page.\n        \n        Returns a list of URL's.\n        By default, only external URL's are returned.\n        External URL's starts with http:// and point to another\n        domain than the domain the page is on.\n        \n        \"\"\"\n        \n        domain = URLParser(self.url).domain\n        \n        links = []\n        for a in self(\"a\"):\n            for attribute, value in a.attrs:\n                if attribute == \"href\":\n                    if not external \\\n                    or (value.startswith(\"http://\") and value.find(\"http://\"+domain) < 0):\n                        links.append(value)\n                        \n        return links", "response": "Retrieves the links in the page."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef sorted(list, cmp=None, reversed=False):\n    list = [x for x in list]\n    list.sort(cmp)\n    if reversed: list.reverse()\n    return list", "response": "Returns a sorted copy of the list."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef unique(list):\n    unique = []; [unique.append(x) for x in list if x not in unique]\n    return unique", "response": "Returns a copy of the list without duplicates."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef subgraph(graph, id, distance=1):\n    \n    \"\"\" Creates the subgraph of the flattened node with given id (or list of id's).\n    Finds all the edges between the nodes that make up the subgraph.\n    \"\"\"\n    \n    g = graph.copy(empty=True)\n    \n    if isinstance(id, (FunctionType, LambdaType)):\n        # id can also be a lambda or function that returns True or False\n        # for each node in the graph. We take the id's of nodes that pass.\n        id = [node.id for node in filter(id, graph.nodes)]\n    if not isinstance(id, (list, tuple)):\n        id = [id]\n    for id in id:\n        for n in flatten(graph[id], distance):\n            g.add_node(n.id, n.r, n.style, n.category, n.label, (n==graph.root), n.__dict__)\n        \n    for e in graph.edges:\n        if g.has_key(e.node1.id) and \\\n           g.has_key(e.node2.id):\n            g.add_edge(e.node1.id, e.node2.id, e.weight, e.length, e.label, e.__dict__)\n    \n    # Should we look for shortest paths between nodes here?\n    \n    return g", "response": "Creates a subgraph of the flattened node with given id."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nreturn the largest possible clique for the given node with given id.", "response": "def clique(graph, id):\n    \n    \"\"\" Returns the largest possible clique for the node with given id.\n    \"\"\"\n    \n    clique = [id]\n    for n in graph.nodes:\n        friend = True\n        for id in clique:\n            if n.id == id or graph.edge(n.id, id) == None:\n                friend = False\n                break\n        if friend:\n            clique.append(n.id)\n    \n    return clique"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef cliques(graph, threshold=3):\n    \n    \"\"\" Returns all the cliques in the graph of at least the given size.\n    \"\"\"\n    \n    cliques = []\n    for n in graph.nodes:\n        c = clique(graph, n.id)\n        if len(c) >= threshold: \n            c.sort()\n            if c not in cliques:\n                cliques.append(c)\n    \n    return cliques", "response": "Returns all the cliques in the given graph."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef partition(graph):\n    \n    \"\"\" Splits unconnected subgraphs.\n    \n    For each node in the graph, make a list of its id and all directly connected id's.\n    If one of the nodes in this list intersects with a subgraph,\n    they are all part of that subgraph.\n    Otherwise, this list is part of a new subgraph.\n    Return a list of subgraphs sorted by size (biggest-first).\n    \n    \"\"\"\n    \n    g = []\n    for n in graph.nodes:\n        c = [n.id for n in flatten(n)]\n        f = False\n        for i in range(len(g)):\n            if len(intersection(g[i], c)) > 0:\n                g[i] = union(g[i], c)\n                f = True\n                break\n        if not f:\n            g.append(c)\n    \n    # If 1 is directly connected to 2 and 3,\n    # and 4 is directly connected to 5 and 6, these are separate subgraphs.\n    # If we later find that 7 is directly connected to 3 and 6,\n    # it will be attached to [1, 2, 3] yielding\n    # [1, 2, 3, 6, 7] and [4, 5, 6].\n    # These two subgraphs are connected and need to be merged.\n    merged = []\n    for i in range(len(g)):\n        merged.append(g[i])\n        for j in range(i+1, len(g)):\n            if len(intersection(g[i], g[j])) > 0:\n                merged[-1].extend(g[j])\n                g[j] = []\n    \n    g = merged\n    g = [graph.sub(g, distance=0) for g in g]\n    g.sort(lambda a, b: len(b) - len(a))             \n    \n    return g", "response": "Splits unconnected subgraphs into two lists."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef render(self, size, frame, drawqueue):\n        '''\n        Calls implmentation to get a render context,\n        passes it to the drawqueues render function\n        then calls self.rendering_finished\n        '''\n        r_context = self.create_rcontext(size, frame)\n        drawqueue.render(r_context)\n        self.rendering_finished(size, frame, r_context)\n        return r_context", "response": "Returns a new render context"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ntries to interpret the next 8 bytes of the data as a 64 - bit signed integer.", "response": "def readLong(data):\n    \"\"\"Tries to interpret the next 8 bytes of the data\n    as a 64-bit signed integer.\"\"\"\n    high, low = struct.unpack(\">ll\", data[0:8])\n    big = (long(high) << 32) + low\n    rest = data[8:]\n    return (big, rest)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef OSCBlob(next):\n\n    if type(next) == type(\"\"):\n        length = len(next)\n        padded = math.ceil((len(next)) / 4.0) * 4\n        binary = struct.pack(\">i%ds\" % (padded), length, next)\n        tag    = 'b'\n    else:\n        tag    = ''\n        binary = ''\n\n    return (tag, binary)", "response": "Convert a string into an OSC Blob and\n    returning a ( typetag data ) tuple."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nconverting some Python types to their OSC binary representations returning a ( typetag data ) tuple.", "response": "def OSCArgument(next):\n    \"\"\"Convert some Python types to their\n    OSC binary representations, returning a\n    (typetag, data) tuple.\"\"\"\n\n    if type(next) == type(\"\"):\n        OSCstringLength = math.ceil((len(next)+1) / 4.0) * 4\n        binary  = struct.pack(\">%ds\" % (OSCstringLength), next)\n        tag = \"s\"\n    elif type(next) == type(42.5):\n        binary  = struct.pack(\">f\", next)\n        tag = \"f\"\n    elif type(next) == type(13):\n        binary  = struct.pack(\">i\", next)\n        tag = \"i\"\n    else:\n        binary  = \"\"\n        tag = \"\"\n\n    return (tag, binary)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef parseArgs(args):\n    parsed = []\n    for arg in args:\n        print arg\n        arg = arg.strip()\n        interpretation = None\n        try:\n            interpretation = float(arg)\n            if string.find(arg, \".\") == -1:\n                interpretation = int(interpretation)\n        except:\n            # Oh - it was a string.\n            interpretation = arg\n            pass\n        parsed.append(interpretation)\n    return parsed", "response": "Given a list of strings produces a list\n    where those strings have been parsed as floats or integers."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef decodeOSC(data):\n    table = {\"i\":readInt, \"f\":readFloat, \"s\":readString, \"b\":readBlob}\n    decoded = []\n    address,  rest = readString(data)\n    typetags = \"\"\n\n    if address == \"#bundle\":\n        time, rest = readLong(rest)\n#       decoded.append(address)\n#       decoded.append(time)\n        while len(rest)>0:\n            length, rest = readInt(rest)\n            decoded.append(decodeOSC(rest[:length]))\n            rest = rest[length:]\n\n    elif len(rest) > 0:\n        typetags, rest = readString(rest)\n        decoded.append(address)\n        decoded.append(typetags)\n        if typetags[0] == \",\":\n            for tag in typetags[1:]:\n                value, rest = table[tag](rest)\n                decoded.append(value)\n        else:\n            print \"Oops, typetag lacks the magic ,\"\n\n    return decoded", "response": "Converts a typetagged OSC message to a Python list."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef append(self, argument, typehint = None):\n\n        if typehint == 'b':\n            binary = OSCBlob(argument)\n        else:\n            binary = OSCArgument(argument)\n\n        self.typetags = self.typetags + binary[0]\n        self.rawAppend(binary[1])", "response": "Appends data to the message"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef getBinary(self):\n        address  = OSCArgument(self.address)[1]\n        typetags = OSCArgument(self.typetags)[1]\n        return address + typetags + self.message", "response": "Returns the binary message with typetags."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef handle(self, data, source = None):\n        decoded = decodeOSC(data)\n        self.dispatch(decoded, source)", "response": "Given data tries to call the callback with the\n        right address."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef dispatch(self, message, source = None):\n        msgtype = \"\"\n        try:\n            if type(message[0]) == str:\n                # got a single message\n                address = message[0]\n                self.callbacks[address](message)\n            elif type(message[0]) == list:\n                for msg in message:\n                    self.dispatch(msg)\n        except KeyError, key:\n            print 'address %s not found, %s: %s' % (address, key, message)\n            pprint.pprint(message)\n        except IndexError, e:\n            print '%s: %s' % (e, message)\n            pass\n        except None, e:\n            print \"Exception in\", address, \"callback :\", e\n\n        return", "response": "Sends decoded OSC data to an appropriate calback"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef add(self, callback, name):\n        if callback == None:\n            del self.callbacks[name]\n        else:\n            self.callbacks[name] = callback", "response": "Adds a callback to our set of callbacks with name."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef find_example_dir():\n    # Replace %s with directory to check for shoebot menus.\n    code_stub = textwrap.dedent(\"\"\"\n    from pkg_resources import resource_filename, Requirement, DistributionNotFound\n    try:\n        print(resource_filename(Requirement.parse('shoebot'), '%s'))\n    except DistributionNotFound:\n        pass\n\n    \"\"\")\n\n\n    # Needs to run in same python env as shoebot (may be different to gedits)\n    code = code_stub % 'share/shoebot/examples'\n    cmd = [\"python\", \"-c\", code]\n    p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)\n    output, errors = p.communicate()\n    if errors:\n        print('Shoebot experienced errors searching for install and examples.')\n        print('Errors:\\n{0}'.format(errors.decode('utf-8')))\n        return None\n    else:\n        examples_dir = output.decode('utf-8').strip()\n        if os.path.isdir(examples_dir):\n            return examples_dir\n\n        # If user is running 'setup.py develop' then examples could be right here\n        #code = \"from pkg_resources import resource_filename, Requirement; print resource_filename(Requirement.parse('shoebot'), 'examples/')\"\n        code = code_stub % 'examples/'\n        cmd = [\"python\", \"-c\", code]\n        p = subprocess.Popen(cmd, stdout=subprocess.PIPE)\n        output, errors = p.communicate()\n        examples_dir = output.decode('utf-8').strip()\n        if os.path.isdir(examples_dir):\n            return examples_dir\n\n        if examples_dir:\n            print('Shoebot could not find examples at: {0}'.format(examples_dir))\n        else:\n            print('Shoebot could not find install dir and examples.')", "response": "Find examples dir.. a little bit ugly.."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef run(self):\n        try:\n            for line in iter(self._fd.readline, False):\n                if line is not None:\n                    if self._althandler:\n                        if self._althandler(line):\n                            # If the althandler returns True\n                            # then don't process this as usual\n                            continue\n                self._queue.put(line)\n                if not line:\n                    time.sleep(0.1)\n        except ValueError:  # This can happen if we are closed during readline - TODO - better fix.\n            if not self._fd.closed:\n                raise", "response": "Runs the tread thread."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nchecking whether there is no more content to expect.", "response": "def eof(self):\n        \"\"\"\n        Check whether there is no more content to expect.\n        \"\"\"\n        return (not self.is_alive()) and self._queue.empty() or self._fd.closed"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef live_source_load(self, source):\n        source = source.rstrip('\\n')\n        if source != self.source:\n            self.source = source\n            b64_source = base64.b64encode(bytes(bytearray(source, \"ascii\")))\n            self.send_command(CMD_LOAD_BASE64, b64_source)", "response": "Send new source code to the bot"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nsend command to the remote process.", "response": "def send_command(self, cmd, *args):\n        \"\"\"\n        :param cmd:\n        :param args:\n        :return:\n        \"\"\"\n        # Test in python 2 and 3 before modifying (gedit2 + 3)\n        if True:\n            # Create a CommandResponse using a cookie as a unique id\n            cookie = str(uuid.uuid4())\n            response = CommandResponse(cmd, cookie, None, info=[])\n            self.responses[cookie] = response\n            args = list(args) + [b'cookie=' + bytes(cookie, \"ascii\")]\n        if args:\n            bytes_args = []\n            for arg in args:\n                if isinstance(arg, bytes):\n                    bytes_args.append(arg)\n                else:\n                    bytes_args.append(bytearray(arg, \"ascii\"))\n            data = bytearray(cmd, \"ascii\") + b' ' + b' '.join(bytes_args) + b'\\n'\n        else:\n            data = bytearray(cmd, \"ascii\") + b'\\n'\n\n        self.process.stdin.write(data)\n        self.process.stdin.flush()"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nclose outputs of process.", "response": "def close(self):\n        \"\"\"\n        Close outputs of process.\n        \"\"\"\n        self.process.stdout.close()\n        self.process.stderr.close()\n        self.running = False"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nget responses to commands sent", "response": "def get_command_responses(self):\n        \"\"\"\n        Get responses to commands sent\n        \"\"\"\n        if not self.response_queue.empty():\n            yield None\n        while not self.response_queue.empty():\n            line = self.response_queue.get()\n            if line is not None:\n                yield line"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nsort a list of options by preference", "response": "def sort_by_preference(options, prefer):\n    \"\"\"\n    :param options: List of options\n    :param prefer: Prefered options\n    :return:\n\n    Pass in a list of options, return options in 'prefer' first\n\n    >>> sort_by_preference([\"cairo\", \"cairocffi\"], [\"cairocffi\"])\n    [\"cairocffi\", \"cairo\"]\n    \"\"\"\n    if not prefer:\n        return options\n    return sorted(options, key=lambda x: (prefer + options).index(x))"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nimport the modules in module_names and set the has_... booleans to class set..._impl to first successful import", "response": "def import_libs(self, module_names, impl_name):\n        \"\"\"\n        Loop through module_names,\n        add has_.... booleans to class\n        set ..._impl to first successful import\n\n        :param module_names:  list of module names to try importing\n        :param impl_name:  used in error output if no modules succeed\n        :return: name, module from first successful implementation\n        \"\"\"\n        for name in module_names:\n            try:\n                module = __import__(name)\n                has_module = True\n            except ImportError:\n                module = None\n                has_module = False\n            setattr(self, name, module)\n            setattr(self, 'has_%s' % name, has_module)\n\n        for name in module_names:\n            try:\n                return name, __import__(name)\n            except ImportError:\n                pass\n        raise ImportError('No %s Implementation found, tried: %s' % (impl_name, ' '.join(module_names)))"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef ensure_pycairo_context(self, ctx):\n        if self.cairocffi and isinstance(ctx, self.cairocffi.Context):\n            from shoebot.util.cairocffi.cairocffi_to_pycairo import _UNSAFE_cairocffi_context_to_pycairo\n            return _UNSAFE_cairocffi_context_to_pycairo(ctx)\n        else:\n            return ctx", "response": "Ensures that the given context is a pycairo context."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ncreate a new context from a Context object.", "response": "def pangocairo_create_context(cr):\n    \"\"\"\n    If python-gi-cairo is not installed, using PangoCairo.create_context\n    dies with an unhelpful KeyError, check for that and output somethig\n    useful.\n    \"\"\"\n    # TODO move this to core.backend\n    try:\n        return PangoCairo.create_context(cr)\n    except KeyError as e:\n        if e.args == ('could not find foreign type Context',):\n            raise ShoebotInstallError(\"Error creating PangoCairo missing dependency: python-gi-cairo\")\n        else:\n            raise"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _get_center(self):\n        '''Returns the center point of the path, disregarding transforms.\n        '''\n        w, h = self.layout.get_pixel_size()\n        x = (self.x + w / 2)\n        y = (self.y + h / 2)\n        return x, y", "response": "Returns the center point of the path disregarding transforms."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef is_list(str):\n \n    \"\"\" Determines if an item in a paragraph is a list.\n\n    If all of the lines in the markup start with a \"*\" or \"1.\" \n    this indicates a list as parsed by parse_paragraphs().\n    It can be drawn with draw_list().\n    \n    \"\"\" \n    \n    for chunk in str.split(\"\\n\"):\n        chunk = chunk.replace(\"\\t\", \"\")\n        if  not chunk.lstrip().startswith(\"*\") \\\n        and not re.search(r\"^([0-9]{1,3}\\. )\", chunk.lstrip()):\n            return False\n    \n    return True", "response": "Determines if an item in a paragraph is a list."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ndetermines if an item in a paragraph is a LaTeX math equation.", "response": "def is_math(str):\n    \n    \"\"\" Determines if an item in a paragraph is a LaTeX math equation.\n    \n    Math equations are wrapped in <math></math> tags.\n    They can be drawn as an image using draw_math().\n    \n    \"\"\"\n    \n    str = str.strip()\n    if str.startswith(\"<math>\") and str.endswith(\"</math>\"):\n        return True\n    else:\n        return False"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef draw_math(str, x, y, alpha=1.0):\n    \n    \"\"\" Uses mimetex to generate a GIF-image from the LaTeX equation.\n    \"\"\"\n    \n    try: from web import _ctx\n    except: pass\n    \n    str = re.sub(\"</{0,1}math>\", \"\", str.strip())\n    img = mimetex.gif(str)\n    w, h = _ctx.imagesize(img)\n    _ctx.image(img, x, y, alpha=alpha)\n    return w, h", "response": "Draws a LaTeX math string to the current context."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef draw_list(markup, x, y, w, padding=5, callback=None):\n    \n    \"\"\" Draws list markup with indentation in NodeBox.\n\n    Draw list markup at x, y coordinates\n    using indented bullets or numbers.\n    The callback is a command that takes a str and an int.\n    \n    \"\"\"\n    \n    try: from web import _ctx\n    except: pass\n\n    i = 1\n    for chunk in markup.split(\"\\n\"):\n        \n        if callback != None: \n            callback(chunk, i)\n        \n        m = re.search(\"^([0-9]{1,3}\\. )\", chunk.lstrip())\n        if m:\n            indent = re.search(\"[0-9]\", chunk).start()*padding*2\n            bullet = m.group(1)\n            dx = textwidth(\"000.\")\n            chunk = chunk.lstrip(m.group(1)+\"\\t\")\n        \n        if chunk.lstrip().startswith(\"*\"):\n            indent = chunk.find(\"*\")*padding*2\n            bullet = u\"\u2022\"\n            dx = textwidth(\"*\")\n            chunk = chunk.lstrip(\"* \\t\")\n        \n        _ctx.text(bullet, x+indent, y)\n        dx += padding + indent\n        _ctx.text(chunk, x+dx, y, width=w-dx)\n        y += _ctx.textheight(chunk, width=w-dx)\n        y += _ctx.textheight(\" \") * 0.25\n        i += 1", "response": "Draw list markup at x y coordinates using indented bullets or numbers."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ndraw a table in the NodeBox.", "response": "def draw_table(table, x, y, w, padding=5):\n    \n    \"\"\" This is a very poor algorithm to draw Wikipedia tables in NodeBox.\n    \"\"\"\n    \n    try: from web import _ctx\n    except: pass\n    \n    f = _ctx.fill()\n    _ctx.stroke(f)\n    h = _ctx.textheight(\" \") + padding*2\n    \n    row_y = y\n    \n    if table.title != \"\":\n        _ctx.fill(f)\n        _ctx.rect(x, row_y, w, h)\n        _ctx.fill(1)\n        _ctx.text(table.title, x+padding, row_y+_ctx.fontsize()+ padding)\n        row_y += h\n    \n    # A table of flags marking how long a cell \n    # from a previous row is still spanning in a column.\n    rowspans = [1 for i in range(10)]\n    previous_cell_w = 0\n    \n    for row in table:\n        \n        cell_x = x\n        \n        # The width of a cell is the total table width \n        # evenly divided by the number of cells.\n        # Previous rows' cells still spanning will push cells\n        # to the right and decrease their width.\n        cell_w  = 1.0 * w\n        cell_w -= previous_cell_w * len([n for n in rowspans if n > 1])\n        cell_w /= len(row)\n        \n        # The height of each cell is the highest cell in the row.\n        # The height depends on the amount of text in the cell.\n        cell_h = 0\n        for cell in row:\n            this_h = _ctx.textheight(cell, width=cell_w-padding*2) + padding*2\n            cell_h = max(cell_h, this_h)\n        \n        # Traverse each cell in this row.\n        i = 0\n        for cell in row:\n            \n            # If a previous row's cell is still spanning,\n            # push this cell to the right.\n            if rowspans[i] > 1:\n                rowspans[i] -= 1\n                cell_x += previous_cell_w\n                i += 1\n                \n            # Get the rowspan attribute for this cell.\n            m = re.search(\"rowspan=\\\"(.*?)\\\"\", cell.properties)\n            if m:\n                rowspan = int(m.group(1))\n                rowspans[i] = rowspan\n            else:\n                rowspan = 1\n\n            # Padded cell text.            \n            # Horizontal line above each cell.\n            # Vertical line before each cell.\n            _ctx.fill(f)\n            _ctx.text(cell, cell_x+padding, row_y+_ctx.fontsize()+padding, cell_w-padding*2)\n            _ctx.line(cell_x, row_y, cell_x+cell_w, row_y)\n            if cell_x > x:\n                _ctx.nofill()\n                _ctx.line(cell_x, row_y, cell_x, row_y+cell_h)\n                \n            cell_x += cell_w\n            i += 1\n            \n        # Move to next row.\n        row_y += cell_h\n        previous_cell_w = cell_w\n        \n    # Table's bounding rectangle.\n    _ctx.nofill()\n    _ctx.rect(x, y, w, row_y-y)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nparsing the markup and sets the attributes of the object.", "response": "def parse(self, light=False):\n\n        \"\"\" Parses data from Wikipedia page markup.\n\n        The markup comes from Wikipedia's edit page.\n        We parse it here into objects containing plain text.\n        The light version parses only links to other articles, it's faster than a full parse.    \n        \n        \"\"\"\n\n        markup = self.markup\n        \n        self.disambiguation = self.parse_disambiguation(markup)\n        self.categories = self.parse_categories(markup)\n        self.links = self.parse_links(markup)\n        \n        if not light:\n        \n            # Conversion of HTML markup to Wikipedia markup.\n            markup = self.convert_pre(markup)\n            markup = self.convert_li(markup)\n            markup = self.convert_table(markup)\n            markup = replace_entities(markup)\n        \n            # Harvest references from the markup\n            # and replace them by footnotes.\n            markup = markup.replace(\"{{Cite\", \"{{cite\")\n            markup = re.sub(\"\\{\\{ {1,2}cite\", \"{{cite\", markup)\n            self.references, markup = self.parse_references(markup)\n\n            # Make sure there are no legend linebreaks in image links.\n            # Then harvest images and strip them from the markup.\n            markup = re.sub(\"\\n+(\\{\\{legend)\", \"\\\\1\", markup)\n            self.images, markup = self.parse_images(markup)\n            self.images.extend(self.parse_gallery_images(markup))\n            \n            self.paragraphs = self.parse_paragraphs(markup)\n            self.tables = self.parse_tables(markup)\n            self.translations = self.parse_translations(markup)\n            self.important = self.parse_important(markup)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nstripping Wikipedia markup from given text.", "response": "def plain(self, markup):\n        \n        \"\"\" Strips Wikipedia markup from given text.\n        \n        This creates a \"plain\" version of the markup,\n        stripping images and references and the like.\n        Does some commonsense maintenance as well,\n        like collapsing multiple spaces.\n        If you specified full_strip=False for WikipediaPage instance,\n        some markup is preserved as HTML (links, bold, italic).\n        \n        \"\"\"\n        \n        # Strip bold and italic.\n        if self.full_strip:\n            markup = markup.replace(\"'''\", \"\")\n            markup = markup.replace(\"''\", \"\")\n        else:\n            markup = re.sub(\"'''([^']*?)'''\", \"<b>\\\\1</b>\", markup)\n            markup = re.sub(\"''([^']*?)''\", \"<i>\\\\1</i>\", markup)\n        \n        # Strip image gallery sections.\n        markup = re.sub(self.re[\"gallery\"], \"\", markup)\n        \n        # Strip tables.\n        markup = re.sub(self.re[\"table\"], \"\", markup)\n        markup = markup.replace(\"||\", \"\")\n        markup = markup.replace(\"|}\", \"\")\n        \n        # Strip links, keeping the display alias.\n        # We'll strip the ending ]] later.\n        if self.full_strip:\n            markup = re.sub(r\"\\[\\[[^\\]]*?\\|\", \"\", markup)\n        else:\n            markup = re.sub(r\"\\[\\[([^]|]*|)\\]\\]\", '<a href=\"\\\\1\">\\\\1</a>', markup)\n            markup = re.sub(r\"\\[\\[([^]|]*|)\\|([^]]*)\\]\\]\", '<a href=\"\\\\1\">\\\\2</a>', markup)    \n\n        # Strip translations, users, etc.\n        markup = re.sub(self.re[\"translation\"], \"\", markup)\n        \n        # This math TeX is not supported:\n        markup = markup.replace(\"\\displaytyle\", \"\")\n        markup = markup.replace(\"\\textstyle\", \"\")\n        markup = markup.replace(\"\\scriptstyle\", \"\")\n        markup = markup.replace(\"\\scriptscriptstyle\", \"\")\n        \n        # Before stripping [ and ] brackets,\n        # make sure they are retained inside <math></math> equations.\n        markup = re.sub(\"(<math>.*?)\\[(.*?</math>)\", \"\\\\1MATH___OPEN\\\\2\", markup)\n        markup = re.sub(\"(<math>.*?)\\](.*?</math>)\", \"\\\\1MATH___CLOSE\\\\2\", markup)\n        markup = markup.replace(\"[\", \"\")\n        markup = markup.replace(\"]\", \"\")\n        markup = markup.replace(\"MATH___OPEN\", \"[\")\n        markup = markup.replace(\"MATH___CLOSE\", \"]\")\n        \n        # a) Strip references.\n        # b) Strip <ref></ref> tags.\n        # c) Strip <ref name=\"\" /> tags.\n        # d) Replace --REF--(12) by [12].\n        # e) Remove space between [12] and trailing punctuation .,\n        # f) Remove HTML comment <!-- -->\n        # g) Keep the Latin Extended-B template: {{latinx| }}\n        # h) Strip Middle-Earth references.\n        # i) Keep quotes: {{quote| }}\n        # j) Remove templates\n        markup = re.sub(self.re[\"reference\"], \"\", markup)                  # a\n        markup = re.sub(\"</{0,1}ref.*?>\", \"\", markup)                      # b\n        markup = re.sub(\"<ref name=\\\".*?\\\" {0,1}/>\", \"\", markup)           # c\n        markup = re.sub(self.ref+\"\\(([0-9]*?)\\)\", \"[\\\\1] \", markup)        # d\n        markup = re.sub(\"\\] ([,.\\\"\\?\\)])\", \"]\\\\1\", markup)                 # e\n        markup = re.sub(self.re[\"comment\"], \"\", markup)                    # f\n        markup = re.sub(\"\\{\\{latinx\\|(.*?)\\}\\}\", \"\\\\1\", markup)            # g\n        markup = re.sub(\"\\{\\{ME-ref.*?\\}\\}\", \"\", markup)                   # h\n        markup = re.sub(\"\\{\\{quote\\|(.*?)\\}\\}\", \"\\\"\\\\1\\\"\", markup)         # i\n        markup = re.sub(re.compile(\"\\{\\{.*?\\}\\}\", re.DOTALL), \"\", markup)  # j\n        markup = markup.replace(\"}}\", \"\")\n        \n        # Collapse multiple spaces between words,\n        # unless they appear in preformatted text.\n        markup = re.sub(\"<br.*?/{0,1}>\", \" \", markup)\n        markup = markup.split(\"\\n\")\n        for i in range(len(markup)):\n            if not markup[i].startswith(\" \"):\n                markup[i] = re.sub(r\"[ ]+\", \" \", markup[i])\n        markup = \"\\n\".join(markup)\n        markup = markup.replace(\" .\", \".\")\n        \n        # Strip all HTML except <math> tags.\n        if self.full_strip:\n            markup = strip_tags(markup, exclude=[\"math\"], linebreaks=True)\n        \n        markup = markup.strip()\n        return markup"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nconverts the markup from pre to wikipedia", "response": "def convert_pre(self, markup):\n        \n        \"\"\" Substitutes <pre> to Wikipedia markup by adding a space at the start of a line.\n        \"\"\"\n        \n        for m in re.findall(self.re[\"preformatted\"], markup):\n            markup = markup.replace(m, m.replace(\"\\n\", \"\\n \"))\n            markup = re.sub(\"<pre.*?>\\n{0,}\", \"\", markup)\n            markup = re.sub(\"\\W{0,}</pre>\", \"\", markup)\n        \n        return markup"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef convert_li(self, markup):\n\n        \"\"\" Subtitutes <li> content to Wikipedia markup.\n        \"\"\"\n        \n        for li in re.findall(\"<li;*?>\", markup):\n            markup = re.sub(li, \"\\n* \", markup)\n        markup = markup.replace(\"</li>\", \"\")\n            \n        return markup", "response": "Convert the li content to Wikipedia markup."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nconverts HTML table content to Wikipedia markup.", "response": "def convert_table(self, markup):\n        \n        \"\"\" Subtitutes <table> content to Wikipedia markup.\n        \"\"\"\n        \n        for table in re.findall(self.re[\"html-table\"], markup):\n            wiki = table\n            wiki = re.sub(r\"<table(.*?)>\", \"{|\\\\1\", wiki)\n            wiki = re.sub(r\"<tr(.*?)>\", \"|-\\\\1\", wiki)\n            wiki = re.sub(r\"<td(.*?)>\", \"|\\\\1|\", wiki)\n            wiki = wiki.replace(\"</td>\", \"\\n\")\n            wiki = wiki.replace(\"</tr>\", \"\\n\")\n            wiki = wiki.replace(\"</table>\", \"\\n|}\")\n            markup = markup.replace(table, wiki)\n        \n        return markup"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef parse_links(self, markup):\n        \n        \"\"\" Returns a list of internal Wikipedia links in the markup.\n\n        # A Wikipedia link looks like:\n        # [[List of operating systems#Embedded | List of embedded operating systems]]\n        # It does not contain a colon, this indicates images, users, languages, etc.\n        \n        The return value is a list containing the first part of the link,\n        without the anchor.\n\n        \"\"\"\n        \n        links = []\n        m = re.findall(self.re[\"link\"], markup)\n        for link in m:\n            # We don't like [[{{{1|Universe (disambiguation)}}}]]\n            if link.find(\"{\") >= 0:\n                link = re.sub(\"\\{{1,3}[0-9]{0,2}\\|\", \"\", link)\n                link = link.replace(\"{\", \"\")\n                link = link.replace(\"}\", \"\")            \n            link = link.split(\"|\")\n            link[0] = link[0].split(\"#\")\n            page = link[0][0].strip()\n            #anchor = u\"\"\n            #display = u\"\"\n            #if len(link[0]) > 1: \n            #    anchor = link[0][1].strip()\n            #if len(link) > 1: \n            #    display = link[1].strip()\n            if not page in links:\n                links.append(page)\n                #links[page] = WikipediaLink(page, anchor, display)\n        \n        links.sort()\n        return links", "response": "Parses the markup and returns a list of internal Wikipedia links."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nparse the markup and returns a list of images found in the markup.", "response": "def parse_images(self, markup, treshold=6):\n        \n        \"\"\" Returns a list of images found in the markup.\n        \n        An image has a pathname, a description in plain text\n        and a list of properties Wikipedia uses to size and place images.\n\n        # A Wikipedia image looks like:\n        # [[Image:Columbia Supercomputer - NASA Advanced Supercomputing Facility.jpg|right|thumb|\n        #   The [[NASA]] [[Columbia (supercomputer)|Columbia Supercomputer]].]]\n        # Parts are separated by \"|\".\n        # The first part is the image file, the last part can be a description.\n        # In between are display properties, like \"right\" or \"thumb\".\n        \n        \"\"\"\n        \n        images = []\n        m = re.findall(self.re[\"image\"], markup)\n        for p in m:\n            p = self.parse_balanced_image(p)\n            img = p.split(\"|\")\n            path = img[0].replace(\"[[Image:\", \"\").strip()\n            description = u\"\"\n            links = {}\n            properties = []\n            if len(img) > 1:\n                img = \"|\".join(img[1:])\n                links = self.parse_links(img)\n                properties = self.plain(img).split(\"|\")\n                description = u\"\"\n                # Best guess: an image description is normally\n                # longer than six characters, properties like\n                # \"thumb\" and \"right\" are less than six characters.\n                if len(properties[-1]) > treshold:\n                    description = properties[-1]\n                    properties = properties[:-1]\n            img = WikipediaImage(path, description, links, properties)\n            images.append(img)\n            markup = markup.replace(p, \"\")\n        \n        return images, markup.strip()"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ncorrects Wikipedia image markup.", "response": "def parse_balanced_image(self, markup):\n        \n        \"\"\" Corrects Wikipedia image markup.\n\n        Images have a description inside their link markup that \n        can contain link markup itself, make sure the outer \"[\" and \"]\" brackets \n        delimiting the image are balanced correctly (e.g. no [[ ]] ]]).\n\n        Called from parse_images().\n\n        \"\"\"\n\n        opened = 0\n        closed = 0\n        for i in range(len(markup)):\n            if markup[i] == \"[\": opened += 1\n            if markup[i] == \"]\": closed += 1\n            if opened == closed:\n                return markup[:i+1]\n                \n        return markup"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nparsing the gallery images from the markup.", "response": "def parse_gallery_images(self, markup):\n        \n        \"\"\" Parses images from the <gallery></gallery> section.\n        \n        Images inside <gallery> tags do not have outer \"[[\" brackets.\n        Add these and then parse again.\n        \n        \"\"\"\n        \n        gallery = re.search(self.re[\"gallery\"], markup)\n        if gallery:\n            gallery = gallery.group(1)\n            gallery = gallery.replace(\"Image:\", \"[[Image:\")\n            gallery = gallery.replace(\"\\n\", \"]]\\n\")\n            images, markup = self.parse_images(gallery)\n            return images\n        \n        return []"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef parse_paragraph(self, markup):\n        \n        \"\"\" Creates a list from lines of text in a paragraph.\n        \n        Each line of text is a new item in the list,\n        except lists and preformatted chunks (<li> and <pre>),\n        these are kept together as a single chunk.\n        \n        Lists are formatted using parse_paragraph_list().\n        \n        Empty lines are stripped from the output.\n        Indentation (i.e. lines starting with \":\") is ignored.\n        \n        Called from parse_paragraphs() method.\n        \n        \"\"\"\n        \n        s = self.plain(markup)\n        # Add an extra linebreak between the last list item\n        # and the normal line following after it, so they don't stick together, e.g.\n        # **[[Alin Magic]], magic used in the videogame ''[[Rise of Nations: Rise of Legends]]''\n        # In '''popular culture''':\n        # * [[Magic (film)|''Magic'' (film)]], a 1978 film starring Anthony Hopkins and Ann-Margret\n        s = re.sub(re.compile(\"\\n([*#;].*?)\\n([^*#?])\", re.DOTALL), \"\\n\\\\1\\n\\n\\\\2\", s)\n        # This keeps list items with linebreaks \n        # between them nicely together.\n        s = re.sub(\"\\n{2,3}([*#;])\", \"\\n\\\\1\", s)\n        chunks = []\n        ch = \"\"\n        i = 1\n        for chunk in s.split(\"\\n\"):\n            if chunk.startswith(\":\"):\n                chunk = chunk.lstrip(\":\")\n            if len(chunk.strip()) > 1:\n                # Leave out taxoboxes and infoboxes.\n                if not chunk.startswith(\"|\"):\n                    ch += chunk + \"\\n\"\n            if ch.strip() != \"\":\n                if not re.search(\"^[ *#;]\", chunk):\n                    ch = self.parse_paragraph_list(ch)\n                    chunks.append(ch.rstrip())\n                    ch = \"\"\n\n        if ch.strip() != \"\":\n            ch = self.parse_paragraph_list(ch)\n            chunks.append(ch.strip())\n            \n        return chunks", "response": "Parses a paragraph into a list of items."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nparse a list of items from a Wikipedia file.", "response": "def parse_paragraph_list(self, markup, indent=\"\\t\"):\n        \n        \"\"\" Formats bullets and numbering of Wikipedia lists.\n        \n        List items are marked by \"*\", \"#\" or \";\" at the start of a line.\n        We treat \";\" the same as \"*\",\n        and replace \"#\" with real numbering (e.g. \"2.\").\n        Sublists (e.g. *** and ###) get indented by tabs.\n        \n        Called from parse_paragraphs() method.\n        \n        \"\"\"\n\n        def lastleft(ch, str):\n            n = 0\n            while n < len(str) and str[n] == ch: n += 1\n            return n        \n\n        tally = [1 for i in range(10)]\n        chunks = markup.split(\"\\n\")\n        for i in range(len(chunks)):\n            if chunks[i].startswith(\"#\"):\n                j = min(lastleft(\"#\", chunks[i]), len(tally)-1)\n                chunks[i] = indent*(j-1) + str(tally[j])+\". \" + chunks[i][j:]\n                chunks[i] = chunks[i].replace(\".  \", \". \")\n                tally[j] += 1\n                # Reset the numbering of sublists.\n                for k in range(j+1, len(tally)): \n                    tally[k] = 1\n            if chunks[i].startswith(\";\"):\n                chunks[i] = \"*\" + chunks[i][1:]\n            if chunks[i].startswith(\"*\"):\n                j = lastleft(\"*\", chunks[i])  \n                chunks[i] = indent*(j-1) + \"* \" + chunks[i][j:]\n                chunks[i] = chunks[i].replace(\"*  \", \"* \")\n        \n        return \"\\n\".join(chunks)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef connect_paragraph(self, paragraph, paragraphs):\n        \n        \"\"\" Create parent/child links to other paragraphs.\n        \n        The paragraphs parameters is a list of all the paragraphs\n        parsed up till now.\n        \n        The parent is the previous paragraph whose depth is less.\n        The parent's children include this paragraph.\n        \n        Called from parse_paragraphs() method.\n        \n        \"\"\"\n\n        if paragraph.depth > 0:\n            n = range(len(paragraphs))\n            n.reverse()\n            for i in n:\n                if paragraphs[i].depth == paragraph.depth-1:\n                    paragraph.parent = paragraphs[i]\n                    paragraphs[i].children.append(paragraph)\n                    break\n                    \n        return paragraph", "response": "Connect a paragraph to another paragraph."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nparse the content of a specific paragraph into a list of items and items.", "response": "def parse_paragraph_references(self, markup):\n        \n        \"\"\" Updates references with content from specific paragraphs.\n        \n        The \"references\", \"notes\", \"external links\" paragraphs \n        are double-checked for references. Not all items in the list\n        might have been referenced inside the article, or the item\n        might contain more info than we initially parsed from it.\n        \n        Called from parse_paragraphs() method.\n        \n        \"\"\"\n        \n        for chunk in markup.split(\"\\n\"):\n            # We already parsed this, it contains the self.ref mark.\n            # See if we can strip more notes from it.\n            m = re.search(self.ref+\"\\(([0-9]*?)\\)\", chunk)\n            if m:\n                chunk = chunk.strip(\"* \")\n                chunk = chunk.replace(m.group(0), \"\")\n                chunk = self.plain(chunk)\n                i = int(m.group(1))\n                if chunk != \"\":\n                    self.references[i-1].note = chunk\n            # If it's not a citation we don't have this reference yet.\n            elif chunk.strip().startswith(\"*\") \\\n             and chunk.find(\"{{cite\") < 0:\n                chunk = chunk.strip(\"* \")\n                chunk = self.plain(chunk)\n                if chunk != \"\":\n                    r = WikipediaReference()\n                    r.note = chunk\n                    self.references.append(r)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nparse the markup and returns a list of paragraphs.", "response": "def parse_paragraphs(self, markup):\n        \n        \"\"\" Returns a list of paragraphs in the markup.\n        \n        A paragraph has a title and multiple lines of plain text.\n        A paragraph might have parent and child paragraphs,\n        denoting subtitles or bigger chapters.\n        \n        A paragraph might have links to additional articles.\n        \n        Formats numbered lists by replacing # by 1.\n        Formats bulleted sublists like ** or *** with indentation.\n        \n        \"\"\"\n        \n        # Paragraphs to exclude.\n        refs = [\"references\", \"notes\", \"notes and references\", \"external links\", \"further reading\"]\n        exclude = [\"see also\", \"media\", \"gallery\", \"related topics\", \"lists\", \"gallery\", \"images\"]\n        exclude.extend(refs)\n        \n        paragraphs = []\n        paragraph = WikipediaParagraph(self.title)\n        paragraph_data = \"\"\n        for chunk in markup.split(\"\\n\"):\n            \n            # Strip each line of whitespace, \n            # unless it's a preformatted line (starts with a space).\n            if not chunk.startswith(\" \"):\n                chunk = chunk.strip()\n                \n            # A title wrapped in \"=\", \"==\", \"===\"...\n            # denotes a new paragraphs section.\n            if chunk.startswith(\"=\"):\n\n                if paragraph.title.lower() in refs \\\n                or (paragraph.parent and paragraph.parent.title.lower() in refs):\n                    self.parse_paragraph_references(paragraph_data)\n                paragraph.extend(self.parse_paragraph(paragraph_data))\n                paragraphs.append(paragraph)\n                \n                # Initialise a new paragraph.\n                # Create parent/child links to other paragraphs.\n                title = chunk.strip().strip(\"=\")\n                title = self.plain(title)\n                paragraph = WikipediaParagraph(title)\n                paragraph.depth = self.parse_paragraph_heading_depth(chunk)\n                if paragraph.title.lower() not in exclude:\n                    paragraph = self.connect_paragraph(paragraph, paragraphs)\n                paragraph_data = \"\"\n            \n            # Underneath a title might be links to in-depth articles,\n            # e.g. Main articles: Computer program and Computer programming\n            # which in wiki markup would be {{main|Computer program|Computer programming}}\n            # The second line corrects\" {{Main|Credit (finance)}} or {{Main|Usury}}\".\n            elif re.search(re.compile(\"^{{main\", re.I), chunk):\n                paragraph.main = [link.strip(\"} \") for link in chunk.split(\"|\")[1:]]\n                paragraph.main = [re.sub(re.compile(\"}}.*?{{main\", re.I), \"\", link) \n                                  for link in paragraph.main]\n                \n            # At the bottom might be links to related articles,\n            # e.g. See also: Abundance of the chemical elements\n            # which in wiki markup would be {{see also|Abundance of the chemical elements}}\n            elif re.search(re.compile(\"^{{see {0,1}also\", re.I), chunk):\n                paragraph.related = [link.strip(\"} \") for link in chunk.split(\"|\")[1:]]\n                \n            # Accumulate the data in this paragraph,\n            # we'll process it once a new paragraph starts.\n            else:\n                paragraph_data += chunk +\"\\n\"\n                \n        # Append the last paragraph.\n        if paragraph.title.lower() in refs \\\n        or (paragraph.parent and paragraph.parent.title.lower() in refs):\n            self.parse_paragraph_references(paragraph_data)\n        paragraph.extend(self.parse_paragraph(paragraph_data))\n        paragraphs.append(paragraph)\n\n        # The \"See also\" paragraph is an enumeration of links\n        # which we already parsed so don't show them.\n        # We also did references, and other paragraphs are not that relevant.        \n        paragraphs_exclude = []\n        for paragraph in paragraphs:\n            if paragraph.title.lower() not in exclude \\\n            and not (paragraph.parent and paragraph.parent.title.lower() in exclude):\n                paragraphs_exclude.append(paragraph)\n        \n        if len(paragraphs_exclude) == 1 and \\\n           len(paragraphs_exclude[0]) == 0:\n            return []\n        \n        return paragraphs_exclude"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nparses a row of cells in a Wikipedia table.", "response": "def parse_table_row(self, markup, row):\n\n        \"\"\" Parses a row of cells in a Wikipedia table.\n        \n        Cells in the row are separated by \"||\".\n        A \"!\" indicates a row of heading columns.\n        Each cell can contain properties before a \"|\",\n        # e.g. align=\"right\" | Cell 2 (right aligned).       \n        \n        \"\"\"\n        \n        if row == None:\n            row = WikipediaTableRow()\n           \n        markup = markup.replace(\"!!\", \"||\")\n        for cell in markup.lstrip(\"|!\").split(\"||\"):\n            # The \"|\" after the properties can't be part of a link.\n            i = cell.find(\"|\")\n            j = cell.find(\"[[\")\n            if i>0 and (j<0 or i<j):\n                data = self.plain(cell[i+1:])\n                properties = cell[:i].strip()\n            else:\n                data = self.plain(cell)\n                properties = u\"\"\n            cell = WikipediaTableCell(data)\n            cell.properties = properties\n            row.append(cell)\n        \n        return row"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ncreating a link from the table to paragraph and vice versa.", "response": "def connect_table(self, table, chunk, markup):\n\n        \"\"\" Creates a link from the table to paragraph and vice versa.\n        \n        Finds the first heading above the table in the markup.\n        This is the title of the paragraph the table belongs to.\n        \n        \"\"\"\n\n        k = markup.find(chunk)\n        i = markup.rfind(\"\\n=\", 0, k)\n        j = markup.find(\"\\n\", i+1)\n        paragraph_title = markup[i:j].strip().strip(\"= \")\n        for paragraph in self.paragraphs:\n            if paragraph.title == paragraph_title:\n                paragraph.tables.append(table)\n                table.paragraph = paragraph"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef parse_tables(self, markup):\n        \n        \"\"\" Returns a list of tables in the markup.\n\n        A Wikipedia table looks like:\n        {| border=\"1\"\n        |-\n        |Cell 1 (no modifier - not aligned)\n        |-\n        |align=\"right\" |Cell 2 (right aligned)\n        |-\n        |}\n\n        \"\"\"\n\n        tables = []\n        m = re.findall(self.re[\"table\"], markup)\n        for chunk in m:\n\n            table = WikipediaTable()\n            table.properties = chunk.split(\"\\n\")[0].strip(\"{|\").strip()\n            self.connect_table(table, chunk, markup)\n                  \n            # Tables start with \"{|\".\n            # On the same line can be properties, e.g. {| border=\"1\"\n            # The table heading starts with \"|+\".\n            # A new row in the table starts with \"|-\".\n            # The end of the table is marked with \"|}\".            \n            row = None\n            for chunk in chunk.split(\"\\n\"):\n                chunk = chunk.strip()\n                if chunk.startswith(\"|+\"):\n                    title = self.plain(chunk.strip(\"|+\"))\n                    table.title = title\n                elif chunk.startswith(\"|-\"):\n                    if row: \n                        row.properties = chunk.strip(\"|-\").strip()\n                        table.append(row)\n                    row = None\n                elif chunk.startswith(\"|}\"):\n                    pass\n                elif chunk.startswith(\"|\") \\\n                  or chunk.startswith(\"!\"):\n                    row = self.parse_table_row(chunk, row)\n                        \n            # Append the last row.\n            if row: table.append(row)\n            if len(table) > 0:\n                tables.append(table)\n        \n        return tables", "response": "Parses the markup and returns a list of tables in the markup."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nparsing the markup and returns a list of references found in the markup.", "response": "def parse_references(self, markup):\n\n        \"\"\" Returns a list of references found in the markup.\n        \n        References appear inline as <ref> footnotes, \n        http:// external links, or {{cite}} citations.\n        We replace it with (1)-style footnotes.\n        Additional references data is gathered in\n        parse_paragraph_references() when we parse paragraphs.\n        \n        References can also appear in image descriptions,\n        tables and taxoboxes, so they might not always pop up in a paragraph.\n        \n        The plain() method finally replaces (1) by [1].\n        \n        \"\"\"\n    \n        references = []\n        \n        # A Wikipedia reference note looks like:\n        # <ref>In 1946, [[ENIAC]] consumed an estimated 174 kW. \n        # By comparison, a typical personal computer may use around 400 W; \n        # over four hundred times less. {{Ref harvard|kempf1961|Kempf 1961|a}}</ref>\n        m = re.findall(self.re[\"reference\"], markup)\n        for reference in m:\n            reference = re.sub(\"<ref> {0,1}cite\", \"<ref>{{cite\", reference)\n            if not reference.strip().startswith(\"[http://\") and \\\n               not re.search(\"\\{\\{cite\", reference):\n                r = WikipediaReference()\n                r.note = self.plain(re.sub(\"</{0,1}ref.*?>\", \"\", reference))\n                if r.note != \"\":\n                    references.append(r)\n                    p = \" \"+self.ref+\"(\"+str(len(references))+\")\"\n                    markup = markup.replace(reference, p, 1)\n            else:\n                # References containing a citation or url \n                # are better handled by the next patterns.\n                pass\n        \n        # A Wikipedia citation looks like:\n        # {{cite journal\n        # | last = Einstein \n        # | first = Albert\n        # | authorlink = Albert Einstein\n        # | title = Sidelights on Relativity (Geometry and Experience) \n        # | publisher = P. Dutton., Co \n        # | date = 1923}}\n        m = re.findall(self.re[\"citation\"], markup)\n        for citation in m:\n            c = citation.replace(\"\\n\", \"\")\n            r = WikipediaReference()\n            for key in r.__dict__.keys():\n                value = re.search(\"\\| {0,1}\"+key+\"(.*?)[\\|}]\", c)\n                if value:\n                    value = value.group(1)\n                    value = value.replace(\"link\", \"\")\n                    value = value.strip().strip(\" =[]\")\n                    value = self.plain(value)\n                    setattr(r, key, value)\n            if r.first != \"\" and r.last != \"\":\n                r.author = r.first + \" \" + r.last\n            references.append(r)\n            p = \" \"+self.ref+\"(\"+str(len(references))+\")\"\n            markup = markup.replace(citation, p, 1)\n        \n        # A Wikipedia embedded url looks like:\n        # [http://www.pbs.org/wnet/hawking/html/home.html ''Stephen Hawking's Universe'']\n        m = re.findall(self.re[\"url\"], markup)\n        for url in m:\n            r = WikipediaReference()\n            i = url.find(\" \")\n            if i > 0:\n                r.url = url[:i].strip()\n                r.note = self.plain(url[i:])\n            else:\n                r.url = url.strip()\n            references.append(r)\n            p = r.note+\" \"+self.ref+\"(\"+str(len(references))+\")\"\n            markup = markup.replace(\"[\"+url+\"]\", p, 1)\n\n        # Since we parsed all citations first and then all notes and urls,\n        # the ordering will not be correct in the markup,\n        # e.g. (1) (11) (12) (2) (3).\n        sorted = []\n        m = re.findall(self.ref+\"\\(([0-9]*)\\)\", markup)\n        for i in m:\n            sorted.append(references[int(i)-1])\n            markup = markup.replace(\n                self.ref+\"(\"+i+\")\", \n                self.ref+\"**(\"+str(len(sorted))+\")\"\n                )\n        markup = markup.replace(self.ref+\"**\", self.ref)\n        for r in references:\n            if r not in sorted:\n                sorted.append(r)\n        references = sorted\n\n        return references, markup.strip()"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nparses the markup for the categories.", "response": "def parse_categories(self, markup):\n        \n        \"\"\" Returns a list of categories the page belongs to.\n\n        # A Wikipedia category link looks like:\n        # [[Category:Computing]]\n        # This indicates the page is included in the given category.\n        # If \"Category\" is preceded by \":\" this indicates a link to a category.\n        \n        \"\"\"\n        \n        categories = []\n        m = re.findall(self.re[\"category\"], markup)\n        for category in m:\n            category = category.split(\"|\")\n            page = category[0].strip()\n            display = u\"\"\n            if len(category) > 1: \n                display = category[1].strip()\n            #if not categories.has_key(page):\n            #    categories[page] = WikipediaLink(page, u\"\", display)\n            if not page in categories:\n                categories.append(page)\n                \n        return categories"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nparse the markup and returns a dictionary of translations for the page title.", "response": "def parse_translations(self, markup):\n        \n        \"\"\" Returns a dictionary of translations for the page title.\n        \n        A Wikipedia language link looks like: [[af:Rekenaar]].\n        The parser will also fetch links like \"user:\" and \"media:\"\n        but these are stripped against the dictionary of\n        Wikipedia languages.\n        \n        You can get a translated page by searching Wikipedia\n        with the appropriate language code and supplying\n        the translated title as query.\n        \n        \"\"\"\n        \n        global languages\n        translations = {}\n        m = re.findall(self.re[\"translation\"], markup)\n        for language, translation in m:\n            if language in languages:\n                translations[language] = translation\n         \n        return translations"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef parse_disambiguation(self, markup):\n        \n        \"\"\" Gets the Wikipedia disambiguation page for this article.\n        \n        A Wikipedia disambiguation link refers to other pages\n        with the same title but of smaller significance,\n        e.g. {{dablink|For the IEEE magazine see [[Computer (magazine)]].}}\n        \n        \"\"\"\n        \n        m = re.search(self.re[\"disambiguation\"], markup)\n        if m:\n            return self.parse_links(m.group(1))\n        else:\n            return []", "response": "Parses the disambiguation page for this article."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nparsing the markup and returns a list of words that appear in bold in the article.", "response": "def parse_important(self, markup):\n        \n        \"\"\" Returns a list of words that appear in bold in the article.\n        \n        Things like table titles are not added to the list,\n        these are probably bold because it makes the layout nice,\n        not necessarily because they are important.\n        \n        \"\"\"\n        \n        important = []\n        table_titles = [table.title for table in self.tables]\n        m = re.findall(self.re[\"bold\"], markup)\n        for bold in m:\n            bold = self.plain(bold)\n            if not bold in table_titles:\n                important.append(bold.lower())\n        \n        return important"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef sanitize(self, val):\n        if self.type == NUMBER:\n            try:\n                return clamp(self.min, self.max, float(val))\n            except ValueError:\n                return 0.0\n        elif self.type == TEXT:\n            try:\n                return unicode(str(val), \"utf_8\", \"replace\")\n            except:\n                return \"\"\n        elif self.type == BOOLEAN:\n            if unicode(val).lower() in (\"true\", \"1\", \"yes\"):\n                return True\n            else:\n                return False", "response": "Given a Variable and a value cleans it out"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef compliesTo(self, v):\n        if self.type == v.type:\n            if self.type == NUMBER:\n                if self.value < self.min or self.value > self.max:\n                    return False\n            return True\n        return False", "response": "Return whether I am compatible with the given var."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef isList(l):\n    return hasattr(l, '__iter__') \\\n           or (type(l) in (types.ListType, types.TupleType))", "response": "Convenience method that returns True if l is a listlike object."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef isString(s):\n    try:\n        return isinstance(s, unicode) or isinstance(s, basestring)\n    except NameError:\n        return isinstance(s, str)", "response": "Convenience method that works with all 2. x versions of Python\n    to determine whether or not a string is like."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef buildTagMap(default, *args):\n    built = {}\n    for portion in args:\n        if hasattr(portion, 'items'):\n            #It's a map. Merge it.\n            for k,v in portion.items():\n                built[k] = v\n        elif isList(portion):\n            #It's a list. Map each item to the default.\n            for k in portion:\n                built[k] = default\n        else:\n            #It's a scalar. Map it to the default.\n            built[portion] = default\n    return built", "response": "Turns a list of maps lists or scalars into a single map."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nsets up the initial relations between this element and other elements.", "response": "def setup(self, parent=None, previous=None):\n        \"\"\"Sets up the initial relations between this element and\n        other elements.\"\"\"\n        self.parent = parent\n        self.previous = previous\n        self.next = None\n        self.previousSibling = None\n        self.nextSibling = None\n        if self.parent and self.parent.contents:\n            self.previousSibling = self.parent.contents[-1]\n            self.previousSibling.nextSibling = self"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef extract(self):\n        if self.parent:\n            try:\n                self.parent.contents.remove(self)\n            except ValueError:\n                pass\n\n        #Find the two elements that would be next to each other if\n        #this element (and any children) hadn't been parsed. Connect\n        #the two.\n        lastChild = self._lastRecursiveChild()\n        nextElement = lastChild.next\n\n        if self.previous:\n            self.previous.next = nextElement\n        if nextElement:\n            nextElement.previous = self.previous\n        self.previous = None\n        lastChild.next = None\n\n        self.parent = None\n        if self.previousSibling:\n            self.previousSibling.nextSibling = self.nextSibling\n        if self.nextSibling:\n            self.nextSibling.previousSibling = self.previousSibling\n        self.previousSibling = self.nextSibling = None\n        return self", "response": "Destructively rips this element out of the tree."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nfinds the last element beneath this object to be parsed.", "response": "def _lastRecursiveChild(self):\n        \"Finds the last element beneath this object to be parsed.\"\n        lastChild = self\n        while hasattr(lastChild, 'contents') and lastChild.contents:\n            lastChild = lastChild.contents[-1]\n        return lastChild"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn the first item that matches the given criteria and appears after this Tag in the document.", "response": "def findNext(self, name=None, attrs={}, text=None, **kwargs):\n        \"\"\"Returns the first item that matches the given criteria and\n        appears after this Tag in the document.\"\"\"\n        return self._findOne(self.findAllNext, name, attrs, text, **kwargs)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef findAllNext(self, name=None, attrs={}, text=None, limit=None,\n                    **kwargs):\n        \"\"\"Returns all items that match the given criteria and appear\n        after this Tag in the document.\"\"\"\n        return self._findAll(name, attrs, text, limit, self.nextGenerator,\n                             **kwargs)", "response": "Returns all items that match the given criteria and appear\n        after this Tag in the document."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nreturns the closest sibling to this Tag that matches the given criteria and appears after this Tag in the document.", "response": "def findNextSibling(self, name=None, attrs={}, text=None, **kwargs):\n        \"\"\"Returns the closest sibling to this Tag that matches the\n        given criteria and appears after this Tag in the document.\"\"\"\n        return self._findOne(self.findNextSiblings, name, attrs, text,\n                             **kwargs)"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nreturns the next siblings of this Tag that match the given criteria and appear after this Tag in the document.", "response": "def findNextSiblings(self, name=None, attrs={}, text=None, limit=None,\n                         **kwargs):\n        \"\"\"Returns the siblings of this Tag that match the given\n        criteria and appear after this Tag in the document.\"\"\"\n        return self._findAll(name, attrs, text, limit,\n                             self.nextSiblingGenerator, **kwargs)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn the first item that matches the given criteria and appears before this Tag in the document.", "response": "def findPrevious(self, name=None, attrs={}, text=None, **kwargs):\n        \"\"\"Returns the first item that matches the given criteria and\n        appears before this Tag in the document.\"\"\"\n        return self._findOne(self.findAllPrevious, name, attrs, text, **kwargs)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef findAllPrevious(self, name=None, attrs={}, text=None, limit=None,\n                        **kwargs):\n        \"\"\"Returns all items that match the given criteria and appear\n        before this Tag in the document.\"\"\"\n        return self._findAll(name, attrs, text, limit, self.previousGenerator,\n                           **kwargs)", "response": "Returns all items that match the given criteria and appear\n        before this Tag in the document."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef findPreviousSibling(self, name=None, attrs={}, text=None, **kwargs):\n        return self._findOne(self.findPreviousSiblings, name, attrs, text,\n                             **kwargs)", "response": "Returns the closest sibling to this Tag that matches the\n        given criteria and appears before this Tag in the document."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nreturn the siblings of this Tag that match the given criteria and appear before this Tag in the document.", "response": "def findPreviousSiblings(self, name=None, attrs={}, text=None,\n                             limit=None, **kwargs):\n        \"\"\"Returns the siblings of this Tag that match the given\n        criteria and appear before this Tag in the document.\"\"\"\n        return self._findAll(name, attrs, text, limit,\n                             self.previousSiblingGenerator, **kwargs)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nreturning the closest parent of this Tag that matches the given criteria.", "response": "def findParent(self, name=None, attrs={}, **kwargs):\n        \"\"\"Returns the closest parent of this Tag that matches the given\n        criteria.\"\"\"\n        # NOTE: We can't use _findOne because findParents takes a different\n        # set of arguments.\n        r = None\n        l = self.findParents(name, attrs, 1)\n        if l:\n            r = l[0]\n        return r"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn the parents of this Tag that match the given criteria.", "response": "def findParents(self, name=None, attrs={}, limit=None, **kwargs):\n        \"\"\"Returns the parents of this Tag that match the given\n        criteria.\"\"\"\n\n        return self._findAll(name, attrs, None, limit, self.parentGenerator,\n                             **kwargs)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\niterating over a generator looking for things that match.", "response": "def _findAll(self, name, attrs, text, limit, generator, **kwargs):\n        \"Iterates over a generator looking for things that match.\"\n\n        if isinstance(name, SoupStrainer):\n            strainer = name\n        else:\n            # Build a SoupStrainer\n            strainer = SoupStrainer(name, attrs, text, **kwargs)\n        results = ResultSet(strainer)\n        g = generator()\n        while True:\n            try:\n                i = g.next()\n            except StopIteration:\n                break\n            if i:\n                found = strainer.search(i)\n                if found:\n                    results.append(found)\n                    if limit and len(results) >= limit:\n                        break\n        return results"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef toEncoding(self, s, encoding=None):\n        if isinstance(s, unicode):\n            if encoding:\n                s = s.encode(encoding)\n        elif isinstance(s, str):\n            if encoding:\n                s = s.encode(encoding)\n            else:\n                s = unicode(s)\n        else:\n            if encoding:\n                s  = self.toEncoding(str(s), encoding)\n            else:\n                s = unicode(s)\n        return s", "response": "Encodes an object to a string in some encoding or to Unicode."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _invert(h):\n        \"Cheap function to invert a hash.\"\n        i = {}\n        for k,v in h.items():\n            i[v] = k\n        return i", "response": "Cheap function to invert a hash."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nuse in a call to re. sub to replace HTML XML and numericISO - 164 entities with the appropriate Unicode characters.", "response": "def _convertEntities(self, match):\n        \"\"\"Used in a call to re.sub to replace HTML, XML, and numeric\n        entities with the appropriate Unicode characters. If HTML\n        entities are being converted, any unrecognized entities are\n        escaped.\"\"\"\n        x = match.group(1)\n        if self.convertHTMLEntities and x in name2codepoint:\n            return unichr(name2codepoint[x])\n        elif x in self.XML_ENTITIES_TO_SPECIAL_CHARS:\n            if self.convertXMLEntities:\n                return self.XML_ENTITIES_TO_SPECIAL_CHARS[x]\n            else:\n                return u'&%s;' % x\n        elif len(x) > 0 and x[0] == '#':\n            # Handle numeric entities\n            if len(x) > 1 and x[1] == 'x':\n                return unichr(int(x[2:], 16))\n            else:\n                return unichr(int(x[1:]))\n\n        elif self.escapeUnrecognizedEntities:\n            return u'&amp;%s;' % x\n        else:\n            return u'&%s;' % x"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef decompose(self):\n        contents = [i for i in self.contents]\n        for i in contents:\n            if isinstance(i, Tag):\n                i.decompose()\n            else:\n                i.extract()\n        self.extract()", "response": "Recursively destroys the contents of this tree."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nrender the contents of this tag as a Unicode string in the given encoding.", "response": "def renderContents(self, encoding=DEFAULT_OUTPUT_ENCODING,\n                       prettyPrint=False, indentLevel=0):\n        \"\"\"Renders the contents of this tag as a string in the given\n        encoding. If encoding is None, returns a Unicode string..\"\"\"\n        s=[]\n        for c in self:\n            text = None\n            if isinstance(c, NavigableString):\n                text = c.__str__(encoding)\n            elif isinstance(c, Tag):\n                s.append(c.__str__(encoding, prettyPrint, indentLevel))\n            if text and prettyPrint:\n                text = text.strip()\n            if text:\n                if prettyPrint:\n                    s.append(\" \" * (indentLevel-1))\n                s.append(text)\n                if prettyPrint:\n                    s.append(\"\\n\")\n        return ''.join(s)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nreturning only the first child of this Tag matching the given criteria.", "response": "def find(self, name=None, attrs={}, recursive=True, text=None,\n             **kwargs):\n        \"\"\"Return only the first child of this Tag matching the given\n        criteria.\"\"\"\n        r = None\n        l = self.findAll(name, attrs, recursive, text, 1, **kwargs)\n        if l:\n            r = l[0]\n        return r"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef findAll(self, name=None, attrs={}, recursive=True, text=None,\n                limit=None, **kwargs):\n        \"\"\"Extracts a list of Tag objects that match the given\n        criteria.  You can specify the name of the Tag and any\n        attributes you want the Tag to have.\n\n        The value of a key-value pair in the 'attrs' map can be a\n        string, a list of strings, a regular expression object, or a\n        callable that takes a string and returns whether or not the\n        string matches for some custom definition of 'matches'. The\n        same is true of the tag name.\"\"\"\n        generator = self.recursiveChildGenerator\n        if not recursive:\n            generator = self.childGenerator\n        return self._findAll(name, attrs, text, limit, generator, **kwargs)", "response": "Returns a list of Tag objects that match the given criteria."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ninitializes a map representation of this tag s attributes and the tag s class ID.", "response": "def _getAttrMap(self):\n        \"\"\"Initializes a map representation of this tag's attributes,\n        if not already initialized.\"\"\"\n        if not getattr(self, 'attrMap'):\n            self.attrMap = {}\n            for (key, value) in self.attrs:\n                self.attrMap[key] = value\n        return self.attrMap"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef convert_charref(self, name):\n        try:\n            n = int(name)\n        except ValueError:\n            return\n        if not 0 <= n <= 127 : # ASCII ends at 127, not 255\n            return\n        return self.convert_codepoint(n)", "response": "This method converts a character reference to a string."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nreturn true iff the given string is the name of a self - closing tag according to this parser.", "response": "def isSelfClosingTag(self, name):\n        \"\"\"Returns true iff the given string is the name of a\n        self-closing tag according to this parser.\"\"\"\n        return self.SELF_CLOSING_TAGS.has_key(name) \\\n               or self.instanceSelfClosingTags.has_key(name)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _popToTag(self, name, inclusivePop=True):\n        #print \"Popping to %s\" % name\n        if name == self.ROOT_TAG_NAME:\n            return\n\n        numPops = 0\n        mostRecentTag = None\n        for i in range(len(self.tagStack)-1, 0, -1):\n            if name == self.tagStack[i].name:\n                numPops = len(self.tagStack)-i\n                break\n        if not inclusivePop:\n            numPops = numPops - 1\n\n        for i in range(0, numPops):\n            mostRecentTag = self.popTag()\n        return mostRecentTag", "response": "Pops the tag stack up to and including the most recent instance of the given tag."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _toStringSubclass(self, text, subclass):\n        self.endData()\n        self.handle_data(text)\n        self.endData(subclass)", "response": "Adds a certain piece of text to the tree as a NavigableString\n        subclass."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nhandling a processing instruction as a ProcessingInstruction object possibly one with a %SOUP - ENCODING% slot into which an analyzed entry will be plugged later.", "response": "def handle_pi(self, text):\n        \"\"\"Handle a processing instruction as a ProcessingInstruction\n        object, possibly one with a %SOUP-ENCODING% slot into which an\n        encoding will be plugged later.\"\"\"\n        if text[:3] == \"xml\":\n            text = u\"xml version='1.0' encoding='%SOUP-ENCODING%'\"\n        self._toStringSubclass(text, ProcessingInstruction)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nhandle character references as data.", "response": "def handle_charref(self, ref):\n        \"Handle character references as data.\"\n        if self.convertEntities:\n            data = unichr(int(ref))\n        else:\n            data = '&#%s;' % ref\n        self.handle_data(data)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef handle_entityref(self, ref):\n        data = None\n        if self.convertHTMLEntities:\n            try:\n                data = unichr(name2codepoint[ref])\n            except KeyError:\n                pass\n\n        if not data and self.convertXMLEntities:\n                data = self.XML_ENTITIES_TO_SPECIAL_CHARS.get(ref)\n\n        if not data and self.convertHTMLEntities and \\\n            not self.XML_ENTITIES_TO_SPECIAL_CHARS.get(ref):\n                # TODO: We've got a problem here. We're told this is\n                # an entity reference, but it's not an XML entity\n                # reference or an HTML entity reference. Nonetheless,\n                # the logical thing to do is to pass it through as an\n                # unrecognized entity reference.\n                #\n                # Except: when the input is \"&carol;\" this function\n                # will be called with input \"carol\". When the input is\n                # \"AT&T\", this function will be called with input\n                # \"T\". We have no way of knowing whether a semicolon\n                # was present originally, so we don't know whether\n                # this is an unknown entity or just a misplaced\n                # ampersand.\n                #\n                # The more common case is a misplaced ampersand, so I\n                # escape the ampersand and omit the trailing semicolon.\n                data = \"&amp;%s\" % ref\n        if not data:\n            # This case is different from the one above, because we\n            # haven't already gone through a supposedly comprehensive\n            # mapping of entities to Unicode characters. We might not\n            # have gone through any mapping at all. So the chances are\n            # very high that this is a real entity, and not a\n            # misplaced ampersand.\n            data = \"&%s;\" % ref\n        self.handle_data(data)", "response": "Handle entity references as data possibly converting known unicode and XML entity references to the corresponding Unicode unicode characters."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef parse_declaration(self, i):\n        j = None\n        if self.rawdata[i:i+9] == '<![CDATA[':\n             k = self.rawdata.find(']]>', i)\n             if k == -1:\n                 k = len(self.rawdata)\n             data = self.rawdata[i+9:k]\n             j = k+3\n             self._toStringSubclass(data, CData)\n        else:\n            try:\n                j = SGMLParser.parse_declaration(self, i)\n            except SGMLParseError:\n                toHandle = self.rawdata[i:]\n                self.handle_data(toHandle)\n                j = i + len(toHandle)\n        return j", "response": "Treat a bogus SGML declaration as raw data. Treat a CDATA\n        declaration as a CData object."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _subMSChar(self, orig):\n        sub = self.MS_CHARS.get(orig)\n        if type(sub) == types.TupleType:\n            if self.smartQuotesTo == 'xml':\n                sub = '&#x%s;' % sub[1]\n            else:\n                sub = '&%s;' % sub[0]\n        return sub", "response": "Changes a MS smart quote character to an XML or HTML object."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ngive a string and its encoding decodes the string into Unicode.", "response": "def _toUnicode(self, data, encoding):\n        '''Given a string and its encoding, decodes the string into Unicode.\n        %encoding is a string recognized by encodings.aliases'''\n\n        # strip Byte Order Mark (if present)\n        if (len(data) >= 4) and (data[:2] == '\\xfe\\xff') \\\n               and (data[2:4] != '\\x00\\x00'):\n            encoding = 'utf-16be'\n            data = data[2:]\n        elif (len(data) >= 4) and (data[:2] == '\\xff\\xfe') \\\n                 and (data[2:4] != '\\x00\\x00'):\n            encoding = 'utf-16le'\n            data = data[2:]\n        elif data[:3] == '\\xef\\xbb\\xbf':\n            encoding = 'utf-8'\n            data = data[3:]\n        elif data[:4] == '\\x00\\x00\\xfe\\xff':\n            encoding = 'utf-32be'\n            data = data[4:]\n        elif data[:4] == '\\xff\\xfe\\x00\\x00':\n            encoding = 'utf-32le'\n            data = data[4:]\n        newdata = unicode(data, encoding)\n        return newdata"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _detectEncoding(self, xml_data, isHTML=False):\n        xml_encoding = sniffed_xml_encoding = None\n        try:\n            if xml_data[:4] == '\\x4c\\x6f\\xa7\\x94':\n                # EBCDIC\n                xml_data = self._ebcdic_to_ascii(xml_data)\n            elif xml_data[:4] == '\\x00\\x3c\\x00\\x3f':\n                # UTF-16BE\n                sniffed_xml_encoding = 'utf-16be'\n                xml_data = unicode(xml_data, 'utf-16be').encode('utf-8')\n            elif (len(xml_data) >= 4) and (xml_data[:2] == '\\xfe\\xff') \\\n                     and (xml_data[2:4] != '\\x00\\x00'):\n                # UTF-16BE with BOM\n                sniffed_xml_encoding = 'utf-16be'\n                xml_data = unicode(xml_data[2:], 'utf-16be').encode('utf-8')\n            elif xml_data[:4] == '\\x3c\\x00\\x3f\\x00':\n                # UTF-16LE\n                sniffed_xml_encoding = 'utf-16le'\n                xml_data = unicode(xml_data, 'utf-16le').encode('utf-8')\n            elif (len(xml_data) >= 4) and (xml_data[:2] == '\\xff\\xfe') and \\\n                     (xml_data[2:4] != '\\x00\\x00'):\n                # UTF-16LE with BOM\n                sniffed_xml_encoding = 'utf-16le'\n                xml_data = unicode(xml_data[2:], 'utf-16le').encode('utf-8')\n            elif xml_data[:4] == '\\x00\\x00\\x00\\x3c':\n                # UTF-32BE\n                sniffed_xml_encoding = 'utf-32be'\n                xml_data = unicode(xml_data, 'utf-32be').encode('utf-8')\n            elif xml_data[:4] == '\\x3c\\x00\\x00\\x00':\n                # UTF-32LE\n                sniffed_xml_encoding = 'utf-32le'\n                xml_data = unicode(xml_data, 'utf-32le').encode('utf-8')\n            elif xml_data[:4] == '\\x00\\x00\\xfe\\xff':\n                # UTF-32BE with BOM\n                sniffed_xml_encoding = 'utf-32be'\n                xml_data = unicode(xml_data[4:], 'utf-32be').encode('utf-8')\n            elif xml_data[:4] == '\\xff\\xfe\\x00\\x00':\n                # UTF-32LE with BOM\n                sniffed_xml_encoding = 'utf-32le'\n                xml_data = unicode(xml_data[4:], 'utf-32le').encode('utf-8')\n            elif xml_data[:3] == '\\xef\\xbb\\xbf':\n                # UTF-8 with BOM\n                sniffed_xml_encoding = 'utf-8'\n                xml_data = unicode(xml_data[3:], 'utf-8').encode('utf-8')\n            else:\n                sniffed_xml_encoding = 'ascii'\n                pass\n        except:\n            xml_encoding_match = None\n        xml_encoding_match = re.compile(\n            '^<\\?.*encoding=[\\'\"](.*?)[\\'\"].*\\?>').match(xml_data)\n        if not xml_encoding_match and isHTML:\n            regexp = re.compile('<\\s*meta[^>]+charset=([^>]*?)[;\\'\">]', re.I)\n            xml_encoding_match = regexp.search(xml_data)\n        if xml_encoding_match is not None:\n            xml_encoding = xml_encoding_match.groups()[0].lower()\n            if isHTML:\n                self.declaredHTMLEncoding = xml_encoding\n            if sniffed_xml_encoding and \\\n               (xml_encoding in ('iso-10646-ucs-2', 'ucs-2', 'csunicode',\n                                 'iso-10646-ucs-4', 'ucs-4', 'csucs4',\n                                 'utf-16', 'utf-32', 'utf_16', 'utf_32',\n                                 'utf16', 'u16')):\n                xml_encoding = sniffed_xml_encoding\n        return xml_data, xml_encoding, sniffed_xml_encoding", "response": "Detects the XML encoding of a document."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef shoebot_example(**shoebot_kwargs):\n\n    def decorator(f):\n        def run():\n            from shoebot import ShoebotInstallError  # https://github.com/shoebot/shoebot/issues/206\n            print(\"    Shoebot - %s:\" % f.__name__.replace(\"_\", \" \"))\n            try:\n                import shoebot\n                outputfile = \"/tmp/shoebot-%s.png\" % f.__name__\n                bot = shoebot.create_bot(outputfile=outputfile)\n                f(bot)\n                bot.finish()\n                print('        [passed] : %s' % outputfile)\n                print('')\n            except ShoebotInstallError as e:\n                print('        [failed]', e.args[0])\n                print('')\n            except Exception:\n                print('        [failed] - traceback:')\n                for line in traceback.format_exc().splitlines():\n                    print('    %s' % line)\n                print('')\n\n        return run\n\n    return decorator", "response": "Decorator to run some code in a single bot instance."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreturn the center point of the path disregarding transforms.", "response": "def _get_center(self):\n        '''Returns the center point of the path, disregarding transforms.\n        '''\n        x = (self.x + self.width / 2)\n        y = (self.y + self.height / 2)\n        return (x, y)"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nscale the context based on difference between bot size and widget size and widget size.", "response": "def scale_context_and_center(self, cr):\n        \"\"\"\n        Scale context based on difference between bot size and widget\n        \"\"\"\n        bot_width, bot_height = self.bot_size\n        if self.width != bot_width or self.height != bot_height:\n            # Scale up by largest dimension\n            if self.width < self.height:\n                scale_x = float(self.width) / float(bot_width)\n                scale_y = scale_x\n                cr.translate(0, (self.height - (bot_height * scale_y)) / 2.0)\n            elif self.width > self.height:\n                scale_y = float(self.height) / float(bot_height)\n                scale_x = scale_y\n                cr.translate((self.width - (bot_width * scale_x)) / 2.0, 0)\n            else:\n                scale_x = 1.0\n                scale_y = 1.0\n            cr.scale(scale_x, scale_y)\n            self.input_device.scale_x = scale_y\n            self.input_device.scale_y = scale_y"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef draw(self, widget, cr):\n        '''\n        Draw just the exposed part of the backing store, scaled to fit\n        '''\n        if self.bot_size is None:\n            # No bot to draw yet.\n            self.draw_default_image(cr)\n            return\n\n        cr = driver.ensure_pycairo_context(cr)\n        \n        surface = self.backing_store.surface\n        cr.set_source_surface(surface)\n        cr.paint()", "response": "Draw just the exposed part of the backing store scaled to fit\n       "}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef create_rcontext(self, size, frame):\n        '''\n        Creates a recording surface for the bot to draw on\n\n        :param size: The width and height of bot\n        '''\n        self.frame = frame\n        width, height = size\n        meta_surface = cairo.RecordingSurface(cairo.CONTENT_COLOR_ALPHA, (0, 0, width, height))\n\n        ctx = cairo.Context(meta_surface)\n        return ctx", "response": "Creates a recording surface for the bot to draw on\n       "}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nupdates the backing store from a cairo context and schedule a redraw (expose event) :param size: width, height in pixels of bot :param frame: frame # thar was drawn :param cairo_ctx: cairo context the bot was drawn on", "response": "def do_drawing(self, size, frame, cairo_ctx):\n        '''\n        Update the backing store from a cairo context and\n        schedule a redraw (expose event)\n\n        :param size: width, height in pixels of bot\n        :param frame: frame # thar was drawn\n        :param cairo_ctx: cairo context the bot was drawn on\n        '''\n        if self.get_window() and not self.bot_size:\n            # Get initial size for window\n            self.set_size_request(*size)\n\n        self.bot_size = size\n        self.backing_store = BackingStore.get_backingstore(self.width, self.height)\n\n        cr = pycairo.Context(self.backing_store.surface)\n        if self.scale_fit:\n            self.scale_context_and_center(cr)\n\n        cairo_ctx = driver.ensure_pycairo_context(cairo_ctx)\n        cr.set_source_surface(cairo_ctx.get_target())\n        # Create the cairo context\n        cr.set_operator(cairo.OPERATOR_SOURCE)\n        cr.paint()\n\n        self.queue_draw()\n\n        while Gtk.events_pending():\n            Gtk.main_iteration_do(False)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nreturning a JSON string representation of a Python data structure.", "response": "def encode(self, o):\n        \"\"\"\n        Return a JSON string representation of a Python data structure.\n\n        >>> JSONEncoder().encode({\"foo\": [\"bar\", \"baz\"]})\n        '{\"foo\": [\"bar\", \"baz\"]}'\n        \"\"\"\n        # This is for extremely simple cases and benchmarks.\n        if isinstance(o, basestring):\n            if isinstance(o, str):\n                _encoding = self.encoding\n                if (_encoding is not None \n                        and not (_encoding == 'utf-8')):\n                    o = o.decode(_encoding)\n            if self.ensure_ascii:\n                return encode_basestring_ascii(o)\n            else:\n                return encode_basestring(o)\n        # This doesn't pass the iterator directly to ''.join() because the\n        # exceptions aren't as detailed.  The list call should be roughly\n        # equivalent to the PySequence_Fast that ''.join() would do.\n        chunks = list(self.iterencode(o))\n        return ''.join(chunks)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nencode the given object and yield each string representation as available.", "response": "def iterencode(self, o):\n        \"\"\"\n        Encode the given object and yield each string\n        representation as available.\n        \n        For example::\n            \n            for chunk in JSONEncoder().iterencode(bigobject):\n                mysocket.write(chunk)\n        \"\"\"\n        if self.check_circular:\n            markers = {}\n        else:\n            markers = None\n        return self._iterencode(o, markers)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef get_key_map(self):\n        '''\n        Return a dict in the form of\n\n        SHOEBOT_KEY_NAME, GTK_VALUE\n\n        Shoebot key names look like KEY_LEFT, whereas Gdk uses KEY_Left\n        - Shoebot key names are derived from Nodebox 1, which was a mac\n          app.\n        '''\n        kdict = {}\n        for gdk_name in dir(Gdk):\n            nb_name = gdk_name.upper()\n            kdict[nb_name] = getattr(Gdk, gdk_name)\n        return kdict", "response": "Return a dict of the key names and values of the Shoebot application."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\noutputting a filename along with multifile filename along with multifile filename along with multifile filename along with multifile filename along with multifile numbered filenames will be used.", "response": "def _output_file(self, frame):\n        \"\"\"\n        If filename was used output a filename, along with multifile\n        numbered filenames will be used.\n\n        If buff was specified it is returned.\n\n        :return: Output buff or filename.\n        \"\"\"\n        if self.buff:\n            return self.buff\n        elif self.multifile:\n            return self.file_root + \"_%03d\" % frame + self.file_ext\n        else:\n            return self.filename"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef create_rcontext(self, size, frame):\n        if self.format == 'pdf':\n            surface = cairo.PDFSurface(self._output_file(frame), *size)\n        elif self.format in ('ps', 'eps'):\n            surface = cairo.PSSurface(self._output_file(frame), *size)\n        elif self.format == 'svg':\n            surface = cairo.SVGSurface(self._output_file(frame), *size)\n        elif self.format == 'surface':\n            surface = self.target\n        else:\n            surface = cairo.ImageSurface(cairo.FORMAT_ARGB32, *size)\n        return cairo.Context(surface)", "response": "Create a new context object for the current image."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef rendering_finished(self, size, frame, cairo_ctx):\n        surface = cairo_ctx.get_target()\n        if self.format == 'png':\n            surface.write_to_png(self._output_file(frame))\n        surface.finish()\n        surface.flush()", "response": "Called when the CairoCanvas has rendered a bot"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef output_closure(self, target, file_number=None):\n        '''\n        Function to output to a cairo surface\n\n        target is a cairo Context or filename\n        if file_number is set, then files will be numbered\n        (this is usually set to the current frame number)\n        '''\n        def output_context(ctx):\n            target_ctx = target\n            target_ctx.set_source_surface(ctx.get_target())\n            target_ctx.paint()\n            return target_ctx\n\n        def output_surface(ctx):\n            target_ctx = cairo.Context(target)\n            target_ctx.set_source_surface(ctx.get_target())\n            target_ctx.paint()\n            return target_ctx\n\n        def output_file(ctx):\n            root, extension = os.path.splitext(target)\n            if file_number:\n                filename = '%s_%04d%s' % (root, file_number, extension)\n            else:\n                filename = target\n\n            extension = extension.lower()\n            if extension == '.png':\n                surface = ctx.get_target()\n                surface.write_to_png(target)\n            elif extension == '.pdf':\n                target_ctx = cairo.Context(cairo.PDFSurface(filename, *self.size_or_default()))\n                target_ctx.set_source_surface(ctx.get_target())\n                target_ctx.paint()\n            elif extension in ('.ps', '.eps'):\n                target_ctx = cairo.Context(cairo.PSSurface(filename, *self.size_or_default()))\n                if extension == '.eps':\n                    target_ctx.set_eps(extension='.eps')\n                target_ctx.set_source_surface(ctx.get_target())\n                target_ctx.paint()\n            elif extension == '.svg':\n                target_ctx = cairo.Context(cairo.SVGSurface(filename, *self.size_or_default()))\n                target_ctx.set_source_surface(ctx.get_target())\n                target_ctx.paint()\n            return filename\n\n        if isinstance(target, cairo.Context):\n            return output_context\n        elif isinstance(target, cairo.Surface):\n            return output_surface\n        else:\n            return output_file", "response": "Function to output to a cairo surface\n            target is a string or a path to a cairo surface containing the current frame number."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nparses the resource class definition and returns a list of all the links in a resource class properties.", "response": "def _parse(self):\n        \n        \"\"\" Strips links from the definition and gathers them in a links property.\n        \"\"\"\n        \n        p1 = \"\\[.*?\\](.*?)\\[\\/.*?\\]\"\n        p2 = \"\\[(.*?)\\]\"\n        self.links = []\n        for p in (p1,p2):\n            for link in re.findall(p, self.description):\n                self.links.append(link)\n            self.description = re.sub(p, \"\\\\1\", self.description)\n            \n        self.description = self.description.strip()"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ncreate a CairoCanvas and CairoImageSink for an attachment to a bot.", "response": "def create_canvas(src, format=None, outputfile=None, multifile=False, buff=None, window=False, title=None,\n                  fullscreen=None, show_vars=False):\n    \"\"\"\n    Create canvas and sink for attachment to a bot\n\n    canvas is what draws images, 'sink' is the final consumer of the images\n\n    :param src: Defaults for title or outputfile if not specified.\n\n    :param format: CairoImageSink image format, if using buff instead of outputfile\n    :param buff: CairoImageSink buffer object to send output to\n\n    :param outputfile: CairoImageSink output filename e.g. \"hello.svg\"\n    :param multifile: CairoImageSink if True,\n\n    :param title: ShoebotWindow - set window title\n    :param fullscreen: ShoebotWindow - set window title\n    :param show_vars: ShoebotWindow - display variable window\n\n    Two kinds of sink are provided: CairoImageSink and ShoebotWindow\n\n    ShoebotWindow\n\n    Displays a window to draw shoebot inside.\n\n\n    CairoImageSink\n\n    Output to a filename (or files if multifile is set), or a buffer object.\n    \"\"\"\n    from core import CairoCanvas, CairoImageSink # https://github.com/shoebot/shoebot/issues/206\n\n    if outputfile:\n        sink = CairoImageSink(outputfile, format, multifile, buff)\n    elif window or show_vars:\n        from gui import ShoebotWindow\n        if not title:\n            if src and os.path.isfile(src):\n                title = os.path.splitext(os.path.basename(src))[0] + ' - Shoebot'\n            else:\n                title = 'Untitled - Shoebot'\n        sink = ShoebotWindow(title, show_vars, fullscreen=fullscreen)\n    else:\n        if src and isinstance(src, cairo.Surface):\n            outputfile = src\n            format = 'surface'\n        elif src and os.path.isfile(src):\n            outputfile = os.path.splitext(os.path.basename(src))[0] + '.' + (format or 'svg')\n        else:\n            outputfile = 'output.svg'\n        sink = CairoImageSink(outputfile, format, multifile, buff)\n    canvas = CairoCanvas(sink)\n\n    return canvas"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ncreates a bot with the same canvas attached to it", "response": "def create_bot(src=None, grammar=NODEBOX, format=None, outputfile=None, iterations=1, buff=None, window=False,\n               title=None, fullscreen=None, server=False, port=7777, show_vars=False, vars=None, namespace=None):\n    \"\"\"\n    Create a canvas and a bot with the same canvas attached to it\n\n    bot parameters\n    :param grammar: DRAWBOT or NODEBOX - decides what kind of bot is created\n    :param vars: preset dictionary of vars from the called\n\n    canvas parameters:\n    ... everything else ...\n\n    See create_canvas for details on those parameters.\n\n    \"\"\"\n    canvas = create_canvas(src, format, outputfile, iterations > 1, buff, window, title, fullscreen=fullscreen,\n                           show_vars=show_vars)\n\n    if grammar == DRAWBOT:\n        from shoebot.grammar import DrawBot\n        bot = DrawBot(canvas, namespace=namespace, vars=vars)\n    else:\n        from shoebot.grammar import NodeBot\n        bot = NodeBot(canvas, namespace=namespace, vars=vars)\n\n    if server:\n        from shoebot.sbio import SocketServer\n        socket_server = SocketServer(bot, \"\", port=port)\n    return bot"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef run(src,\n        grammar=NODEBOX,\n        format=None,\n        outputfile=None,\n        iterations=1,\n        buff=None,\n        window=True,\n        title=None,\n        fullscreen=None,\n        close_window=False,\n        server=False,\n        port=7777,\n        show_vars=False,\n        vars=None,\n        namespace=None,\n        run_shell=False,\n        args=[],\n        verbose=False,\n        background_thread=True):\n    \"\"\"\n    Create and run a bot, the arguments all correspond to sanitized\n    commandline options.\n\n    :param background_thread: If True then use a background thread.\n\n\n    Other args are split into create_args and run_args\n\n    See create_bot for details on create_args\n\n    run_args are passed to bot.run - see Nodebot.run or Drawbot.run\n\n\n\n    Background thread:\n\n    readline in python is blocking, running the app in a background\n    thread opens up the main thread for IO on stdin/stdout, which\n    can be used for communication with shoebot when livecoding is\n    enabled.\n\n    See shoebot.io for implementation of the shell, and the gedit\n    plugin for an example of using livecoding.\n    \"\"\"\n    # Munge shoebogt sys.argv\n    sys.argv = [sys.argv[\n                    0]] + args  # Remove shoebot parameters so sbot can be used in place of the python interpreter (e.g. for sphinx).\n\n    # arguments for create_bot\n    create_args = [src,\n                   grammar,\n                   format,\n                   outputfile,\n                   iterations,\n                   buff,\n                   window,\n                   title,\n                   fullscreen,\n                   server,\n                   port,\n                   show_vars]\n    create_kwargs = dict(vars=vars, namespace=namespace)\n    run_args = [src]\n    run_kwargs = dict(\n        iterations=iterations,\n        frame_limiter=window,\n        verbose=verbose,\n        # run forever except 1. windowed mode is off 2. if --close-window was specified and\n        # 3. if an output file was indicated\n        run_forever=window and not (close_window or bool(outputfile)),\n    )\n\n    # Run shoebot in a background thread so we can run a cmdline shell in the current thread\n    if background_thread:\n        sbot_thread = ShoebotThread(\n            create_args=create_args,\n            create_kwargs=create_kwargs,\n            run_args=run_args,\n            run_kwargs=run_kwargs,\n            send_sigint=run_shell\n        )\n        sbot_thread.start()\n        sbot = sbot_thread.sbot\n    else:\n        print('background thread disabled')\n        # This is a debug option, things should always work using the\n        # background thread (crosses fingers)\n        if run_shell:\n            # python readline is blocking, so ui must run in a seperate\n            # thread\n            raise ValueError('UI Must run in a separate thread to shell and shell needs main thread')\n\n        sbot_thread = None\n        sbot = create_bot(*create_args, **create_kwargs)\n        sbot.run(*run_args, **run_kwargs)\n\n    if run_shell:\n        import shoebot.sbio.shell\n        shell = shoebot.sbio.shell.ShoebotCmd(sbot, trusted=True)\n        try:\n            shell.cmdloop()\n        except KeyboardInterrupt as e:\n            publish_event(QUIT_EVENT)  # Handle Ctrl-C\n            # KeyboardInterrupt is generated by os.kill from the other thread\n            if verbose:\n                raise\n            else:\n                return\n    elif background_thread:\n        try:\n            while sbot_thread.is_alive():\n                sleep(1)\n        except KeyboardInterrupt:\n            publish_event(QUIT_EVENT)\n\n    if all((background_thread, sbot_thread)):\n        sbot_thread.join()\n\n    return sbot", "response": "Create and run a nodebot from a source string."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef save_as(self):\n        chooser = ShoebotFileChooserDialog(_('Save File'), None, Gtk.FileChooserAction.SAVE,\n                                           (Gtk.STOCK_SAVE, Gtk.ResponseType.ACCEPT,\n                                            Gtk.STOCK_CANCEL, Gtk.ResponseType.CANCEL))\n        chooser.set_do_overwrite_confirmation(True)\n        chooser.set_transient_for(self)\n        saved = chooser.run() == Gtk.ResponseType.ACCEPT\n        if saved:\n            old_filename = self.filename\n            self.source_buffer.filename = chooser.get_filename()\n            if not self.save():\n                self.filename = old_filename\n        chooser.destroy()\n        return saved", "response": "Save the buffer as."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef add_variables(self):\n        for k, v in self.bot._vars.items():\n            if not hasattr(v, 'type'):\n                raise AttributeError(\n                    '%s is not a Shoebot Variable - see https://shoebot.readthedocs.io/en/latest/commands.html#dynamic-variables' % k)\n            self.add_variable(v)", "response": "Add all variables to the specified vbox\n       "}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nupdate the value of a variable in the current object", "response": "def update_var(self, name, value):\n        \"\"\"\n        :return: success, err_msg_if_failed\n        \"\"\"\n        widget = self.widgets.get(name)\n        if widget is None:\n            return False, 'No widget found matching, {}'.format(name)\n\n        try:\n            if isinstance(widget, Gtk.CheckButton):\n                widget.set_active(value)\n                return True, widget.get_active()\n            elif isinstance(widget, Gtk.Entry):\n                widget.set_text(value)\n                return True, widget.get_text()\n            else:\n                widget.set_value(value)\n                return True, widget.get_value()\n        except Exception as e:\n            return False, str(e)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ncall when a slider is changed.", "response": "def widget_changed(self, widget, v):\n        ''' Called when a slider is adjusted. '''\n        # set the appropriate bot var\n        if v.type is NUMBER:\n            self.bot._namespace[v.name] = widget.get_value()\n            self.bot._vars[v.name].value = widget.get_value()  ## Not sure if this is how to do this - stu\n            publish_event(VARIABLE_UPDATED_EVENT, v)  # pretty dumb for now\n        elif v.type is BOOLEAN:\n            self.bot._namespace[v.name] = widget.get_active()\n            self.bot._vars[v.name].value = widget.get_active()  ## Not sure if this is how to do this - stu\n            publish_event(VARIABLE_UPDATED_EVENT, v)  # pretty dumb for now\n        elif v.type is TEXT:\n            self.bot._namespace[v.name] = widget.get_text()\n            self.bot._vars[v.name].value = widget.get_text()  ## Not sure if this is how to do this - stu\n            publish_event(VARIABLE_UPDATED_EVENT, v)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef var_added(self, v):\n        self.add_variable(v)\n\n        self.window.set_size_request(400, 35 * len(self.widgets.keys()))\n        self.window.show_all()", "response": "Called when a new variable was added in the bot while it ran"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef var_deleted(self, v):\n        widget = self.widgets[v.name]\n\n        # widgets are all in a single container ..\n        parent = widget.get_parent()\n        self.container.remove(parent)\n        del self.widgets[v.name]\n\n        self.window.set_size_request(400, 35 * len(self.widgets.keys()))\n        self.window.show_all()", "response": "Called when a var was deleted in the bot\n       "}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nparse a SVG string into a list of paths.", "response": "def parse(svg, cached=False, _copy=True):\n    \n    \"\"\" Returns cached copies unless otherwise specified.\n    \"\"\"\n    \n    if not cached:\n        dom = parser.parseString(svg)\n        paths = parse_node(dom, [])\n    else:\n        id = _cache.id(svg)\n        if not _cache.has_key(id):\n            dom = parser.parseString(svg)\n            _cache.save(id, parse_node(dom, []))\n        paths = _cache.load(id, _copy)\n   \n    return paths"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nreturns the attribute of the XML element.", "response": "def get_attribute(element, attribute, default=0):\n    \n    \"\"\" Returns XML element's attribute, or default if none.\n    \"\"\" \n    \n    a = element.getAttribute(attribute)\n    if a == \"\": \n        return default\n    return a"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef parse_transform(e, path):\n    \n    \"\"\" Transform the path according to a defined matrix.\n    \n    Attempts to extract a transform=\"matrix()|translate()\" attribute.\n    Transforms the path accordingly.\n    \n    \"\"\"\n    \n    t = get_attribute(e, \"transform\", default=\"\")\n    \n    for mode in (\"matrix\", \"translate\"):\n        if t.startswith(mode):\n            v = t.replace(mode, \"\").lstrip(\"(\").rstrip(\")\")\n            v = v.replace(\", \", \",\").replace(\" \", \",\")\n            v = [float(x) for x in v.split(\",\")]\n            from nodebox.graphics import Transform\n            t = Transform()            \n            if mode == \"matrix\":\n                t._set_matrix(v)\n            elif mode == \"translate\":\n                t.translate(*v)\n            path = t.transformBezierPath(path)\n            break\n\n    # Transformations can also be defined as <g transform=\"matrix()\"><path /><g>\n    # instead of <g><path transform=\"matrix() /></g>.\n    e = e.parentNode\n    if e and e.tagName == \"g\":\n        path = parse_transform(e, path)\n        \n    return path", "response": "Parses the transform attribute of the element e and returns the path."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef add_color_info(e, path):\n    \n    \"\"\" Expand the path with color information.\n    \n    Attempts to extract fill and stroke colors\n    from the element and adds it to path attributes.\n    \n    \"\"\"\n    \n    _ctx.colormode(RGB, 1.0)\n    \n    def _color(hex, alpha=1.0):\n        if hex == \"none\": return None\n        n = int(hex[1:],16)\n        r = (n>>16)&0xff\n        g = (n>>8)&0xff\n        b = n&0xff\n        return _ctx.color(r/255.0, g/255.0, b/255.0, alpha)\n\n    path.fill = (0,0,0,0)\n    path.stroke = (0,0,0,0)\n    path.strokewidth = 0\n\n    # See if we can find an opacity attribute,\n    # which is the color's alpha.\n    alpha = get_attribute(e, \"opacity\", default=\"\")\n    if alpha == \"\":\n        alpha = 1.0\n    else:\n        alpha = float(alpha)\n    \n    # Colors stored as fill=\"\" or stroke=\"\" attributes.\n    try: path.fill = _color(get_attribute(e, \"fill\", default=\"#00000\"), alpha)\n    except: \n        pass\n    try: path.stroke = _color(get_attribute(e, \"stroke\", default=\"none\"), alpha)\n    except: \n        pass\n    try: path.strokewidth = float(get_attribute(e, \"stroke-width\", default=\"1\"))\n    except: \n        pass\n    \n    # Colors stored as a CSS style attribute, for example:\n    # style=\"fill:#ff6600;stroke:#ffe600;stroke-width:0.06742057\"\n    style = get_attribute(e, \"style\", default=\"\").split(\";\")\n    for s in style:\n        try:\n            if s.startswith(\"fill:\"):\n                path.fill = _color(s.replace(\"fill:\", \"\"))\n            elif s.startswith(\"stroke:\"):\n                path.stroke = _color(s.replace(\"stroke:\", \"\"))\n            elif s.startswith(\"stroke-width:\"):\n                path.strokewidth = float(s.replace(\"stroke-width:\", \"\"))\n        except:\n            pass    \n\n    # A path with beginning and ending coordinate\n    # at the same location is considered closed.\n    # Unless it contains a MOVETO somewhere in the middle.\n    path.closed = False\n    if path[0].x == path[len(path)-1].x and \\\n       path[0].y == path[len(path)-1].y: \n        path.closed = True\n    for i in range(1,-1):\n        if path[i].cmd == MOVETO:\n            path.closed = False\n        \n    return path", "response": "Adds color information to the path attributes of the current node."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef download(self, size=SIZE_LARGE, thumbnail=False, wait=60, asynchronous=False):\n        \n        \"\"\" Downloads this image to cache.\n        \n        Calling the download() method instantiates an asynchronous URLAccumulator.\n        Once it is done downloading, this image will have its path property\n        set to an image file in the cache.\n        \n        \"\"\"\n        \n        if thumbnail == True: size = SIZE_THUMBNAIL # backwards compatibility\n        self._size = disambiguate_size(size)\n        if self._size == SIZE_THUMBNAIL:\n            url = self.url.replace(\"/preview/\", \"/med/\")\n        else:\n            url = self.url\n        \n        cache = \"morguefile\"\n        extension = os.path.splitext(url)[1]\n        URLAccumulator.__init__(self, url, wait, asynchronous, cache, extension, 2)\n        \n        if not asynchronous:\n            return self.path", "response": "Downloads this image to the cache."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreturning a copy of the event handler remembering the last node clicked.", "response": "def copy(self, graph):\n    \n        \"\"\" Returns a copy of the event handler, remembering the last node clicked.\n        \"\"\"\n    \n        e = events(graph, self._ctx)\n        e.clicked = self.clicked\n        return e"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef update(self):\n    \n        \"\"\" Interacts with the graph by clicking or dragging nodes.\n        Hovering a node fires the callback function events.hover().\n        Clicking a node fires the callback function events.click().\n        \"\"\"\n    \n        if self.mousedown:\n        \n            # When not pressing or dragging, check each node.\n            if not self.pressed and not self.dragged:\n                for n in self.graph.nodes:\n                    if self.mouse in n:\n                        self.pressed = n\n                        break\n                    \n            # If a node is pressed, check if a drag is started.\n            elif self.pressed and not self.mouse in self.pressed:\n                self.dragged = self.pressed\n                self.pressed = None\n            \n            # Drag the node (right now only for springgraphs).\n            elif self.dragged and self.graph.layout.type == \"spring\":\n                self.drag(self.dragged)\n                self.graph.layout.i = min(100, max(2, self.graph.layout.n-100))\n    \n        # Mouse is clicked on a node, fire callback.\n        elif self.pressed and self.mouse in self.pressed:\n            self.clicked = self.pressed\n            self.pressed = None\n            self.graph.layout.i = 2\n            self.click(self.clicked)\n    \n        # Mouse up.\n        else:\n            self.hovered = None\n            self.pressed = None\n            self.dragged = None\n        \n            # Hovering over a node?\n            for n in self.graph.nodes:\n                if self.mouse in n:\n                    self.hovered = n\n                    self.hover(n)\n                    break", "response": "Updates the mouse state of the node."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef drag(self, node):\n\n        \"\"\" Drags given node to mouse location.\n        \"\"\"\n    \n        dx = self.mouse.x - self.graph.x\n        dy = self.mouse.y - self.graph.y\n\n        # A dashed line indicates the drag vector.\n        s = self.graph.styles.default\n        self._ctx.nofill()\n        self._ctx.nostroke()\n        if s.stroke: \n            self._ctx.strokewidth(s.strokewidth)\n            self._ctx.stroke(\n                s.stroke.r, \n                s.stroke.g, \n                s.stroke.g, \n                0.75\n            )\n        p = self._ctx.line(node.x, node.y, dx, dy, draw=False)\n        try: p._nsBezierPath.setLineDash_count_phase_([2,4], 2, 50)\n        except:\n            pass\n        self._ctx.drawpath(p)\n        r = node.__class__(None).r * 0.75\n        self._ctx.oval(dx-r/2, dy-r/2, r, r)\n    \n        node.vx = dx / self.graph.d\n        node.vy = dy / self.graph.d", "response": "Drags given node to mouse location."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef hover(self, node):\n        \n        \"\"\" Displays a popup when hovering over a node.\n        \"\"\"\n        \n        if self.popup == False: return\n        if self.popup == True or self.popup.node != node:\n            if self.popup_text.has_key(node.id):\n                texts = self.popup_text[node.id]\n            else:\n                texts = None\n            self.popup = popup(self._ctx, node, texts)\n        self.popup.draw()", "response": "Displays a popup when hovering over a node."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nreturning a cached textpath of the given text in queue.", "response": "def textpath(self, i):\n        \n        \"\"\" Returns a cached textpath of the given text in queue.\n        \"\"\"\n        \n        if len(self._textpaths) == i:\n            self._ctx.font(self.font, self.fontsize)\n            txt = self.q[i]\n            if len(self.q) > 1:\n                # Indicate current text (e.g. 5/13).\n                txt += \" (\"+str(i+1)+\"/\" + str(len(self.q))+\")\"\n            p = self._ctx.textpath(txt, 0, 0, width=self._w)\n            h = self._ctx.textheight(txt, width=self._w)\n            self._textpaths.append((p, h))\n\n        return self._textpaths[i]"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef update(self):\n        \n        \"\"\" Rotates the queued texts and determines display time.\n        \"\"\"\n        \n        if self.delay > 0:\n            # It takes a while for the popup to appear.\n            self.delay -= 1; return\n            \n        if self.fi == 0:\n            # Only one text in queue, displayed infinitely.\n            if len(self.q) == 1: \n                self.fn = float(\"inf\")\n            # Else, display time depends on text length.\n            else:\n                self.fn = len(self.q[self.i]) / self.speed\n                self.fn = max(self.fn, self.mf)            \n            \n        self.fi += 1\n        if self.fi > self.fn:\n            # Rotate to the next text in queue.\n            self.fi = 0\n            self.i = (self.i+1) % len(self.q)", "response": "Updates the internal state of the internal state of the object."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef draw(self):\n        \n        \"\"\" Draws a popup rectangle with a rotating text queue.        \n        \"\"\" \n        \n        if len(self.q) > 0:\n            self.update()\n            \n            if self.delay == 0:\n                \n                # Rounded rectangle in the given background color.\n                p, h = self.textpath(self.i)\n                f = self.fontsize\n                self._ctx.fill(self.background)\n                self._ctx.rect(\n                    self.node.x + f*1.0, \n                    self.node.y + f*0.5, \n                    self._w + f, \n                    h + f*1.5, \n                    roundness=0.2\n                )\n                \n                # Fade in/out the current text.\n                alpha = 1.0\n                if self.fi < 5: \n                    alpha = 0.2 * self.fi\n                if self.fn-self.fi < 5: \n                    alpha = 0.2 * (self.fn-self.fi)\n                self._ctx.fill(\n                    self.text.r,\n                    self.text.g,\n                    self.text.b,\n                    self.text.a * alpha\n                )\n                \n                self._ctx.translate(self.node.x + f*2.0, self.node.y + f*2.5)\n                self._ctx.drawpath(p)", "response": "Draws a popup rectangle with a rotating text queue."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef write_main(argv):\n    if len(argv) != 1:\n        print(\"Please provide the name of a file to write.\")\n        return 1\n\n    filename = argv[0]\n    resource_name = \"files/\" + filename\n    tweaks_name = amend_filename(filename, \"_tweaks\")\n\n    if not pkg_resources.resource_exists(\"edx_lint\", resource_name):\n        print(u\"Don't have file %r to write.\" % filename)\n        return 2\n\n    if os.path.exists(filename):\n        print(u\"Checking existing copy of %s\" % filename)\n        tef = TamperEvidentFile(filename)\n        if not tef.validate():\n            bak_name = amend_filename(filename, \"_backup\")\n            print(u\"Your copy of %s seems to have been edited, renaming it to %s\" % (filename, bak_name))\n            if os.path.exists(bak_name):\n                print(u\"A previous %s exists, deleting it\" % bak_name)\n                os.remove(bak_name)\n            os.rename(filename, bak_name)\n\n    print(u\"Reading edx_lint/files/%s\" % filename)\n    cfg = configparser.RawConfigParser()\n    resource_string = pkg_resources.resource_string(\"edx_lint\", resource_name).decode(\"utf8\")\n\n    # pkg_resources always reads binary data (in both python2 and python3).\n    # ConfigParser.read_string only exists in python3, so we have to wrap the string\n    # from pkg_resources in a cStringIO so that we can pass it into ConfigParser.readfp.\n    if six.PY2:\n        cfg.readfp(cStringIO(resource_string), resource_name)\n    else:\n        cfg.read_string(resource_string, resource_name)     # pylint: disable=no-member\n\n    if os.path.exists(tweaks_name):\n        print(u\"Applying local tweaks from %s\" % tweaks_name)\n        cfg_tweaks = configparser.RawConfigParser()\n        cfg_tweaks.read([tweaks_name])\n\n        merge_configs(cfg, cfg_tweaks)\n\n    print(u\"Writing %s\" % filename)\n    output_text = cStringIO()\n    output_text.write(WARNING_HEADER.format(filename=filename, tweaks_name=tweaks_name))\n    cfg.write(output_text)\n\n    out_tef = TamperEvidentFile(filename)\n    if six.PY2:\n        output_bytes = output_text.getvalue()\n    else:\n        output_bytes = output_text.getvalue().encode(\"utf8\")\n    out_tef.write(output_bytes)\n\n    return 0", "response": "Write a local copy of FILENAME using FILENAME_tweaks for local tweaks."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef amend_filename(filename, amend):\n    base, ext = os.path.splitext(filename)\n    amended_name = base + amend + ext\n    return amended_name", "response": "Amends a filename with a suffix."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef check_main(argv):\n    if len(argv) != 1:\n        print(\"Please provide the name of a file to check.\")\n        return 1\n\n    filename = argv[0]\n\n    if os.path.exists(filename):\n        print(u\"Checking existing copy of %s\" % filename)\n        tef = TamperEvidentFile(filename)\n        if tef.validate():\n            print(u\"Your copy of %s is good\" % filename)\n        else:\n            print(u\"Your copy of %s seems to have been edited\" % filename)\n    else:\n        print(u\"You don't have a copy of %s\" % filename)\n\n    return 0", "response": "Check that FILENAME has not been edited since writing."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef merge_configs(main, tweaks):\n    for section in tweaks.sections():\n        for option in tweaks.options(section):\n            value = tweaks.get(section, option)\n            if option.endswith(\"+\"):\n                option = option[:-1]\n                value = main.get(section, option) + value\n            main.set(section, option, value)", "response": "Merge tweaks into a main config file."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef validate(self):\n\n        with open(self.filename, \"rb\") as f:\n            text = f.read()\n\n        start_last_line = text.rfind(b\"\\n\", 0, -1)\n        if start_last_line == -1:\n            return False\n\n        original_text = text[:start_last_line+1]\n        last_line = text[start_last_line+1:]\n\n        expected_hash = hashlib.sha1(original_text).hexdigest().encode('utf8')\n        match = re.search(b\"[0-9a-f]{40}\", last_line)\n        if not match:\n            return False\n        actual_hash = match.group(0)\n        return actual_hash == expected_hash", "response": "Check if the file still has its original contents."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nchecking that a checker s visitors are correctly named.", "response": "def check_visitors(cls):\n    \"\"\"Check that a checker's visitors are correctly named.\n\n    A checker has methods named visit_NODETYPE, but it's easy to mis-name\n    a visit method, and it will never be called.  This decorator checks\n    the class to see that all of its visitors are named after an existing\n    node class.\n    \"\"\"\n    for name in dir(cls):\n        if name.startswith(\"visit_\"):\n            if name[6:] not in CLASS_NAMES:\n                raise Exception(u\"Method {} doesn't correspond to a node class\".format(name))\n    return cls"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef usable_class_name(node):\n    name = node.qname()\n    for prefix in [\"__builtin__.\", \"builtins.\", \".\"]:\n        if name.startswith(prefix):\n            name = name[len(prefix):]\n    return name", "response": "Make a reasonable class name for a class node."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef parse_pylint_output(pylint_output):\n    for line in pylint_output:\n        if not line.strip():\n            continue\n\n        if line[0:5] in (\"-\"*5, \"*\"*5):\n            continue\n\n        parsed = PYLINT_PARSEABLE_REGEX.search(line)\n        if parsed is None:\n            LOG.warning(\n                u\"Unable to parse %r. If this is a lint failure, please re-run pylint with the \"\n                u\"--output-format=parseable option, otherwise, you can ignore this message.\",\n                line\n            )\n            continue\n\n        parsed_dict = parsed.groupdict()\n        parsed_dict['linenum'] = int(parsed_dict['linenum'])\n        yield PylintError(**parsed_dict)", "response": "Parse the pylint output into PylintError tuples."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef format_pylint_disables(error_names, tag=True):\n    tag_str = \"lint-amnesty, \" if tag else \"\"\n    if error_names:\n        return u\"  # {tag}pylint: disable={disabled}\".format(\n            disabled=\", \".join(sorted(error_names)),\n            tag=tag_str,\n        )\n    else:\n        return \"\"", "response": "Format a list of error_names into a pylint disable line."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef fix_pylint(line, errors):\n    if not errors:\n        yield line\n        return\n\n    current = PYLINT_EXCEPTION_REGEX.search(line)\n    if current:\n        original_errors = {disable.strip() for disable in current.group('disables').split(',')}\n    else:\n        original_errors = set()\n\n    disabled_errors = set(original_errors)\n\n    for error in errors:\n        if error.error_name == 'useless-suppression':\n            parsed = re.search(\"\"\"Useless suppression of '(?P<error_name>[^']+)'\"\"\", error.error_msg)\n            disabled_errors.discard(parsed.group('error_name'))\n        elif error.error_name == 'missing-docstring' and error.error_msg == 'Missing module docstring':\n            yield format_pylint_disables({error.error_name}).strip() + '\\n'\n        else:\n            disabled_errors.add(error.error_name)\n\n    disable_string = format_pylint_disables(disabled_errors, not disabled_errors <= original_errors)\n\n    if current:\n        yield PYLINT_EXCEPTION_REGEX.sub(disable_string, line)\n    else:\n        yield re.sub(r'($\\s*)', disable_string + r'\\1', line, count=1)", "response": "Yield any modified versions of line needed to address the errors in errors."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nadd ``# pylint: disable`` clauses to add exceptions to all existing pylint errors in a codebase.", "response": "def pylint_amnesty(pylint_output):\n    \"\"\"\n    Add ``# pylint: disable`` clauses to add exceptions to all existing pylint errors in a codebase.\n    \"\"\"\n    errors = defaultdict(lambda: defaultdict(set))\n    for pylint_error in parse_pylint_output(pylint_output):\n        errors[pylint_error.filename][pylint_error.linenum].add(pylint_error)\n\n    for file_with_errors in sorted(errors):\n        try:\n            opened_file = open(file_with_errors)\n        except IOError:\n            LOG.warning(u\"Unable to open %s for edits\", file_with_errors, exc_info=True)\n        else:\n            with opened_file as input_file:\n                output_lines = []\n                for line_num, line in enumerate(input_file, start=1):\n                    output_lines.extend(\n                        fix_pylint(\n                            line,\n                            errors[file_with_errors][line_num]\n                        )\n                    )\n\n            with open(file_with_errors, 'w') as output_file:\n                output_file.writelines(output_lines)"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nprints the help string for the edx_lint command.", "response": "def show_help():\n    \"\"\"Print the help string for the edx_lint command.\"\"\"\n    print(\"\"\"\\\nManage local config files from masters in edx_lint.\n\nCommands:\n\"\"\")\n    for cmd in [write_main, check_main, list_main]:\n        print(cmd.__doc__.lstrip(\"\\n\"))"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nparses an HTML JSON form submission as per the W3C Draft spec.", "response": "def parse_json_form(dictionary, prefix=''):\n    \"\"\"\n    Parse an HTML JSON form submission as per the W3C Draft spec\n    An implementation of \"The application/json encoding algorithm\"\n    http://www.w3.org/TR/html-json-forms/\n    \"\"\"\n    # Step 1: Initialize output object\n    output = {}\n    for name, value in get_all_items(dictionary):\n        # TODO: implement is_file flag\n\n        # Step 2: Compute steps array\n        steps = parse_json_path(name)\n\n        # Step 3: Initialize context\n        context = output\n\n        # Step 4: Iterate through steps\n        for step in steps:\n            # Step 4.1 Retrieve current value from context\n            current_value = get_value(context, step.key, Undefined())\n\n            # Steps 4.2, 4.3: Set JSON value on context\n            context = set_json_value(\n                context=context,\n                step=step,\n                current_value=current_value,\n                entry_value=value,\n                is_file=False,\n            )\n    # Convert any remaining Undefined array entries to None\n    output = clean_undefined(output)\n    output = clean_empty_string(output)\n\n    # Account for DRF prefix (not part of JSON form spec)\n    result = get_value(output, prefix, Undefined())\n    if isinstance(result, Undefined):\n        return output\n    else:\n        return result"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef parse_json_path(path):\n\n    # Steps 1, 2, 3\n    original_path = path\n    steps = []\n\n    # Step 11 (Failure)\n    failed = [\n        JsonStep(\n            type=\"object\",\n            key=original_path,\n            last=True,\n            failed=True,\n        )\n    ]\n\n    # Other variables for later use\n    digit_re = re.compile(r'^\\[([0-9]+)\\]')\n    key_re = re.compile(r'^\\[([^\\]]+)\\]')\n\n    # Step 4 - Find characters before first [ (if any)\n    parts = path.split(\"[\")\n    first_key = parts[0]\n    if parts[1:]:\n        path = \"[\" + \"[\".join(parts[1:])\n    else:\n        path = \"\"\n\n    # Step 5 - According to spec, keys cannot start with [\n    # NOTE: This was allowed in older DRF versions, so disabling rule for now\n    # if not first_key:\n    #     return failed\n\n    # Step 6 - Save initial step\n    steps.append(JsonStep(\n        type=\"object\",\n        key=first_key,\n    ))\n\n    # Step 7 - Simple single-step case (no [ found)\n    if not path:\n        steps[-1].last = True\n        return steps\n\n    # Step 8 - Loop\n    while path:\n        # Step 8.1 - Check for single-item array\n        if path[:2] == \"[]\":\n            path = path[2:]\n            steps.append(JsonStep(\n                type=\"array\",\n                key=0,\n            ))\n            continue\n\n        # Step 8.2 - Check for array[index]\n        digit_match = digit_re.match(path)\n        if digit_match:\n            path = digit_re.sub(\"\", path)\n            steps.append(JsonStep(\n                type=\"array\",\n                key=int(digit_match.group(1)),\n            ))\n            continue\n\n        # Step 8.3 - Check for object[key]\n        key_match = key_re.match(path)\n        if key_match:\n            path = key_re.sub(\"\", path)\n            steps.append(JsonStep(\n                type=\"object\",\n                key=key_match.group(1),\n            ))\n            continue\n\n        # Step 8.4 - Invalid key format\n        return failed\n\n    # Step 9\n    next_step = None\n    for step in reversed(steps):\n        if next_step:\n            step.next_type = next_step.type\n        else:\n            step.last = True\n        next_step = step\n\n    return steps", "response": "Parses a string as a JSON encoding path"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef set_json_value(context, step, current_value, entry_value, is_file):\n\n    # TODO: handle is_file\n\n    # Add empty values to array so indexing works like JavaScript\n    if isinstance(context, list) and isinstance(step.key, int):\n        undefined_count = step.key - len(context) + 1\n        if undefined_count > 1000:\n            raise ParseException(\"Too many consecutive empty values!\")\n        elif undefined_count > 0:\n            context += [Undefined()] * undefined_count\n\n    # Step 7: Handle last step\n    if step.last:\n        if isinstance(current_value, Undefined):\n            # Step 7.1: No existing value\n            key = step.key\n            if isinstance(context, dict) and isinstance(key, int):\n                key = str(key)\n            if step.append:\n                context[key] = [entry_value]\n            else:\n                context[key] = entry_value\n        elif isinstance(current_value, list):\n            # Step 7.2: Existing value is an Array, assume multi-valued field\n            # and add entry to end.\n\n            # FIXME: What if the other items in the array had explicit keys and\n            # this one is supposed to be the \"\" value?\n            # (See step 8.4 and Example 7)\n            context[step.key].append(entry_value)\n\n        elif isinstance(current_value, dict) and not is_file:\n            # Step 7.3: Existing value is an Object\n            return set_json_value(\n                context=current_value,\n                step=JsonStep(type=\"object\", key=\"\", last=True),\n                current_value=current_value.get(\"\", Undefined()),\n                entry_value=entry_value,\n                is_file=is_file,\n            )\n        else:\n            # Step 7.4: Existing value is a scalar; preserve both values\n            context[step.key] = [current_value, entry_value]\n\n        # Step 7.5\n        return context\n\n    # Step 8: Handle intermediate steps\n    if isinstance(current_value, Undefined):\n        # 8.1: No existing value\n        if step.next_type == \"array\":\n            context[step.key] = []\n        else:\n            context[step.key] = {}\n        return context[step.key]\n    elif isinstance(current_value, dict):\n        # Step 8.2: Existing value is an Object\n        return get_value(context, step.key, Undefined())\n    elif isinstance(current_value, list):\n        # Step 8.3: Existing value is an Array\n        if step.next_type == \"array\":\n            return current_value\n        # Convert array to object to facilitate mixed keys\n        obj = {}\n        for i, item in enumerate(current_value):\n            if not isinstance(item, Undefined):\n                obj[str(i)] = item\n        context[step.key] = obj\n        return obj\n    else:\n        # 8.4: Existing value is a scalar; convert to Object, preserving\n        # current value via an empty key\n        obj = {'': current_value}\n        context[step.key] = obj\n        return obj", "response": "Apply a JSON value to a context object object\n           "}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nmimic JavaScript Object / Array behavior by allowing access to nonexistent RSVs and VISA objects.", "response": "def get_value(obj, key, default=None):\n    \"\"\"\n    Mimic JavaScript Object/Array behavior by allowing access to nonexistent\n    indexes.\n    \"\"\"\n    if isinstance(obj, dict):\n        return obj.get(key, default)\n    elif isinstance(obj, list):\n        try:\n            return obj[key]\n        except IndexError:\n            return default"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef clean_undefined(obj):\n    if isinstance(obj, list):\n        return [\n            None if isinstance(item, Undefined) else item\n            for item in obj\n        ]\n    if isinstance(obj, dict):\n        for key in obj:\n            obj[key] = clean_undefined(obj[key])\n    return obj", "response": "Convert Undefined array entries to None ( null"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nremoving empty string values from the object and returns a list of empty strings.", "response": "def clean_empty_string(obj):\n    \"\"\"\n    Replace empty form values with None, since the is_html_input() check in\n    Field won't work after we convert to JSON.\n    (FIXME: What about allow_blank=True?)\n    \"\"\"\n    if obj == '':\n        return None\n    if isinstance(obj, list):\n        return [\n            None if item == '' else item\n            for item in obj\n        ]\n    if isinstance(obj, dict):\n        for key in obj:\n            obj[key] = clean_empty_string(obj[key])\n    return obj"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef trans_new(name, transform, inverse, breaks=None,\n              minor_breaks=None, _format=None,\n              domain=(-np.inf, np.inf), doc='', **kwargs):\n    \"\"\"\n    Create a transformation class object\n\n    Parameters\n    ----------\n    name : str\n        Name of the transformation\n    transform : callable ``f(x)``\n        A function (preferably a `ufunc`) that computes\n        the transformation.\n    inverse : callable ``f(x)``\n        A function (preferably a `ufunc`) that computes\n        the inverse of the transformation.\n    breaks : callable ``f(limits)``\n        Function to compute the breaks for this transform.\n        If None, then a default good enough for a linear\n        domain is used.\n    minor_breaks : callable ``f(major, limits)``\n        Function to compute the minor breaks for this\n        transform. If None, then a default good enough for\n        a linear domain is used.\n    _format : callable ``f(breaks)``\n        Function to format the generated breaks.\n    domain : array_like\n        Domain over which the transformation is valid.\n        It should be of length 2.\n    doc : str\n        Docstring for the class.\n    **kwargs : dict\n        Attributes of the transform, e.g if base is passed\n        in kwargs, then `t.base` would be a valied attribute.\n\n    Returns\n    -------\n    out : trans\n        Transform class\n    \"\"\"\n    def _get(func):\n        if isinstance(func, (classmethod, staticmethod, MethodType)):\n            return func\n        else:\n            return staticmethod(func)\n\n    klass_name = '{}_trans'.format(name)\n\n    d = {'transform': _get(transform),\n         'inverse': _get(inverse),\n         'domain': domain,\n         '__doc__': doc,\n         **kwargs}\n\n    if breaks:\n        d['breaks_'] = _get(breaks)\n\n    if minor_breaks:\n        d['minor_breaks'] = _get(minor_breaks)\n\n    if _format:\n        d['format'] = _get(_format)\n\n    return type(klass_name, (trans,), d)", "response": "Create a new transcription object."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ncreate a log transform class for the base.", "response": "def log_trans(base=None, **kwargs):\n    \"\"\"\n    Create a log transform class for *base*\n\n    Parameters\n    ----------\n    base : float\n        Base for the logarithm. If None, then\n        the natural log is used.\n    kwargs : dict\n        Keyword arguments passed onto\n        :func:`trans_new`. Should not include\n        the `transform` or `inverse`.\n\n    Returns\n    -------\n    out : type\n        Log transform class\n    \"\"\"\n    # transform function\n    if base is None:\n        name = 'log'\n        base = np.exp(1)\n        transform = np.log\n    elif base == 10:\n        name = 'log10'\n        transform = np.log10\n    elif base == 2:\n        name = 'log2'\n        transform = np.log2\n    else:\n        name = 'log{}'.format(base)\n\n        def transform(x):\n            return np.log(x)/np.log(base)\n\n    # inverse function\n    def inverse(x):\n        try:\n            return base ** x\n        except TypeError:\n            return [base**val for val in x]\n\n    if 'domain' not in kwargs:\n        kwargs['domain'] = (sys.float_info.min, np.inf)\n\n    if 'breaks' not in kwargs:\n        kwargs['breaks'] = log_breaks(base=base)\n\n    kwargs['base'] = base\n    kwargs['_format'] = log_format(base)\n\n    _trans = trans_new(name, transform, inverse, **kwargs)\n\n    if 'minor_breaks' not in kwargs:\n        n = int(base) - 2\n        _trans.minor_breaks = trans_minor_breaks(_trans, n=n)\n\n    return _trans"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\ncreate a exponential transform class for the base of the logarithm.", "response": "def exp_trans(base=None, **kwargs):\n    \"\"\"\n    Create a exponential transform class for *base*\n\n    This is inverse of the log transform.\n\n    Parameters\n    ----------\n    base : float\n        Base of the logarithm\n    kwargs : dict\n        Keyword arguments passed onto\n        :func:`trans_new`. Should not include\n        the `transform` or `inverse`.\n\n    Returns\n    -------\n    out : type\n        Exponential transform class\n    \"\"\"\n    # default to e\n    if base is None:\n        name = 'power_e'\n        base = np.exp(1)\n    else:\n        name = 'power_{}'.format(base)\n\n    # transform function\n    def transform(x):\n        return base ** x\n\n    # inverse function\n    def inverse(x):\n        return np.log(x)/np.log(base)\n\n    kwargs['base'] = base\n    return trans_new(name, transform, inverse, **kwargs)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef boxcox_trans(p, **kwargs):\n    if np.abs(p) < 1e-7:\n        return log_trans()\n\n    def transform(x):\n        return (x**p - 1) / (p * np.sign(x-1))\n\n    def inverse(x):\n        return (np.abs(x) * p + np.sign(x)) ** (1 / p)\n\n    kwargs['p'] = p\n    kwargs['name'] = kwargs.get('name', 'pow_{}'.format(p))\n    kwargs['transform'] = transform\n    kwargs['inverse'] = inverse\n    return trans_new(**kwargs)", "response": "Returns a new Boxcox Transformation object for the given power parameter."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef probability_trans(distribution, *args, **kwargs):\n    import scipy.stats as stats\n    cdists = {k for k in dir(stats)\n              if hasattr(getattr(stats, k), 'cdf')}\n    if distribution not in cdists:\n        msg = \"Unknown distribution '{}'\"\n        raise ValueError(msg.format(distribution))\n\n    try:\n        doc = kwargs.pop('_doc')\n    except KeyError:\n        doc = ''\n\n    try:\n        name = kwargs.pop('_name')\n    except KeyError:\n        name = 'prob_{}'.format(distribution)\n\n    def transform(x):\n        return getattr(stats, distribution).cdf(x, *args, **kwargs)\n\n    def inverse(x):\n        return getattr(stats, distribution).ppf(x, *args, **kwargs)\n\n    return trans_new(name,\n                     transform, inverse, domain=(0, 1),\n                     doc=doc)", "response": "Returns a new random - random - random - random - random - random - random - random - random - random - random - random - random - random - random number"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef gettrans(t):\n    obj = t\n    # Make sure trans object is instantiated\n    if isinstance(obj, str):\n        name = '{}_trans'.format(obj)\n        obj = globals()[name]()\n    if callable(obj):\n        obj = obj()\n    if isinstance(obj, type):\n        obj = obj()\n\n    if not isinstance(obj, trans):\n        raise ValueError(\"Could not get transform object.\")\n\n    return obj", "response": "Returns a trans object for the specified name"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef breaks(self, limits):\n        # clip the breaks to the domain,\n        # e.g. probabilities will be in [0, 1] domain\n        vmin = np.max([self.domain[0], limits[0]])\n        vmax = np.min([self.domain[1], limits[1]])\n        breaks = np.asarray(self.breaks_([vmin, vmax]))\n\n        # Some methods(mpl_breaks, extended_breaks) that\n        # calculate breaks take the limits as guide posts and\n        # not hard limits.\n        breaks = breaks.compress((breaks >= self.domain[0]) &\n                                 (breaks <= self.domain[1]))\n        return breaks", "response": "Calculate breaks in data space and return them\n        in transformed space."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ntransform from date to a numerical format", "response": "def transform(x):\n        \"\"\"\n        Transform from date to a numerical format\n        \"\"\"\n        try:\n            x = date2num(x)\n        except AttributeError:\n            # numpy datetime64\n            # This is not ideal because the operations do not\n            # preserve the np.datetime64 type. May be need\n            # a datetime64_trans\n            x = [pd.Timestamp(item) for item in x]\n            x = date2num(x)\n        return x"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef transform(x):\n        # microseconds\n        try:\n            x = np.array([_x.total_seconds()*10**6 for _x in x])\n        except TypeError:\n            x = x.total_seconds()*10**6\n        return x", "response": "Transform from Timeddelta to numerical format"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef inverse(x):\n        try:\n            x = [datetime.timedelta(microseconds=i) for i in x]\n        except TypeError:\n            x = datetime.timedelta(microseconds=x)\n        return x", "response": "Transform to Timedelta from numerical format"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef transform(x):\n        # nanoseconds\n        try:\n            x = np.array([_x.value for _x in x])\n        except TypeError:\n            x = x.value\n        return x", "response": "Transform from Timeddelta to numerical format"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef inverse(x):\n        try:\n            x = [pd.Timedelta(int(i)) for i in x]\n        except TypeError:\n            x = pd.Timedelta(int(x))\n        return x", "response": "Transform to Timedelta from numerical format"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef rescale(x, to=(0, 1), _from=None):\n    if _from is None:\n        _from = np.min(x), np.max(x)\n    return np.interp(x, _from, to)", "response": "Rescales numeric vector to have specified minimum and maximum."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef rescale_mid(x, to=(0, 1), _from=None, mid=0):\n    array_like = True\n\n    try:\n        len(x)\n    except TypeError:\n        array_like = False\n        x = [x]\n\n    if not hasattr(x, 'dtype'):\n        x = np.asarray(x)\n\n    if _from is None:\n        _from = np.array([np.min(x), np.max(x)])\n    else:\n        _from = np.asarray(_from)\n\n    if (zero_range(_from) or zero_range(to)):\n        out = np.repeat(np.mean(to), len(x))\n    else:\n        extent = 2 * np.max(np.abs(_from - mid))\n        out = (x - mid) / extent * np.diff(to) + np.mean(to)\n\n    if not array_like:\n        out = out[0]\n    return out", "response": "Rescales numeric vector to have specified minimum midpoint and maximum midpoint and returns array_like | numeric\n   "}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef rescale_max(x, to=(0, 1), _from=None):\n    array_like = True\n\n    try:\n        len(x)\n    except TypeError:\n        array_like = False\n        x = [x]\n\n    if not hasattr(x, 'dtype'):\n        x = np.asarray(x)\n\n    if _from is None:\n        _from = np.array([np.min(x), np.max(x)])\n\n    out = x/_from[1] * to[1]\n\n    if not array_like:\n        out = out[0]\n    return out", "response": "Rescales numeric vector to have specified maximum."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef squish_infinite(x, range=(0, 1)):\n    xtype = type(x)\n\n    if not hasattr(x, 'dtype'):\n        x = np.asarray(x)\n\n    x[x == -np.inf] = range[0]\n    x[x == np.inf] = range[1]\n\n    if not isinstance(x, xtype):\n        x = xtype(x)\n    return x", "response": "Truncates infinite values to a range."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef squish(x, range=(0, 1), only_finite=True):\n    xtype = type(x)\n\n    if not hasattr(x, 'dtype'):\n        x = np.asarray(x)\n\n    finite = np.isfinite(x) if only_finite else True\n\n    x[np.logical_and(x < range[0], finite)] = range[0]\n    x[np.logical_and(x > range[1], finite)] = range[1]\n\n    if not isinstance(x, xtype):\n        x = xtype(x)\n    return x", "response": "Squish values into range."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nconverting any values outside of range to a new array_like object.", "response": "def censor(x, range=(0, 1), only_finite=True):\n    \"\"\"\n    Convert any values outside of range to a **NULL** type object.\n\n    Parameters\n    ----------\n    x : array_like\n        Values to manipulate\n    range : tuple\n        (min, max) giving desired output range\n    only_finite : bool\n        If True (the default), will only modify\n        finite values.\n\n    Returns\n    -------\n    x : array_like\n        Censored array\n\n    Examples\n    --------\n    >>> a = [1, 2, np.inf, 3, 4, -np.inf, 5]\n    >>> censor(a, (0, 10))\n    [1, 2, inf, 3, 4, -inf, 5]\n    >>> censor(a, (0, 10), False)\n    [1, 2, nan, 3, 4, nan, 5]\n    >>> censor(a, (2, 4))\n    [nan, 2, inf, 3, 4, -inf, nan]\n\n    Notes\n    -----\n    All values in ``x`` should be of the same type. ``only_finite`` parameter\n    is not considered for Datetime and Timedelta types.\n\n    The **NULL** type object depends on the type of values in **x**.\n\n    - :class:`float` - :py:`float('nan')`\n    - :class:`int` - :py:`float('nan')`\n    - :class:`datetime.datetime` : :py:`np.datetime64(NaT)`\n    - :class:`datetime.timedelta` : :py:`np.timedelta64(NaT)`\n\n    \"\"\"\n    if not len(x):\n        return x\n\n    py_time_types = (datetime.datetime, datetime.timedelta)\n    np_pd_time_types = (pd.Timestamp, pd.Timedelta,\n                        np.datetime64, np.timedelta64)\n    x0 = first_element(x)\n\n    # Yes, we want type not isinstance\n    if type(x0) in py_time_types:\n        return _censor_with(x, range, 'NaT')\n\n    if not hasattr(x, 'dtype') and isinstance(x0, np_pd_time_types):\n        return _censor_with(x, range, type(x0)('NaT'))\n\n    x_array = np.asarray(x)\n    if pdtypes.is_number(x0) and not isinstance(x0, np.timedelta64):\n        null = float('nan')\n    elif com.is_datetime_arraylike(x_array):\n        null = pd.Timestamp('NaT')\n    elif pdtypes.is_datetime64_dtype(x_array):\n        null = np.datetime64('NaT')\n    elif isinstance(x0, pd.Timedelta):\n        null = pd.Timedelta('NaT')\n    elif pdtypes.is_timedelta64_dtype(x_array):\n        null = np.timedelta64('NaT')\n    else:\n        raise ValueError(\n            \"Do not know how to censor values of type \"\n            \"{}\".format(type(x0)))\n\n    if only_finite:\n        try:\n            finite = np.isfinite(x)\n        except TypeError:\n            finite = np.repeat(True, len(x))\n    else:\n        finite = np.repeat(True, len(x))\n\n    if hasattr(x, 'dtype'):\n        outside = (x < range[0]) | (x > range[1])\n        bool_idx = finite & outside\n        x = x.copy()\n        x[bool_idx] = null\n    else:\n        x = [null if not range[0] <= val <= range[1] and f else val\n             for val, f in zip(x, finite)]\n\n    return x"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _censor_with(x, range, value=None):\n    return [val if range[0] <= val <= range[1] else value\n            for val in x]", "response": "Censor any values outside of range with value"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef zero_range(x, tol=np.finfo(float).eps * 100):\n    try:\n        if len(x) == 1:\n            return True\n    except TypeError:\n        return True\n\n    if len(x) != 2:\n        raise ValueError('x must be length 1 or 2')\n\n    # Deals with array_likes that have non-standard indices\n    x = tuple(x)\n\n    # datetime - pandas, cpython\n    if isinstance(x[0], (pd.Timestamp, datetime.datetime)):\n        # date2num include timezone info, .toordinal() does not\n        x = date2num(x)\n    # datetime - numpy\n    elif isinstance(x[0], np.datetime64):\n        return x[0] == x[1]\n    # timedelta - pandas, cpython\n    elif isinstance(x[0], (pd.Timedelta, datetime.timedelta)):\n        x = x[0].total_seconds(), x[1].total_seconds()\n    # timedelta - numpy\n    elif isinstance(x[0], np.timedelta64):\n        return x[0] == x[1]\n    elif not isinstance(x[0], (float, int, np.number)):\n        raise TypeError(\n            \"zero_range objects cannot work with objects \"\n            \"of type '{}'\".format(type(x[0])))\n\n    if any(np.isnan(x)):\n        return np.nan\n\n    if x[0] == x[1]:\n        return True\n\n    if all(np.isinf(x)):\n        return False\n\n    m = np.abs(x).min()\n    if m == 0:\n        return False\n\n    return np.abs((x[0] - x[1]) / m) < tol", "response": "Returns True if the range of vector x is close to zero."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef expand_range(range, mul=0, add=0, zero_width=1):\n    x = range\n\n    # Enforce tuple\n    try:\n        x[0]\n    except TypeError:\n        x = (x, x)\n\n    # The expansion cases\n    if zero_range(x):\n        new = x[0]-zero_width/2, x[0]+zero_width/2\n    else:\n        dx = (x[1] - x[0]) * mul + add\n        new = x[0]-dx, x[1]+dx\n\n    return new", "response": "Expand a range of data with a multiplicative or additive constant."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef expand_range_distinct(range, expand=(0, 0, 0, 0), zero_width=1):\n\n    if len(expand) == 2:\n        expand = tuple(expand) * 2\n\n    lower = expand_range(range, expand[0], expand[1], zero_width)[0]\n    upper = expand_range(range, expand[2], expand[3], zero_width)[1]\n    return (lower, upper)", "response": "Expand a range with a multiplicative or additive constants."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _extend_breaks(self, major):\n        trans = self.trans\n        trans = trans if isinstance(trans, type) else trans.__class__\n        # so far we are only certain about this extending stuff\n        # making sense for log transform\n        is_log = trans.__name__.startswith('log')\n        diff = np.diff(major)\n        step = diff[0]\n        if is_log and all(diff == step):\n            major = np.hstack([major[0]-step, major, major[-1]+step])\n        return major", "response": "Extend the major array with 2 extra breaks at either end of major_breaks or the last major_breaks."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef best_units(self, sequence):\n        # Read\n        #   [(0.9, 's'),\n        #    (9, 'm)]\n        # as, break ranges between 0.9 seconds (inclusive)\n        # and 9 minutes are represented in seconds. And so on.\n        ts_range = self.value(max(sequence)) - self.value(min(sequence))\n        package = self.determine_package(sequence[0])\n        if package == 'pandas':\n            cuts = [\n                (0.9, 'us'),\n                (0.9, 'ms'),\n                (0.9, 's'),\n                (9, 'm'),\n                (6, 'h'),\n                (4, 'd'),\n                (4, 'w'),\n                (4, 'M'),\n                (3, 'y')]\n            denomination = NANOSECONDS\n            base_units = 'ns'\n        else:\n            cuts = [\n                (0.9, 's'),\n                (9, 'm'),\n                (6, 'h'),\n                (4, 'd'),\n                (4, 'w'),\n                (4, 'M'),\n                (3, 'y')]\n            denomination = SECONDS\n            base_units = 'ms'\n\n        for size, units in reversed(cuts):\n            if ts_range >= size*denomination[units]:\n                return units\n\n        return base_units", "response": "Determine good units for representing a sequence of timedeltas\n           "}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef scaled_limits(self):\n        _min = self.limits[0]/self.factor\n        _max = self.limits[1]/self.factor\n        return _min, _max", "response": "Returns the minimum and maximum of the set of resources that are scaled by factor."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef numeric_to_timedelta(self, numerics):\n        if self.package == 'pandas':\n            return [self.type(int(x*self.factor), units='ns')\n                    for x in numerics]\n        else:\n            return [self.type(seconds=x*self.factor)\n                    for x in numerics]", "response": "Convert sequence of numerics to timedelta"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nconverting timedelta to a number corresponding to the appropriate units.", "response": "def to_numeric(self, td):\n        \"\"\"\n        Convert timedelta to a number corresponding to the\n        appropriate units. The appropriate units are those\n        determined with the object is initialised.\n        \"\"\"\n        if self.package == 'pandas':\n            return td.value/NANOSECONDS[self.units]\n        else:\n            return td.total_seconds()/SECONDS[self.units]"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef round_any(x, accuracy, f=np.round):\n    if not hasattr(x, 'dtype'):\n        x = np.asarray(x)\n\n    return f(x / accuracy) * accuracy", "response": "Round to multiple of any number."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nreturning the minimum and maximum of a sequence of items.", "response": "def min_max(x, na_rm=False, finite=True):\n    \"\"\"\n    Return the minimum and maximum of x\n\n    Parameters\n    ----------\n    x : array_like\n        Sequence\n    na_rm : bool\n        Whether to remove ``nan`` values.\n    finite : bool\n        Whether to consider only finite values.\n\n    Returns\n    -------\n    out : tuple\n        (minimum, maximum) of x\n    \"\"\"\n    if not hasattr(x, 'dtype'):\n        x = np.asarray(x)\n\n    if na_rm and finite:\n        x = x[np.isfinite(x)]\n    elif not na_rm and np.any(np.isnan(x)):\n        return np.nan, np.nan\n    elif na_rm:\n        x = x[~np.isnan(x)]\n    elif finite:\n        x = x[~np.isinf(x)]\n\n    if (len(x)):\n        return np.min(x), np.max(x)\n    else:\n        return float('-inf'), float('inf')"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef match(v1, v2, nomatch=-1, incomparables=None, start=0):\n    v2_indices = {}\n    for i, x in enumerate(v2):\n        if x not in v2_indices:\n            v2_indices[x] = i\n\n    v1_to_v2_map = [nomatch] * len(v1)\n    skip = set(incomparables) if incomparables else set()\n    for i, x in enumerate(v1):\n        if x in skip:\n            continue\n\n        try:\n            v1_to_v2_map[i] = v2_indices[x] + start\n        except KeyError:\n            pass\n\n    return v1_to_v2_map", "response": "Returns a vector of the positions of v1 that matches v2."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef precision(x):\n    from .bounds import zero_range\n\n    rng = min_max(x, na_rm=True)\n    if zero_range(rng):\n        span = np.abs(rng[0])\n    else:\n        span = np.diff(rng)[0]\n\n    if span == 0:\n        return 1\n    else:\n        return 10 ** int(np.floor(np.log10(span)))", "response": "Returns the precision of x."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef multitype_sort(a):\n    types = defaultdict(list)\n    numbers = {int, float, complex}\n\n    for x in a:\n        t = type(x)\n        if t in numbers:\n            types['number'].append(x)\n        else:\n            types[t].append(x)\n\n    for t in types:\n        types[t] = np.sort(types[t])\n\n    return list(chain(*(types[t] for t in types)))", "response": "Sort elements of multiple types\n    x is assumed to contain elements of different types such that\n    plain sort would raise a TypeError."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef nearest_int(x):\n    if x == 0:\n        return np.int64(0)\n    elif x > 0:\n        return np.int64(x + 0.5)\n    else:\n        return np.int64(x - 0.5)", "response": "Return nearest long integer to x"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\ncheck if value is close to an integer.", "response": "def is_close_to_int(x):\n    \"\"\"\n    Check if value is close to an integer\n\n    Parameters\n    ----------\n    x : float\n        Numeric value to check\n\n    Returns\n    -------\n    out : bool\n    \"\"\"\n    if not np.isfinite(x):\n        return False\n    return abs(x - nearest_int(x)) < 1e-10"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef same_log10_order_of_magnitude(x, delta=0.1):\n    dmin = np.log10(np.min(x)*(1-delta))\n    dmax = np.log10(np.max(x)*(1+delta))\n    return np.floor(dmin) == np.floor(dmax)", "response": "Returns True if the given array - like object x is approximately in same order of magnitude as the base 10."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _tidyup_labels(self, labels):\n        def remove_zeroes(s):\n            \"\"\"\n            Remove unnecessary zeros for float string s\n            \"\"\"\n            tup = s.split('e')\n            if len(tup) == 2:\n                mantissa = tup[0].rstrip('0').rstrip('.')\n                exponent = int(tup[1])\n                if exponent:\n                    s = '%se%d' % (mantissa, exponent)\n                else:\n                    s = mantissa\n            return s\n\n        def as_exp(s):\n            \"\"\"\n            Float string s as in exponential format\n            \"\"\"\n            return s if 'e' in s else '{:1.0e}'.format(float(s))\n\n        # If any are in exponential format, make all of\n        # them expontential\n        has_e = np.array(['e' in x for x in labels])\n        if not np.all(has_e) and not np.all(~has_e):\n            labels = [as_exp(x) for x in labels]\n\n        labels = [remove_zeroes(x) for x in labels]\n        return labels", "response": "Takes a list of labels and returns a list of labels that are tidied."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nreturning a list of colors in a circular HLS color space.", "response": "def hls_palette(n_colors=6, h=.01, l=.6, s=.65):\n    \"\"\"\n    Get a set of evenly spaced colors in HLS hue space.\n\n    h, l, and s should be between 0 and 1\n\n    Parameters\n    ----------\n\n    n_colors : int\n        number of colors in the palette\n    h : float\n        first hue\n    l : float\n        lightness\n    s : float\n        saturation\n\n    Returns\n    -------\n    palette : list\n        List of colors as RGB hex strings.\n\n    See Also\n    --------\n    husl_palette : Make a palette using evenly spaced circular\n        hues in the HUSL system.\n\n    Examples\n    --------\n    >>> len(hls_palette(2))\n    2\n    >>> len(hls_palette(9))\n    9\n    \"\"\"\n    hues = np.linspace(0, 1, n_colors + 1)[:-1]\n    hues += h\n    hues %= 1\n    hues -= hues.astype(int)\n    palette = [colorsys.hls_to_rgb(h_i, l, s) for h_i in hues]\n    return palette"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef husl_palette(n_colors=6, h=.01, s=.9, l=.65):\n    hues = np.linspace(0, 1, n_colors + 1)[:-1]\n    hues += h\n    hues %= 1\n    hues *= 359\n    s *= 99\n    l *= 99\n    palette = [husl.husl_to_rgb(h_i, s, l) for h_i in hues]\n    return palette", "response": "Returns a list of colors in HUSL by using evenly spaced colors in HSL system."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef area_pal(range=(1, 6)):\n    def area_palette(x):\n        return rescale(np.sqrt(x), to=range, _from=(0, 1))\n    return area_palette", "response": "Point area palette (continuous).\n\n    Parameters\n    ----------\n    range : tuple\n        Numeric vector of length two, giving range of possible sizes.\n        Should be greater than 0.\n\n    Returns\n    -------\n    out : function\n        Palette function that takes a sequence of values\n        in the range ``[0, 1]`` and returns values in\n        the specified range.\n\n    Examples\n    --------\n    >>> x = np.arange(0, .6, .1)**2\n    >>> palette = area_pal()\n    >>> palette(x)\n    array([1. , 1.5, 2. , 2.5, 3. , 3.5])\n\n    The results are equidistant because the input ``x`` is in\n    area space, i.e it is squared."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef abs_area(max):\n    def abs_area_palette(x):\n        return rescale(np.sqrt(np.abs(x)), to=(0, max), _from=(0, 1))\n    return abs_area_palette", "response": "Point area palette (continuous), with area proportional to value.\n\n    Parameters\n    ----------\n    max : float\n        A number representing the maximum size\n\n    Returns\n    -------\n    out : function\n        Palette function that takes a sequence of values\n        in the range ``[0, 1]`` and returns values in the range\n        ``[0, max]``.\n\n    Examples\n    --------\n    >>> x = np.arange(0, .8, .1)**2\n    >>> palette = abs_area(5)\n    >>> palette(x)\n    array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5])\n\n    Compared to :func:`area_pal`, :func:`abs_area` will handle values\n    in the range ``[-1, 0]`` without returning ``np.nan``. And values\n    whose absolute value is greater than 1 will be clipped to the\n    maximum."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nmap values in the range [ 0 1 ) onto colors", "response": "def ratios_to_colors(values, colormap):\n    \"\"\"\n    Map values in the range [0, 1] onto colors\n\n    Parameters\n    ----------\n    values : array_like | float\n        Numeric(s) in the range [0, 1]\n    colormap : cmap\n        Matplotlib colormap to use for the mapping\n\n    Returns\n    -------\n    out : list | float\n        Color(s) corresponding to the values\n    \"\"\"\n    iterable = True\n    try:\n        iter(values)\n    except TypeError:\n        iterable = False\n        values = [values]\n\n    color_tuples = colormap(values)\n    try:\n        hex_colors = [mcolors.rgb2hex(t) for t in color_tuples]\n    except IndexError:\n        hex_colors = mcolors.rgb2hex(color_tuples)\n    return hex_colors if iterable else hex_colors[0]"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef gradient_n_pal(colors, values=None, name='gradientn'):\n    # Note: For better results across devices and media types,\n    # it would be better to do the interpolation in\n    # Lab color space.\n    if values is None:\n        colormap = mcolors.LinearSegmentedColormap.from_list(\n            name, colors)\n    else:\n        colormap = mcolors.LinearSegmentedColormap.from_list(\n            name, list(zip(values, colors)))\n\n    def _gradient_n_pal(vals):\n        return ratios_to_colors(vals, colormap)\n\n    return _gradient_n_pal", "response": "Create a n color gradient palette that maps the values of the given colors onto the given MPL color palette."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ncreating a continuous color palette using an MPL colormap.", "response": "def cmap_pal(name=None, lut=None):\n    \"\"\"\n    Create a continuous palette using an MPL colormap\n\n    Parameters\n    ----------\n    name : str\n        Name of colormap\n    lut : None | int\n        This is the number of entries desired in the lookup table.\n        Default is ``None``, leave it up Matplotlib.\n\n    Returns\n    -------\n    out : function\n        Continuous color palette that takes a single\n        parameter either a :class:`float` or a sequence\n        of floats maps those value(s) onto the palette\n        and returns color(s). The float(s) must be\n        in the range [0, 1].\n\n    Examples\n    --------\n    >>> palette = cmap_pal('viridis')\n    >>> palette([.1, .2, .3, .4, .5])\n    ['#482475', '#414487', '#355f8d', '#2a788e', '#21918c']\n    \"\"\"\n    colormap = get_cmap(name, lut)\n\n    def _cmap_pal(vals):\n        return ratios_to_colors(vals, colormap)\n\n    return _cmap_pal"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ncreates a discrete palette using an MPL Listed colormap.", "response": "def cmap_d_pal(name=None, lut=None):\n    \"\"\"\n    Create a discrete palette using an MPL Listed colormap\n\n    Parameters\n    ----------\n    name : str\n        Name of colormap\n    lut : None | int\n        This is the number of entries desired in the lookup table.\n        Default is ``None``, leave it up Matplotlib.\n\n    Returns\n    -------\n    out : function\n        A discrete color palette that takes a single\n        :class:`int` parameter ``n`` and returns ``n``\n        colors. The maximum value of ``n`` varies\n        depending on the parameters.\n\n    Examples\n    --------\n    >>> palette = cmap_d_pal('viridis')\n    >>> palette(5)\n    ['#440154', '#3b528b', '#21918c', '#5cc863', '#fde725']\n    \"\"\"\n    colormap = get_cmap(name, lut)\n\n    if not isinstance(colormap, mcolors.ListedColormap):\n        raise ValueError(\n            \"For a discrete palette, cmap must be of type \"\n            \"matplotlib.colors.ListedColormap\")\n\n    ncolors = len(colormap.colors)\n\n    def _cmap_d_pal(n):\n        if n > ncolors:\n            raise ValueError(\n                \"cmap `{}` has {} colors you requested {} \"\n                \"colors.\".format(name, ncolors, n))\n\n        if ncolors < 256:\n            return [mcolors.rgb2hex(c) for c in colormap.colors[:n]]\n        else:\n            # Assume these are continuous and get colors equally spaced\n            # intervals  e.g. viridis is defined with 256 colors\n            idx = np.linspace(0, ncolors-1, n).round().astype(int)\n            return [mcolors.rgb2hex(colormap.colors[i]) for i in idx]\n\n    return _cmap_d_pal"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef desaturate_pal(color, prop, reverse=False):\n    if not 0 <= prop <= 1:\n        raise ValueError(\"prop must be between 0 and 1\")\n\n    # Get rgb tuple rep\n    # Convert to hls\n    # Desaturate the saturation channel\n    # Convert back to rgb\n    rgb = mcolors.colorConverter.to_rgb(color)\n    h, l, s = colorsys.rgb_to_hls(*rgb)\n    s *= prop\n    desaturated_color = colorsys.hls_to_rgb(h, l, s)\n    colors = [color, desaturated_color]\n    if reverse:\n        colors = colors[::-1]\n    return gradient_n_pal(colors, name='desaturated')", "response": "Create a continuous color palette that desaturates a color by some proportion."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef manual_pal(values):\n    max_n = len(values)\n\n    def _manual_pal(n):\n        if n > max_n:\n            msg = (\"Palette can return a maximum of {} values. \"\n                   \"{} were requested from it.\")\n            warnings.warn(msg.format(max_n, n))\n\n        return values[:n]\n\n    return _manual_pal", "response": "Create a palette function from a list of values."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef cubehelix_pal(start=0, rot=.4, gamma=1.0, hue=0.8,\n                  light=.85, dark=.15, reverse=False):\n    \"\"\"\n    Utility for creating continuous palette from the cubehelix system.\n\n    This produces a colormap with linearly-decreasing (or increasing)\n    brightness. That means that information will be preserved if printed to\n    black and white or viewed by someone who is colorblind.\n\n    Parameters\n    ----------\n    start : float (0 <= start <= 3)\n        The hue at the start of the helix.\n    rot : float\n        Rotations around the hue wheel over the range of the palette.\n    gamma : float (0 <= gamma)\n        Gamma factor to emphasize darker (gamma < 1) or lighter (gamma > 1)\n        colors.\n    hue : float (0 <= hue <= 1)\n        Saturation of the colors.\n    dark : float (0 <= dark <= 1)\n        Intensity of the darkest color in the palette.\n    light : float (0 <= light <= 1)\n        Intensity of the lightest color in the palette.\n    reverse : bool\n        If True, the palette will go from dark to light.\n\n    Returns\n    -------\n    out : function\n        Continuous color palette that takes a single\n        :class:`int` parameter ``n`` and returns ``n``\n        equally spaced colors.\n\n\n    References\n    ----------\n    Green, D. A. (2011). \"A colour scheme for the display of astronomical\n    intensity images\". Bulletin of the Astromical Society of India, Vol. 39,\n    p. 289-295.\n\n    Examples\n    --------\n    >>> palette = cubehelix_pal()\n    >>> palette(5)\n    ['#edd1cb', '#d499a7', '#aa688f', '#6e4071', '#2d1e3e']\n    \"\"\"\n    cdict = mpl._cm.cubehelix(gamma, start, rot, hue)\n    cubehelix_cmap = mpl.colors.LinearSegmentedColormap('cubehelix', cdict)\n\n    def cubehelix_palette(n):\n        values = np.linspace(light, dark, n)\n        return [mcolors.rgb2hex(cubehelix_cmap(x)) for x in values]\n\n    return cubehelix_palette", "response": "Utility for creating continuous color palette from a single cubehelix system."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\napplies the class - specific function to scale data continuously.", "response": "def apply(cls, x, palette, na_value=None, trans=None):\n        \"\"\"\n        Scale data continuously\n\n        Parameters\n        ----------\n        x : array_like\n            Continuous values to scale\n        palette : callable ``f(x)``\n            Palette to use\n        na_value : object\n            Value to use for missing values.\n        trans : trans\n            How to transform the data before scaling. If\n            ``None``, no transformation is done.\n\n        Returns\n        -------\n        out : array_like\n            Scaled values\n        \"\"\"\n        if trans is not None:\n            x = trans.transform(x)\n\n        limits = cls.train(x)\n        return cls.map(x, palette, limits, na_value)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef train(cls, new_data, old=None):\n        if not len(new_data):\n            return old\n\n        if not hasattr(new_data, 'dtype'):\n            new_data = np.asarray(new_data)\n\n        if new_data.dtype.kind not in CONTINUOUS_KINDS:\n            raise TypeError(\n                \"Discrete value supplied to continuous scale\")\n\n        if old is not None:\n            new_data = np.hstack([new_data, old])\n\n        return min_max(new_data, na_rm=True, finite=True)", "response": "Train a continuous scale of the log entry set."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef map(cls, x, palette, limits, na_value=None, oob=censor):\n        x = oob(rescale(x, _from=limits))\n        pal = palette(x)\n        try:\n            pal[pd.isnull(x)] = na_value\n        except TypeError:\n            pal = [v if not pd.isnull(v) else na_value for v in pal]\n\n        return pal", "response": "Map values to continuous palette."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef train(cls, new_data, old=None, drop=False, na_rm=False):\n        if not len(new_data):\n            return old\n\n        if old is None:\n            old = []\n\n        # Get the missing values (NaN & Nones) locations and remove them\n        nan_bool_idx = pd.isnull(new_data)\n        has_na = np.any(nan_bool_idx)\n        if not hasattr(new_data, 'dtype'):\n            new_data = np.asarray(new_data)\n        new_data = new_data[~nan_bool_idx]\n\n        if new_data.dtype.kind not in DISCRETE_KINDS:\n            raise TypeError(\n                \"Continuous value supplied to discrete scale\")\n\n        # Train i.e. get the new values\n        if pdtypes.is_categorical_dtype(new_data):\n            try:\n                new = list(new_data.cat.categories)  # series\n            except AttributeError:\n                new = list(new_data.categories)      # plain categorical\n            if drop:\n                present = set(new_data.drop_duplicates())\n                new = [i for i in new if i in present]\n        else:\n            try:\n                new = np.unique(new_data)\n                new.sort()\n            except TypeError:\n                # new_data probably has nans and other types\n                new = list(set(new_data))\n                new = multitype_sort(new)\n\n        # Add nan if required\n        if has_na and not na_rm:\n            new = np.hstack([new, np.nan])\n\n        # update old\n        old_set = set(old)\n        return list(old) + [i for i in new if (i not in old_set)]", "response": "Train a continuous scale of the logarithmic entry set."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef map(cls, x, palette, limits, na_value=None):\n        n = len(limits)\n        pal = palette(n)[match(x, limits)]\n        try:\n            pal[pd.isnull(x)] = na_value\n        except TypeError:\n            pal = [v if not pd.isnull(v) else na_value for v in pal]\n\n        return pal", "response": "Map values to a discrete palette."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef parse(type: Type):\n\n        def decorator(parser):\n            EnvVar.parsers[type] = parser\n            return parser\n\n        return decorator", "response": "Decorator for registering a parser for a specific attribute type."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _patched_run_hook(hook_name, project_dir, context):\n    if hook_name == 'post_gen_project':\n        with temple.utils.cd(project_dir):\n            temple.utils.write_temple_config(context['cookiecutter'],\n                                             context['template'],\n                                             context['version'])\n    return cc_hooks.run_hook(hook_name, project_dir, context)", "response": "Used to patch cookiecutter s run_hook function."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _generate_files(repo_dir, config, template, version):\n    with unittest.mock.patch('cookiecutter.generate.run_hook', side_effect=_patched_run_hook):\n        cc_generate.generate_files(repo_dir=repo_dir,\n                                   context={'cookiecutter': config,\n                                            'template': template,\n                                            'version': version},\n                                   overwrite_if_exists=False,\n                                   output_dir='.')", "response": "Generates the temple. yaml files for the cookiecutter project."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nsets up a new project from a template.", "response": "def setup(template, version=None):\n    \"\"\"Sets up a new project from a template\n\n    Note that the `temple.constants.TEMPLE_ENV_VAR` is set to 'setup' during the duration\n    of this function.\n\n    Args:\n        template (str): The git SSH path to a template\n        version (str, optional): The version of the template to use when updating. Defaults\n            to the latest version\n    \"\"\"\n    temple.check.is_git_ssh_path(template)\n    temple.check.not_in_git_repo()\n\n    repo_path = temple.utils.get_repo_path(template)\n    msg = (\n        'You will be prompted for the parameters of your new project.'\n        ' Please read the docs at https://github.com/{} before entering parameters.'\n    ).format(repo_path)\n    print(msg)\n\n    cc_repo_dir, config = temple.utils.get_cookiecutter_config(template, version=version)\n\n    if not version:\n        with temple.utils.cd(cc_repo_dir):\n            ret = temple.utils.shell('git rev-parse HEAD', stdout=subprocess.PIPE)\n            version = ret.stdout.decode('utf-8').strip()\n\n    _generate_files(repo_dir=cc_repo_dir, config=config, template=template, version=version)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _parse_link_header(headers):\n    links = {}\n    if 'link' in headers:\n        link_headers = headers['link'].split(', ')\n        for link_header in link_headers:\n            (url, rel) = link_header.split('; ')\n            url = url[1:-1]\n            rel = rel[5:-1]\n            links[rel] = url\n    return links", "response": "Parses Github s link header for pagination."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _code_search(query, github_user=None):\n    github_client = temple.utils.GithubClient()\n    headers = {'Accept': 'application/vnd.github.v3.text-match+json'}\n\n    resp = github_client.get('/search/code',\n                             params={'q': query, 'per_page': 100},\n                             headers=headers)\n\n    if resp.status_code == requests.codes.unprocessable_entity and github_user:\n        raise temple.exceptions.InvalidGithubUserError(\n            'Invalid Github user or org - \"{}\"'.format(github_user))\n    resp.raise_for_status()\n\n    resp_data = resp.json()\n\n    repositories = collections.defaultdict(dict)\n    while True:\n        repositories.update({\n            'git@github.com:{}.git'.format(repo['repository']['full_name']): repo['repository']\n            for repo in resp_data['items']\n        })\n\n        next_url = _parse_link_header(resp.headers).get('next')\n        if next_url:\n            resp = requests.get(next_url, headers=headers)\n            resp.raise_for_status()\n            resp_data = resp.json()\n        else:\n            break\n\n    return repositories", "response": "Perform a Github API code search"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nlist all temple templates and packages associated with the given template.", "response": "def ls(github_user, template=None):\n    \"\"\"Lists all temple templates and packages associated with those templates\n\n    If ``template`` is None, returns the available templates for the configured\n    Github org.\n\n    If ``template`` is a Github path to a template, returns all projects spun\n    up with that template.\n\n    ``ls`` uses the github search API to find results.\n\n    Note that the `temple.constants.TEMPLE_ENV_VAR` is set to 'ls' for the duration of this\n    function.\n\n    Args:\n        github_user (str): The github user or org being searched.\n        template (str, optional): The template git repo path. If provided, lists\n            all projects that have been created with the provided template. Note\n            that the template path is the SSH path\n            (e.g. git@github.com:CloverHealth/temple.git)\n\n    Returns:\n        dict: A dictionary of repository information keyed on the SSH Github url\n\n    Raises:\n        `InvalidGithubUserError`: When ``github_user`` is invalid\n    \"\"\"\n    temple.check.has_env_vars(temple.constants.GITHUB_API_TOKEN_ENV_VAR)\n\n    if template:\n        temple.check.is_git_ssh_path(template)\n        search_q = 'user:{} filename:{} {}'.format(\n            github_user,\n            temple.constants.TEMPLE_CONFIG_FILE,\n            template)\n    else:\n        search_q = 'user:{} cookiecutter.json in:path'.format(github_user)\n\n    results = _code_search(search_q, github_user)\n    return collections.OrderedDict(sorted(results.items()))"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nupdate the contents of a temple package with the latest template.", "response": "def update(check, enter_parameters, version):\n    \"\"\"\n    Update package with latest template. Must be inside of the project\n    folder to run.\n\n    Using \"-e\" will prompt for re-entering the template parameters again\n    even if the project is up to date.\n\n    Use \"-v\" to update to a particular version of a template.\n\n    Using \"-c\" will perform a check that the project is up to date\n    with the latest version of the template (or the version specified by \"-v\").\n    No updating will happen when using this option.\n    \"\"\"\n    if check:\n        if temple.update.up_to_date(version=version):\n            print('Temple package is up to date')\n        else:\n            msg = (\n                'This temple package is out of date with the latest template.'\n                ' Update your package by running \"temple update\" and commiting changes.'\n            )\n            raise temple.exceptions.NotUpToDateWithTemplateError(msg)\n    else:\n        temple.update.update(new_version=version, enter_parameters=enter_parameters)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef ls(github_user, template, long_format):\n    github_urls = temple.ls.ls(github_user, template=template)\n    for ssh_path, info in github_urls.items():\n        if long_format:\n            print(ssh_path, '-', info['description'] or '(no project description found)')\n        else:\n            print(ssh_path)", "response": "List all packages created with temple."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nswitching a project s template to a different version.", "response": "def switch(template, version):\n    \"\"\"\n    Switch a project's template to a different template.\n    \"\"\"\n    temple.update.update(new_template=template, new_version=version)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _in_git_repo():\n    ret = temple.utils.shell('git rev-parse', stderr=subprocess.DEVNULL, check=False)\n    return ret.returncode == 0", "response": "Returns True if the current repository is inside a git repo False otherwise"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _has_branch(branch):\n    ret = temple.utils.shell('git rev-parse --verify {}'.format(branch),\n                             stderr=subprocess.DEVNULL,\n                             stdout=subprocess.DEVNULL,\n                             check=False)\n    return ret.returncode == 0", "response": "Return True if the target branch exists."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef not_has_branch(branch):\n    if _has_branch(branch):\n        msg = 'Cannot proceed while {} branch exists; remove and try again.'.format(branch)\n        raise temple.exceptions.ExistingBranchError(msg)", "response": "Raises an exception if the specified branch exists."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nraises InvalidEnvironmentError when one is not set", "response": "def has_env_vars(*env_vars):\n    \"\"\"Raises `InvalidEnvironmentError` when one isnt set\"\"\"\n    for env_var in env_vars:\n        if not os.environ.get(env_var):\n            msg = (\n                'Must set {} environment variable. View docs for setting up environment at {}'\n            ).format(env_var, temple.constants.TEMPLE_DOCS_URL)\n            raise temple.exceptions.InvalidEnvironmentError(msg)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef is_temple_project():\n    if not os.path.exists(temple.constants.TEMPLE_CONFIG_FILE):\n        msg = 'No {} file found in repository.'.format(temple.constants.TEMPLE_CONFIG_FILE)\n        raise temple.exceptions.InvalidTempleProjectError(msg)", "response": "Raises InvalidTempleProjectError if repository is not a temple project"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ndetermines the current git branch", "response": "def _get_current_branch():\n    \"\"\"Determine the current git branch\"\"\"\n    result = temple.utils.shell('git rev-parse --abbrev-ref HEAD', stdout=subprocess.PIPE)\n    return result.stdout.decode('utf8').strip()"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef clean():\n    temple.check.in_git_repo()\n\n    current_branch = _get_current_branch()\n    update_branch = temple.constants.UPDATE_BRANCH_NAME\n    temp_update_branch = temple.constants.TEMP_UPDATE_BRANCH_NAME\n\n    if current_branch in (update_branch, temp_update_branch):\n        err_msg = (\n            'You must change from the \"{}\" branch since it will be deleted during cleanup'\n        ).format(current_branch)\n        raise temple.exceptions.InvalidCurrentBranchError(err_msg)\n\n    if temple.check._has_branch(update_branch):\n        temple.utils.shell('git branch -D {}'.format(update_branch))\n    if temple.check._has_branch(temp_update_branch):\n        temple.utils.shell('git branch -D {}'.format(temp_update_branch))", "response": "Cleans up temporary resources\n   "}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _cookiecutter_configs_have_changed(template, old_version, new_version):\n    temple.check.is_git_ssh_path(template)\n    repo_path = temple.utils.get_repo_path(template)\n    github_client = temple.utils.GithubClient()\n    api = '/repos/{}/contents/cookiecutter.json'.format(repo_path)\n\n    old_config_resp = github_client.get(api, params={'ref': old_version})\n    old_config_resp.raise_for_status()\n    new_config_resp = github_client.get(api, params={'ref': new_version})\n    new_config_resp.raise_for_status()\n\n    return old_config_resp.json()['content'] != new_config_resp.json()['content']", "response": "Checks if the cookiecutter. json files have changed in the old and new versions\n   "}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _apply_template(template, target, *, checkout, extra_context):\n    with tempfile.TemporaryDirectory() as tempdir:\n        repo_dir = cc_main.cookiecutter(\n            template,\n            checkout=checkout,\n            no_input=True,\n            output_dir=tempdir,\n            extra_context=extra_context)\n        for item in os.listdir(repo_dir):\n            src = os.path.join(repo_dir, item)\n            dst = os.path.join(target, item)\n            if os.path.isdir(src):\n                if os.path.exists(dst):\n                    shutil.rmtree(dst)\n                shutil.copytree(src, dst)\n            else:\n                if os.path.exists(dst):\n                    os.remove(dst)\n                shutil.copy2(src, dst)", "response": "Apply a template to a temporary directory and then copy results to target."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\ncheck if a temple project is up to date with the given version.", "response": "def up_to_date(version=None):\n    \"\"\"Checks if a temple project is up to date with the repo\n\n    Note that the `temple.constants.TEMPLE_ENV_VAR` is set to 'update' for the duration of this\n    function.\n\n    Args:\n        version (str, optional): Update against this git SHA or branch of the template\n\n    Returns:\n        boolean: True if up to date with ``version`` (or latest version), False otherwise\n\n    Raises:\n        `NotInGitRepoError`: When running outside of a git repo\n        `InvalidTempleProjectError`: When not inside a valid temple repository\n    \"\"\"\n    temple.check.in_git_repo()\n    temple.check.is_temple_project()\n\n    temple_config = temple.utils.read_temple_config()\n    old_template_version = temple_config['_version']\n    new_template_version = version or _get_latest_template_version(temple_config['_template'])\n\n    return new_template_version == old_template_version"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _needs_new_cc_config_for_update(old_template, old_version, new_template, new_version):\n    if old_template != new_template:\n        return True\n    else:\n        return _cookiecutter_configs_have_changed(new_template,\n                                                  old_version,\n                                                  new_version)", "response": "Returns True if a new cookiecutter\n    config needs to be obtained from the user."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nupdates the temple project with the latest template and version of the new template.", "response": "def update(old_template=None, old_version=None, new_template=None, new_version=None,\n           enter_parameters=False):\n    \"\"\"Updates the temple project to the latest template\n\n    Proceeeds in the following steps:\n\n    1. Ensure we are inside the project repository\n    2. Obtain the latest version of the package template\n    3. If the package is up to date with the latest template, return\n    4. If not, create an empty template branch with a new copy of the old template\n    5. Create an update branch from HEAD and merge in the new template copy\n    6. Create a new copy of the new template and merge into the empty template branch\n    7. Merge the updated empty template branch into the update branch\n    8. Ensure temple.yaml reflects what is in the template branch\n    9. Remove the empty template branch\n\n    Note that the `temple.constants.TEMPLE_ENV_VAR` is set to 'update' for the\n    duration of this function.\n\n    Two branches will be created during the update process, one named\n    ``_temple_update`` and one named ``_temple_update_temp``. At the end of\n    the process, ``_temple_update_temp`` will be removed automatically. The\n    work will be left in ``_temple_update`` in an uncommitted state for\n    review. The update will fail early if either of these branches exist\n    before the process starts.\n\n    Args:\n        old_template (str, default=None): The old template from which to update. Defaults\n            to the template in temple.yaml\n        old_version (str, default=None): The old version of the template. Defaults to\n            the version in temple.yaml\n        new_template (str, default=None): The new template for updating. Defaults to the\n            template in temple.yaml\n        new_version (str, default=None): The new version of the new template to update.\n            Defaults to the latest version of the new template\n        enter_parameters (bool, default=False): Force entering template parameters for the project\n\n    Raises:\n        `NotInGitRepoError`: When not inside of a git repository\n        `InvalidTempleProjectError`: When not inside a valid temple repository\n        `InDirtyRepoError`: When an update is triggered while the repo is in a dirty state\n        `ExistingBranchError`: When an update is triggered and there is an existing\n            update branch\n\n    Returns:\n        boolean: True if update was performed or False if template was already up to date\n    \"\"\"\n    update_branch = temple.constants.UPDATE_BRANCH_NAME\n    temp_update_branch = temple.constants.TEMP_UPDATE_BRANCH_NAME\n\n    temple.check.in_git_repo()\n    temple.check.in_clean_repo()\n    temple.check.is_temple_project()\n    temple.check.not_has_branch(update_branch)\n    temple.check.not_has_branch(temp_update_branch)\n    temple.check.has_env_vars(temple.constants.GITHUB_API_TOKEN_ENV_VAR)\n\n    temple_config = temple.utils.read_temple_config()\n    old_template = old_template or temple_config['_template']\n    new_template = new_template or temple_config['_template']\n    old_version = old_version or temple_config['_version']\n    new_version = new_version or _get_latest_template_version(new_template)\n\n    if new_template == old_template and new_version == old_version and not enter_parameters:\n        print('No updates have happened to the template, so no files were updated')\n        return False\n\n    print('Creating branch {} for processing the update'.format(update_branch))\n    temple.utils.shell('git checkout -b {}'.format(update_branch),\n                       stderr=subprocess.DEVNULL)\n\n    print('Creating temporary working branch {}'.format(temp_update_branch))\n    temple.utils.shell('git checkout --orphan {}'.format(temp_update_branch),\n                       stderr=subprocess.DEVNULL)\n    temple.utils.shell('git rm -rf .',\n                       stdout=subprocess.DEVNULL)\n    _apply_template(old_template,\n                    '.',\n                    checkout=old_version,\n                    extra_context=temple_config)\n    temple.utils.shell('git add .')\n    temple.utils.shell(\n        'git commit --no-verify -m \"Initialize template from version {}\"'.format(old_version),\n        stdout=subprocess.DEVNULL,\n        stderr=subprocess.DEVNULL)\n\n    print('Merge old template history into update branch.')\n    temple.utils.shell('git checkout {}'.format(update_branch),\n                       stderr=subprocess.DEVNULL)\n    temple.utils.shell(\n        'git merge -s ours --no-edit --allow-unrelated-histories {}'.format(temp_update_branch),\n        stderr=subprocess.DEVNULL)\n\n    print('Update template in temporary branch.')\n    temple.utils.shell('git checkout {}'.format(temp_update_branch),\n                       stderr=subprocess.DEVNULL)\n    temple.utils.shell('git rm -rf .',\n                       stdout=subprocess.DEVNULL)\n\n    # If the cookiecutter.json files have changed or the templates have changed,\n    # the user will need to re-enter the cookiecutter config\n    needs_new_cc_config = _needs_new_cc_config_for_update(old_template, old_version,\n                                                          new_template, new_version)\n    if needs_new_cc_config:\n        if old_template != new_template:\n            cc_config_input_msg = (\n                'You will be prompted for the parameters of the new template.'\n                ' Please read the docs at https://github.com/{} before entering parameters.'\n                ' Press enter to continue'\n            ).format(temple.utils.get_repo_path(new_template))\n        else:\n            cc_config_input_msg = (\n                'A new template variable has been defined in the updated template.'\n                ' You will be prompted to enter all of the variables again. Variables'\n                ' already configured in your project will have their values set as'\n                ' defaults. Press enter to continue'\n            )\n\n        input(cc_config_input_msg)\n\n    # Even if there is no detected need to re-enter the cookiecutter config, the user\n    # can still re-enter config parameters with the \"enter_parameters\" flag\n    if needs_new_cc_config or enter_parameters:\n        _, temple_config = (\n            temple.utils.get_cookiecutter_config(new_template,\n                                                 default_config=temple_config,\n                                                 version=new_version))\n\n    _apply_template(new_template,\n                    '.',\n                    checkout=new_version,\n                    extra_context=temple_config)\n    temple.utils.write_temple_config(temple_config, new_template, new_version)\n\n    temple.utils.shell('git add .')\n    temple.utils.shell(\n        'git commit --no-verify -m \"Update template to version {}\"'.format(new_version),\n        stdout=subprocess.DEVNULL,\n        stderr=subprocess.DEVNULL)\n\n    print('Merge updated template into update branch.')\n    temple.utils.shell('git checkout {}'.format(update_branch),\n                       stderr=subprocess.DEVNULL)\n    temple.utils.shell('git merge --no-commit {}'.format(temp_update_branch),\n                       check=False,\n                       stdout=subprocess.DEVNULL,\n                       stderr=subprocess.DEVNULL)\n    # The temple.yaml file should always reflect what is in the new template\n    temple.utils.shell('git checkout --theirs {}'.format(temple.constants.TEMPLE_CONFIG_FILE),\n                       check=False,\n                       stdout=subprocess.DEVNULL,\n                       stderr=subprocess.DEVNULL)\n\n    print('Remove temporary template branch {}'.format(temp_update_branch))\n    temple.utils.shell('git branch -D {}'.format(temp_update_branch),\n                       stdout=subprocess.DEVNULL)\n\n    print(textwrap.dedent(\"\"\"\\\n        Updating complete!\n\n        Please review the changes with \"git status\" for any errors or\n        conflicts. Once you are satisfied with the changes, add, commit,\n        push, and open a PR with the branch {}\n    \"\"\").format(update_branch))\n    return True"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef shell(cmd, check=True, stdin=None, stdout=None, stderr=None):\n    return subprocess.run(cmd, shell=True, check=check, stdin=stdin, stdout=stdout, stderr=stderr)", "response": "Runs a subprocess shell with check = True by default"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nread the temple YAML configuration file in the repository", "response": "def read_temple_config():\n    \"\"\"Reads the temple YAML configuration file in the repository\"\"\"\n    with open(temple.constants.TEMPLE_CONFIG_FILE) as temple_config_file:\n        return yaml.load(temple_config_file, Loader=yaml.SafeLoader)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef write_temple_config(temple_config, template, version):\n    with open(temple.constants.TEMPLE_CONFIG_FILE, 'w') as temple_config_file:\n        versioned_config = {\n            **temple_config,\n            **{'_version': version, '_template': template},\n        }\n        yaml.dump(versioned_config, temple_config_file, Dumper=yaml.SafeDumper)", "response": "Writes the temple YAML configuration"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn the configuration used for cookiecutter templating", "response": "def get_cookiecutter_config(template, default_config=None, version=None):\n    \"\"\"Obtains the configuration used for cookiecutter templating\n\n    Args:\n        template: Path to the template\n        default_config (dict, optional): The default configuration\n        version (str, optional): The git SHA or branch to use when\n            checking out template. Defaults to latest version\n\n    Returns:\n        tuple: The cookiecutter repo directory and the config dict\n    \"\"\"\n    default_config = default_config or {}\n    config_dict = cc_config.get_user_config()\n    repo_dir, _ = cc_repository.determine_repo_dir(\n        template=template,\n        abbreviations=config_dict['abbreviations'],\n        clone_to_dir=config_dict['cookiecutters_dir'],\n        checkout=version,\n        no_input=True)\n    context_file = os.path.join(repo_dir, 'cookiecutter.json')\n    context = cc_generate.generate_context(\n        context_file=context_file,\n        default_context={**config_dict['default_context'], **default_config})\n    return repo_dir, cc_prompt.prompt_for_config(context)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef set_cmd_env_var(value):\n    def func_decorator(function):\n        @functools.wraps(function)\n        def wrapper(*args, **kwargs):\n            previous_cmd_env_var = os.getenv(temple.constants.TEMPLE_ENV_VAR)\n            os.environ[temple.constants.TEMPLE_ENV_VAR] = value\n            try:\n                ret_val = function(*args, **kwargs)\n            finally:\n                if previous_cmd_env_var is None:\n                    del os.environ[temple.constants.TEMPLE_ENV_VAR]\n                else:\n                    os.environ[temple.constants.TEMPLE_ENV_VAR] = previous_cmd_env_var\n\n            return ret_val\n        return wrapper\n    return func_decorator", "response": "Decorator that sets the temple command env var to value"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _call_api(self, verb, url, **request_kwargs):\n        api = 'https://api.github.com{}'.format(url)\n        auth_headers = {'Authorization': 'token {}'.format(self.api_token)}\n        headers = {**auth_headers, **request_kwargs.pop('headers', {})}\n        return getattr(requests, verb)(api, headers=headers, **request_kwargs)", "response": "Perform a github API call\n           "}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\ndeploy the package and documentation to PyPI.", "response": "def deploy(target):\n    \"\"\"Deploys the package and documentation.\n\n    Proceeds in the following steps:\n\n    1. Ensures proper environment variables are set and checks that we are on Circle CI\n    2. Tags the repository with the new version\n    3. Creates a standard distribution and a wheel\n    4. Updates version.py to have the proper version\n    5. Commits the ChangeLog, AUTHORS, and version.py file\n    6. Pushes to PyPI\n    7. Pushes the tags and newly committed files\n\n    Raises:\n        `EnvironmentError`:\n            - Not running on CircleCI\n            - `*_PYPI_USERNAME` and/or `*_PYPI_PASSWORD` environment variables\n               are missing\n            - Attempting to deploy to production from a branch that isn't master\n    \"\"\"\n    # Ensure proper environment\n    if not os.getenv(CIRCLECI_ENV_VAR):     # pragma: no cover\n        raise EnvironmentError('Must be on CircleCI to run this script')\n\n    current_branch = os.getenv('CIRCLE_BRANCH')\n    if (target == 'PROD') and (current_branch != 'master'):\n        raise EnvironmentError((\n            'Refusing to deploy to production from branch {current_branch!r}. '\n            'Production deploys can only be made from master.'\n        ).format(current_branch=current_branch))\n\n    if target in ('PROD', 'TEST'):\n        pypi_username = os.getenv('{target}_PYPI_USERNAME'.format(target=target))\n        pypi_password = os.getenv('{target}_PYPI_PASSWORD'.format(target=target))\n    else:\n        raise ValueError(\n            \"Deploy target must be 'PROD' or 'TEST', got {target!r}.\".format(target=target))\n\n    if not (pypi_username and pypi_password):  # pragma: no cover\n        raise EnvironmentError((\n            \"Missing '{target}_PYPI_USERNAME' and/or '{target}_PYPI_PASSWORD' \"\n            \"environment variables. These are required to push to PyPI.\"\n        ).format(target=target))\n\n    # Twine requires these environment variables to be set. Subprocesses will\n    # inherit these when we invoke them, so no need to pass them on the command\n    # line. We want to avoid that in case something's logging each command run.\n    os.environ['TWINE_USERNAME'] = pypi_username\n    os.environ['TWINE_PASSWORD'] = pypi_password\n\n    # Set up git on circle to push to the current branch\n    _shell('git config --global user.email \"oss@cloverhealth.com\"')\n    _shell('git config --global user.name \"Circle CI\"')\n    _shell('git config push.default current')\n\n    # Obtain the version to deploy\n    ret = _shell('make version', stdout=subprocess.PIPE)\n    version = ret.stdout.decode('utf-8').strip()\n\n    print('Deploying version {version!r}...'.format(version=version))\n\n    # Tag the version\n    _shell('git tag -f -a {version} -m \"Version {version}\"'.format(version=version))\n\n    # Update the version\n    _shell(\n        'sed -i.bak \"s/^__version__ = .*/__version__ = {version!r}/\" */version.py'.format(\n            version=version))\n\n    # Create a standard distribution and a wheel\n    _shell('python setup.py sdist bdist_wheel')\n\n    # Add the updated ChangeLog and AUTHORS\n    _shell('git add ChangeLog AUTHORS */version.py')\n\n    # Start the commit message with \"Merge\" so that PBR will ignore it in the\n    # ChangeLog. Use [skip ci] to ensure CircleCI doesn't recursively deploy.\n    _shell('git commit --no-verify -m \"Merge autogenerated files [skip ci]\"')\n\n    # Push the distributions to PyPI.\n    _pypi_push('dist')\n\n    # Push the tag and AUTHORS / ChangeLog after successful PyPI deploy\n    _shell('git push --follow-tags')\n\n    print('Deployment complete. Latest version is {version}.'.format(version=version))"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef run(self):\n\t\tfilename = \".DS_Store\"\n\t\tcommand = \"find {path} -type f -name \\\"{filename}\\\" \".format(path = self.path, filename = filename)\n\t\tcmd = CommandHelper(command)\n\t\tcmd.execute()\n\t\tfiles = cmd.output.split(\"\\n\")\n\t\tfor f in files:\n\t\t\tif not f.endswith(filename):\n\t\t\t\tcontinue\n\n\t\t\t# Ignore paths excluded\n\t\t\trel_path = f.replace(self.path, \"\")\n\t\t\tif rel_path.startswith(tuple(self.CONFIG['exclude_paths'])):\n\t\t\t\tcontinue\n\n\t\t\tissue = Issue()\n\t\t\tissue.name = \"File .DS_Store detected\"\n\t\t\tissue.potential = False\n\t\t\tissue.severity = Issue.SEVERITY_LOW\n\n\t\t\t# Get only relative path\n\t\t\tissue.file = rel_path\n\n\t\t\tself.saveIssue(issue)", "response": "Find and save the files into path\n\t"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nsetting the _path property to the absolute path to scan.", "response": "def path(self, value):\n\t\t\"\"\"\n\t\tSetter for 'path' property\n\n\t\tArgs:\n\t\t\tvalue (str): Absolute path to scan\n\n\t\t\"\"\"\n\t\tif not value.endswith('/'):\n\t\t\tself._path = '{v}/'.format(v=value)\n\t\telse:\n\t\t\tself._path = value"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nparsing the checker config values", "response": "def parseConfig(cls, value):\n\t\t\"\"\"\n\t\tParse the config values\n\n\t\tArgs:\n\t\t\tvalue (dict): Dictionary which contains the checker config\n\n\t\tReturns:\n\t\t\tdict: The checker config with parsed values\n\t\t\"\"\"\n\t\tif 'enabled' in value:\n\t\t\tvalue['enabled'] = bool(value['enabled'])\n\n\t\tif 'exclude_paths' in value:\n\t\t\tvalue['exclude_paths'] = [n.strip() for n in ast.literal_eval(value['exclude_paths'])]\n\n\t\treturn value"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\ncheck if a software is installed into machine.", "response": "def isInstalled(value):\n\t\t\"\"\"\n\t\tCheck if a software is installed into machine.\n\n\t\tArgs:\n\t\t\tvalue (str): Software's name\n\n\t\tReturns:\n\t\t\tbool: True if the software is installed. False else\n\t\t\"\"\"\n\n\t\tfunction = \"\"\"\n\t\tfunction is_installed {\n\t\t  local return_=1;\n\t\t  type $1 >/dev/null 2>&1 || { local return_=0; };\n\t\t  echo \"$return_\";\n\t\t}\"\"\"\n\t\tcommand = \"\"\"bash -c '{f}; echo $(is_installed \\\"{arg}\\\")'\"\"\".format(f = function, arg=value)\n\t\tcmd = CommandHelper(command)\n\t\tcmd.execute()\n\n\t\treturn \"1\" in cmd.output"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nreturning the OS name.", "response": "def getOSName(self):\n\t\t\"\"\"\n\t\tGet the OS name. If OS is linux, returns the Linux distribution name\n\n\t\tReturns:\n\t\t\tstr: OS name\n\t\t\"\"\"\n\t\t_system = platform.system()\n\t\tif _system in [self.__class__.OS_WINDOWS, self.__class__.OS_MAC, self.__class__.OS_LINUX]:\n\t\t\tif _system == self.__class__.OS_LINUX:\n\t\t\t\t_dist = platform.linux_distribution()[0]\n\t\t\t\tif _dist.lower() == self.__class__.OS_UBUNTU.lower():\n\t\t\t\t\treturn self.__class__.OS_UBUNTU\n\t\t\t\telif _dist.lower() == self.__class__.OS_DEBIAN.lower():\n\t\t\t\t\treturn self.__class__.OS_DEBIAN\n\t\t\t\telif _dist.lower() == self.__class__.OS_CENTOS.lower():\n\t\t\t\t\treturn self.__class__.OS_CENTOS\n\t\t\t\telif _dist.lower() == self.__class__.OS_REDHAT.lower():\n\t\t\t\t\treturn self.__class__.OS_REDHAT\n\t\t\t\telif _dist.lower() == self.__class__.OS_KALI.lower():\n\t\t\t\t\treturn self.__class__.OS_KALI\n\t\t\treturn _system\n\t\telse:\n\t\t\treturn None"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef execute(self, shell = True):\n\t\tprocess = Popen(self.command, stdout=PIPE, stderr=PIPE, shell=shell)\n\t\tself.output, self.errors = process.communicate()", "response": "Executes the command setted into class\n\tArgs"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _debug(message, color=None, attrs=None):\n\t\tif attrs is None:\n\t\t\tattrs = []\n\t\tif color is not None:\n\t\t\tprint colored(message, color, attrs=attrs)\n\t\telse:\n\t\t\tif len(attrs) > 0:\n\t\t\t\tprint colored(message, \"white\", attrs=attrs)\n\t\t\telse:\n\t\t\t\tprint message", "response": "Print a message if the class attribute verbose is enabled."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ncreates required directories and copy checkers and reports.", "response": "def setup():\n\t\t\"\"\"\n\t\t\tCreates required directories and copy checkers and reports.\n\t\t\"\"\"\n\n\n\t\t# # Check if dir is writable\n\t\t# if not os.access(AtomShieldsScanner.HOME, os.W_OK):\n\t\t# \tAtomShieldsScanner.HOME = os.path.expanduser(\"~/.atomshields\")\n\t\t# \tAtomShieldsScanner.CHECKERS_DIR = os.path.join(AtomShieldsScanner.HOME, \"checkers\")\n\t\t# \tAtomShieldsScanner.REPORTS_DIR = os.path.join(AtomShieldsScanner.HOME, \"reports\")\n\n\n\t\tif not os.path.isdir(AtomShieldsScanner.CHECKERS_DIR):\n\t\t\tos.makedirs(AtomShieldsScanner.CHECKERS_DIR)\n\t\tif not os.path.isdir(AtomShieldsScanner.REPORTS_DIR):\n\t\t\tos.makedirs(AtomShieldsScanner.REPORTS_DIR)\n\n\n\t\t# Copy all checkers\n\t\tfor f in AtomShieldsScanner._getFiles(os.path.join(os.path.dirname(os.path.realpath(__file__)), \"checkers\"), \"*.py\"):\n\t\t\tAtomShieldsScanner.installChecker(f)\n\t\t# Copy all reports\n\t\tfor f in AtomShieldsScanner._getFiles(os.path.join(os.path.dirname(os.path.realpath(__file__)), \"reports\"), \"*.py\"):\n\t\t\tAtomShieldsScanner.installReport(f)\n\n\t\tAtomShieldsScanner._executeMassiveMethod(path=AtomShieldsScanner.CHECKERS_DIR, method=\"install\", args={})\n\n\n\t\tconfig_dir = os.path.dirname(AtomShieldsScanner.CONFIG_PATH)\n\t\tif not os.path.isdir(config_dir):\n\t\t\tos.makedirs(config_dir)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nadding a config to the configparser.", "response": "def _addConfig(instance, config, parent_section):\n\t\t\"\"\"\n\t\tWrites a section for a plugin.\n\n\t\tArgs:\n\t\t\tinstance (object): Class instance for plugin\n\t\t\tconfig (object): Object (ConfigParser) which the current config\n\t\t\tparent_section (str): Parent section for plugin. Usually 'checkers' or 'reports'\n\t\t\"\"\"\n\t\ttry:\n\t\t\tsection_name = \"{p}/{n}\".format(p = parent_section, n=instance.NAME.lower())\n\t\t\tconfig.add_section(section_name)\n\t\t\tfor k in instance.CONFIG.keys():\n\t\t\t\tconfig.set(section_name, k, instance.CONFIG[k])\n\t\texcept Exception as e:\n\t\t\tprint \"[!] %s\" % e"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef getConfig(self, section = None):\n\t\tdata = {}\n\t\tif section is None:\n\t\t\tfor s in self.config.sections():\n\t\t\t\tif '/' in s:\n\t\t\t\t\t# Subsection\n\t\t\t\t\tparent, _s = s.split('/')\n\t\t\t\t\tdata[parent][_s] = dict(self.config.items(s))\n\t\t\t\telse:\n\t\t\t\t\tdata[s] = dict(self.config.items(s))\n\t\telse:\n\t\t\t# Only one section will be returned\n\t\t\tdata = dict(self.config.items(section))\n\t\treturn data", "response": "Returns a dictionary which contains the current config."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef _getClassInstance(path, args=None):\n\t\tif not path.endswith(\".py\"):\n\t\t\treturn None\n\n\t\tif args is None:\n\t\t\targs = {}\n\n\t\tclassname = AtomShieldsScanner._getClassName(path)\n\t\tbasename = os.path.basename(path).replace(\".py\", \"\")\n\t\tsys.path.append(os.path.dirname(path))\n\t\ttry:\n\t\t\tmod = __import__(basename, globals(), locals(), [classname], -1)\n\t\t\tclass_ = getattr(mod, classname)\n\t\t\tinstance = class_(**args)\n\t\texcept Exception as e:\n\t\t\tAtomShieldsScanner._debug(\"[!] %s\" % e)\n\t\t\treturn None\n\t\tfinally:\n\t\t\tsys.path.remove(os.path.dirname(path))\n\t\treturn instance", "response": "Returns a class instance from a. py file."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _executeMassiveMethod(path, method, args=None, classArgs = None):\n\t\tresponse = {}\n\n\t\tif args is None:\n\t\t\targs = {}\n\n\t\tif classArgs is None:\n\t\t\tclassArgs = {}\n\n\t\tsys.path.append(path)\n\t\texclude = [\"__init__.py\", \"base.py\"]\n\t\tfor f in AtomShieldsScanner._getFiles(path, \"*.py\", exclude=exclude):\n\t\t\ttry:\n\t\t\t\tinstance = AtomShieldsScanner._getClassInstance(path = f, args = classArgs)\n\t\t\t\tif instance is not None:\n\t\t\t\t\tif callable(method):\n\t\t\t\t\t\targs[\"instance\"] = instance\n\t\t\t\t\t\toutput = method(**args)\n\t\t\t\t\t\tresponse[instance.__class__.NAME] = output\n\t\t\t\t\telse:\n\t\t\t\t\t\tif hasattr(instance, method):\n\t\t\t\t\t\t\toutput = getattr(instance, method)(**args)\n\t\t\t\t\t\t\tresponse[instance.__class__.NAME] = output\n\t\t\t\t\t\telse:\n\t\t\t\t\t\t\tcontinue\n\n\t\t\texcept Exception as e:\n\t\t\t\tAtomShieldsScanner._debug(\"[!] %s\" % e)\n\t\tsys.path.remove(path)\n\t\treturn response", "response": "Execute a method on each class instance located in path."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef run(self):\n\n\t\tself.checkProperties()\n\n\t\tself.debug(\"[*] Iniciando escaneo de AtomShields con las siguientes propiedades. . . \")\n\n\t\tself.showScanProperties()\n\n\t\tself.loadConfig()\n\n\t\t# Init time counter\n\t\tinit_ts = datetime.now()\n\n\t\t# Execute plugins\n\t\tcwd = os.getcwd()\n\t\tos.chdir(self.path)\n\t\tissues = self.executeCheckers()\n\t\tos.chdir(cwd)\n\n\n\n\n\t\t# Finish time counter\n\t\tend_ts = datetime.now()\n\t\tduration = '{}'.format(end_ts - init_ts)\n\n\t\t# Process and set issues\n\t\tfor plugin in issues.keys():\n\t\t\tvalue = issues[plugin]\n\t\t\tif isinstance(value, list):\n\t\t\t\tmap(self.saveIssue, value)\n\t\t\telse:\n\t\t\t\tself.saveIssue(value)\n\n\n\n\t\t# Execute reports\n\t\tprint \"\"\n\t\tself.executeReports()\n\n\n\t\t# Print summary output.\n\t\tself.debug(\"\")\n\t\tself.debug(\"Duration: {t}\".format(t=duration))\n\t\tself.showSummary()\n\n\t\treturn self.issues", "response": "Execute all checks and reports and set the issues to the current state."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef install():\n\t\tcmd = CommandHelper()\n\t\tcmd.install(\"npm\")\n\n\t\tcmd = CommandHelper()\n\t\tcmd.install(\"nodejs-legacy\")\n\n\t\t# Install retre with npm\n\t\tcmd = CommandHelper()\n\t\tcmd.command = \"npm install -g retire\"\n\t\tcmd.execute()\n\n\t\tif cmd.errors:\n\t\t\tfrom termcolor import colored\n\t\t\tprint colored(cmd.errors, \"red\")\n\t\telse:\n\t\t\tprint cmd.output", "response": "Install all the dependences\n\t"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nsetting the potential property of the object.", "response": "def potential(self, value):\n\t\t\"\"\"\n\t\tSetter for 'potential' property\n\n\t\tArgs:\n\t\t\tvalue (bool): True if a potential is required. False else\n\n\t\t\"\"\"\n\t\tif value:\n\t\t\tself._potential = True\n\t\telse:\n\t\t\tself._potential = False"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef get(name, default=None, allow_default=True):\n    return Config().get(name, default, allow_default=allow_default)", "response": "Get a value from the configuration file."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef env(key, default):\n    value = os.environ.get(key, None)\n    if value is not None:\n        log.info('    %s = %r', key.lower().replace('_', '.'), value)\n        return value\n\n    key = key.lower().replace('_', '.')\n    value = get(key)\n    if value is not None:\n        return value\n\n    return default", "response": "Helper to try to get a setting from the environment or pyconfig or a provided default."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ntrying to get key from environment variables.", "response": "def env_key(key, default):\n    \"\"\"\n    Try to get `key` from the environment.\n\n    This mutates `key` to replace dots with underscores and makes it all\n    uppercase.\n\n        my.database.host => MY_DATABASE_HOST\n\n    \"\"\"\n    env = key.upper().replace('.', '_')\n    return os.environ.get(env, default)"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nchange a setting value.", "response": "def set(self, name, value):\n        \"\"\" Changes a setting value.\n\n            This implements a locking mechanism to ensure some level of thread\n            safety.\n\n            :param str name: Setting key name.\n            :param value: Setting value.\n\n        \"\"\"\n        if not self.settings.get('pyconfig.case_sensitive', False):\n            name = name.lower()\n        log.info(\"    %s = %s\", name, repr(value))\n\n        # Acquire our lock to change the config\n        with self.mut_lock:\n            self.settings[name] = value"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _update(self, conf_dict, base_name=None):\n        for name in conf_dict:\n            # Skip private names\n            if name.startswith('_'):\n                continue\n            value = conf_dict[name]\n            # Skip Namespace if it's imported\n            if value is Namespace:\n                continue\n            # Use a base namespace\n            if base_name:\n                name = base_name + '.' + name\n            if isinstance(value, Namespace):\n                for name, value in value.iteritems(name):\n                    self.set(name, value)\n            # Automatically call any functions in the settings module, and if\n            # they return a value other than None, that value becomes a setting\n            elif callable(value):\n                value = value()\n                if value is not None:\n                    self.set(name, value)\n            else:\n                self.set(name, value)", "response": "Updates the current configuration with the values in conf_dict."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nloads all the config plugins and localconfig modules and sets the current configuration to the current values.", "response": "def load(self, clear=False):\n        \"\"\"\n        Loads all the config plugin modules to build a working configuration.\n\n        If there is a ``localconfig`` module on the python path, it will be\n        loaded last, overriding other settings.\n\n        :param bool clear: Clear out the previous settings before loading\n\n        \"\"\"\n        if clear:\n            self.settings = {}\n\n        defer = []\n\n        # Load all config plugins\n        for conf in pkg_resources.iter_entry_points('pyconfig'):\n            if conf.attrs:\n                raise RuntimeError(\"config must be a module\")\n\n            mod_name = conf.module_name\n            base_name = conf.name if conf.name != 'any' else None\n\n            log.info(\"Loading module '%s'\", mod_name)\n            mod_dict = runpy.run_module(mod_name)\n\n            # If this module wants to be deferred, save it for later\n            if mod_dict.get('deferred', None) is deferred:\n                log.info(\"Deferring module '%s'\", mod_name)\n                mod_dict.pop('deferred')\n                defer.append((mod_name, base_name, mod_dict))\n                continue\n\n            self._update(mod_dict, base_name)\n\n        # Load deferred modules\n        for mod_name, base_name, mod_dict in defer:\n            log.info(\"Loading deferred module '%s'\", mod_name)\n            self._update(mod_dict, base_name)\n\n        if etcd().configured:\n            # Load etcd stuff\n            mod_dict = etcd().load()\n            if mod_dict:\n                self._update(mod_dict)\n\n        # Allow localconfig overrides\n        mod_dict = None\n        try:\n            mod_dict = runpy.run_module('localconfig')\n        except ImportError:\n            pass\n        except ValueError as err:\n            if getattr(err, 'message') != '__package__ set to non-string':\n                raise\n\n            # This is a bad work-around to make this work transparently...\n            # shouldn't really access core stuff like this, but Fuck It[tm]\n            mod_name = 'localconfig'\n            if sys.version_info < (2, 7):\n                loader, code, fname = runpy._get_module_details(mod_name)\n            else:\n                _, loader, code, fname = runpy._get_module_details(mod_name)\n            mod_dict = runpy._run_code(code, {}, {}, mod_name, fname, loader,\n                    pkg_name=None)\n\n        if mod_dict:\n            log.info(\"Loading module 'localconfig'\")\n            self._update(mod_dict)\n\n        self.call_reload_hooks()"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef get(self, name, default, allow_default=True):\n        if not self.settings.get('pyconfig.case_sensitive', False):\n            name = name.lower()\n        if name not in self.settings:\n            if not allow_default:\n                raise LookupError('No setting \"{name}\"'.format(name=name))\n            self.settings[name] = default\n        return self.settings[name]", "response": "Get a value from the settings dictionary."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef init(self, hosts=None, cacert=None, client_cert=None, client_key=None):\n        # Try to get the etcd module\n        try:\n            import etcd\n            self.module = etcd\n        except ImportError:\n            pass\n\n        if not self.module:\n            return\n\n        self._parse_jetconfig()\n\n        # Check env for overriding configuration or pyconfig setting\n        hosts = env('PYCONFIG_ETCD_HOSTS', hosts)\n        protocol = env('PYCONFIG_ETCD_PROTOCOL', None)\n        cacert = env('PYCONFIG_ETCD_CACERT', cacert)\n        client_cert = env('PYCONFIG_ETCD_CERT', client_cert)\n        client_key = env('PYCONFIG_ETCD_KEY', client_key)\n\n        # Parse auth string if there is one\n        username = None\n        password = None\n        auth = env('PYCONFIG_ETCD_AUTH', None)\n        if auth:\n            auth = auth.split(':')\n            auth.append('')\n            username = auth[0]\n            password = auth[1]\n\n        # Create new etcd instance\n        hosts = self._parse_hosts(hosts)\n        if hosts is None:\n            return\n\n        kw = {}\n        # Need this when passing a list of hosts to python-etcd, which we\n        # always do, even if it's a list of one\n        kw['allow_reconnect'] = True\n\n        # Grab optional protocol argument\n        if protocol:\n            kw['protocol'] = protocol\n\n        # Add auth to constructor if we got it\n        if username:\n            kw['username'] = username\n        if password:\n            kw['password'] = password\n\n        # Assign the SSL args if we have 'em\n        if cacert:\n            kw['ca_cert'] = os.path.abspath(cacert)\n        if client_cert and client_key:\n            kw['cert'] = ((os.path.abspath(client_cert),\n                os.path.abspath(client_key)))\n        elif client_cert:\n            kw['cert'] = os.path.abspath(client_cert)\n        if cacert or client_cert or client_key:\n            kw['protocol'] = 'https'\n\n        self.client = self.module.Client(hosts, **kw)", "response": "Initialize etcd client instance and other business."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nload the settings from etcd and return a dictionary of settings.", "response": "def load(self, prefix=None, depth=None):\n        \"\"\"\n        Return a dictionary of settings loaded from etcd.\n\n        \"\"\"\n        prefix = prefix or self.prefix\n        prefix = '/' + prefix.strip('/') + '/'\n        if depth is None:\n            depth = self.inherit_depth\n\n        if not self.configured:\n            log.debug(\"etcd not available\")\n            return\n\n        if self.watching:\n            log.info(\"Starting watcher for %r\", prefix)\n            self.start_watching()\n\n        log.info(\"Loading from etcd %r\", prefix)\n        try:\n            result = self.client.get(prefix)\n        except self.module.EtcdKeyNotFound:\n            result = None\n        if not result:\n            log.info(\"No configuration found\")\n            return {}\n\n        # Iterate over the returned keys from etcd\n        update = {}\n        for item in result.children:\n            key = item.key\n            value = item.value\n            # Try to parse them as JSON strings, just in case it works\n            try:\n                value = pytool.json.from_json(value)\n            except:\n                pass\n\n            # Make the key lower-case if we're not case-sensitive\n            if not self.case_sensitive:\n                key = key.lower()\n\n            # Strip off the prefix that we're using\n            if key.startswith(prefix):\n                key = key[len(prefix):]\n\n            # Store the key/value to update the config\n            update[key] = value\n\n        # Access cached settings directly to avoid recursion\n        inherited = Config().settings.get(self.inherit_key,\n                update.get(self.inherit_key, None))\n        if depth > 0 and inherited:\n            log.info(\"    ... inheriting ...\")\n            inherited = self.load(inherited, depth - 1) or {}\n            inherited.update(update)\n            update = inherited\n\n        return update"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nreturn a generator which yields events as they happen.", "response": "def get_watcher(self):\n        \"\"\"\n        Return a etcd watching generator which yields events as they happen.\n\n        \"\"\"\n        if not self.watching:\n            raise StopIteration()\n        return self.client.eternal_watch(self.prefix, recursive=True)"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef start_watching(self):\n        # Don't create a new watcher thread if we already have one running\n        if self.watcher and self.watcher.is_alive():\n            return\n\n        # Create a new watcher thread and start it\n        self.watcher = Watcher()\n        self.watcher.start()", "response": "Starts watching etcd for changes."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _parse_hosts(self, hosts):\n        # Default host\n        if hosts is None:\n            return\n\n        # If it's a string, we allow comma separated strings\n        if isinstance(hosts, six.string_types):\n            # Split comma-separated list\n            hosts = [host.strip() for host in hosts.split(',')]\n            # Split host and port\n            hosts = [host.split(':') for host in hosts]\n            # Coerce ports to int\n            hosts = [(host[0], int(host[1])) for host in hosts]\n\n        # The python-etcd client explicitly checks for a tuple type\n        return tuple(hosts)", "response": "Parse hosts into a tuple of tuples."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nparse the JETCONFIG_ETCD environment variable and set the environment variables to the values in the environment variables.", "response": "def _parse_jetconfig(self):\n        \"\"\"\n        Undocumented cross-compatability functionality with jetconfig\n        (https://github.com/shakefu/jetconfig) that is very sloppy.\n\n        \"\"\"\n        conf = env('JETCONFIG_ETCD', None)\n\n        if not conf:\n            return\n\n        import urlparse\n\n        auth = None\n        port = None\n        conf = conf.split(',').pop()\n        entry = urlparse.urlparse(conf)\n        scheme = entry.scheme\n        host = entry.netloc or entry.path # Path is where it goes if there's no\n                                          # scheme on the URL\n\n        if '@' in host:\n            auth, host = host.split('@')\n\n        if ':' in host:\n            host, port = host.split(':')\n\n        if not port and scheme == 'https':\n            port = '443'\n\n        if scheme:\n            os.environ['PYCONFIG_ETCD_PROTOCOL'] = scheme\n\n        if auth:\n            os.environ['PYCONFIG_ETCD_AUTH'] = auth\n\n        if port:\n            host = host + \":\" + port\n\n        os.environ['PYCONFIG_ETCD_HOSTS'] = host"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef main():\n    parser = argparse.ArgumentParser(description=\"Helper for working with \"\n            \"pyconfigs\")\n    target_group = parser.add_mutually_exclusive_group()\n    target_group.add_argument('-f', '--filename',\n            help=\"parse an individual file or directory\",\n            metavar='F')\n    target_group.add_argument('-m', '--module',\n            help=\"parse a package or module, recursively looking inside it\",\n            metavar='M')\n    parser.add_argument('-v', '--view-call',\n            help=\"show the actual pyconfig call made (default: show namespace)\",\n            action='store_true')\n    parser.add_argument('-l', '--load-configs',\n            help=\"query the currently set value for each key found\",\n            action='store_true')\n    key_group = parser.add_mutually_exclusive_group()\n    key_group.add_argument('-a', '--all',\n            help=\"show keys which don't have defaults set\",\n            action='store_true')\n    key_group.add_argument('-k', '--only-keys',\n            help=\"show a list of discovered keys without values\",\n            action='store_true')\n    parser.add_argument('-n', '--natural-sort',\n            help=\"sort by filename and line (default: alphabetical by key)\",\n            action='store_true')\n    parser.add_argument('-s', '--source',\n            help=\"show source annotations (implies --natural-sort)\",\n            action='store_true')\n    parser.add_argument('-c', '--color',\n            help=\"toggle output colors (default: %s)\" % bool(pygments),\n            action='store_const', default=bool(pygments),\n            const=(not bool(pygments)))\n    args = parser.parse_args()\n\n    if args.color and not pygments:\n        _error(\"Pygments is required for color output.\\n\"\n                \"    pip install pygments\")\n\n    if args.module:\n        _handle_module(args)\n\n    if args.filename:\n        _handle_file(args)", "response": "Main function for the pyconfig command."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef _handle_module(args):\n    module = _get_module_filename(args.module)\n    if not module:\n        _error(\"Could not load module or package: %r\", args.module)\n    elif isinstance(module, Unparseable):\n        _error(\"Could not determine module source: %r\", args.module)\n\n    _parse_and_output(module, args)", "response": "Handles the - m argument."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _error(msg, *args):\n    print(msg % args, file=sys.stderr)\n    sys.exit(1)", "response": "Print an error message and exit."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _get_module_filename(module):\n    # Split up the module and its containing package, if it has one\n    module = module.split('.')\n    package = '.'.join(module[:-1])\n    module = module[-1]\n\n    try:\n        if not package:\n            # We aren't accessing a module within a package, but rather a top\n            # level package, so it's a straight up import\n            module = __import__(module)\n        else:\n            # Import the package containing our desired module\n            package = __import__(package, fromlist=[module])\n            # Get the module from that package\n            module = getattr(package, module, None)\n\n        filename = getattr(module, '__file__', None)\n        if not filename:\n            # No filename? Nothing to do here\n            return Unparseable()\n\n        # If we get a .pyc, strip the c to get .py so we can parse the source\n        if filename.endswith('.pyc'):\n            filename = filename[:-1]\n            if not os.path.exists(filename) and os.path.isfile(filename):\n                # If there's only a .pyc and no .py it's a compile package or\n                # egg and we can't get at the source for parsing\n                return Unparseable()\n        # If we have a package, we want the directory not the init file\n        if filename.endswith('__init__.py'):\n            filename = filename[:-11]\n\n        # Yey, we found it\n        return filename\n    except ImportError:\n        # Definitely not a valid module or package\n        return", "response": "Get the filename of a module if it can be imported."}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nparses filename appropriately and then output the calls according to the args.", "response": "def _parse_and_output(filename, args):\n    \"\"\"\n    Parse `filename` appropriately and then output calls according to the\n    `args` specified.\n\n    :param filename: A file or directory\n    :param args: Command arguments\n    :type filename: str\n\n    \"\"\"\n    relpath = os.path.dirname(filename)\n    if os.path.isfile(filename):\n        calls = _parse_file(filename, relpath)\n    elif os.path.isdir(filename):\n        calls = _parse_dir(filename, relpath)\n    else:\n        # XXX(shakefu): This is an error of some sort, maybe symlinks?\n        # Probably need some thorough testing\n        _error(\"Could not determine file type: %r\", filename)\n\n    if not calls:\n        # XXX(shakefu): Probably want to change this to not be an error and\n        # just be a normal fail (e.g. command runs, no output).\n        _error(\"No pyconfig calls.\")\n\n    if args.load_configs:\n        # We want to iterate over the configs and add any keys which haven't\n        # already been found\n        keys = set()\n        for call in calls:\n            keys.add(call.key)\n\n        # Iterate the loaded keys and make _PyconfigCall instances\n        conf = pyconfig.Config()\n        for key, value in conf.settings.items():\n            if key in keys:\n                continue\n            calls.append(_PyconfigCall('set', key, value, [None]*4))\n\n    _output(calls, args)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\noutput the list of _PyconfigCall instances in order to be able to display them.", "response": "def _output(calls, args):\n    \"\"\"\n    Outputs `calls`.\n\n    :param calls: List of :class:`_PyconfigCall` instances\n    :param args: :class:`~argparse.ArgumentParser` instance\n    :type calls: list\n    :type args: argparse.ArgumentParser\n\n    \"\"\"\n    # Sort the keys appropriately\n    if args.natural_sort or args.source:\n        calls = sorted(calls, key=lambda c: (c.filename, c.lineno))\n    else:\n        calls = sorted(calls, key=lambda c: c.key)\n\n    out = []\n\n    # Handle displaying only the list of keys\n    if args.only_keys:\n        keys = set()\n        for call in calls:\n            if call.key in keys:\n                continue\n            out.append(_format_call(call, args))\n            keys.add(call.key)\n\n        out = '\\n'.join(out)\n        if args.color:\n            out = _colorize(out)\n        print(out, end=' ')\n\n        # We're done here\n        return\n\n    # Build a list of keys which have default values available, so that we can\n    # toggle between displaying only those keys with defaults and all keys\n    keys = set()\n    for call in calls:\n        if call.default:\n            keys.add(call.key)\n\n    for call in calls:\n        if not args.all and not call.default and call.key in keys:\n            continue\n        out.append(_format_call(call, args))\n\n    out = '\\n'.join(out)\n    if args.color:\n        out = _colorize(out)\n    print(out, end=' ')"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _format_call(call, args):\n    out = ''\n    if args.source:\n        out += call.annotation() + '\\n'\n\n    if args.only_keys:\n        out += call.get_key()\n        return out\n\n    if args.view_call:\n        out += call.as_call()\n    elif args.load_configs:\n        out += call.as_live()\n    else:\n        out += call.as_namespace()\n\n    return out", "response": "Format a pyconfig call object for the command line."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nreturning output colorized with Pygments if available.", "response": "def _colorize(output):\n    \"\"\"\n    Return `output` colorized with Pygments, if available.\n\n    \"\"\"\n    if not pygments:\n        return output\n    # Available styles\n    # ['monokai', 'manni', 'rrt', 'perldoc', 'borland', 'colorful', 'default',\n    # 'murphy', 'vs', 'trac', 'tango', 'fruity', 'autumn', 'bw', 'emacs',\n    # 'vim', 'pastie', 'friendly', 'native']\n    return pygments.highlight(output,\n            pygments.lexers.PythonLexer(),\n            pygments.formatters.Terminal256Formatter(style='monokai'))"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nreturning a list of :class:`_PyconfigCall` from recursively parsing `directory`. :param directory: Directory to walk looking for python files :param relpath: Path to make filenames relative to :type directory: str :type relpath: str", "response": "def _parse_dir(directory, relpath):\n    \"\"\"\n    Return a list of :class:`_PyconfigCall` from recursively parsing\n    `directory`.\n\n    :param directory: Directory to walk looking for python files\n    :param relpath: Path to make filenames relative to\n    :type directory: str\n    :type relpath: str\n\n    \"\"\"\n    relpath = os.path.dirname(relpath)\n    pyconfig_calls = []\n    for root, dirs, files in os.walk(directory):\n        for filename in files:\n            if not filename.endswith('.py'):\n                continue\n            filename = os.path.join(root, filename)\n            pyconfig_calls.extend(_parse_file(filename, relpath))\n\n    return pyconfig_calls"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _parse_file(filename, relpath=None):\n    with open(filename, 'r') as source:\n        source = source.read()\n\n    pyconfig_calls = []\n    try:\n        nodes = ast.parse(source, filename=filename)\n    except SyntaxError:\n        # XXX(Jake): We might want to handle this differently\n        return []\n\n    # Look for UTF-8 encoding\n    first_lines = source[0:200]\n    match = re.match('^#.*coding[:=].?([a-zA-Z0-9-_]+).*', first_lines)\n    if match:\n        try:\n            coding = match.group(1)\n            source = source.decode(coding)\n        except:\n            print(\"# Error decoding file, may not parse correctly:\", filename)\n\n    try:\n        # Split the source into lines so we can reference it easily\n        source = source.split('\\n')\n    except:\n        print(\"# Error parsing file, ignoring:\", filename);\n        return []\n\n    # Make the filename relative to the given path, if needed\n    if relpath:\n        filename = os.path.relpath(filename, relpath)\n\n    for call in ast.walk(nodes):\n        if not isinstance(call, _ast.Call):\n            # Skip any node that isn't a Call\n            continue\n\n        func = call.func\n        if not isinstance(call.func, _ast.Attribute):\n            # We're looking for calls to pyconfig.*, so the function has to be\n            # an Attribute node, otherwise skip it\n            continue\n\n        if getattr(func.value, 'id', None) != 'pyconfig':\n            # If the Attribute value isn't a Name (doesn't have an `id`) or it\n            # isn't 'pyconfig', then we skip\n            continue\n\n        if func.attr not in ['get', 'set', 'setting']:\n            # If the Attribute attr isn't one of the pyconfig API methods, then\n            # we skip\n            continue\n\n        # Now we parse the call arguments as best we can\n        args = []\n        if call.args:\n            arg = call.args[0]\n            if isinstance(arg, _ast.Str):\n                args.append(arg.s)\n            else:\n                args.append(_map_arg(arg))\n\n        for arg in call.args[1:]:\n            args.append(_map_arg(arg))\n\n        line = (filename, source[call.lineno-1], call.lineno, call.col_offset)\n        call = _PyconfigCall(func.attr, args[0], args[1:], line)\n        pyconfig_calls.append(call)\n\n    return pyconfig_calls", "response": "Parse a file into a list of _PyconfigCall objects."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nreturning appropriately parsed or mapped to a usable value.", "response": "def _map_arg(arg):\n    \"\"\"\n    Return `arg` appropriately parsed or mapped to a usable value.\n\n    \"\"\"\n    # Grab the easy to parse values\n    if isinstance(arg, _ast.Str):\n        return repr(arg.s)\n    elif isinstance(arg, _ast.Num):\n        return arg.n\n    elif isinstance(arg, _ast.Name):\n        name = arg.id\n        if name == 'True':\n            return True\n        elif name == 'False':\n            return False\n        elif name == 'None':\n            return None\n        return name\n    else:\n        # Everything else we don't bother with\n        return Unparseable()"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef as_namespace(self, namespace=None):\n        key = self.key\n        if namespace and key.startswith(namespace):\n            key = key[len(namespace) + 1:]\n\n        return \"%s = %s\" % (self.get_key(), self._default() or NotSet())", "response": "Return this call as if it was being assigned in a pyconfig namespace."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn this call as if it was being assigned in a pyconfig namespace but load the actual value currently available in pyconfig.", "response": "def as_live(self):\n        \"\"\"\n        Return this call as if it were being assigned in a pyconfig namespace,\n        but load the actual value currently available in pyconfig.\n\n        \"\"\"\n        key = self.get_key()\n        default = pyconfig.get(key)\n        if default:\n            default = repr(default)\n        else:\n            default = self._default() or NotSet()\n        return \"%s = %s\" % (key, default)"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef as_call(self):\n        default = self._default()\n        default = ', ' + default if default else ''\n        return \"pyconfig.%s(%r%s)\" % (self.method, self.get_key(), default)", "response": "Return this call as it is called in its source."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef get_key(self):\n        if not isinstance(self.key, Unparseable):\n            return self.key\n\n        line = self.source[self.col_offset:]\n        regex = re.compile('''pyconfig\\.[eginst]+\\(([^,]+).*?\\)''')\n        match = regex.match(line)\n        if not match:\n            return Unparseable()\n\n        return \"<%s>\" % match.group(1)", "response": "Return the call key even if it has to be parsed from the source."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nreturning only the default value if there is one.", "response": "def _default_value_only(self):\n        \"\"\"\n        Return only the default value, if there is one.\n\n        \"\"\"\n        line = self.source[self.col_offset:]\n        regex = re.compile('''pyconfig\\.[eginst]+\\(['\"][^)]+?['\"], ?(.*?)\\)''')\n        match = regex.match(line)\n        if not match:\n            return ''\n\n        return match.group(1)"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _default(self):\n        try:\n            # Check if it's iterable\n            iter(self.default)\n        except TypeError:\n            return repr(self.default)\n\n        # This is to look for unparsable values, and if we find one, we try to\n        # directly parse the string\n        for v in self.default:\n            if isinstance(v, Unparseable):\n                default = self._default_value_only()\n                if default:\n                    return default\n        # Otherwise just make it a string and go\n        return ', '.join(str(v) for v in self.default)", "response": "Return the default argument formatted nicely."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef _get_param_names(self):\n        template = Template(self.yaml_string)\n        names = ['yaml_string']  # always include the template\n        for match in re.finditer(template.pattern, template.template):\n            name = match.group('named') or match.group('braced')\n            assert name is not None\n            names.append(name)\n        return names", "response": "Get mappable parameters from YAML."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _get_dataset(self, X, y=None):\n        from pylearn2.datasets import DenseDesignMatrix\n\n        X = np.asarray(X)\n        assert X.ndim > 1\n        if y is not None:\n            y = self._get_labels(y)\n        if X.ndim == 2:\n            return DenseDesignMatrix(X=X, y=y)\n        return DenseDesignMatrix(topo_view=X, y=y)", "response": "Construct a pylearn2 dataset."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef _get_labels(self, y):\n        y = np.asarray(y)\n        if y.ndim == 1:\n            return y.reshape((y.size, 1))\n        assert y.ndim == 2\n        return y", "response": "Construct pylearn2 dataset labels."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nfit a new training set and runs main loop.", "response": "def fit(self, X, y=None):\n        \"\"\"\n        Build a trainer and run main_loop.\n\n        Parameters\n        ----------\n        X : array_like\n            Training examples.\n        y : array_like, optional\n            Labels.\n        \"\"\"\n        from pylearn2.config import yaml_parse\n        from pylearn2.train import Train\n\n        # build trainer\n        params = self.get_params()\n        yaml_string = Template(self.yaml_string).substitute(params)\n        self.trainer = yaml_parse.load(yaml_string)\n        assert isinstance(self.trainer, Train)\n        if self.trainer.dataset is not None:\n            raise ValueError('Train YAML database must evaluate to None.')\n        self.trainer.dataset = self._get_dataset(X, y)\n\n        # update monitoring dataset(s)\n        if (hasattr(self.trainer.algorithm, 'monitoring_dataset') and\n                self.trainer.algorithm.monitoring_dataset is not None):\n            monitoring_dataset = self.trainer.algorithm.monitoring_dataset\n            if len(monitoring_dataset) == 1 and '' in monitoring_dataset:\n                monitoring_dataset[''] = self.trainer.dataset\n            else:\n                monitoring_dataset['train'] = self.trainer.dataset\n            self.trainer.algorithm._set_monitoring_dataset(monitoring_dataset)\n        else:\n            self.trainer.algorithm._set_monitoring_dataset(\n                self.trainer.dataset)\n\n        # run main loop\n        self.trainer.main_loop()"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\npredicting the model for the next time step.", "response": "def _predict(self, X, method='fprop'):\n        \"\"\"\n        Get model predictions.\n\n        See pylearn2.scripts.mlp.predict_csv and\n        http://fastml.com/how-to-get-predictions-from-pylearn2/.\n\n        Parameters\n        ----------\n        X : array_like\n            Test dataset.\n        method : str\n            Model method to call for prediction.\n        \"\"\"\n        import theano\n\n        X_sym = self.trainer.model.get_input_space().make_theano_batch()\n        y_sym = getattr(self.trainer.model, method)(X_sym)\n        f = theano.function([X_sym], y_sym, allow_input_downcast=True)\n        return f(X)"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nconstructs pylearn2 dataset labels.", "response": "def _get_labels(self, y):\n        \"\"\"\n        Construct pylearn2 dataset labels.\n\n        Parameters\n        ----------\n        y : array_like, optional\n            Labels.\n        \"\"\"\n        y = np.asarray(y)\n        assert y.ndim == 1\n        # convert to one-hot\n        labels = np.unique(y).tolist()\n        oh = np.zeros((y.size, len(labels)), dtype=float)\n        for i, label in enumerate(y):\n            oh[i, labels.index(label)] = 1.\n        return oh"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef load(self):\n        from pylearn2.config import yaml_parse\n        from pylearn2.datasets import Dataset\n\n        dataset = yaml_parse.load(self.yaml_string)\n        assert isinstance(dataset, Dataset)\n        data = dataset.iterator(mode='sequential', num_batches=1,\n                                data_specs=dataset.data_specs,\n                                return_tuple=True).next()\n        if len(data) == 2:\n            X, y = data\n            y = np.squeeze(y)\n            if self.one_hot:\n                y = np.argmax(y, axis=1)\n        else:\n            X = data\n            y = None\n        return X, y", "response": "Load the data from the pylearn2. config. yaml_parse module."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nfitting the model with random restarts.", "response": "def fit(self):\n        \"\"\"\n        Fits the model with random restarts.\n        :return:\n        \"\"\"\n        self.model.optimize_restarts(num_restarts=self.num_restarts, verbose=False)"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef _create_kernel(self):\n        # Check kernels\n        kernels = self.kernel_params\n        if not isinstance(kernels, list):\n            raise RuntimeError('Must provide enumeration of kernels')\n        for kernel in kernels:\n            if sorted(list(kernel.keys())) != ['name', 'options', 'params']:\n                raise RuntimeError(\n                    'strategy/params/kernels must contain keys: \"name\", \"options\", \"params\"')\n\n        # Turn into entry points.\n        # TODO use eval to allow user to specify internal variables for kernels (e.g. V) in config file.\n        kernels = []\n        for kern in self.kernel_params:\n            params = kern['params']\n            options = kern['options']\n            name = kern['name']\n            kernel_ep = load_entry_point(name, 'strategy/params/kernels')\n            if issubclass(kernel_ep, KERNEL_BASE_CLASS):\n                if options['independent']:\n                    # TODO Catch errors here?  Estimator entry points don't catch instantiation errors\n                    kernel = np.sum([kernel_ep(1, active_dims=[i], **params) for i in range(self.n_dims)])\n                else:\n                    kernel = kernel_ep(self.n_dims, **params)\n            if not isinstance(kernel, KERNEL_BASE_CLASS):\n                raise RuntimeError('strategy/params/kernel must load a'\n                                   'GPy derived Kernel')\n            kernels.append(kernel)\n\n        self.kernel = np.sum(kernels)", "response": "Creates an additive kernel for the current instance of the current instance of the current instance of the current class."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef fit_and_score_estimator(estimator, parameters, cv, X, y=None, scoring=None,\n                            iid=True, n_jobs=1, verbose=1,\n                            pre_dispatch='2*n_jobs'):\n    \"\"\"Fit and score an estimator with cross-validation\n\n    This function is basically a copy of sklearn's\n    model_selection._BaseSearchCV._fit(), which is the core of the GridSearchCV\n    fit() method. Unfortunately, that class does _not_ return the training\n    set scores, which we want to save in the database, and because of the\n    way it's written, you can't change it by subclassing or monkeypatching.\n\n    This function uses some undocumented internal sklearn APIs (non-public).\n    It was written against sklearn version 0.16.1. Prior Versions are likely\n    to fail due to changes in the design of cross_validation module.\n\n    Returns\n    -------\n    out : dict, with keys 'mean_test_score' 'test_scores', 'train_scores'\n        The scores on the training and test sets, as well as the mean test set\n        score.\n    \"\"\"\n\n    scorer = check_scoring(estimator, scoring=scoring)\n    n_samples = num_samples(X)\n    X, y = check_arrays(X, y, allow_lists=True, sparse_format='csr',\n                        allow_nans=True)\n    if y is not None:\n        if len(y) != n_samples:\n            raise ValueError('Target variable (y) has a different number '\n                             'of samples (%i) than data (X: %i samples)'\n                             % (len(y), n_samples))\n    cv = check_cv(cv=cv, y=y, classifier=is_classifier(estimator))\n\n    out = Parallel(\n        n_jobs=n_jobs, verbose=verbose, pre_dispatch=pre_dispatch\n    )(\n        delayed(_fit_and_score)(clone(estimator), X, y, scorer,\n                                train, test, verbose, parameters,\n                                fit_params=None)\n        for train, test in cv.split(X, y))\n\n    assert len(out) == cv.n_splits\n\n    train_scores, test_scores = [], []\n    n_train_samples, n_test_samples = [], []\n    for test_score, n_test, train_score, n_train, _ in out:\n        train_scores.append(train_score)\n        test_scores.append(test_score)\n        n_test_samples.append(n_test)\n        n_train_samples.append(n_train)\n\n    train_scores, test_scores = map(list, check_arrays(train_scores,\n                                                       test_scores,\n                                                       warn_nans=True,\n                                                       replace_nans=True))\n\n    if iid:\n        if verbose > 0 and is_msmbuilder_estimator(estimator):\n            print('[CV] Using MSMBuilder API n_samples averaging')\n            print('[CV]   n_train_samples: %s' % str(n_train_samples))\n            print('[CV]   n_test_samples: %s' % str(n_test_samples))\n        mean_test_score = np.average(test_scores, weights=n_test_samples)\n        mean_train_score = np.average(train_scores, weights=n_train_samples)\n    else:\n        mean_test_score = np.average(test_scores)\n        mean_train_score = np.average(train_scores)\n\n    grid_scores = {\n        'mean_test_score': mean_test_score, 'test_scores': test_scores,\n        'mean_train_score': mean_train_score, 'train_scores': train_scores,\n        'n_test_samples': n_test_samples, 'n_train_samples': n_train_samples}\n    return grid_scores", "response": "Fit and score an estimator with cross - validation."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef init_subclass_by_name(baseclass, short_name, params):\n    sc = baseclass.__subclasses__()\n    for kls in sc:\n        if kls.short_name == short_name or \\\n           (_is_collection(kls.short_name) and short_name in kls.short_name):\n            try:\n                return kls(**params)\n            except TypeError as e:\n                spec = inspect.getargspec(kls.__init__)\n\n                # try to give nice errors to the user if the params failed\n                # to bind\n                if 'unexpected' in str(e):\n                    avail = join_quoted(spec.args[1:])\n                    raise RuntimeError(\n                        \"%s's %s. Available params for this subclass are: %s.\"\n                        % (short_name, str(e), avail))\n                elif 'takes exactly' in str(e):\n                    required = join_quoted(spec.args[1:-len(spec.defaults)])\n                    raise RuntimeError(\n                        \"%s's %s. Required params for this subclass are %s.\"\n                        % (short_name, str(e), required))\n                elif 'takes at least' in str(e):\n                    required = join_quoted(spec.args[1:-len(spec.defaults)])\n                    optional = join_quoted(spec.args[-len(spec.defaults):])\n                    raise RuntimeError(\n                        \"%s's %s. Required params for this subclass are: %s. \"\n                        \"Optional params are: %s\" % (\n                            short_name, str(e), required, optional))\n                # :(\n                raise\n\n    chain = itertools.chain.from_iterable(\n        e.short_name if _is_collection(e.short_name) else [e.short_name]\n        for e in sc)\n    avail_names = ', '.join(str(n) for n in chain)\n    raise ValueError('\"%s\" is not a recognized subclass. available names '\n                     'are: %s' % (short_name, avail_names))", "response": "This function tries to find the subclass of baseclass with class attribute short_name and then instantiate and return the baseclass with the supplied params."}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef in_directory(path):\n    curdir = os.path.abspath(os.curdir)\n    os.chdir(path)\n    yield\n    os.chdir(curdir)", "response": "Context manager that changes the current directory to the given path."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef format_timedelta(td_object):\n    def get_total_seconds(td):\n        # timedelta.total_seconds not in py2.6\n        return (td.microseconds +\n                (td.seconds + td.days * 24 * 3600) * 1e6) / 1e6\n\n    seconds = int(get_total_seconds(td_object))\n    periods = [('year',    60*60*24*365),\n               ('month',   60*60*24*30),\n               ('day',     60*60*24),\n               ('hour',    60*60),\n               ('minute',  60),\n               ('second',  1)]\n\n    strings = []\n    for period_name, period_seconds in periods:\n        if seconds > period_seconds:\n            period_value, seconds = divmod(seconds, period_seconds)\n            if period_value == 1:\n                strings.append(\"%s %s\" % (period_value, period_name))\n            else:\n                strings.append(\"%s %ss\" % (period_value, period_name))\n\n    return \", \".join(strings)", "response": "Format a timedelta object for display to users\n    Returns ------- str\n   "}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nliking assert_all_finite but only for ndarray.", "response": "def _assert_all_finite(X):\n    \"\"\"Like assert_all_finite, but only for ndarray.\"\"\"\n    X = np.asanyarray(X)\n    # First try an O(n) time, O(1) space solution for the common case that\n    # everything is finite; fall back to O(n) space np.isfinite to prevent\n    # false positives from overflow in sum method\n    if (X.dtype.char in np.typecodes['AllFloat'] and\n            not np.isfinite(X.sum()) and not np.isfinite(X).all()):\n        raise ValueError(\"Input contains NaN, infinity\"\n                         \" or a value too large for %r.\" % X.dtype)"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _warn_if_not_finite(X):\n    X = np.asanyarray(X)\n    # First try an O(n) time, O(1) space solution for the common case that\n    # everything is finite; fall back to O(n) space np.isfinite to prevent\n    # false positives from overflow in sum method\n    if (X.dtype.char in np.typecodes['AllFloat'] and\n            not np.isfinite(X.sum()) and not np.isfinite(X).all()):\n        warnings.warn(\"Result contains NaN, infinity\"\n                      \" or a value too large for %r.\" % X.dtype,\n                      category=UserWarning)", "response": "Warn if array contains non - finite elements."}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef num_samples(x, is_nested=False):\n    if hasattr(x, 'fit'):\n        # Don't get num_samples from an ensembles length!\n        raise TypeError('Expected sequence or array-like, got '\n                        'estimator %s' % x)\n\n    if is_nested:\n        return sum(num_samples(xx, is_nested=False) for xx in x)\n\n    if not hasattr(x, '__len__') and not hasattr(x, 'shape'):\n        if hasattr(x, '__array__'):\n            x = np.asarray(x)\n        else:\n            raise TypeError(\"Expected sequence or array-like, got %s\" %\n                            type(x))\n    if hasattr(x, 'shape'):\n        if len(x.shape) == 0:\n            raise TypeError(\"Singleton array %r cannot be considered\"\n                            \" a valid collection.\" % x)\n        return x.shape[0]\n    else:\n        return len(x)", "response": "Return number of samples in array - like x."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef check_arrays(*arrays, **options):\n    sparse_format = options.pop('sparse_format', None)\n    if sparse_format not in (None, 'csr', 'csc', 'dense'):\n        raise ValueError('Unexpected sparse format: %r' % sparse_format)\n    copy = options.pop('copy', False)\n    check_ccontiguous = options.pop('check_ccontiguous', False)\n    dtype = options.pop('dtype', None)\n    warn_nans = options.pop('warn_nans', False)\n    replace_nans = options.pop('replace_nans', False)\n    allow_lists = options.pop('allow_lists', False)\n    allow_nans = options.pop('allow_nans', False)\n    allow_nd = options.pop('allow_nd', False)\n\n    if options:\n        raise TypeError(\"Unexpected keyword arguments: %r\" % options.keys())\n\n    if len(arrays) == 0:\n        return None\n\n    n_samples = num_samples(arrays[0])\n\n    checked_arrays = []\n    for array in arrays:\n        array_orig = array\n        if array is None:\n            # special case: ignore optional y=None kwarg pattern\n            checked_arrays.append(array)\n            continue\n        size = num_samples(array)\n\n        if size != n_samples:\n            raise ValueError(\"Found array with dim %d. Expected %d\"\n                             % (size, n_samples))\n\n        if not allow_lists or hasattr(array, \"shape\"):\n            if sp.issparse(array):\n                if sparse_format == 'csr':\n                    array = array.tocsr()\n                elif sparse_format == 'csc':\n                    array = array.tocsc()\n                elif sparse_format == 'dense':\n                    raise TypeError('A sparse matrix was passed, but dense '\n                                    'data is required. Use X.toarray() to '\n                                    'convert to a dense numpy array.')\n                if check_ccontiguous:\n                    array.data = np.ascontiguousarray(array.data, dtype=dtype)\n                elif hasattr(array, 'data'):\n                    array.data = np.asarray(array.data, dtype=dtype)\n                elif array.dtype != dtype:\n                    array = array.astype(dtype)\n                if not allow_nans:\n                    if hasattr(array, 'data'):\n                        _assert_all_finite(array.data)\n                    else:\n                        _assert_all_finite(array.values())\n            else:\n                if check_ccontiguous:\n                    array = np.ascontiguousarray(array, dtype=dtype)\n                else:\n                    array = np.asarray(array, dtype=dtype)\n                if warn_nans:\n                    allow_nans = True\n                    _warn_if_not_finite(array)\n                if replace_nans:\n                    array = np.nan_to_num(array)\n                if not allow_nans:\n                    _assert_all_finite(array)\n\n            if not allow_nd and array.ndim >= 3:\n                raise ValueError(\"Found array with dim %d. Expected <= 2\" %\n                                 array.ndim)\n\n        if copy and array is array_orig:\n            array = array.copy()\n        checked_arrays.append(array)\n\n    return checked_arrays", "response": "Checks whether all arrays in arrays have consistent first dimensions."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef is_repeated_suggestion(params, history):\n        if any(params == hparams and hstatus == 'SUCCEEDED'\n               for hparams, hscore, hstatus in history):\n            return True\n        else:\n            return False", "response": "Returns True if there are several suggestions in the set."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nsuggest the value of a synthetic object based on the objective function based on the objective function evaluation history using a tree of Parzen estimators TPE and the hyperopt package.", "response": "def suggest(self, history, searchspace):\n        \"\"\"\n        Suggest params to maximize an objective function based on the\n        function evaluation history using a tree of Parzen estimators (TPE),\n        as implemented in the hyperopt package.\n\n        Use of this function requires that hyperopt be installed.\n        \"\"\"\n        # This function is very odd, because as far as I can tell there's\n        # no real documented API for any of the internals of hyperopt. Its\n        # execution model is that hyperopt calls your objective function\n        # (instead of merely providing you with suggested points, and then\n        # you calling the function yourself), and its very tricky (for me)\n        # to use the internal hyperopt data structures to get these predictions\n        # out directly.\n\n        # so they path we take in this function is to construct a synthetic\n        # hyperopt.Trials database which from the `history`, and then call\n        # hyoperopt.fmin with a dummy objective function that logs the value\n        # used, and then return that value to our client.\n\n        # The form of the hyperopt.Trials database isn't really documented in\n        # the code -- most of this comes from reverse engineering it, by\n        # running fmin() on a simple function and then inspecting the form of\n        # the resulting trials object.\n        if 'hyperopt' not in sys.modules:\n            raise ImportError('No module named hyperopt')\n\n        random = check_random_state(self.seed)\n        hp_searchspace = searchspace.to_hyperopt()\n\n        trials = Trials()\n        for i, (params, scores, status) in enumerate(history):\n            if status == 'SUCCEEDED':\n                # we're doing maximization, hyperopt.fmin() does minimization,\n                # so we need to swap the sign\n                result = {'loss': -np.mean(scores), 'status': STATUS_OK}\n            elif status == 'PENDING':\n                result = {'status': STATUS_RUNNING}\n            elif status == 'FAILED':\n                result = {'status': STATUS_FAIL}\n            else:\n                raise RuntimeError('unrecognized status: %s' % status)\n\n            # the vals key in the trials dict is basically just the params\n            # dict, but enum variables (hyperopt hp.choice() nodes) are\n            # different, because the index of the parameter is specified\n            # in vals, not the parameter itself.\n\n            vals = {}\n            for var in searchspace:\n                if isinstance(var, EnumVariable):\n                    # get the index in the choices of the parameter, and use\n                    # that.\n                    matches = [\n                        i for i, c in enumerate(var.choices)\n                        if c == params[var.name]\n                    ]\n                    assert len(matches) == 1\n                    vals[var.name] = matches\n                else:\n                    # the other big difference is that all of the param values\n                    # are wrapped in length-1 lists.\n                    vals[var.name] = [params[var.name]]\n\n            trials.insert_trial_doc({\n                'misc': {\n                    'cmd': ('domain_attachment', 'FMinIter_Domain'),\n                    'idxs': dict((k, [i]) for k in hp_searchspace.keys()),\n                    'tid': i,\n                    'vals': vals,\n                    'workdir': None\n                },\n                'result': result,\n                'tid': i,\n                # bunch of fixed fields that hyperopt seems to require\n                'owner': None,\n                'spec': None,\n                'state': 2,\n                'book_time': None,\n                'exp_key': None,\n                'refresh_time': None,\n                'version': 0\n            })\n\n        trials.refresh()\n        chosen_params_container = []\n\n        def suggest(*args, **kwargs):\n            return tpe.suggest(*args,\n                               **kwargs,\n                               gamma=self.gamma,\n                               n_startup_jobs=self.seeds)\n\n        def mock_fn(x):\n            # http://stackoverflow.com/a/3190783/1079728\n            # to get around no nonlocal keywork in python2\n            chosen_params_container.append(x)\n            return 0\n\n        fmin(fn=mock_fn,\n             algo=tpe.suggest,\n             space=hp_searchspace,\n             trials=trials,\n             max_evals=len(trials.trials) + 1,\n             **self._hyperopt_fmin_random_kwarg(random))\n        chosen_params = chosen_params_container[0]\n\n        return chosen_params"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nmerges the default values from two sources into a single config object.", "response": "def _merge_defaults(self, config):\n        \"\"\"The config object loads its values from two sources, with the\n        following precedence:\n\n            1. data/default_config.yaml\n            2. The config file itself, passed in to this object in the\n               constructor as `path`.\n\n        in case of conflict, the config file dominates.\n        \"\"\"\n        fn = resource_filename('osprey', join('data', 'default_config.yaml'))\n        with open(fn) as f:\n            default = parse(f)\n        return reduce(dict_merge, [default, config])"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ncreates a Config object from a dict.", "response": "def fromdict(cls, config, check_fields=True):\n        \"\"\"Create a Config object from config dict directly.\"\"\"\n        m = super(Config, cls).__new__(cls)\n        m.path = '.'\n        m.verbose = False\n        m.config = m._merge_defaults(config)\n        if check_fields:\n            m._check_fields()\n        return m"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nget an entry from within a section using a '/' delimiter", "response": "def get_value(self, field, default=None):\n        \"\"\"Get an entry from within a section, using a '/' delimiter\"\"\"\n        section, key = field.split('/')\n        return self.get_section(section).get(key, default)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nget the estimator instance of a ( subclass of )", "response": "def estimator(self):\n        \"\"\"Get the estimator, an instance of a (subclass of)\n        sklearn.base.BaseEstimator\n\n        It can be loaded either from a pickle, from a string using eval(),\n        or from an entry point.\n\n        e.g.\n\n        estimator:\n            # only one of the following can actually be active in a given\n            # config file.\n            pickle: path-to-pickle-file.pkl\n            eval: \"Pipeline([('cluster': KMeans())])\"\n            entry_point: sklearn.linear_model.LogisticRegression\n            module: myestimator\n        \"\"\"\n        module_path = self.get_value('estimator/module')\n        if module_path is not None:\n            with prepend_syspath(dirname(abspath(self.path))):\n                estimator_module = importlib.import_module(module_path)\n            estimator = estimator_module.estimator()\n            if not isinstance(estimator, sklearn.base.BaseEstimator):\n                raise RuntimeError('estimator/pickle must load a '\n                                   'sklearn-derived Estimator')\n            return estimator\n\n        evalstring = self.get_value('estimator/eval')\n        if evalstring is not None:\n            got = self.get_value('estimator/eval_scope')\n            if isinstance(got, six.string_types):\n                got = [got]\n            elif isinstance(got, list):\n                pass\n            else:\n                raise RuntimeError('unexpected type for estimator/eval_scope')\n\n            scope = {}\n            for pkg_name in got:\n                if pkg_name in eval_scopes.__all__:\n                    scope.update(getattr(eval_scopes, pkg_name)())\n                else:\n                    try:\n                        pkg = importlib.import_module(pkg_name)\n                    except ImportError as e:\n                        raise RuntimeError(str(e))\n                    scope.update(eval_scopes.import_all_estimators(pkg))\n\n            try:\n                estimator = eval(evalstring, {}, scope)\n                if not isinstance(estimator, sklearn.base.BaseEstimator):\n                    raise RuntimeError('estimator/pickle must load a '\n                                       'sklearn-derived Estimator')\n                return estimator\n            except:\n                print('-'*78, file=sys.stderr)\n                print('Error parsing estimator/eval', file=sys.stderr)\n                print('-'*78, file=sys.stderr)\n\n                traceback.print_exc(file=sys.stderr)\n                print('-'*78, file=sys.stderr)\n                sys.exit(1)\n\n        entry_point = self.get_value('estimator/entry_point')\n        if entry_point is not None:\n            estimator = load_entry_point(entry_point, 'estimator/entry_point')\n            if issubclass(estimator, sklearn.base.BaseEstimator):\n                estimator = estimator(\n                    **self.get_value('estimator/params', default={}))\n            if not isinstance(estimator, sklearn.base.BaseEstimator):\n                raise RuntimeError('estimator/pickle must load a '\n                                   'sklearn-derived Estimator')\n            return estimator\n\n        # load estimator from pickle field\n        pkl = self.get_value('estimator/pickle')\n        if pkl is not None:\n            pickl_dir = dirname(abspath(self.path))\n            path = join(pickl_dir, pkl)\n            if not isfile(path):\n                raise RuntimeError('estimator/pickle %s is not a file' % pkl)\n            with open(path, 'rb') as f:\n                with prepend_syspath(pickl_dir):\n                    estimator = cPickle.load(f)\n                if not isinstance(estimator, sklearn.base.BaseEstimator):\n                    raise RuntimeError('estimator/pickle must load a '\n                                       'sklearn-derived Estimator')\n                return estimator\n\n        raise RuntimeError('no estimator field')"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef get_best_candidate(self):\n        # TODO make this best mean response\n        self.incumbent = self.surrogate.Y.max()\n\n        # Objective function\n        def z(x):\n            # TODO make spread of points around x and take mean value.\n            x = x.copy().reshape(-1, self.n_dims)\n            y_mean, y_var = self.surrogate.predict(x)\n            af = self._acquisition_function(y_mean=y_mean, y_var=y_var)\n            # TODO make -1 dependent on flag in inputs for either max or minimization\n            return (-1) * af\n\n        # Optimization loop\n        af_values = []\n        af_args = []\n\n        for i in range(self.n_iter):\n            init = self._get_random_point()\n            res = minimize(z, init, bounds=self.n_dims * [(0., 1.)],\n                           options={'maxiter': int(self.max_iter), 'disp': 0})\n            af_args.append(res.x)\n            af_values.append(res.fun)\n\n        # Choose the best\n        af_values = np.array(af_values).flatten()\n        af_args = np.array(af_args)\n        best_index = int(np.argmin(af_values))\n        best_candidate = af_args[best_index]\n        return best_candidate", "response": "Returns the best candidate hyper - parameters as defined by\n       "}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nplots 1. All iterations", "response": "def plot_1(data, *args):\n    \"\"\"Plot 1. All iterations (scatter plot)\"\"\"\n    df_all = pd.DataFrame(data)\n    df_params = nonconstant_parameters(data)\n    return build_scatter_tooltip(\n        x=df_all['id'], y=df_all['mean_test_score'], tt=df_params,\n        title='All Iterations')"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nplotting 2. Running best score", "response": "def plot_2(data, *args):\n    \"\"\"Plot 2. Running best score (scatter plot)\"\"\"\n    df_all = pd.DataFrame(data)\n    df_params = nonconstant_parameters(data)\n    x = [df_all['id'][0]]\n    y = [df_all['mean_test_score'][0]]\n    params = [df_params.loc[0]]\n    for i in range(len(df_all)):\n        if df_all['mean_test_score'][i] > y[-1]:\n            x.append(df_all['id'][i])\n            y.append(df_all['mean_test_score'][i])\n            params.append(df_params.loc[i])\n    return build_scatter_tooltip(\n        x=x, y=y, tt=pd.DataFrame(params), title='Running best')"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef plot_3(data, ss, *args):\n\n    if len(data) <= 1:\n        warnings.warn(\"Only one datapoint. Could not compute t-SNE embedding.\")\n        return None\n\n    scores = np.array([d['mean_test_score'] for d in data])\n    # maps each parameters to a vector of floats\n    warped = np.array([ss.point_to_unit(d['parameters']) for d in data])\n\n    # Embed into 2 dimensions with t-SNE\n    X = TSNE(n_components=2).fit_transform(warped)\n\n    e_scores = np.exp(scores)\n    mine, maxe = np.min(e_scores), np.max(e_scores)\n    color = (e_scores - mine) / (maxe - mine)\n    mapped_colors = list(map(rgb2hex, cm.get_cmap('RdBu_r')(color)))\n\n    p = bk.figure(title='t-SNE (unsupervised)', tools=TOOLS)\n\n    df_params = nonconstant_parameters(data)\n    df_params['score'] = scores\n    df_params['x'] = X[:, 0]\n    df_params['y'] = X[:, 1]\n    df_params['color'] = mapped_colors\n    df_params['radius'] = 1\n    p.circle(\n        x='x', y='y', color='color', radius='radius',\n        source=ColumnDataSource(data=df_params), fill_alpha=0.6,\n        line_color=None)\n    cp = p\n    hover = cp.select(dict(type=HoverTool))\n    format_tt = [(s, '@%s' % s) for s in df_params.columns]\n    hover.tooltips = OrderedDict([(\"index\", \"$index\")] + format_tt)\n\n    xax, yax = p.axis\n    xax.axis_label = 't-SNE coord 1'\n    yax.axis_label = 't-SNE coord 2'\n    return p", "response": "Plots the t - SNE embedding of the parameters colored by score by score by mean test score."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nscatter plot of score vs each param", "response": "def plot_4(data, *args):\n    \"\"\"Scatter plot of score vs each param\n    \"\"\"\n    params = nonconstant_parameters(data)\n    scores = np.array([d['mean_test_score'] for d in data])\n    order = np.argsort(scores)\n\n    for key in params.keys():\n        if params[key].dtype == np.dtype('bool'):\n            params[key] = params[key].astype(np.int)\n    p_list = []\n    for key in params.keys():\n        x = params[key][order]\n        y = scores[order]\n        params = params.loc[order]\n        try:\n            radius = (np.max(x) - np.min(x)) / 100.0\n        except:\n            print(\"error making plot4 for '%s'\" % key)\n            continue\n\n        p_list.append(build_scatter_tooltip(\n            x=x, y=y, radius=radius, add_line=False, tt=params,\n            xlabel=key, title='Score vs %s' % key))\n    return p_list"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nadding a jump to the internal state.", "response": "def add_jump(self, name, min, max, num, warp=None, var_type=float):\n        \"\"\" An integer/float-valued enumerable with `num` items, bounded\n        between [`min`, `max`]. Note that the right endpoint of the interval\n        includes `max`. This is a wrapper around the add_enum. `jump` can be\n        a float or int.\n        \"\"\"\n        if not isinstance(var_type, type):\n            if var_type == 'int':\n                var_type = int\n            elif var_type == 'float':\n                var_type = float\n            else:\n                raise ValueError('var_type (%s) is not supported. use '\n                                 '\"int\" or \"float\",' % (var_type))\n\n        min, max = map(var_type, (min, max))\n        num = int(num)\n\n        if not warp:\n            choices = np.linspace(min, max, num=num, dtype=var_type)\n        elif (min >= 0) and warp == 'log':\n            choices = np.logspace(np.log10(min), np.log10(max), num=num,\n                                  dtype=var_type)\n        elif (min <= 0)and warp == 'log':\n            raise ValueError('variable %s: log-warping requires min > 0')\n        else:\n            raise ValueError('variable %s: warp=%s is not supported. use '\n                             'None or \"log\",' % (name, warp))\n\n        self.variables[name] = EnumVariable(name, choices.tolist())"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nadds an integer - valued variable to the internal dictionary.", "response": "def add_int(self, name, min, max, warp=None):\n        \"\"\"An integer-valued dimension bounded between `min` <= x <= `max`.\n        Note that the right endpoint of the interval includes `max`.\n\n        When `warp` is None, the base measure associated with this dimension\n        is a categorical distribution with each weight on each of the integers\n        in [min, max]. With `warp == 'log'`, the base measure is a uniform\n        distribution on the log of the variable, with bounds at `log(min)` and\n        `log(max)`. This is appropriate for variables that are \"naturally\" in\n        log-space. Other `warp` functions are not supported (yet), but may be\n        at a later time. Please note that this functionality is not supported\n        for `hyperopt_tpe`.\n        \"\"\"\n        min, max = map(int, (min, max))\n        if max < min:\n            raise ValueError('variable %s: max < min error' % name)\n        if warp not in (None, 'log'):\n            raise ValueError('variable %s: warp=%s is not supported. use '\n                             'None or \"log\",' % (name, warp))\n        if min <= 0 and warp == 'log':\n            raise ValueError('variable %s: log-warping requires min > 0')\n\n        self.variables[name] = IntVariable(name, min, max, warp)"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nadding a floating point - valued variable to the internal dictionary.", "response": "def add_float(self, name, min, max, warp=None):\n        \"\"\"A floating point-valued dimension bounded `min` <= x < `max`\n\n        When `warp` is None, the base measure associated with this dimension\n        is a uniform distribution on [min, max). With `warp == 'log'`, the\n        base measure is a uniform distribution on the log of the variable,\n        with bounds at `log(min)` and `log(max)`. This is appropriate for\n        variables that are \"naturally\" in log-space. Other `warp` functions\n        are not supported (yet), but may be at a later time.\n        \"\"\"\n        min, max = map(float, (min, max))\n        if not min < max:\n            raise ValueError('variable %s: min >= max error' % name)\n        if warp not in (None, 'log'):\n            raise ValueError('variable %s: warp=%s is not supported. use '\n                             'None or \"log\",' % (name, warp))\n        if min <= 0 and warp == 'log':\n            raise ValueError('variable %s: log-warping requires min > 0')\n\n        self.variables[name] = FloatVariable(name, min, max, warp)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef bresenham(x0, y0, x1, y1):\n    dx = x1 - x0\n    dy = y1 - y0\n\n    xsign = 1 if dx > 0 else -1\n    ysign = 1 if dy > 0 else -1\n\n    dx = abs(dx)\n    dy = abs(dy)\n\n    if dx > dy:\n        xx, xy, yx, yy = xsign, 0, 0, ysign\n    else:\n        dx, dy = dy, dx\n        xx, xy, yx, yy = 0, ysign, xsign, 0\n\n    D = 2*dy - dx\n    y = 0\n\n    for x in range(dx + 1):\n        yield x0 + x*xx + y*yx, y0 + x*xy + y*yy\n        if D >= 0:\n            y += 1\n            D -= 2*dx\n        D += 2*dy", "response": "Yields integer coordinates on the line from x0 y0 x1 y1."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ntry to find a pattern that matches the source and call a parser method to create Python objects.", "response": "def find_match(self):\n        \"\"\"Try to find a pattern that matches the source and calll a parser\n        method to create Python objects.\n\n        A callback that raises an IgnoredMatchException indicates that the\n        given string data is ignored by the parser and no objects are created.\n\n        If none of the pattern match a NoMatchException is raised.\n        \"\"\"\n        for pattern, callback in self.rules:\n            match = pattern.match(self.source, pos=self.pos)\n\n            if not match:\n                continue\n\n            try:\n                node = callback(match)\n            except IgnoredMatchException:\n                pass\n            else:\n                self.seen.append(node)\n\n            return match\n\n        raise NoMatchException(\n            'None of the known patterns match for {}'\n            ''.format(self.source[self.pos:])\n        )"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef add_child(self, child):\n        if not isinstance(child, ChildMixin):\n            raise TypeError(\n                'Requires instance of TreeElement. '\n                'Got {}'.format(type(child))\n            )\n        child.parent = self\n        self._children.append(child)", "response": "Adds a child to the tree."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef get_ip_packet(data, client_port, server_port, is_loopback=False):\n    header = _loopback if is_loopback else _ethernet\n\n    try:\n        header.unpack(data)\n    except Exception as ex:\n        raise ValueError('Bad header: %s' % ex)\n\n    tcp_p = getattr(header.data, 'data', None)\n    if type(tcp_p) != dpkt.tcp.TCP:\n        raise ValueError('Not a TCP packet')\n\n    if tcp_p.dport == server_port:\n        if client_port != 0 and tcp_p.sport != client_port:\n            raise ValueError('Request from different client')\n    elif tcp_p.sport == server_port:\n        if client_port != 0 and tcp_p.dport != client_port:\n            raise ValueError('Reply for different client')\n    else:\n        raise ValueError('Packet not for/from client/server')\n\n    return header.data", "response": "Get IP packet from data."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef listening_ports():\n    ports = []\n\n    if not os.path.exists(PROC_TCP):\n        return ports\n\n    with open(PROC_TCP) as fh:\n        for line in fh:\n            if '00000000:0000' not in line:\n                continue\n            parts = line.lstrip(' ').split(' ')\n            if parts[2] != '00000000:0000':\n                continue\n\n            local_port = parts[1].split(':')[1]\n            local_port = int('0x' + local_port, base=16)\n            ports.append(local_port)\n\n    return ports", "response": "Reads listening ports from the tcp file."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nget stats & show them", "response": "def report(self):\n        \"\"\" get stats & show them \"\"\"\n        self._output.write('\\r')\n\n        sort_by = 'avg'\n        results = {}\n        for key, latencies in self._latencies_by_method.items():\n            result = {}\n            result['count'] = len(latencies)\n            result['avg'] = sum(latencies) / len(latencies)\n            result['min'] = min(latencies)\n            result['max'] = max(latencies)\n            latencies = sorted(latencies)\n            result['p90'] = percentile(latencies, 0.90)\n            result['p95'] = percentile(latencies, 0.95)\n            result['p99'] = percentile(latencies, 0.99)\n            result['p999'] = percentile(latencies, 0.999)\n            results[key] = result\n\n        headers = ['method', 'count', 'avg', 'min', 'max', 'p90', 'p95', 'p99', 'p999']\n        data = []\n        results = sorted(results.items(), key=lambda it: it[1][sort_by], reverse=True)\n\n        def row(key, res):\n            data = [key] + [res[header] for header in headers[1:]]\n            return tuple(data)\n\n        data = [row(key, result) for key, result in results]\n\n        self._output.write('%s\\n' % tabulate(data, headers=headers))\n        self._output.flush()"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef of_messages(cls, msg_a, msg_b):\n        ok_to_diff, reason = cls.can_diff(msg_a, msg_b)\n        if not ok_to_diff:\n            raise ValueError(reason)\n        return [cls.of_structs(x.value, y.value)\n                for x, y in zip(msg_a.args, msg_b.args)\n                if x.field_type == 'struct']", "response": "Diff two Thrift messages by comparing their args raises exceptions if they can t be diffed"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef can_diff(msg_a, msg_b):\n        if msg_a.method != msg_b.method:\n            return False, 'method name of messages do not match'\n        if len(msg_a.args) != len(msg_b.args) \\\n                or not msg_a.args.is_isomorphic_to(msg_b.args):\n            return False, 'argument signature of methods do not match'\n        return True, None", "response": "Check if two Thrift messages are diff ready."}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns true if all fields of this object are isomorphic to other.", "response": "def is_isomorphic_to(self, other):\n        \"\"\"\n        Returns true if all fields of other struct are isomorphic to this\n        struct's fields\n        \"\"\"\n        return (isinstance(other, self.__class__)\n                and\n                len(self.fields) == len(other.fields)\n                and\n                all(a.is_isomorphic_to(b) for a, b in zip(self.fields,\n                                                          other.fields)))"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef is_isomorphic_to(self, other):\n        return (isinstance(other, self.__class__)\n                and self.field_type == other.field_type\n                and self.field_id == other.field_id)", "response": "Returns true if this one is the same as other."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef read(cls, data,\n             protocol=None,\n             fallback_protocol=TBinaryProtocol,\n             finagle_thrift=False,\n             max_fields=MAX_FIELDS,\n             max_list_size=MAX_LIST_SIZE,\n             max_map_size=MAX_MAP_SIZE,\n             max_set_size=MAX_SET_SIZE,\n             read_values=False):\n        \"\"\" tries to deserialize a message, might fail if data is missing \"\"\"\n\n        # do we have enough data?\n        if len(data) < cls.MIN_MESSAGE_SIZE:\n            raise ValueError('not enough data')\n\n        if protocol is None:\n            protocol = cls.detect_protocol(data, fallback_protocol)\n        trans = TTransport.TMemoryBuffer(data)\n        proto = protocol(trans)\n\n        # finagle-thrift prepends a RequestHeader\n        #\n        # See: http://git.io/vsziG\n        header = None\n        if finagle_thrift:\n            try:\n                header = ThriftStruct.read(\n                    proto,\n                    max_fields,\n                    max_list_size,\n                    max_map_size,\n                    max_set_size,\n                    read_values)\n            except:\n                # reset stream, maybe it's not finagle-thrift\n                trans = TTransport.TMemoryBuffer(data)\n                proto = protocol(trans)\n\n        # unpack the message\n        method, mtype, seqid = proto.readMessageBegin()\n        mtype = cls.message_type_to_str(mtype)\n\n        if len(method) == 0 or method.isspace() or method.startswith(' '):\n            raise ValueError('no method name')\n\n        if len(method) > cls.MAX_METHOD_LENGTH:\n            raise ValueError('method name too long')\n\n        # we might have made it until this point by mere chance, so filter out\n        # suspicious method names\n        valid = range(33, 127)\n        if any(ord(char) not in valid for char in method):\n            raise ValueError('invalid method name' % method)\n\n        args = ThriftStruct.read(\n            proto,\n            max_fields,\n            max_list_size,\n            max_map_size,\n            max_set_size,\n            read_values)\n\n        proto.readMessageEnd()\n\n        # Note: this is a bit fragile, the right thing would be to count bytes\n        # as we read them (i.e.: when calling readI32, etc).\n        msglen = trans._buffer.tell()\n\n        return cls(method, mtype, seqid, args, header, msglen), msglen", "response": "Reads a message from the given data stream."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\ndetecting protocol from data.", "response": "def detect_protocol(cls, data, default=None):\n        \"\"\" TODO: support fbthrift, finagle-thrift, finagle-mux, CORBA \"\"\"\n        if cls.is_compact_protocol(data):\n            return TCompactProtocol\n        elif cls.is_binary_protocol(data):\n            return TBinaryProtocol\n        elif cls.is_json_protocol(data):\n            return TJSONProtocol\n\n        if default is None:\n            raise ValueError('Unknown protocol')\n\n        return default"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef pop(self, nbytes):\n        size = 0\n        popped = []\n        with self._lock_packets:\n            while size < nbytes:\n                try:\n                    packet = self._packets.pop(0)\n                    size += len(packet.data.data)\n                    self._remaining -= len(packet.data.data)\n                    popped.append(packet)\n                except IndexError:\n                    break\n        return popped", "response": "pops packets with _at least_ nbytes of payload"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\npops data from the queue and return it", "response": "def pop_data(self, nbytes):\n        \"\"\" similar to pop, but returns payload + last timestamp \"\"\"\n        last_timestamp = 0\n        data = []\n        for packet in self.pop(nbytes):\n            last_timestamp = packet.timestamp\n            data.append(packet.data.data)\n\n        return ''.join(data), last_timestamp"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\npush the given IP packet into the queue", "response": "def push(self, ip_packet):\n        \"\"\" push the packet into the queue \"\"\"\n\n        data_len = len(ip_packet.data.data)\n        seq_id = ip_packet.data.seq\n\n        if data_len == 0:\n            self._next_seq_id = seq_id\n            return False\n\n        # have we seen this packet?\n        if self._next_seq_id != -1 and seq_id != self._next_seq_id:\n            return False\n\n        self._next_seq_id = seq_id + data_len\n\n        with self._lock_packets:\n            # Note: we only account for payload (i.e.: tcp data)\n            self._length += len(ip_packet.data.data)\n            self._remaining += len(ip_packet.data.data)\n\n            self._packets.append(ip_packet)\n\n        return True"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ndeals with incoming packets", "response": "def run(self, *args, **kwargs):\n        \"\"\" Deal with the incoming packets \"\"\"\n        while True:\n            try:\n                timestamp, ip_p = self._queue.popleft()\n\n                src_ip = get_ip(ip_p, ip_p.src)\n                dst_ip = get_ip(ip_p, ip_p.dst)\n\n                src = intern('%s:%s' % (src_ip, ip_p.data.sport))\n                dst = intern('%s:%s' % (dst_ip, ip_p.data.dport))\n                key = intern('%s<->%s' % (src, dst))\n\n                stream = self._streams.get(key)\n                if stream is None:\n                    stream = Stream(src, dst)\n                    self._streams[key] = stream\n\n                # HACK: save the timestamp\n                setattr(ip_p, 'timestamp', timestamp)\n                pushed = stream.push(ip_p)\n\n                if not pushed:\n                    continue\n\n                # let listeners know about the updated stream\n                for handler in self._handlers:\n                    try:\n                        handler(stream)\n                    except Exception as ex:\n                        print('handler exception: %s' % ex)\n            except Exception:\n                time.sleep(0.00001)"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef getLogin(filename, user, passwd):\n    '''\n    write user/passwd to login file or get them from file.\n    This method is not Py3 safe (byte vs. str)\n    '''\n    if filename is None:\n        return (user, passwd)\n    isPy2 = sys.version_info[0] == 2\n    if os.path.exists(filename):\n        print(\"Using file {} for Login\".format(filename))\n        with open(filename, \"r\") as loginfile:\n            encoded_cred = loginfile.read()\n            print(\"encoded: {}\".format(encoded_cred))\n            if isPy2:\n                decoded_cred = b64decode(encoded_cred)\n            else:\n                decoded_cred = b64decode(encoded_cred).decode('utf-8')\n            login = decoded_cred.split(':', 1)\n            return (login[0], login[1])\n    else:\n        if user is None or passwd is None:\n            raise ValueError(\"user and password must not be None\")\n        print(\"Writing file {} for Login\".format(filename))\n        with open(filename, \"wb\") as loginfile:\n            creds = user+\":\"+passwd\n            if isPy2:\n                encoded_cred = b64encode(creds)\n            else:\n                encoded_cred = b64encode(creds.encode('utf-8'))\n            print(\"encoded: {}\".format(encoded_cred))\n            loginfile.write(encoded_cred)\n        return (user, passwd)", "response": "This method is not Py3 safe."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nwait for a list of requests to finish.", "response": "def wait_for_requests(pbclient, request_ids=None,\n                      timeout=0, initial_wait=5, scaleup=10):\n    '''\n    Waits for a list of requests to finish until timeout.\n    timeout==0 is interpreted as infinite wait time.\n    Returns a dict of request_id -> result.\n    result is a tuple (return code, request status, message) where return code\n    0  : request successful\n    1  : request failed\n    -1 : timeout exceeded\n    The wait_period is increased every scaleup steps to adjust for long\n    running requests.\n    '''\n    done = dict()\n    if not request_ids:\n        print(\"empty request list\")\n        return done\n    total_wait = 0\n    wait_period = initial_wait\n    next_scaleup = scaleup * wait_period\n    wait = True\n    while wait:\n        for request_id in request_ids:\n            if request_id in done:\n                continue\n            request_status = pbclient.get_request(request_id, status=True)\n            state = request_status['metadata']['status']\n            if state == \"DONE\":\n                done[request_id] = (0, state, request_status['metadata']['message'])\n                print(\"Request '{}' is in state '{}'.\".format(request_id, state))\n            if state == 'FAILED':\n                done[request_id] = (1, state, request_status['metadata']['message'])\n                print(\"Request '{}' is in state '{}'.\".format(request_id, state))\n        # end for(request_ids)\n        if len(done) == len(request_ids):\n            wait = False\n        else:\n            print(\"{} of {} requests are finished. Sleeping for {} seconds...\"\n                  .format(len(done), len(request_ids), wait_period))\n            sleep(wait_period)\n            total_wait += wait_period\n            if timeout != 0 and total_wait > timeout:\n                wait = False\n            next_scaleup -= wait_period\n            if next_scaleup == 0:\n                wait_period += initial_wait\n                next_scaleup = scaleup * wait_period\n                print(\"scaling up wait_period to {}, next change in {} seconds\"\n                      .format(wait_period, next_scaleup))\n        # end if/else(done)\n    # end while(wait)\n    if len(done) != len(request_ids):\n        for request_id in request_ids:\n            if request_id in done:\n                continue\n            done[request_id] = (-1, state, \"request not finished before timeout\")\n    return done"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn all disk images within a given location with a given image name.", "response": "def get_disk_image_by_name(pbclient, location, image_name):\n    \"\"\"\n    Returns all disk images within a location with a given image name.\n    The name must match exactly.\n    The list may be empty.\n    \"\"\"\n    all_images = pbclient.list_images()\n    matching = [i for i in all_images['items'] if\n                i['properties']['name'] == image_name and\n                i['properties']['imageType'] == \"HDD\" and\n                i['properties']['location'] == location]\n    return matching"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef _nsattr(self, attr, ns=None):\n        ''' returns an attribute name w/ namespace prefix'''\n        if ns is None:\n            return attr\n        return '{' + self._ns[ns] + '}' + attr", "response": "returns an attribute name w/ namespace prefix"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreads the user configuration file.", "response": "def _read_config(self, filename=None):\n        \"\"\"\n        Read the user configuration\n        \"\"\"\n        if filename:\n            self._config_filename = filename\n        else:\n            try:\n                import appdirs\n            except ImportError:\n                raise Exception(\"Missing dependency for determining config path. Please install \"\n                                \"the 'appdirs' Python module.\")\n            self._config_filename = appdirs.user_config_dir(_LIBRARY_NAME, \"ProfitBricks\") + \".ini\"\n        if not self._config:\n            self._config = configparser.ConfigParser()\n            self._config.optionxform = str\n            self._config.read(self._config_filename)"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef _save_config(self, filename=None):\n        if filename is None:\n            filename = self._config_filename\n        parent_path = os.path.dirname(filename)\n        if not os.path.isdir(parent_path):\n            os.makedirs(parent_path)\n        with open(filename, \"w\") as configfile:\n            self._config.write(configfile)", "response": "Save the user configuration."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\ndetermine the username from the user s name.", "response": "def _get_username(self, username=None, use_config=True, config_filename=None):\n        \"\"\"Determine the username\n\n        If a username is given, this name is used. Otherwise the configuration\n        file will be consulted if `use_config` is set to True. The user is asked\n        for the username if the username is not available. Then the username is\n        stored in the configuration file.\n\n        :param      username: Username (used directly if given)\n        :type       username: ``str``\n\n        :param      use_config: Whether to read username from configuration file\n        :type       use_config: ``bool``\n\n        :param      config_filename: Path to the configuration file\n        :type       config_filename: ``str``\n\n        \"\"\"\n        if not username and use_config:\n            if self._config is None:\n                self._read_config(config_filename)\n            username = self._config.get(\"credentials\", \"username\", fallback=None)\n\n        if not username:\n            username = input(\"Please enter your username: \").strip()\n            while not username:\n                username = input(\"No username specified. Please enter your username: \").strip()\n            if 'credendials' not in self._config:\n                self._config.add_section('credentials')\n            self._config.set(\"credentials\", \"username\", username)\n            self._save_config()\n\n        return username"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef _get_password(self, password, use_config=True, config_filename=None,\n                      use_keyring=HAS_KEYRING):\n        \"\"\"\n        Determine the user password\n\n        If the password is given, this password is used. Otherwise\n        this function will try to get the password from the user's keyring\n        if `use_keyring` is set to True.\n\n        :param      username: Username (used directly if given)\n        :type       username: ``str``\n\n        :param      use_config: Whether to read username from configuration file\n        :type       use_config: ``bool``\n\n        :param      config_filename: Path to the configuration file\n        :type       config_filename: ``str``\n\n        \"\"\"\n        if not password and use_config:\n            if self._config is None:\n                self._read_config(config_filename)\n            password = self._config.get(\"credentials\", \"password\", fallback=None)\n\n        if not password and use_keyring:\n            logger = logging.getLogger(__name__)\n            question = (\"Please enter your password for {} on {}: \"\n                        .format(self.username, self.host_base))\n            if HAS_KEYRING:\n                password = keyring.get_password(self.keyring_identificator, self.username)\n                if password is None:\n                    password = getpass.getpass(question)\n                    try:\n                        keyring.set_password(self.keyring_identificator, self.username, password)\n                    except keyring.errors.PasswordSetError as error:\n                        logger.warning(\"Storing password in keyring '%s' failed: %s\",\n                                       self.keyring_identificator, error)\n            else:\n                logger.warning(\"Install the 'keyring' Python module to store your password \"\n                               \"securely in your keyring!\")\n                password = self._config.get(\"credentials\", \"password\", fallback=None)\n                if password is None:\n                    password = getpass.getpass(question)\n                    store_plaintext_passwords = self._config.get(\n                        \"preferences\", \"store-plaintext-passwords\", fallback=None)\n                    if store_plaintext_passwords != \"no\":\n                        question = (\"Do you want to store your password in plain text in \" +\n                                    self._config_filename())\n                        answer = ask(question, [\"yes\", \"no\", \"never\"], \"no\")\n                        if answer == \"yes\":\n                            self._config.set(\"credentials\", \"password\", password)\n                            self._save_config()\n                        elif answer == \"never\":\n                            if \"preferences\" not in self._config:\n                                self._config.add_section(\"preferences\")\n                            self._config.set(\"preferences\", \"store-plaintext-passwords\", \"no\")\n                            self._save_config()\n\n        return password", "response": "Get the user s password from the user s keyring."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nretrieve a data center by its ID.", "response": "def get_datacenter(self, datacenter_id, depth=1):\n        \"\"\"\n        Retrieves a data center by its ID.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      depth: The depth of the response data.\n        :type       depth: ``int``\n\n        \"\"\"\n        response = self._perform_request(\n            '/datacenters/%s?depth=%s' % (datacenter_id, str(depth)))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nreturns a data center response by its name.", "response": "def get_datacenter_by_name(self, name, depth=1):\n        \"\"\"\n        Retrieves a data center by its name.\n\n        Either returns the data center response or raises an Exception\n        if no or more than one data center was found with the name.\n        The search for the name is done in this relaxing way:\n\n        - exact name match\n        - case-insentive name match\n        - data center starts with the name\n        - data center starts with the name  (case insensitive)\n        - name appears in the data center name\n        - name appears in the data center name (case insensitive)\n\n        :param      name: The name of the data center.\n        :type       name: ``str``\n\n        :param      depth: The depth of the response data.\n        :type       depth: ``int``\n        \"\"\"\n        all_data_centers = self.list_datacenters(depth=depth)['items']\n        data_center = find_item_by_name(all_data_centers, lambda i: i['properties']['name'], name)\n        if not data_center:\n            raise NameError(\"No data center found with name \"\n                            \"containing '{name}'.\".format(name=name))\n        if len(data_center) > 1:\n            raise NameError(\"Found {n} data centers with the name '{name}': {names}\".format(\n                n=len(data_center),\n                name=name,\n                names=\", \".join(d['properties']['name'] for d in data_center)\n            ))\n        return data_center[0]"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ndeletes the data center and all its components.", "response": "def delete_datacenter(self, datacenter_id):\n        \"\"\"\n        Removes the data center and all its components such as servers, NICs,\n        load balancers, volumes.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        \"\"\"\n        response = self._perform_request(\n            url='/datacenters/%s' % (datacenter_id),\n            method='DELETE')\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\ncreates a new datacenter and returns a dictionary of all the relevant data center items.", "response": "def create_datacenter(self, datacenter):\n        \"\"\"\n        Creates a data center -- both simple and complex are supported.\n\n        \"\"\"\n        server_items = []\n        volume_items = []\n        lan_items = []\n        loadbalancer_items = []\n\n        entities = dict()\n\n        properties = {\n            \"name\": datacenter.name\n        }\n\n        # Omit 'location', if not provided, to receive\n        # a meaningful error message.\n        if datacenter.location:\n            properties['location'] = datacenter.location\n\n        # Optional Properties\n        if datacenter.description:\n            properties['description'] = datacenter.description\n\n        # Servers\n        if datacenter.servers:\n            for server in datacenter.servers:\n                server_items.append(self._create_server_dict(server))\n\n            servers = {\n                \"items\": server_items\n            }\n\n            server_entities = {\n                \"servers\": servers\n            }\n\n            entities.update(server_entities)\n\n        # Volumes\n        if datacenter.volumes:\n            for volume in datacenter.volumes:\n                volume_items.append(self._create_volume_dict(volume))\n\n            volumes = {\n                \"items\": volume_items\n            }\n\n            volume_entities = {\n                \"volumes\": volumes\n            }\n\n            entities.update(volume_entities)\n\n        # Load Balancers\n        if datacenter.loadbalancers:\n            for loadbalancer in datacenter.loadbalancers:\n                loadbalancer_items.append(\n                    self._create_loadbalancer_dict(\n                        loadbalancer\n                    )\n                )\n\n            loadbalancers = {\n                \"items\": loadbalancer_items\n            }\n\n            loadbalancer_entities = {\n                \"loadbalancers\": loadbalancers\n            }\n\n            entities.update(loadbalancer_entities)\n\n        # LANs\n        if datacenter.lans:\n            for lan in datacenter.lans:\n                lan_items.append(\n                    self._create_lan_dict(lan)\n                )\n\n            lans = {\n                \"items\": lan_items\n            }\n\n            lan_entities = {\n                \"lans\": lans\n            }\n\n            entities.update(lan_entities)\n\n        if not entities:\n            raw = {\n                \"properties\": properties,\n            }\n        else:\n            raw = {\n                \"properties\": properties,\n                \"entities\": entities\n            }\n\n        data = json.dumps(raw)\n\n        response = self._perform_request(\n            url='/datacenters',\n            method='POST',\n            data=data)\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nretrieves a single firewall rule by ID.", "response": "def get_firewall_rule(self, datacenter_id,\n                          server_id, nic_id, firewall_rule_id):\n        \"\"\"\n        Retrieves a single firewall rule by ID.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server_id: The unique ID of the server.\n        :type       server_id: ``str``\n\n        :param      nic_id: The unique ID of the NIC.\n        :type       nic_id: ``str``\n\n        :param      firewall_rule_id: The unique ID of the firewall rule.\n        :type       firewall_rule_id: ``str``\n\n        \"\"\"\n        response = self._perform_request(\n            '/datacenters/%s/servers/%s/nics/%s/firewallrules/%s' % (\n                datacenter_id,\n                server_id,\n                nic_id,\n                firewall_rule_id))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef delete_firewall_rule(self, datacenter_id, server_id,\n                             nic_id, firewall_rule_id):\n        \"\"\"\n        Removes a firewall rule from the NIC.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server_id: The unique ID of the server.\n        :type       server_id: ``str``\n\n        :param      nic_id: The unique ID of the NIC.\n        :type       nic_id: ``str``\n\n        :param      firewall_rule_id: The unique ID of the firewall rule.\n        :type       firewall_rule_id: ``str``\n\n        \"\"\"\n        response = self._perform_request(\n            url='/datacenters/%s/servers/%s/nics/%s/firewallrules/%s' % (\n                datacenter_id,\n                server_id,\n                nic_id,\n                firewall_rule_id),\n            method='DELETE')\n\n        return response", "response": "This method deletes a firewall rule from the NIC."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef create_firewall_rule(self, datacenter_id, server_id,\n                             nic_id, firewall_rule):\n        \"\"\"\n        Creates a firewall rule on the specified NIC and server.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server_id: The unique ID of the server.\n        :type       server_id: ``str``\n\n        :param      nic_id: The unique ID of the NIC.\n        :type       nic_id: ``str``\n\n        :param      firewall_rule: A firewall rule dict.\n        :type       firewall_rule: ``dict``\n\n        \"\"\"\n        properties = {\n            \"name\": firewall_rule.name\n        }\n\n        if firewall_rule.protocol:\n            properties['protocol'] = firewall_rule.protocol\n\n        # Optional Properties\n        if firewall_rule.source_mac:\n            properties['sourceMac'] = firewall_rule.source_mac\n\n        if firewall_rule.source_ip:\n            properties['sourceIp'] = firewall_rule.source_ip\n\n        if firewall_rule.target_ip:\n            properties['targetIp'] = firewall_rule.target_ip\n\n        if firewall_rule.port_range_start:\n            properties['portRangeStart'] = firewall_rule.port_range_start\n\n        if firewall_rule.port_range_end:\n            properties['portRangeEnd'] = firewall_rule.port_range_end\n\n        if firewall_rule.icmp_type:\n            properties['icmpType'] = firewall_rule.icmp_type\n\n        if firewall_rule.icmp_code:\n            properties['icmpCode'] = firewall_rule.icmp_code\n\n        data = {\n            \"properties\": properties\n        }\n\n        response = self._perform_request(\n            url='/datacenters/%s/servers/%s/nics/%s/firewallrules' % (\n                datacenter_id,\n                server_id,\n                nic_id),\n            method='POST',\n            data=json.dumps(data))\n\n        return response", "response": "Creates a firewall rule on the specified NIC and server."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nupdate a firewall rule in a server.", "response": "def update_firewall_rule(self, datacenter_id, server_id,\n                             nic_id, firewall_rule_id, **kwargs):\n        \"\"\"\n        Updates a firewall rule.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server_id: The unique ID of the server.\n        :type       server_id: ``str``\n\n        :param      nic_id: The unique ID of the NIC.\n        :type       nic_id: ``str``\n\n        :param      firewall_rule_id: The unique ID of the firewall rule.\n        :type       firewall_rule_id: ``str``\n\n        \"\"\"\n        data = {}\n\n        for attr, value in kwargs.items():\n            data[self._underscore_to_camelcase(attr)] = value\n\n            if attr == 'source_mac':\n                data['sourceMac'] = value\n            elif attr == 'source_ip':\n                data['sourceIp'] = value\n            elif attr == 'target_ip':\n                data['targetIp'] = value\n            elif attr == 'port_range_start':\n                data['portRangeStart'] = value\n            elif attr == 'port_range_end':\n                data['portRangeEnd'] = value\n            elif attr == 'icmp_type':\n                data['icmpType'] = value\n            elif attr == 'icmp_code':\n                data['icmpCode'] = value\n            else:\n                data[self._underscore_to_camelcase(attr)] = value\n\n        response = self._perform_request(\n            url='/datacenters/%s/servers/%s/nics/%s/firewallrules/%s' % (\n                datacenter_id,\n                server_id,\n                nic_id,\n                firewall_rule_id),\n            method='PATCH',\n            data=json.dumps(data))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef delete_image(self, image_id):\n        response = self._perform_request(url='/images/' + image_id,\n                                         method='DELETE')\n        return response", "response": "Removes only user created images."}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nupdates all properties of an image.", "response": "def update_image(self, image_id, **kwargs):\n        \"\"\"\n        Replace all properties of an image.\n\n        \"\"\"\n        data = {}\n\n        for attr, value in kwargs.items():\n            data[self._underscore_to_camelcase(attr)] = value\n\n        response = self._perform_request(url='/images/' + image_id,\n                                         method='PATCH',\n                                         data=json.dumps(data))\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef delete_ipblock(self, ipblock_id):\n        response = self._perform_request(\n            url='/ipblocks/' + ipblock_id, method='DELETE')\n\n        return response", "response": "Removes a single IP block from your account."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreserve an IP block within your account.", "response": "def reserve_ipblock(self, ipblock):\n        \"\"\"\n        Reserves an IP block within your account.\n\n        \"\"\"\n        properties = {\n            \"name\": ipblock.name\n        }\n\n        if ipblock.location:\n            properties['location'] = ipblock.location\n\n        if ipblock.size:\n            properties['size'] = str(ipblock.size)\n\n        raw = {\n            \"properties\": properties,\n        }\n\n        response = self._perform_request(\n            url='/ipblocks', method='POST', data=json.dumps(raw))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nretrieving a single LAN by ID.", "response": "def get_lan(self, datacenter_id, lan_id, depth=1):\n        \"\"\"\n        Retrieves a single LAN by ID.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      lan_id: The unique ID of the LAN.\n        :type       lan_id: ``str``\n\n        :param      depth: The depth of the response data.\n        :type       depth: ``int``\n\n        \"\"\"\n        response = self._perform_request(\n            '/datacenters/%s/lans/%s?depth=%s' % (\n                datacenter_id,\n                lan_id,\n                str(depth)))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef list_lans(self, datacenter_id, depth=1):\n        response = self._perform_request(\n            '/datacenters/%s/lans?depth=%s' % (\n                datacenter_id,\n                str(depth)))\n\n        return response", "response": "This method returns a list of LANs available in the account."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nremoving a LAN from the data center. :param datacenter_id: The unique ID of the data center. :type datacenter_id: ``str`` :param lan_id: The unique ID of the LAN. :type lan_id: ``str``", "response": "def delete_lan(self, datacenter_id, lan_id):\n        \"\"\"\n        Removes a LAN from the data center.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      lan_id: The unique ID of the LAN.\n        :type       lan_id: ``str``\n\n        \"\"\"\n        response = self._perform_request(\n            url='/datacenters/%s/lans/%s' % (\n                datacenter_id, lan_id), method='DELETE')\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef create_lan(self, datacenter_id, lan):\n        data = json.dumps(self._create_lan_dict(lan))\n\n        response = self._perform_request(\n            url='/datacenters/%s/lans' % datacenter_id,\n            method='POST',\n            data=data)\n\n        return response", "response": "Creates a LAN in the data center."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef update_lan(self, datacenter_id, lan_id, name=None,\n                   public=None, ip_failover=None):\n        \"\"\"\n        Updates a LAN\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      lan_id: The unique ID of the LAN.\n        :type       lan_id: ``str``\n\n        :param      name: The new name of the LAN.\n        :type       name: ``str``\n\n        :param      public: Indicates if the LAN is public.\n        :type       public: ``bool``\n\n        :param      ip_failover: A list of IP fail-over dicts.\n        :type       ip_failover: ``list``\n\n        \"\"\"\n        data = {}\n\n        if name:\n            data['name'] = name\n\n        if public is not None:\n            data['public'] = public\n\n        if ip_failover:\n            data['ipFailover'] = ip_failover\n\n        response = self._perform_request(\n            url='/datacenters/%s/lans/%s' % (datacenter_id, lan_id),\n            method='PATCH',\n            data=json.dumps(data))\n\n        return response", "response": "Updates a LAN in a data center."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef get_lan_members(self, datacenter_id, lan_id, depth=1):\n        response = self._perform_request(\n            '/datacenters/%s/lans/%s/nics?depth=%s' % (\n                datacenter_id,\n                lan_id,\n                str(depth)))\n\n        return response", "response": "Returns the list of NICs that are part of the LAN."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nretrieving a single load balancer by ID.", "response": "def get_loadbalancer(self, datacenter_id, loadbalancer_id):\n        \"\"\"\n        Retrieves a single load balancer by ID.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      loadbalancer_id: The unique ID of the load balancer.\n        :type       loadbalancer_id: ``str``\n\n        \"\"\"\n        response = self._perform_request(\n            '/datacenters/%s/loadbalancers/%s' % (\n                datacenter_id, loadbalancer_id))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef list_loadbalancers(self, datacenter_id, depth=1):\n        response = self._perform_request(\n            '/datacenters/%s/loadbalancers?depth=%s' % (\n                datacenter_id, str(depth)))\n\n        return response", "response": "Returns a list of load balancers in the data center."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nremoves the load balancer from the data center. :param datacenter_id: The unique ID of the data center. :type datacenter_id: ``str`` :param loadbalancer_id: The unique ID of the load balancer. :type loadbalancer_id: ``str``", "response": "def delete_loadbalancer(self, datacenter_id, loadbalancer_id):\n        \"\"\"\n        Removes the load balancer from the data center.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      loadbalancer_id: The unique ID of the load balancer.\n        :type       loadbalancer_id: ``str``\n\n        \"\"\"\n        response = self._perform_request(\n            url='/datacenters/%s/loadbalancers/%s' % (\n                datacenter_id, loadbalancer_id), method='DELETE')\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\ncreate a load balancer within the specified data center.", "response": "def create_loadbalancer(self, datacenter_id, loadbalancer):\n        \"\"\"\n        Creates a load balancer within the specified data center.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      loadbalancer: The load balancer object to be created.\n        :type       loadbalancer: ``dict``\n\n        \"\"\"\n        data = json.dumps(self._create_loadbalancer_dict(loadbalancer))\n\n        response = self._perform_request(\n            url='/datacenters/%s/loadbalancers' % datacenter_id,\n            method='POST',\n            data=data)\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nupdate a load balancer entry.", "response": "def update_loadbalancer(self, datacenter_id,\n                            loadbalancer_id, **kwargs):\n        \"\"\"\n        Updates a load balancer\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      loadbalancer_id: The unique ID of the load balancer.\n        :type       loadbalancer_id: ``str``\n\n        \"\"\"\n        data = {}\n\n        for attr, value in kwargs.items():\n            data[self._underscore_to_camelcase(attr)] = value\n\n        response = self._perform_request(\n            url='/datacenters/%s/loadbalancers/%s' % (datacenter_id,\n                                                      loadbalancer_id),\n            method='PATCH',\n            data=json.dumps(data))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef get_loadbalancer_members(self, datacenter_id, loadbalancer_id,\n                                 depth=1):\n        \"\"\"\n        Retrieves the list of NICs that are associated with a load balancer.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      loadbalancer_id: The unique ID of the load balancer.\n        :type       loadbalancer_id: ``str``\n\n        :param      depth: The depth of the response data.\n        :type       depth: ``int``\n\n        \"\"\"\n        response = self._perform_request(\n            '/datacenters/%s/loadbalancers/%s/balancednics?depth=%s' % (\n                datacenter_id, loadbalancer_id, str(depth)))\n\n        return response", "response": "Returns the list of NICs that are associated with a load balancer."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef add_loadbalanced_nics(self, datacenter_id,\n                              loadbalancer_id, nic_id):\n        \"\"\"\n        Associates a NIC with the given load balancer.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      loadbalancer_id: The unique ID of the load balancer.\n        :type       loadbalancer_id: ``str``\n\n        :param      nic_id: The ID of the NIC.\n        :type       nic_id: ``str``\n\n        \"\"\"\n        data = '{ \"id\": \"' + nic_id + '\" }'\n\n        response = self._perform_request(\n            url='/datacenters/%s/loadbalancers/%s/balancednics' % (\n                datacenter_id,\n                loadbalancer_id),\n            method='POST',\n            data=data)\n\n        return response", "response": "Adds a load balancer NIC to a load center."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef get_loadbalanced_nic(self, datacenter_id,\n                             loadbalancer_id, nic_id, depth=1):\n        \"\"\"\n        Gets the properties of a load balanced NIC.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      loadbalancer_id: The unique ID of the load balancer.\n        :type       loadbalancer_id: ``str``\n\n        :param      nic_id: The unique ID of the NIC.\n        :type       nic_id: ``str``\n\n        :param      depth: The depth of the response data.\n        :type       depth: ``int``\n\n        \"\"\"\n        response = self._perform_request(\n            '/datacenters/%s/loadbalancers/%s/balancednics/%s?depth=%s' % (\n                datacenter_id,\n                loadbalancer_id,\n                nic_id,\n                str(depth)))\n\n        return response", "response": "Get the properties of a load balanced NIC."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef remove_loadbalanced_nic(self, datacenter_id,\n                                loadbalancer_id, nic_id):\n        \"\"\"\n        Removes a NIC from the load balancer.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      loadbalancer_id: The unique ID of the load balancer.\n        :type       loadbalancer_id: ``str``\n\n        :param      nic_id: The unique ID of the NIC.\n        :type       nic_id: ``str``\n\n        \"\"\"\n        response = self._perform_request(\n            url='/datacenters/%s/loadbalancers/%s/balancednics/%s' % (\n                datacenter_id,\n                loadbalancer_id,\n                nic_id),\n            method='DELETE')\n\n        return response", "response": "This method removes a load balanced NIC from the load balancer."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nretrieving a single location by ID.", "response": "def get_location(self, location_id, depth=0):\n        \"\"\"\n        Retrieves a single location by ID.\n\n        :param      location_id: The unique ID of the location.\n        :type       location_id: ``str``\n\n        \"\"\"\n        response = self._perform_request('/locations/%s?depth=%s' % (location_id, depth))\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nretrieving a NIC by its ID.", "response": "def get_nic(self, datacenter_id, server_id, nic_id, depth=1):\n        \"\"\"\n        Retrieves a NIC by its ID.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server_id: The unique ID of the server.\n        :type       server_id: ``str``\n\n        :param      nic_id: The unique ID of the NIC.\n        :type       nic_id: ``str``\n\n        :param      depth: The depth of the response data.\n        :type       depth: ``int``\n\n        \"\"\"\n        response = self._perform_request(\n            '/datacenters/%s/servers/%s/nics/%s?depth=%s' % (\n                datacenter_id,\n                server_id,\n                nic_id,\n                str(depth)))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nreturn a list of all NICs bound to the specified server.", "response": "def list_nics(self, datacenter_id, server_id, depth=1):\n        \"\"\"\n        Retrieves a list of all NICs bound to the specified server.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server_id: The unique ID of the server.\n        :type       server_id: ``str``\n\n        :param      depth: The depth of the response data.\n        :type       depth: ``int``\n\n        \"\"\"\n        response = self._perform_request(\n            '/datacenters/%s/servers/%s/nics?depth=%s' % (\n                datacenter_id,\n                server_id,\n                str(depth)))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nremoves a NIC from the server. :param datacenter_id: The unique ID of the data center. :type datacenter_id: ``str`` :param server_id: The unique ID of the server. :type server_id: ``str`` :param nic_id: The unique ID of the NIC. :type nic_id: ``str``", "response": "def delete_nic(self, datacenter_id, server_id, nic_id):\n        \"\"\"\n        Removes a NIC from the server.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server_id: The unique ID of the server.\n        :type       server_id: ``str``\n\n        :param      nic_id: The unique ID of the NIC.\n        :type       nic_id: ``str``\n\n        \"\"\"\n        response = self._perform_request(\n            url='/datacenters/%s/servers/%s/nics/%s' % (\n                datacenter_id,\n                server_id,\n                nic_id),\n            method='DELETE')\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ncreates a NIC on the specified server.", "response": "def create_nic(self, datacenter_id, server_id, nic):\n        \"\"\"\n        Creates a NIC on the specified server.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server_id: The unique ID of the server.\n        :type       server_id: ``str``\n\n        :param      nic: A NIC dict.\n        :type       nic: ``dict``\n\n        \"\"\"\n\n        data = json.dumps(self._create_nic_dict(nic))\n\n        response = self._perform_request(\n            url='/datacenters/%s/servers/%s/nics' % (\n                datacenter_id,\n                server_id),\n            method='POST',\n            data=data)\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef update_nic(self, datacenter_id, server_id,\n                   nic_id, **kwargs):\n        \"\"\"\n        Updates a NIC with the parameters provided.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server_id: The unique ID of the server.\n        :type       server_id: ``str``\n\n        :param      nic_id: The unique ID of the NIC.\n        :type       nic_id: ``str``\n\n        \"\"\"\n        data = {}\n\n        for attr, value in kwargs.items():\n            data[self._underscore_to_camelcase(attr)] = value\n\n        response = self._perform_request(\n            url='/datacenters/%s/servers/%s/nics/%s' % (\n                datacenter_id,\n                server_id,\n                nic_id),\n            method='PATCH',\n            data=json.dumps(data))\n\n        return response", "response": "Updates a NIC with the parameters provided."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nretrieves a single request by ID.", "response": "def get_request(self, request_id, status=False):\n        \"\"\"\n        Retrieves a single request by ID.\n\n        :param      request_id: The unique ID of the request.\n        :type       request_id: ``str``\n\n        :param      status: Retreive the full status of the request.\n        :type       status: ``bool``\n\n        \"\"\"\n        if status:\n            response = self._perform_request(\n                '/requests/' + request_id + '/status')\n        else:\n            response = self._perform_request(\n                '/requests/%s' % request_id)\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nretrieving a server by its ID.", "response": "def get_server(self, datacenter_id, server_id, depth=1):\n        \"\"\"\n        Retrieves a server by its ID.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server_id: The unique ID of the server.\n        :type       server_id: ``str``\n\n        :param      depth: The depth of the response data.\n        :type       depth: ``int``\n\n        \"\"\"\n        response = self._perform_request(\n            '/datacenters/%s/servers/%s?depth=%s' % (\n                datacenter_id,\n                server_id,\n                str(depth)))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn a list of all servers bound to the specified data center.", "response": "def list_servers(self, datacenter_id, depth=1):\n        \"\"\"\n        Retrieves a list of all servers bound to the specified data center.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      depth: The depth of the response data.\n        :type       depth: ``int``\n\n        \"\"\"\n        response = self._perform_request(\n            '/datacenters/%s/servers?depth=%s' % (datacenter_id, str(depth)))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef delete_server(self, datacenter_id, server_id):\n        response = self._perform_request(\n            url='/datacenters/%s/servers/%s' % (\n                datacenter_id,\n                server_id),\n            method='DELETE')\n\n        return response", "response": "This method deletes the server from your data center."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ncreating a server within the data center.", "response": "def create_server(self, datacenter_id, server):\n        \"\"\"\n        Creates a server within the data center.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server: A dict of the server to be created.\n        :type       server: ``dict``\n\n        \"\"\"\n\n        data = json.dumps(self._create_server_dict(server))\n\n        response = self._perform_request(\n            url='/datacenters/%s/servers' % (datacenter_id),\n            method='POST',\n            data=data)\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef update_server(self, datacenter_id, server_id, **kwargs):\n        data = {}\n\n        for attr, value in kwargs.items():\n            if attr == 'boot_volume':\n                boot_volume_properties = {\n                    \"id\": value\n                }\n                boot_volume_entities = {\n                    \"bootVolume\": boot_volume_properties\n                }\n                data.update(boot_volume_entities)\n            else:\n                data[self._underscore_to_camelcase(attr)] = value\n\n        response = self._perform_request(\n            url='/datacenters/%s/servers/%s' % (\n                datacenter_id,\n                server_id),\n            method='PATCH',\n            data=json.dumps(data))\n\n        return response", "response": "Updates a server with the parameters provided."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nretrieve a list of volumes attached to a server.", "response": "def get_attached_volumes(self, datacenter_id, server_id, depth=1):\n        \"\"\"\n        Retrieves a list of volumes attached to the server.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server_id: The unique ID of the server.\n        :type       server_id: ``str``\n\n        :param      depth: The depth of the response data.\n        :type       depth: ``int``\n\n        \"\"\"\n        response = self._perform_request(\n            '/datacenters/%s/servers/%s/volumes?depth=%s' % (\n                datacenter_id,\n                server_id,\n                str(depth)))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nretrieve the information about an attached volume.", "response": "def get_attached_volume(self, datacenter_id, server_id, volume_id):\n        \"\"\"\n        Retrieves volume information.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server_id: The unique ID of the server.\n        :type       server_id: ``str``\n\n        :param      volume_id: The unique ID of the volume.\n        :type       volume_id: ``str``\n\n        \"\"\"\n        response = self._perform_request(\n            '/datacenters/%s/servers/%s/volumes/%s' % (\n                datacenter_id,\n                server_id,\n                volume_id))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nattaching a volume to a server.", "response": "def attach_volume(self, datacenter_id, server_id, volume_id):\n        \"\"\"\n        Attaches a volume to a server.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server_id: The unique ID of the server.\n        :type       server_id: ``str``\n\n        :param      volume_id: The unique ID of the volume.\n        :type       volume_id: ``str``\n\n        \"\"\"\n        data = '{ \"id\": \"' + volume_id + '\" }'\n\n        response = self._perform_request(\n            url='/datacenters/%s/servers/%s/volumes' % (\n                datacenter_id,\n                server_id),\n            method='POST',\n            data=data)\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef detach_volume(self, datacenter_id, server_id, volume_id):\n        response = self._perform_request(\n            url='/datacenters/%s/servers/%s/volumes/%s' % (\n                datacenter_id,\n                server_id,\n                volume_id),\n            method='DELETE')\n\n        return response", "response": "Detaches a volume from a server."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nreturn a list of CDROMs attached to the server.", "response": "def get_attached_cdroms(self, datacenter_id, server_id, depth=1):\n        \"\"\"\n        Retrieves a list of CDROMs attached to the server.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server_id: The unique ID of the server.\n        :type       server_id: ``str``\n\n        :param      depth: The depth of the response data.\n        :type       depth: ``int``\n\n        \"\"\"\n        response = self._perform_request(\n            '/datacenters/%s/servers/%s/cdroms?depth=%s' % (\n                datacenter_id,\n                server_id,\n                str(depth)))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nretrieving an attached CDROM.", "response": "def get_attached_cdrom(self, datacenter_id, server_id, cdrom_id):\n        \"\"\"\n        Retrieves an attached CDROM.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server_id: The unique ID of the server.\n        :type       server_id: ``str``\n\n        :param      cdrom_id: The unique ID of the CDROM.\n        :type       cdrom_id: ``str``\n\n        \"\"\"\n        response = self._perform_request(\n            '/datacenters/%s/servers/%s/cdroms/%s' % (\n                datacenter_id,\n                server_id,\n                cdrom_id))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nattach a CDROM to a server. :param datacenter_id: The unique ID of the data center. :type datacenter_id: ``str`` :param server_id: The unique ID of the server. :type server_id: ``str`` :param cdrom_id: The unique ID of the CDROM. :type cdrom_id: ``str``", "response": "def attach_cdrom(self, datacenter_id, server_id, cdrom_id):\n        \"\"\"\n        Attaches a CDROM to a server.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server_id: The unique ID of the server.\n        :type       server_id: ``str``\n\n        :param      cdrom_id: The unique ID of the CDROM.\n        :type       cdrom_id: ``str``\n\n        \"\"\"\n        data = '{ \"id\": \"' + cdrom_id + '\" }'\n\n        response = self._perform_request(\n            url='/datacenters/%s/servers/%s/cdroms' % (\n                datacenter_id,\n                server_id),\n            method='POST',\n            data=data)\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\ndetach a CDROM from a server.", "response": "def detach_cdrom(self, datacenter_id, server_id, cdrom_id):\n        \"\"\"\n        Detaches a volume from a server.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server_id: The unique ID of the server.\n        :type       server_id: ``str``\n\n        :param      cdrom_id: The unique ID of the CDROM.\n        :type       cdrom_id: ``str``\n\n        \"\"\"\n        response = self._perform_request(\n            url='/datacenters/%s/servers/%s/cdroms/%s' % (\n                datacenter_id,\n                server_id,\n                cdrom_id),\n            method='DELETE')\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nstarts the server. :param datacenter_id: The unique ID of the data center. :type datacenter_id: ``str`` :param server_id: The unique ID of the server. :type server_id: ``str``", "response": "def start_server(self, datacenter_id, server_id):\n        \"\"\"\n        Starts the server.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server_id: The unique ID of the server.\n        :type       server_id: ``str``\n\n        \"\"\"\n        response = self._perform_request(\n            url='/datacenters/%s/servers/%s/start' % (\n                datacenter_id,\n                server_id),\n            method='POST-ACTION')\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef stop_server(self, datacenter_id, server_id):\n        response = self._perform_request(\n            url='/datacenters/%s/servers/%s/stop' % (\n                datacenter_id,\n                server_id),\n            method='POST-ACTION')\n\n        return response", "response": "Stops the server.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      server_id: The unique ID of the server.\n        :type       server_id: ``str``"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nremove a snapshot from your account.", "response": "def delete_snapshot(self, snapshot_id):\n        \"\"\"\n        Removes a snapshot from your account.\n\n        :param      snapshot_id: The unique ID of the snapshot.\n        :type       snapshot_id: ``str``\n\n        \"\"\"\n        response = self._perform_request(\n            url='/snapshots/' + snapshot_id, method='DELETE')\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nupdating a snapshot from your account.", "response": "def update_snapshot(self, snapshot_id, **kwargs):\n        \"\"\"\n        Removes a snapshot from your account.\n\n        :param      snapshot_id: The unique ID of the snapshot.\n        :type       snapshot_id: ``str``\n        \"\"\"\n        data = {}\n\n        for attr, value in kwargs.items():\n            data[self._underscore_to_camelcase(attr)] = value\n\n        response = self._perform_request(\n            url='/snapshots/' + snapshot_id, method='PATCH', data=json.dumps(data))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\ncreates a snapshot of the specified volume.", "response": "def create_snapshot(self, datacenter_id, volume_id,\n                        name=None, description=None):\n        \"\"\"\n        Creates a snapshot of the specified volume.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      volume_id: The unique ID of the volume.\n        :type       volume_id: ``str``\n\n        :param      name: The name given to the volume.\n        :type       name: ``str``\n\n        :param      description: The description given to the volume.\n        :type       description: ``str``\n\n        \"\"\"\n\n        data = {'name': name, 'description': description}\n\n        response = self._perform_request(\n            '/datacenters/%s/volumes/%s/create-snapshot' % (\n                datacenter_id, volume_id),\n            method='POST-ACTION-JSON',\n            data=urlencode(data))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef restore_snapshot(self, datacenter_id, volume_id, snapshot_id):\n        data = {'snapshotId': snapshot_id}\n\n        response = self._perform_request(\n            url='/datacenters/%s/volumes/%s/restore-snapshot' % (\n                datacenter_id,\n                volume_id),\n            method='POST-ACTION',\n            data=urlencode(data))\n\n        return response", "response": "Restores a snapshot to the specified volume."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef remove_snapshot(self, snapshot_id):\n        response = self._perform_request(\n            url='/snapshots/' + snapshot_id, method='DELETE')\n\n        return response", "response": "Removes a snapshot from the resource store."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef get_group(self, group_id, depth=1):\n        response = self._perform_request(\n            '/um/groups/%s?depth=%s' % (group_id, str(depth)))\n\n        return response", "response": "Retrieves a single group by ID."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ncreating a new group and sets group privileges.", "response": "def create_group(self, group):\n        \"\"\"\n        Creates a new group and set group privileges.\n\n        :param      group: The group object to be created.\n        :type       group: ``dict``\n\n        \"\"\"\n        data = json.dumps(self._create_group_dict(group))\n\n        response = self._perform_request(\n            url='/um/groups',\n            method='POST',\n            data=data)\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef update_group(self, group_id, **kwargs):\n        properties = {}\n\n        # make the key camel-case transformable\n        if 'create_datacenter' in kwargs:\n            kwargs['create_data_center'] = kwargs.pop('create_datacenter')\n\n        for attr, value in kwargs.items():\n            properties[self._underscore_to_camelcase(attr)] = value\n\n        data = {\n            \"properties\": properties\n        }\n\n        response = self._perform_request(\n            url='/um/groups/%s' % group_id,\n            method='PUT',\n            data=json.dumps(data))\n\n        return response", "response": "Updates a group.\n\n        :param      group_id: The unique ID of the group.\n        :type       group_id: ``str``"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nremoving a group from the user s cache.", "response": "def delete_group(self, group_id):\n        \"\"\"\n        Removes a group.\n\n        :param      group_id: The unique ID of the group.\n        :type       group_id: ``str``\n\n        \"\"\"\n        response = self._perform_request(\n            url='/um/groups/%s' % group_id,\n            method='DELETE')\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nreturns a list of all shares though a group.", "response": "def list_shares(self, group_id, depth=1):\n        \"\"\"\n        Retrieves a list of all shares though a group.\n\n        :param      group_id: The unique ID of the group.\n        :type       group_id: ``str``\n\n        :param      depth: The depth of the response data.\n        :type       depth: ``int``\n\n        \"\"\"\n        response = self._perform_request(\n            '/um/groups/%s/shares?depth=%s' % (group_id, str(depth)))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nretrieves a specific resource share available to a group.", "response": "def get_share(self, group_id, resource_id, depth=1):\n        \"\"\"\n        Retrieves a specific resource share available to a group.\n\n        :param      group_id: The unique ID of the group.\n        :type       group_id: ``str``\n\n        :param      resource_id: The unique ID of the resource.\n        :type       resource_id: ``str``\n\n        :param      depth: The depth of the response data.\n        :type       depth: ``int``\n\n        \"\"\"\n        response = self._perform_request(\n            '/um/groups/%s/shares/%s?depth=%s'\n            % (group_id, resource_id, str(depth)))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef add_share(self, group_id, resource_id, **kwargs):\n        properties = {}\n\n        for attr, value in kwargs.items():\n            properties[self._underscore_to_camelcase(attr)] = value\n\n        data = {\n            \"properties\": properties\n        }\n\n        response = self._perform_request(\n            url='/um/groups/%s/shares/%s' % (group_id, resource_id),\n            method='POST',\n            data=json.dumps(data))\n\n        return response", "response": "Adds a share to a group through a resource."}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nremove a resource share from a group.", "response": "def delete_share(self, group_id, resource_id):\n        \"\"\"\n        Removes a resource share from a group.\n\n        :param      group_id: The unique ID of the group.\n        :type       group_id: ``str``\n\n        :param      resource_id: The unique ID of the resource.\n        :type       resource_id: ``str``\n\n        \"\"\"\n        response = self._perform_request(\n            url='/um/groups/%s/shares/%s' % (group_id, resource_id),\n            method='DELETE')\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef get_user(self, user_id, depth=1):\n        response = self._perform_request(\n            '/um/users/%s?depth=%s' % (user_id, str(depth)))\n\n        return response", "response": "Retrieves a single user by ID."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ncreating a new user object.", "response": "def create_user(self, user):\n        \"\"\"\n        Creates a new user.\n\n        :param      user: The user object to be created.\n        :type       user: ``dict``\n\n        \"\"\"\n        data = self._create_user_dict(user=user)\n\n        response = self._perform_request(\n            url='/um/users',\n            method='POST',\n            data=json.dumps(data))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef update_user(self, user_id, **kwargs):\n        properties = {}\n\n        for attr, value in kwargs.items():\n            properties[self._underscore_to_camelcase(attr)] = value\n\n        data = {\n            \"properties\": properties\n        }\n\n        response = self._perform_request(\n            url='/um/users/%s' % user_id,\n            method='PUT',\n            data=json.dumps(data))\n\n        return response", "response": "Updates a user s unique ID."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef delete_user(self, user_id):\n        response = self._perform_request(\n            url='/um/users/%s' % user_id,\n            method='DELETE')\n\n        return response", "response": "Removes a user from the cache."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nreturn a list of all users that are members of a particular group.", "response": "def list_group_users(self, group_id, depth=1):\n        \"\"\"\n        Retrieves a list of all users that are members of a particular group.\n\n        :param      group_id: The unique ID of the group.\n        :type       group_id: ``str``\n\n        :param      depth: The depth of the response data.\n        :type       depth: ``int``\n\n        \"\"\"\n        response = self._perform_request(\n            '/um/groups/%s/users?depth=%s' % (group_id, str(depth)))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef add_group_user(self, group_id, user_id):\n        data = {\n            \"id\": user_id\n        }\n\n        response = self._perform_request(\n            url='/um/groups/%s/users' % group_id,\n            method='POST',\n            data=json.dumps(data))\n\n        return response", "response": "Adds an existing user to a group."}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef remove_group_user(self, group_id, user_id):\n        response = self._perform_request(\n            url='/um/groups/%s/users/%s' % (group_id, user_id),\n            method='DELETE')\n\n        return response", "response": "Removes a user from a group."}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef list_resources(self, resource_type=None, depth=1):\n        if resource_type is not None:\n            response = self._perform_request(\n                '/um/resources/%s?depth=%s' % (resource_type, str(depth)))\n        else:\n            response = self._perform_request(\n                '/um/resources?depth=' + str(depth))\n\n        return response", "response": "Returns a list of all resources in the resource manager."}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nretrieve a single resource of a particular type.", "response": "def get_resource(self, resource_type, resource_id, depth=1):\n        \"\"\"\n        Retrieves a single resource of a particular type.\n\n        :param      resource_type: The resource type: datacenter, image,\n                                   snapshot or ipblock.\n        :type       resource_type: ``str``\n\n        :param      resource_id: The unique ID of the resource.\n        :type       resource_id: ``str``\n\n        :param      depth: The depth of the response data.\n        :type       depth: ``int``\n\n        \"\"\"\n        response = self._perform_request(\n            '/um/resources/%s/%s?depth=%s' % (\n                resource_type, resource_id, str(depth)))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef get_volume(self, datacenter_id, volume_id):\n        response = self._perform_request(\n            '/datacenters/%s/volumes/%s' % (datacenter_id, volume_id))\n\n        return response", "response": "Retrieves a single volume by ID."}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nreturns a list of volumes in the data center.", "response": "def list_volumes(self, datacenter_id, depth=1):\n        \"\"\"\n        Retrieves a list of volumes in the data center.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      depth: The depth of the response data.\n        :type       depth: ``int``\n\n        \"\"\"\n        response = self._perform_request(\n            '/datacenters/%s/volumes?depth=%s' % (datacenter_id, str(depth)))\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nremove a volume from the data center. :param datacenter_id: The unique ID of the data center. :type datacenter_id: ``str`` :param volume_id: The unique ID of the volume. :type volume_id: ``str``", "response": "def delete_volume(self, datacenter_id, volume_id):\n        \"\"\"\n        Removes a volume from the data center.\n\n        :param      datacenter_id: The unique ID of the data center.\n        :type       datacenter_id: ``str``\n\n        :param      volume_id: The unique ID of the volume.\n        :type       volume_id: ``str``\n\n        \"\"\"\n        response = self._perform_request(\n            url='/datacenters/%s/volumes/%s' % (\n                datacenter_id, volume_id), method='DELETE')\n\n        return response"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef create_volume(self, datacenter_id, volume):\n\n        data = (json.dumps(self._create_volume_dict(volume)))\n\n        response = self._perform_request(\n            url='/datacenters/%s/volumes' % datacenter_id,\n            method='POST',\n            data=data)\n\n        return response", "response": "Creates a new volume within the specified data center."}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef wait_for_completion(self, response, timeout=3600, initial_wait=5, scaleup=10):\n        if not response:\n            return\n        logger = logging.getLogger(__name__)\n        wait_period = initial_wait\n        next_increase = time.time() + wait_period * scaleup\n        if timeout:\n            timeout = time.time() + timeout\n        while True:\n            request = self.get_request(request_id=response['requestId'], status=True)\n\n            if request['metadata']['status'] == 'DONE':\n                break\n            elif request['metadata']['status'] == 'FAILED':\n                raise PBFailedRequest(\n                    'Request {0} failed to complete: {1}'.format(\n                        response['requestId'], request['metadata']['message']),\n                    response['requestId']\n                )\n\n            current_time = time.time()\n            if timeout and current_time > timeout:\n                raise PBTimeoutError('Timed out waiting for request {0}.'.format(\n                    response['requestId']), response['requestId'])\n\n            if current_time > next_increase:\n                wait_period *= 2\n                next_increase = time.time() + wait_period * scaleup\n                scaleup *= 2\n\n            logger.info(\"Request %s is in state '%s'. Sleeping for %i seconds...\",\n                        response['requestId'], request['metadata']['status'], wait_period)\n            time.sleep(wait_period)", "response": "Wait for a request to complete."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nreturn the given string as a string of bytes. That means in Python2 and Python3 as a str object and in Python3 as a bytes object.", "response": "def _b(s, encoding='utf-8'):\n        \"\"\"\n        Returns the given string as a string of bytes. That means in\n        Python2 as a str object, and in Python3 as a bytes object.\n        Raises a TypeError, if it cannot be converted.\n        \"\"\"\n        if six.PY2:\n            # This is Python2\n            if isinstance(s, str):\n                return s\n            elif isinstance(s, unicode):  # noqa, pylint: disable=undefined-variable\n                return s.encode(encoding)\n        else:\n            # And this is Python3\n            if isinstance(s, bytes):\n                return s\n            elif isinstance(s, str):\n                return s.encode(encoding)\n\n        raise TypeError(\"Invalid argument %r for _b()\" % (s,))"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nconverting Python snake case back to mixed case.", "response": "def _underscore_to_camelcase(value):\n        \"\"\"\n        Convert Python snake case back to mixed case.\n        \"\"\"\n        def camelcase():\n            yield str.lower\n            while True:\n                yield str.capitalize\n\n        c = camelcase()\n        return \"\".join(next(c)(x) if x else '_' for x in value.split(\"_\"))"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef ask(question, options, default):\n    assert default in options\n\n    question += \" ({})? \".format(\"/\".join(o.upper() if o == default else o for o in options))\n    selected = None\n    while selected not in options:\n        selected = input(question).strip().lower()\n        if selected == \"\":\n            selected = default\n        else:\n            if selected not in options:\n                question = \"Please type '{}'{comma} or '{}': \".format(\n                    \"', '\".join(options[:-1]), options[-1],\n                    comma=',' if len(options) > 2 else '',\n                )\n    return selected", "response": "Ask the user a question with a list of allowed answers."}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\nfinding a given item in a given list by a matching name.", "response": "def find_item_by_name(list_, namegetter, name):\n    \"\"\"\n    Find a item a given list by a matching name.\n\n    The search for the name is done in this relaxing way:\n\n    - exact name match\n    - case-insentive name match\n    - attribute starts with the name\n    - attribute starts with the name (case insensitive)\n    - name appears in the attribute\n    - name appears in the attribute (case insensitive)\n\n    :param    list_: A list of elements\n    :type     list_: ``list``\n\n    :param    namegetter: Function that returns the name for a given\n                          element in the list\n    :type     namegetter: ``function``\n\n    :param    name: Name to search for\n    :type     name: ``str``\n\n    \"\"\"\n    matching_items = [i for i in list_ if namegetter(i) == name]\n    if not matching_items:\n        prog = re.compile(re.escape(name) + '$', re.IGNORECASE)\n        matching_items = [i for i in list_ if prog.match(namegetter(i))]\n    if not matching_items:\n        prog = re.compile(re.escape(name))\n        matching_items = [i for i in list_ if prog.match(namegetter(i))]\n    if not matching_items:\n        prog = re.compile(re.escape(name), re.IGNORECASE)\n        matching_items = [i for i in list_ if prog.match(namegetter(i))]\n    if not matching_items:\n        prog = re.compile(re.escape(name))\n        matching_items = [i for i in list_ if prog.search(namegetter(i))]\n    if not matching_items:\n        prog = re.compile(re.escape(name), re.IGNORECASE)\n        matching_items = [i for i in list_ if prog.search(namegetter(i))]\n    return matching_items"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\ncreates a Datacenter dict for the given datacenter.", "response": "def create_datacenter_dict(pbclient, datacenter):\n    \"\"\"\n    Creates a Datacenter dict -- both simple and complex are supported.\n    This is copied from createDatacenter() and uses private methods.\n\n    \"\"\"\n    # pylint: disable=protected-access\n    server_items = []\n    volume_items = []\n    lan_items = []\n    loadbalancer_items = []\n\n    entities = dict()\n\n    properties = {\n        \"name\": datacenter.name,\n        \"location\": datacenter.location,\n    }\n    # Optional Properties\n    if datacenter.description:\n        properties['description'] = datacenter.description\n\n    # Servers\n    if datacenter.servers:\n        for server in datacenter.servers:\n            server_items.append(pbclient._create_server_dict(server))\n        servers = {\n            \"items\": server_items\n        }\n        server_entities = {\n            \"servers\": servers\n        }\n        entities.update(server_entities)\n\n    # Volumes\n    if datacenter.volumes:\n        for volume in datacenter.volumes:\n            volume_items.append(pbclient._create_volume_dict(volume))\n        volumes = {\n            \"items\": volume_items\n        }\n        volume_entities = {\n            \"volumes\": volumes\n        }\n        entities.update(volume_entities)\n\n    # Load Balancers\n    if datacenter.loadbalancers:\n        for loadbalancer in datacenter.loadbalancers:\n            loadbalancer_items.append(\n                pbclient._create_loadbalancer_dict(\n                    loadbalancer\n                    )\n                )\n        loadbalancers = {\n            \"items\": loadbalancer_items\n        }\n        loadbalancer_entities = {\n            \"loadbalancers\": loadbalancers\n        }\n        entities.update(loadbalancer_entities)\n\n    # LANs\n    if datacenter.lans:\n        for lan in datacenter.lans:\n            lan_items.append(\n                pbclient._create_lan_dict(lan)\n                )\n        lans = {\n            \"items\": lan_items\n        }\n        lan_entities = {\n            \"lans\": lans\n        }\n        entities.update(lan_entities)\n\n    if not entities:\n        raw = {\n            \"properties\": properties,\n        }\n    else:\n        raw = {\n            \"properties\": properties,\n            \"entities\": entities\n        }\n\n    return raw"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nparse command line options and create a server/volume composite.", "response": "def main(argv=None):\n    '''Parse command line options and create a server/volume composite.'''\n\n    if argv is None:\n        argv = sys.argv\n    else:\n        sys.argv.extend(argv)\n\n    program_name = os.path.basename(sys.argv[0])\n    program_version = \"v%s\" % __version__\n    program_build_date = str(__updated__)\n    program_version_message = '%%(prog)s %s (%s)' % (program_version,\n                                                     program_build_date)\n    program_shortdesc = __import__('__main__').__doc__.split(\"\\n\")[1]\n    program_license = '''%s\n\n  Created by J. Buchhammer on %s.\n  Copyright 2016 ProfitBricks GmbH. All rights reserved.\n\n  Licensed under the Apache License 2.0\n  http://www.apache.org/licenses/LICENSE-2.0\n\n  Distributed on an \"AS IS\" basis without warranties\n  or conditions of any kind, either express or implied.\n\nUSAGE\n''' % (program_shortdesc, str(__date__))\n    # Setup argument parser\n    parser = ArgumentParser(description=program_license,\n                            formatter_class=RawDescriptionHelpFormatter)\n    parser.add_argument('-u', '--user', dest='user', help='the login name')\n    parser.add_argument('-p', '--password', dest='password',\n                        help='the login password')\n    parser.add_argument('-L', '--Login', dest='loginfile', default=None,\n                        help='the login file to use')\n    parser.add_argument('-i', '--infile', dest='infile', default=None,\n                        required=True, help='the input file name')\n    parser.add_argument('-D', '--DCname', dest='dcname', default=None,\n                        help='new datacenter name')\n# TODO: add/overwrite image password for creation\n#    parser.add_argument('-P', '--imagepassword', dest='imgpassword',\n#                        default=None, help='the image password')\n    parser.add_argument('-v', '--verbose', dest=\"verbose\", action=\"count\",\n                        default=0, help=\"set verbosity level [default: %(default)s]\")\n    parser.add_argument('-V', '--version', action='version',\n                        version=program_version_message)\n\n    # Process arguments\n    args = parser.parse_args()\n    global verbose\n    verbose = args.verbose\n\n    if verbose > 0:\n        print(\"Verbose mode on\")\n        print(\"start {} with args {}\".format(program_name, str(args)))\n\n    (user, password) = getLogin(args.loginfile, args.user, args.password)\n    if user is None or password is None:\n        raise ValueError(\"user or password resolved to None\")\n    pbclient = ProfitBricksService(user, password)\n\n    usefile = args.infile\n    print(\"read dc from {}\".format(usefile))\n    dcdef = read_dc_definition(usefile)\n    if verbose > 0:\n        print(\"using DC-DEF {}\".format(json.dumps(dcdef)))\n\n    # setup dc:\n    #     + create empty dc\n    #     + create volumes (unattached), map uuid to servers (custom dict)\n    #     + create servers\n    #     + create nics\n    #     + attach volumes\n\n    if 'custom' in dcdef and 'id' in dcdef['custom']:\n        dc_id = dcdef['custom']['id']\n        print(\"using existing DC w/ id {}\".format(str(dc_id)))\n    else:\n        if args.dcname is not None:\n            print(\"Overwrite DC name w/ '{}'\".format(args.dcname))\n            dcdef['properties']['name'] = args.dcname\n        dc = getDatacenterObject(dcdef)\n        # print(\"create DC {}\".format(str(dc)))\n        response = pbclient.create_datacenter(dc)\n        dc_id = response['id']\n        if 'custom' not in dcdef:\n            dcdef['custom'] = dict()\n        dcdef['custom']['id'] = dc_id\n        result = wait_for_request(pbclient, response['requestId'])\n        print(\"wait loop returned {}\".format(result))\n        tmpfile = usefile+\".tmp_postdc\"\n        write_dc_definition(dcdef, tmpfile)\n\n    requests = []\n    print(\"create Volumes {}\".format(str(dc)))\n    # we do NOT consider dangling volumes, only server-attached ones\n    for server in dcdef['entities']['servers']['items']:\n        print(\"- server {}\".format(server['properties']['name']))\n        if 'volumes' not in server['entities']:\n            print(\" server {} has no volumes\".format(server['properties']['name']))\n            continue\n        for volume in server['entities']['volumes']['items']:\n            if 'custom' in volume and 'id' in volume['custom']:\n                vol_id = volume['custom']['id']\n                print(\"using existing volume w/ id {}\".format(str(vol_id)))\n            else:\n                dcvol = getVolumeObject(volume)\n                print(\"OBJ: {}\".format(str(dcvol)))\n                response = pbclient.create_volume(dc_id, dcvol)\n                volume.update({'custom': {'id': response['id']}})\n                requests.append(response['requestId'])\n        # end for(volume)\n    # end for(server)\n    if requests:\n        result = wait_for_requests(pbclient, requests, initial_wait=10, scaleup=15)\n        print(\"wait loop returned {}\".format(str(result)))\n        tmpfile = usefile+\".tmp_postvol\"\n        write_dc_definition(dcdef, tmpfile)\n    else:\n        print(\"all volumes existed already\")\n\n    requests = []\n    print(\"create Servers {}\".format(str(dc)))\n    # we do NOT consider dangling volumes, only server-attached ones\n    for server in dcdef['entities']['servers']['items']:\n        print(\"- server {}\".format(server['properties']['name']))\n        if 'custom' in server and 'id' in server['custom']:\n            srv_id = server['custom']['id']\n            print(\"using existing server w/ id {}\".format(str(srv_id)))\n        else:\n            dcsrv = getServerObject(server)\n            print(\"OBJ: {}\".format(str(dcsrv)))\n            response = pbclient.create_server(dc_id, dcsrv)\n            server.update({'custom': {'id': response['id']}})\n            requests.append(response['requestId'])\n    # end for(server)\n    if requests:\n        result = wait_for_requests(pbclient, requests, initial_wait=10, scaleup=15)\n        print(\"wait loop returned {}\".format(str(result)))\n        tmpfile = usefile+\".tmp_postsrv\"\n        write_dc_definition(dcdef, tmpfile)\n    else:\n        print(\"all servers existed already\")\n\n# TODO: only do this if we have lan entities\n    requests = []\n    # Huuh, looks like we get unpredictable order for LANs!\n    # Nope, order of creation determines the LAN id,\n    # thus we better wait for each request\n    print(\"create LANs {}\".format(str(dc)))\n    for lan in dcdef['entities']['lans']['items']:\n        print(\"- lan {}\".format(lan['properties']['name']))\n        dclan = getLANObject(lan)\n        print(\"OBJ: {}\".format(str(dclan)))\n        response = pbclient.create_lan(dc_id, dclan)\n        lan.update({'custom': {'id': response['id']}})\n        result = wait_for_request(pbclient, response['requestId'])\n        print(\"wait loop returned {}\".format(str(result)))\n    # end for(lan)\n    tmpfile = usefile+\".tmp_postlan\"\n    write_dc_definition(dcdef, tmpfile)\n\n    requests = []\n    # Attention:\n    # NICs appear in OS in the order, they are created.\n    # But DCD rearranges the display by ascending MAC addresses.\n    # This does not change the OS order.\n    # MAC may not be available from create response,\n    # thus we wait for each request :-(\n    print(\"create NICs {}\".format(str(dc)))\n    for server in dcdef['entities']['servers']['items']:\n        print(\"- server {}\".format(server['properties']['name']))\n        srv_id = server['custom']['id']\n        if 'nics' not in server['entities']:\n            print(\" server {} has no NICs\".format(server['properties']['name']))\n            continue\n        macmap = dict()\n        for nic in server['entities']['nics']['items']:\n            dcnic = getNICObject(nic)\n            response = pbclient.create_nic(dc_id, srv_id, dcnic)\n            # print(\"dcnic response {}\".format(str(response)))\n            # mac = response['properties']['mac'] # we don't get it here !?\n            nic_id = response['id']\n            result = wait_for_request(pbclient, response['requestId'])\n            print(\"wait loop returned {}\".format(str(result)))\n            response = pbclient.get_nic(dc_id, srv_id, nic_id, 2)\n            mac = response['properties']['mac']\n            print(\"dcnic has MAC {} for {}\".format(mac, nic_id))\n            macmap[mac] = nic_id\n        # end for(nic)\n        macs = sorted(macmap)\n        print(\"macs will be displayed by DCD in th following order:\")\n        for mac in macs:\n            print(\"mac {} -> id{}\".format(mac, macmap[mac]))\n    # end for(server)\n    tmpfile = usefile+\".tmp_postnic\"\n    write_dc_definition(dcdef, tmpfile)\n\n    requests = []\n    # don't know if we get a race here too, so better wait for each request :-/\n    print(\"attach volumes {}\".format(str(dc)))\n    for server in dcdef['entities']['servers']['items']:\n        print(\"- server {}\".format(server['properties']['name']))\n        if 'volumes' not in server['entities']:\n            print(\" server {} has no volumes\".format(server['properties']['name']))\n            continue\n        srv_id = server['custom']['id']\n        for volume in server['entities']['volumes']['items']:\n            print(\"OBJ: {}\".format(volume['properties']['name']))\n            response = pbclient.attach_volume(dc_id, srv_id, volume['custom']['id'])\n            result = wait_for_request(pbclient, response['requestId'])\n            print(\"wait loop returned {}\".format(str(result)))\n        # end for(volume)\n    # end for(server)\n    tmpfile = usefile+\".tmp_postatt\"\n    write_dc_definition(dcdef, tmpfile)\n\n    # TODO: do we need to set boot volume for each server?\n    # looks like it's working without\n\n    return 0"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ngetting info of servers of a data center", "response": "def getServerInfo(pbclient=None, dc_id=None):\n    ''' gets info of servers of a data center'''\n    if pbclient is None:\n        raise ValueError(\"argument 'pbclient' must not be None\")\n    if dc_id is None:\n        raise ValueError(\"argument 'dc_id' must not be None\")\n    # list of all found server's info\n    server_info = []\n    # depth 1 is enough for props/meta\n    servers = pbclient.list_servers(dc_id, 1)\n    for server in servers['items']:\n        props = server['properties']\n        info = dict(id=server['id'], name=props['name'],\n                    state=server['metadata']['state'],\n                    vmstate=props['vmState'])\n        server_info.append(info)\n    # end for(servers)\n    return server_info"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef getServerStates(pbclient=None, dc_id=None, serverid=None, servername=None):\n    ''' gets states of a server'''\n    if pbclient is None:\n        raise ValueError(\"argument 'pbclient' must not be None\")\n    if dc_id is None:\n        raise ValueError(\"argument 'dc_id' must not be None\")\n    server = None\n    if serverid is None:\n        if servername is None:\n            raise ValueError(\"one of 'serverid' or 'servername' must be specified\")\n        # so, arg.servername is set (to whatever)\n        server_info = select_where(getServerInfo(pbclient, dc_id),\n                                   ['id', 'name', 'state', 'vmstate'],\n                                   name=servername)\n        if len(server_info) > 1:\n            raise NameError(\"ambiguous server name '{}'\".format(servername))\n        if len(server_info) == 1:\n            server = server_info[0]\n    else:\n        # get by ID may also fail if it's removed\n        # in this case, catch exception (message 404) and be quiet for a while\n        # unfortunately this has changed from Py2 to Py3\n        try:\n            server_info = pbclient.get_server(dc_id, serverid, 1)\n            server = dict(id=server_info['id'],\n                          name=server_info['properties']['name'],\n                          state=server_info['metadata']['state'],\n                          vmstate=server_info['properties']['vmState'])\n        except Exception:\n            ex = sys.exc_info()[1]\n            if ex.args[0] is not None and ex.args[0] == 404:\n                print(\"Server w/ ID {} not found\".format(serverid))\n                server = None\n            else:\n                raise ex\n        # end try/except\n    # end if/else(serverid)\n    return server", "response": "returns a dict of server states"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nwaiting for a server to reach a defined state", "response": "def wait_for_server(pbclient=None, dc_id=None, serverid=None,\n                    indicator='state', state='AVAILABLE', timeout=300):\n    '''\n    wait for a server/VM to reach a defined state for a specified time\n    indicator := {state|vmstate} specifies if server or VM stat is tested\n    state specifies the status the indicator should have\n    '''\n    if pbclient is None:\n        raise ValueError(\"argument 'pbclient' must not be None\")\n    if dc_id is None:\n        raise ValueError(\"argument 'dc_id' must not be None\")\n    if serverid is None:\n        raise ValueError(\"argument 'serverid' must not be None\")\n    total_sleep_time = 0\n    seconds = 5\n    while total_sleep_time < timeout:\n        time.sleep(seconds)\n        total_sleep_time += seconds\n        if total_sleep_time == 60:\n            # Increase polling interval after one minute\n            seconds = 10\n        elif total_sleep_time == 600:\n            # Increase polling interval after 10 minutes\n            seconds = 20\n        server = getServerStates(pbclient, dc_id, serverid)\n        if server[indicator] == state:\n            break\n    # end while(total_sleep_time)\n    return server"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\nwaiting for the data center to become available.", "response": "def wait_for_datacenter(client, data_center_id):\n    '''\n    Poll the data center to become available (for the next provisionig job)\n    '''\n    total_sleep_time = 0\n    seconds = 5\n    while True:\n        state = client.get_datacenter(data_center_id)['metadata']['state']\n        if verbose:\n            print(\"datacenter is {}\".format(state))\n        if state == \"AVAILABLE\":\n            break\n        time.sleep(seconds)\n        total_sleep_time += seconds\n        if total_sleep_time == 60:\n            # Increase polling interval after one minute\n            seconds = 10\n        elif total_sleep_time == 600:\n            # Increase polling interval after 10 minutes\n            seconds = 20"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nwrites user/passwd to login file or get them from file. This method is not Py3 safe (byte vs. str)", "response": "def getLogin(filename, user, passwd):\n    '''\n    write user/passwd to login file or get them from file.\n    This method is not Py3 safe (byte vs. str)\n    '''\n    if filename is None:\n        return (user, passwd)\n    if os.path.exists(filename):\n        print(\"Using file {} for Login\".format(filename))\n        with open(filename, \"r\") as loginfile:\n            encoded_cred = loginfile.read()\n            decoded_cred = b64decode(encoded_cred)\n            login = decoded_cred.split(':', 1)\n            return (login[0], login[1])\n    else:\n        if user is None or passwd is None:\n            raise ValueError(\"user and password must not be None\")\n        print(\"Writing file {} for Login\".format(filename))\n        with open(filename, \"w\") as loginfile:\n            encoded_cred = b64encode(user+\":\"+passwd)\n            loginfile.write(encoded_cred)\n        return (user, passwd)"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef wait_for_request(pbclient, request_id,\n                     timeout=0, initial_wait=5, scaleup=10):\n    '''\n    Waits for a request to finish until timeout.\n    timeout==0 is interpreted as infinite wait time.\n    Returns a tuple (return code, request status, message) where return code\n    0  : request successful\n    1  : request failed\n    -1 : timeout exceeded\n    The wait_period is increased every scaleup steps to adjust for long\n    running requests.\n    '''\n    total_wait = 0\n    wait_period = initial_wait\n    next_scaleup = scaleup * wait_period\n    wait = True\n    while wait:\n        request_status = pbclient.get_request(request_id, status=True)\n        state = request_status['metadata']['status']\n        if state == \"DONE\":\n            return(0, state, request_status['metadata']['message'])\n        if state == 'FAILED':\n            return(1, state, request_status['metadata']['message'])\n        if verbose > 0:\n            print(\"Request '{}' is in state '{}'. Sleeping for {} seconds...\"\n                  .format(request_id, state, wait_period))\n        sleep(wait_period)\n        total_wait += wait_period\n        if timeout != 0 and total_wait > timeout:\n            wait = False\n        next_scaleup -= wait_period\n        if next_scaleup == 0:\n            wait_period += initial_wait\n            next_scaleup = scaleup * wait_period\n            if verbose > 0:\n                print(\"scaling up wait_period to {}, next change in {} seconds\"\n                      .format(wait_period, next_scaleup))\n    # end while(wait)\n    return(-1, state, \"request not finished before timeout\")", "response": "Waits for a request to finish."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef main(argv=None):\n    '''Parse command line options and create a server/volume composite.'''\n\n    if argv is None:\n        argv = sys.argv\n    else:\n        sys.argv.extend(argv)\n\n    program_name = os.path.basename(sys.argv[0])\n    program_version = \"v%s\" % __version__\n    program_build_date = str(__updated__)\n    program_version_message = '%%(prog)s %s (%s)' % (program_version,\n                                                     program_build_date)\n    program_shortdesc = __import__('__main__').__doc__.split(\"\\n\")[1]\n    program_license = '''%s\n\n  Created by J. Buchhammer on %s.\n  Copyright 2016 ProfitBricks GmbH. All rights reserved.\n\n  Licensed under the Apache License 2.0\n  http://www.apache.org/licenses/LICENSE-2.0\n\n  Distributed on an \"AS IS\" basis without warranties\n  or conditions of any kind, either express or implied.\n\nUSAGE\n''' % (program_shortdesc, str(__date__))\n\n    try:\n        # Setup argument parser\n        parser = ArgumentParser(description=program_license,\n                                formatter_class=RawDescriptionHelpFormatter)\n        parser.add_argument('-u', '--user', dest='user', help='the login name')\n        parser.add_argument('-p', '--password', dest='password',\n                            help='the login password')\n        parser.add_argument('-L', '--Login', dest='loginfile', default=None,\n                            help='the login file to use')\n        parser.add_argument('-d', '--datacenterid', dest='datacenterid',\n                            required=True, default=None,\n                            help='datacenter of the new server')\n        parser.add_argument('-l', '--lanid', dest='lanid', required=True,\n                            default=None, help='LAN of the new server')\n        parser.add_argument('-n', '--name', dest='servername',\n                            default=\"SRV_\"+datetime.now().isoformat(),\n                            help='name of the new server')\n        parser.add_argument('-c', '--cores', dest='cores', type=int,\n                            default=2, help='CPU cores')\n        parser.add_argument('-r', '--ram', dest='ram', type=int, default=4,\n                            help='RAM in GB')\n        parser.add_argument('-s', '--storage', dest='storage', type=int,\n                            default=4, help='storage in GB')\n        parser.add_argument('-b', '--boot', dest='bootdevice', default=\"HDD\",\n                            help='boot device')\n        parser.add_argument('-i', '--imageid', dest='imageid', default=None,\n                            help='installation image')\n        parser.add_argument('-P', '--imagepassword', dest='imgpassword',\n                            default=None, help='the image password')\n        parser.add_argument('-v', '--verbose', dest=\"verbose\", action=\"count\",\n                            help=\"set verbosity level [default: %(default)s]\")\n        parser.add_argument('-V', '--version', action='version',\n                            version=program_version_message)\n\n        # Process arguments\n        args = parser.parse_args()\n        global verbose\n        verbose = args.verbose\n        dc_id = args.datacenterid\n        lan_id = args.lanid\n        servername = args.servername\n\n        if verbose > 0:\n            print(\"Verbose mode on\")\n            print(\"start {} with args {}\".format(program_name, str(args)))\n\n        # Test images (location de/fra)\n        # CDROM: 7fc885b3-c9a6-11e5-aa10-52540005ab80   # debian-8.3.0-amd64-netinst.iso\n        # HDD:   28007a6d-c88a-11e5-aa10-52540005ab80   # CentOS-7-server-2016-02-01\n        hdimage = args.imageid\n        cdimage = None\n        if args.bootdevice == \"CDROM\":\n            hdimage = None\n            cdimage = args.imageid\n        print(\"using boot device {} with image {}\"\n              .format(args.bootdevice, args.imageid))\n\n        (user, password) = getLogin(args.loginfile, args.user, args.password)\n        if user is None or password is None:\n            raise ValueError(\"user or password resolved to None\")\n        pbclient = ProfitBricksService(user, password)\n\n        first_nic = NIC(name=\"local\", ips=[], dhcp=True, lan=lan_id)\n        volume = Volume(name=servername+\"-Disk\", size=args.storage,\n                        image=hdimage, image_password=args.imgpassword)\n        server = Server(name=servername, cores=args.cores, ram=args.ram*1024,\n                        create_volumes=[volume], nics=[first_nic],\n                        boot_cdrom=cdimage)\n        print(\"creating server..\")\n        if verbose > 0:\n            print(\"SERVER: {}\".format(str(server)))\n        response = pbclient.create_server(dc_id, server)\n        print(\"wait for provisioning..\")\n        wait_for_request(pbclient, response[\"requestId\"])\n        server_id = response['id']\n        print(\"Server provisioned with ID {}\".format(server_id))\n        nics = pbclient.list_nics(dc_id, server_id, 1)\n        # server should have exactly one nic, but we only test empty nic list\n        if not nics['items']:\n            raise CLIError(\"No NICs found for newly created server {}\"\n                           .format(server_id))\n        nic0 = nics['items'][0]\n        if verbose > 0:\n            print(\"NIC0: {}\".format(str(nic0)))\n        (nic_id, nic_mac) = (nic0['id'], nic0['properties']['mac'])\n        print(\"NIC of new Server has ID {} and MAC {}\".format(nic_id, nic_mac))\n        print(\"{} finished w/o errors\".format(program_name))\n        return 0\n\n    except KeyboardInterrupt:\n        # handle keyboard interrupt #\n        return 0\n    except Exception:\n        traceback.print_exc()\n        sys.stderr.write(\"\\n\" + program_name + \":  for help use --help\\n\")\n        return 2", "response": "Main entry point for the\nSSHA command line interface."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef get_dc_inventory(pbclient, dc=None):\n    ''' gets inventory of one data center'''\n    if pbclient is None:\n        raise ValueError(\"argument 'pbclient' must not be None\")\n    if dc is None:\n        raise ValueError(\"argument 'dc' must not be None\")\n    dc_inv = []   # inventory list to return\n    dcid = dc['id']\n    # dc_data contains dc specific columns\n    dc_data = [dcid, dc['properties']['name'], dc['properties']['location']]\n    # first get the servers\n    # this will build a hash to relate volumes to servers later\n    # depth 3 is enough to get into volume/nic level plus details\n    servers = pbclient.list_servers(dcid, 3)\n    print(\"found %i servers in data center %s\" % (len(servers['items']), dc['properties']['name']))\n    if verbose > 2:\n        print(str(servers))\n    # this will build a hash to relate volumes to servers later\n    bound_vols = dict()   # hash volume-to-server relations\n    for server in servers['items']:\n        if verbose > 2:\n            print(\"SERVER: %s\" % str(server))\n        serverid = server['id']\n        # server_data contains server specific columns for later output\n        server_data = [\n            server['type'], serverid, server['properties']['name'],\n            server['metadata']['state']\n        ]\n        # OS is determined by boot device (volume||cdrom), not a server property.\n        # Might even be unspecified\n        bootOS = \"NONE\"\n        bootdev = server['properties']['bootVolume']\n        if bootdev is None:\n            bootdev = server['properties']['bootCdrom']\n            print(\"server %s has boot device %s\" % (serverid, \"CDROM\"))\n        if bootdev is None:\n            print(\"server %s has NO boot device\" % (serverid))\n        else:\n            bootOS = bootdev['properties']['licenceType']\n        server_data += [bootOS, server['properties']['cores'], server['properties']['ram']]\n        server_vols = server['entities']['volumes']['items']\n        n_volumes = len(server_vols)\n        total_disk = 0\n        licence_type = \"\"\n        for vol in server_vols:\n            total_disk += vol['properties']['size']\n            licence_type = str(vol['properties']['licenceType'])\n            bound_vols[vol['id']] = serverid\n            if verbose:\n                print(\"volume %s is connected to %s w/ OS %s\" % (\n                    vol['id'], bound_vols[vol['id']], licence_type))\n        server_nics = server['entities']['nics']['items']\n        n_nics = len(server_nics)\n        server_data += [\n            n_nics, n_volumes, total_disk, \"\",\n            server['metadata']['createdDate'], server['metadata']['lastModifiedDate']\n        ]\n        dc_inv.append(dc_data + server_data)\n    # end for(servers)\n\n    # and now the volumes...\n    volumes = pbclient.list_volumes(dcid, 2)   # depth 2 gives max. details\n    for volume in volumes['items']:\n        if verbose > 2:\n            print(\"VOLUME: %s\" % str(volume))\n        volid = volume['id']\n        vol_data = [\n            volume['type'], volid, volume['properties']['name'], volume['metadata']['state'],\n            volume['properties']['licenceType'], \"\", \"\", \"\", \"\", volume['properties']['size']\n        ]\n        connect = 'NONE'\n        if volid in bound_vols:\n            connect = bound_vols[volid]\n        vol_data += [\n            connect, volume['metadata']['createdDate'], volume['metadata']['lastModifiedDate']\n        ]\n        dc_inv.append(dc_data + vol_data)\n    # end for(volumes)\n    return dc_inv", "response": "gets inventory of one data center"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef get_dc_network(pbclient, dc=None):\n    ''' gets inventory of one data center'''\n    if pbclient is None:\n        raise ValueError(\"argument 'pbclient' must not be None\")\n    if dc is None:\n        raise ValueError(\"argument 'dc' must not be None\")\n    print(\"getting networks..\")\n    dcid = dc['id']\n    # dc_data contains dc specific columns\n    dc_data = [dcid, dc['properties']['name'], dc['properties']['location']]\n    lbs = pbclient.list_loadbalancers(dcid, 2)\n    # build lookup hash for loadbalancer's ID->name\n    lbnames = dict([(lb['id'], lb['properties']['name']) for lb in lbs['items']])\n    if verbose > 2:\n        print(\"LBs: %s\" % (str(lbs)))\n    lans = pbclient.list_lans(dcid, 3)\n    lan_inv = []\n    # lookup hash for server's ID->name\n    servernames = dict()\n    for lan in lans['items']:\n        if verbose > 1:\n            print(\"LAN: %s\" % str(lan))\n        lan_data = dc_data + [\n            \"LAN \"+lan['id'], lan['properties']['name'], lan['properties']['public'],\n            lan['metadata']['state']\n        ]\n        nics = lan['entities']['nics']['items']\n        lan_data.append(len(nics))\n        if nics:\n            for nic in nics:\n                nic_props = nic['properties']\n                # get the serverid of this nic by href\n                # !!! HUUUUH this might also be a loadbalancer ID,\n                # although it's '/servers/<id>/...' !!!\n                serverid = re.sub(r'^.*servers/([^/]+)/nics.*', r'\\1', nic['href'])\n                if serverid in lbnames:\n                    servertype = \"LB\"\n                    servername = lbnames[serverid]\n                    print(\"server entry for %s is LOADBALANCER %s\" % (serverid, servername))\n                else:\n                    servertype = \"Server\"\n                    if serverid not in servernames:\n                        if verbose:\n                            print(\"add server entry for %s\" % serverid)\n                        server = pbclient.get_server(dcid, serverid, 0)\n                        servernames[serverid] = server['properties']['name']\n                    servername = servernames[serverid]\n                # end if/else(serverid)\n                ips = [str(ip) for ip in nic_props['ips']]\n                nic_data = [\n                    nic['id'], nic_props['mac'], nic_props['dhcp'], ips, nic_props['name'],\n                    nic_props['firewallActive'], servertype, serverid, servername\n                ]\n                lan_inv.append(lan_data+nic_data)\n            # end for(nics)\n        else:\n            lan_inv.append(lan_data)\n    # end for(lans)\n    return lan_inv", "response": "gets inventory of one data center"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef main(argv=None):                # IGNORE:C0111\n    '''Command line options.'''\n\n    if argv is None:\n        argv = sys.argv\n    else:\n        sys.argv.extend(argv)\n\n    program_name = os.path.basename(sys.argv[0])\n    program_version = \"v%s\" % __version__\n    program_build_date = str(__updated__)\n    program_version_message = '%%(prog)s %s (%s)' % (program_version, program_build_date)\n    program_shortdesc = __import__('__main__').__doc__.split(\"\\n\")[1]\n    program_license = '''%s\n\n  Created by J.Buchhammer on %s.\n  Copyright 2016 ProfitBricks GmbH. All rights reserved.\n\n  Licensed under the Apache License 2.0\n  http://www.apache.org/licenses/LICENSE-2.0\n\n  Distributed on an \"AS IS\" basis without warranties\n  or conditions of any kind, either express or implied.\n\nUSAGE\n''' % (program_shortdesc, str(__date__))\n\n    try:\n        # Setup argument parser\n        parser = ArgumentParser(\n            description=program_license,\n            formatter_class=RawDescriptionHelpFormatter)\n        parser.add_argument(\n            '-u', '--user', dest='user', required=True, help='the login name')\n        parser.add_argument(\n            '-p', '--password', dest='password', help='the login password')\n        parser.add_argument(\n            '-d', '--datacenter', '--datacenterid', dest='datacenterid', nargs='?', const='*',\n            help='show server/storage of datacenter(s)')\n        parser.add_argument(\n            '-i', '--image', dest='show_images', action=\"store_true\",\n            help='show images and snapshots')\n        parser.add_argument(\n            '-b', '--ipblock', dest='show_ipblocks', action=\"store_true\",\n            help='show reserved IP blocks')\n        parser.add_argument(\n            '-n', '--network', dest='show_networks', action=\"store_true\",\n            help='show network assignments')\n#        parser.add_argument(\n#            '-r', '--request', dest='show_requests', action=\"store_true\",\n#            help='show requests')\n        parser.add_argument(\n            \"-v\", \"--verbose\", dest=\"verbose\", action=\"count\", default=0,\n            help=\"set verbosity level [default: %(default)s]\")\n        parser.add_argument(\n            '-V', '--version', action='version', version=program_version_message)\n\n        # Process arguments\n        args = parser.parse_args()\n        global verbose\n        verbose = args.verbose   # this is a global to be used in methods\n        user = args.user\n        password = args.password\n        datacenterid = args.datacenterid\n\n        print(\"Welcome to PB-API %s\\n\" % user)\n        if password is None:\n            password = getpass()\n        if verbose > 0:\n            print(\"Verbose mode on\")\n            print(\"using python \", sys.version_info)\n\n        pbclient = ProfitBricksService(user, password)\n\n        if datacenterid is not None:\n            datacenters = {}\n            if datacenterid == '*':\n                # the default depth=1 is sufficient, higher values don't provide more details\n                datacenters = pbclient.list_datacenters()\n            else:\n                datacenters['items'] = []\n                datacenters['items'] = [pbclient.get_datacenter(datacenterid, 1)]\n            if verbose > 1:\n                print(pp(datacenters))\n            print(\"retrieved %i datacenters \" % len(datacenters['items']))\n\n            # dump inventory to file\n            with open(\"pb_datacenter_inventory.csv\", 'w') as csvfile:\n                csvwriter = csv.writer(csvfile, delimiter=';', lineterminator='\\n')\n                csvwriter.writerow([\n                    'DCID', 'DCName', 'Loc', 'RscType', 'RscID', 'RscName', 'State', 'LicType',\n                    'Cores', 'RAM', '# NICs', '# Volumes', '(Total) Storage', 'Connected to',\n                    'Created', 'Modified'\n                ])\n                for dc in datacenters['items']:\n                    try:\n                        dc_inv = get_dc_inventory(pbclient, dc)\n                        if verbose:\n                            print(\"DC %s has %i inventory entries\" % (dc['id'], len(dc_inv)))\n                        for row in dc_inv:\n                            csvwriter.writerow(row)\n                    except Exception:\n                        traceback.print_exc()\n                        exit(2)\n                # end for(datacenters)\n\n        if args.show_images:\n            with open(\"pb_datacenter_images.csv\", 'w') as csvfile:\n                csvwriter = csv.writer(csvfile, delimiter=';', lineterminator='\\n')\n                csvwriter.writerow([\n                    'Visibility', 'Loc', 'RscType', 'SubType', 'RscID', 'RscName',\n                    'State', 'LicType', 'Size', 'Created', 'Modified'\n                ])\n                img_inv = get_images(pbclient)\n                for row in img_inv:\n                    csvwriter.writerow(row)\n                snap_inv = get_snapshots(pbclient)\n                for row in snap_inv:\n                    csvwriter.writerow(row)\n\n        if args.show_ipblocks:\n            with open(\"pb_datacenter_ipblocks.csv\", 'w') as csvfile:\n                csvwriter = csv.writer(csvfile, delimiter=';', lineterminator='\\n')\n                csvwriter.writerow([\n                    'Loc', 'RscType', 'RscID', 'State', 'Size', 'IP addresses'])\n                ipblocks = get_ipblocks(pbclient)\n                for row in ipblocks:\n                    csvwriter.writerow(row)\n\n        # file is automatically closed after with block\n        if args.show_networks:\n            # the default depth=1 is sufficient, higher values don't provide more details\n            datacenters = pbclient.list_datacenters()\n            print(\"retrieved %i datacenters \" % len(datacenters['items']))\n            with open(\"pb_datacenter_networks.csv\", 'w') as csvfile:\n                csvwriter = csv.writer(csvfile, delimiter=';', lineterminator='\\n')\n                csvwriter.writerow([\n                    'DCID', 'DCName', 'Loc',\n                    'LAN ID', 'LAN name', 'public', 'State', '# NICs',\n                    'NIC ID', 'MAC address', 'DHCP', 'IP(s)', 'NIC name', 'Firewall',\n                    'Connected to', 'ID', 'Name'])\n\n                for dc in datacenters['items']:\n                    try:\n                        dc_net = get_dc_network(pbclient, dc)\n                        if verbose:\n                            print(\"DC %s has %i network entries\" % (dc['id'], len(dc_net)))\n                        for row in dc_net:\n                            csvwriter.writerow(row)\n                    except Exception:\n                        traceback.print_exc()\n                        exit(2)\n                # end for(datacenters)\n\n        # just for fun:\n#         if args.show_requests:\n#             get_requests(pbclient)\n        print(\"%s finished w/o errors\" % program_name)\n        return 0\n    except KeyboardInterrupt:\n        # handle keyboard interrupt\n        return 0\n    except Exception:\n        traceback.print_exc()\n        sys.stderr.write(\"\\n\" + program_name + \":  for help use --help\\n\")\n        return 2", "response": "Entry point for the AISA - CAN - SECRET - KEY command line interface."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef main(argv=None):\n    '''Parse command line options and dump a datacenter to snapshots and file.'''\n\n    if argv is None:\n        argv = sys.argv\n    else:\n        sys.argv.extend(argv)\n\n    program_name = os.path.basename(sys.argv[0])\n    program_version = \"v%s\" % __version__\n    program_build_date = str(__updated__)\n    program_version_message = '%%(prog)s %s (%s)' % (program_version, program_build_date)\n    program_shortdesc = __import__('__main__').__doc__.split(\"\\n\")[1]\n    program_license = '''%s\n\n  Created by J. Buchhammer on %s.\n  Copyright 2016 ProfitBricks GmbH. All rights reserved.\n\n  Licensed under the Apache License 2.0\n  http://www.apache.org/licenses/LICENSE-2.0\n\n  Distributed on an \"AS IS\" basis without warranties\n  or conditions of any kind, either express or implied.\n\nUSAGE\n''' % (program_shortdesc, str(__date__))\n\n    try:\n        # Setup argument parser\n        parser = ArgumentParser(description=program_license,\n                                formatter_class=RawDescriptionHelpFormatter)\n        parser.add_argument('-u', '--user', dest='user', help='the login name')\n        parser.add_argument('-p', '--password', dest='password',\n                            help='the login password')\n        parser.add_argument('-L', '--Login', dest='loginfile', default=None,\n                            help='the login file to use')\n        parser.add_argument('-d', '--datacenterid', dest='dc_id',\n                            required=True, default=None,\n                            help='datacenter ID of the server')\n        parser.add_argument('-o', '--outfile', dest='outfile',\n                            default='dc-def_'+datetime.now().strftime('%Y-%m-%d_%H%M%S'),\n                            help='the output file name')\n        parser.add_argument('-S', '--Stopalways', dest='stopalways', action='store_true',\n                            help='power off even when VM is running')\n        parser.add_argument('-v', '--verbose', dest=\"verbose\", action=\"count\",\n                            default=0, help=\"set verbosity level [default: %(default)s]\")\n        parser.add_argument('-V', '--version', action='version',\n                            version=program_version_message)\n\n        # Process arguments\n        args = parser.parse_args()\n        global verbose\n        verbose = args.verbose\n\n        if verbose > 0:\n            print(\"Verbose mode on\")\n            print(\"start {} with args {}\".format(program_name, str(args)))\n\n        outfile = args.outfile\n        if outfile.endswith(\".json\"):\n            outfile = os.path.splitext(outfile)\n        print(\"Using output file base name '{}'\".format(outfile))\n\n        (user, password) = getLogin(args.loginfile, args.user, args.password)\n        if user is None or password is None:\n            raise ValueError(\"user or password resolved to None\")\n        pbclient = ProfitBricksService(user, password)\n\n        dc_id = args.dc_id\n\n        # first get all server's VM and OS state to see if we can start\n        srv_info = getServerInfo(pbclient, dc_id)\n        srvon = 0\n        for server in srv_info:\n            if server['vmstate'] != 'SHUTOFF':\n                print(\"VM {} is in state {}, but should be SHUTOFF\"\n                      .format(server['name'], server['vmstate']))\n                srvon += 1\n        # end for(srv_info)\n        if srvon > 0 and not args.stopalways:\n            print(\"shutdown running OS before trying again\")\n            return 1\n        # now power off all VMs before starting the snapshots\n        for server in srv_info:\n            controlServerState(pbclient, dc_id, server['id'], action='POWEROFF')\n\n        # now let's go\n        dcdef = pbclient.get_datacenter(dc_id, 5)\n        print(\"starting dump of datacenter {}\".format(dcdef['properties']['name']))\n        dcdef_file = outfile+'_source.json'\n        print(\"write source dc to {}\".format(dcdef_file))\n        write_dc_definition(dcdef, dcdef_file)\n        print(\"get existing Snapshots\")\n        # first get existing snapshots\n        known_snapshots = dict()\n        snapshots = pbclient.list_snapshots()\n        for snap in snapshots['items']:\n            print(\"SNAP : {}\".format(json.dumps(snap)))\n            known_snapshots[snap['properties']['name']] = snap['id']\n        print(\"create Snapshots, this may take a while ..\")\n        # we do NOT consider dangling volumes, only server-attached ones\n        vol_snapshots = dict()   # map volume id==snapshot name snapshot id\n        for server in dcdef['entities']['servers']['items']:\n            print(\"- server {}\".format(server['properties']['name']))\n            if 'volumes' not in server['entities']:\n                print(\" server {} has no volumes\"\n                      .format(server['properties']['name']))\n                continue\n            # The volumes are attached by order of creation\n            # Thus we must sort them to keep the order in the clone\n            print(\"setting volume order by deviceNumber\")\n            volumes = server['entities']['volumes']['items']\n            new_order = sorted(volumes, key=lambda vol: vol['properties']['deviceNumber'])\n            server['entities']['volumes']['items'] = new_order\n            for volume in server['entities']['volumes']['items']:\n                vol_id = volume['id']   # this will be the name too\n                if vol_id in known_snapshots:\n                    print(\"use existing snapshot {} of volume {}\"\n                          .format(vol_id, volume['properties']['name']))\n                    vol_snapshots[vol_id] = known_snapshots[vol_id]\n                else:\n                    print(\"taking snapshot {} of volume {}\"\n                          .format(vol_id, volume['properties']['name']))\n                    response = pbclient.create_snapshot(dc_id, vol_id, vol_id,\n                                                        \"auto-created by pb_snapshotDatacenter\")\n                    # response has no request id, need to check metadata state (BUSY, AVAILABLE..)\n                    vol_snapshots[vol_id] = response['id']\n                    print(\"snapshot in progress: {}\".format(str(response)))\n            # end for(volume)\n        # end for(server)\n        print(\"Waiting for snapshots to complete\")\n        snapdone = dict()\n        while len(snapdone) != len(vol_snapshots):\n            sleep(10)\n            for snap_id in vol_snapshots.values():\n                print(\"looking for {}\".format(snap_id))\n                if snap_id in snapdone:\n                    continue\n                snapshot = pbclient.get_snapshot(snap_id)\n                print(\"snapshot {} is in state {}\"\n                      .format(snap_id, snapshot['metadata']['state']))\n                if snapshot['metadata']['state'] == 'AVAILABLE':\n                    snapdone[snap_id] = snapshot['metadata']['state']\n            # end for(vol_snapshots)\n        # end while(snapdone)\n\n        # now replace the volumes image IDs\n        print(\"setting snapshot id to volumes\")\n        for server in dcdef['entities']['servers']['items']:\n            print(\"- server {}\".format(server['properties']['name']))\n            if 'volumes' not in server['entities']:\n                print(\" server {} has no volumes\"\n                      .format(server['properties']['name']))\n                continue\n            for volume in server['entities']['volumes']['items']:\n                vol_id = volume['id']   # this will be the name too\n                volume['properties']['image'] = vol_snapshots[vol_id]\n            # end for(volume)\n        # end for(server)\n\n        # As it came out, the LAN id is rearranged by order of creation\n        # Thus we must sort the LANs to keep the order in the clone\n        print(\"setting LAN order by id\")\n        lans = dcdef['entities']['lans']['items']\n        new_order = sorted(lans, key=lambda lan: lan['id'])\n        dcdef['entities']['lans']['items'] = new_order\n\n        # now sort unordered NICs by MAC and save the dcdef\n        # reason is, that NICs seem to be ordered by MAC, but API response\n        # doesn't guarantee the order, which we need for re-creation\n        print(\"setting NIC order by MAC\")\n        for server in dcdef['entities']['servers']['items']:\n            print(\"- server {}\".format(server['properties']['name']))\n            if 'nics' not in server['entities']:\n                print(\" server {} has no nics\"\n                      .format(server['properties']['name']))\n                continue\n            nics = server['entities']['nics']['items']\n            # print(\"NICs before {}\".format(json.dumps(nics)))\n            new_order = sorted(nics, key=lambda nic: nic['properties']['mac'])\n            # print(\"NICs after {}\".format(json.dumps(new_order)))\n            server['entities']['nics']['items'] = new_order\n        # end for(server)\n        dcdef_file = outfile+'.json'\n        print(\"write snapshot dc to {}\".format(dcdef_file))\n        write_dc_definition(dcdef, dcdef_file)\n\n        return 0\n\n    except KeyboardInterrupt:\n        # handle keyboard interrupt\n        return 0\n    except Exception:\n        traceback.print_exc()\n        sys.stderr.write(\"\\n\" + program_name + \":  for help use --help\\n\")\n        return 2", "response": "Main entry point for the\nSSHCommand line interface."}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nget details about the currently authenticated user", "response": "def get_self(session, user_details=None):\n    \"\"\"\n    Get details about the currently authenticated user\n    \"\"\"\n    # Set compact to true\n    if user_details:\n        user_details['compact'] = True\n    response = make_get_request(session, 'self', params_data=user_details)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['result']\n    else:\n        raise SelfNotRetrievedException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id']\n        )"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef get_user_by_id(session, user_id, user_details=None):\n    if user_details:\n        user_details['compact'] = True\n    response = make_get_request(\n        session, 'users/{}'.format(user_id), params_data=user_details)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['result']\n    else:\n        raise UserNotFoundException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id']\n        )", "response": "Get details about specific user"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef get_self_user_id(session):\n    response = make_get_request(session, 'self')\n    if response.status_code == 200:\n        return response.json()['result']['id']\n    else:\n        raise UserIdNotRetrievedException(\n            'Error retrieving user id: %s' % response.text, response.text)", "response": "Get the currently authenticated user id"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nadds a list of jobs to the currently authenticated user", "response": "def add_user_jobs(session, job_ids):\n    \"\"\"\n    Add a list of jobs to the currently authenticated user\n    \"\"\"\n    jobs_data = {\n        'jobs[]': job_ids\n    }\n    response = make_post_request(session, 'self/jobs', json_data=jobs_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['status']\n    else:\n        raise UserJobsNotAddedException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id'])"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nsets the currently authenticated user s list of jobs with a new list of jobs.", "response": "def set_user_jobs(session, job_ids):\n    \"\"\"\n    Replace the currently authenticated user's list of jobs with a new list of\n    jobs\n    \"\"\"\n    jobs_data = {\n        'jobs[]': job_ids\n    }\n    response = make_put_request(session, 'self/jobs', json_data=jobs_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['status']\n    else:\n        raise UserJobsNotSetException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id'])"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef delete_user_jobs(session, job_ids):\n    jobs_data = {\n        'jobs[]': job_ids\n    }\n    response = make_delete_request(session, 'self/jobs', json_data=jobs_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['status']\n    else:\n        raise UserJobsNotDeletedException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id'])", "response": "Remove a list of jobs from the currently authenticated user\n   "}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\ngetting one or more users", "response": "def get_users(session, query):\n    \"\"\"\n    Get one or more users\n    \"\"\"\n    # GET /api/users/0.1/users\n    response = make_get_request(session, 'users', params_data=query)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['result']\n    else:\n        raise UsersNotFoundException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id'])"}
{"SOURCE": "codesearchnet", "instruction": "Given the following Python 3 function, write the documentation\ndef create_project(session, title, description,\n                   currency, budget, jobs):\n    \"\"\"\n    Create a project\n    \"\"\"\n    project_data = {'title': title,\n                    'description': description,\n                    'currency': currency,\n                    'budget': budget,\n                    'jobs': jobs\n                    }\n\n    # POST /api/projects/0.1/projects/\n    response = make_post_request(session, 'projects', json_data=project_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        project_data = json_data['result']\n        p = Project(project_data)\n        p.url = urljoin(session.url, 'projects/%s' % p.seo_url)\n        return p\n    else:\n        raise ProjectNotCreatedException(message=json_data['message'],\n                                         error_code=json_data['error_code'],\n                                         request_id=json_data['request_id'],\n                                         )", "response": "Create a new project in the order they were created."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef create_hireme_project(session, title, description,\n                          currency, budget, jobs, hireme_initial_bid):\n    \"\"\"\n    Create a fixed project\n    \"\"\"\n    jobs.append(create_job_object(id=417))  # Hire Me job, required\n\n    project_data = {'title': title,\n                    'description': description,\n                    'currency': currency,\n                    'budget': budget,\n                    'jobs': jobs,\n                    'hireme': True,\n                    'hireme_initial_bid': hireme_initial_bid\n                    }\n\n    # POST /api/projects/0.1/projects/\n    response = make_post_request(session, 'projects', json_data=project_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        project_data = json_data['result']\n        p = Project(project_data)\n        p.url = urljoin(session.url, 'projects/%s' % p.seo_url)\n        return p\n    else:\n        raise ProjectNotCreatedException(message=json_data['message'],\n                                         error_code=json_data['error_code'],\n                                         request_id=json_data['request_id'],\n                                         )", "response": "Create a HireMe project"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef get_projects(session, query):\n    # GET /api/projects/0.1/projects\n    response = make_get_request(session, 'projects', params_data=query)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['result']\n    else:\n        raise ProjectsNotFoundException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id'])", "response": "Get one or more projects\n   "}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef get_project_by_id(session, project_id, project_details=None, user_details=None):\n    # GET /api/projects/0.1/projects/<int:project_id>\n    query = {}\n    if project_details:\n        query.update(project_details)\n    if user_details:\n        query.update(user_details)\n    response = make_get_request(\n        session, 'projects/{}'.format(project_id), params_data=query)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['result']\n    else:\n        raise ProjectsNotFoundException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id']\n        )", "response": "Get a single project by ID"}
{"SOURCE": "codesearchnet", "instruction": "How would you code a function in Python 3 to\nsearch for all projects in the database.", "response": "def search_projects(session,\n                    query,\n                    search_filter=None,\n                    project_details=None,\n                    user_details=None,\n                    limit=10,\n                    offset=0,\n                    active_only=None):\n    \"\"\"\n    Search for all projects\n    \"\"\"\n    search_data = {\n        'query': query,\n        'limit': limit,\n        'offset': offset,\n    }\n    if search_filter:\n        search_data.update(search_filter)\n    if project_details:\n        search_data.update(project_details)\n    if user_details:\n        search_data.update(user_details)\n\n    # GET /api/projects/0.1/projects/all/\n    # GET /api/projects/0.1/projects/active/\n    endpoint = 'projects/{}'.format('active' if active_only else 'all')\n    response = make_get_request(session, endpoint, params_data=search_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['result']\n    else:\n        raise ProjectsNotFoundException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id'])"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate a brief explanation for the following Python 3 code\ndef place_project_bid(session, project_id, bidder_id, description, amount,\n                      period, milestone_percentage):\n    \"\"\"\n    Place a bid on a project\n    \"\"\"\n    bid_data = {\n        'project_id': project_id,\n        'bidder_id': bidder_id,\n        'description': description,\n        'amount': amount,\n        'period': period,\n        'milestone_percentage': milestone_percentage,\n    }\n    # POST /api/projects/0.1/bids/\n    response = make_post_request(session, 'bids', json_data=bid_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        bid_data = json_data['result']\n        return Bid(bid_data)\n    else:\n        raise BidNotPlacedException(message=json_data['message'],\n                                    error_code=json_data['error_code'],\n                                    request_id=json_data['request_id'])", "response": "Place a bid on a project"}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef get_bids(session, project_ids=[], bid_ids=[], limit=10, offset=0):\n    get_bids_data = {}\n    if bid_ids:\n        get_bids_data['bids[]'] = bid_ids\n    if project_ids:\n        get_bids_data['projects[]'] = project_ids\n    get_bids_data['limit'] = limit\n    get_bids_data['offset'] = offset\n    # GET /api/projects/0.1/bids/\n    response = make_get_request(session, 'bids', params_data=get_bids_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['result']\n    else:\n        raise BidsNotFoundException(\n            message=json_data['message'], error_code=json_data['error_code'],\n            request_id=json_data['request_id']\n        )", "response": "Get the list of bids for a given project."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef get_milestones(session, project_ids=[], milestone_ids=[], user_details=None, limit=10, offset=0):\n    get_milestones_data = {}\n    if milestone_ids:\n        get_milestones_data['milestones[]'] = milestone_ids\n    if project_ids:\n        get_milestones_data['projects[]'] = project_ids\n    get_milestones_data['limit'] = limit\n    get_milestones_data['offset'] = offset\n\n    # Add projections if they exist\n    if user_details:\n        get_milestones_data.update(user_details)\n\n    # GET /api/projects/0.1/milestones/\n\n    response = make_get_request(\n        session, 'milestones', params_data=get_milestones_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['result']\n    else:\n        raise MilestonesNotFoundException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id']\n        )", "response": "Get the list of milestones for a project."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef get_milestone_by_id(session, milestone_id, user_details=None):\n    # GET /api/projects/0.1/milestones/{milestone_id}/\n    endpoint = 'milestones/{}'.format(milestone_id)\n\n    response = make_get_request(session, endpoint, params_data=user_details)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['result']\n    else:\n        raise MilestonesNotFoundException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id']\n        )", "response": "Get a specific milestone by ID."}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\naward a bid on a project", "response": "def award_project_bid(session, bid_id):\n    \"\"\"\n    Award a bid on a project\n    \"\"\"\n    headers = {\n        'Content-Type': 'application/x-www-form-urlencoded'\n    }\n    bid_data = {\n        'action': 'award'\n    }\n    # POST /api/projects/0.1/bids/{bid_id}/?action=award\n    endpoint = 'bids/{}'.format(bid_id)\n    response = make_put_request(session, endpoint, headers=headers,\n                                params_data=bid_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['status']\n    else:\n        json_data = response.json()\n        raise BidNotAwardedException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id']\n        )"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nrevoke a bid on a project", "response": "def revoke_project_bid(session, bid_id):\n    \"\"\"\n    Revoke a bid on a project\n    \"\"\"\n    headers = {\n        'Content-Type': 'application/x-www-form-urlencoded'\n    }\n    bid_data = {\n        'action': 'revoke'\n    }\n    # POST /api/projects/0.1/bids/{bid_id}/?action=revoke\n    endpoint = 'bids/{}'.format(bid_id)\n    response = make_put_request(session, endpoint, headers=headers,\n                                params_data=bid_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['status']\n    else:\n        json_data = response.json()\n        raise BidNotRevokedException(message=json_data['message'],\n                                     error_code=json_data['error_code'],\n                                     request_id=json_data['request_id'])"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef accept_project_bid(session, bid_id):\n    headers = {\n        'Content-Type': 'application/x-www-form-urlencoded'\n    }\n    bid_data = {\n        'action': 'accept'\n    }\n    # POST /api/projects/0.1/bids/{bid_id}/?action=revoke\n    endpoint = 'bids/{}'.format(bid_id)\n    response = make_put_request(session, endpoint, headers=headers,\n                                params_data=bid_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['status']\n    else:\n        json_data = response.json()\n        raise BidNotAcceptedException(message=json_data['message'],\n                                      error_code=json_data['error_code'],\n                                      request_id=json_data['request_id'])", "response": "Accept a bid on a project\n   "}
{"SOURCE": "codesearchnet", "instruction": "Can you create a Python 3 function that\nretracts a bid on a project", "response": "def retract_project_bid(session, bid_id):\n    \"\"\"\n    Retract a bid on a project\n    \"\"\"\n    headers = {\n        'Content-Type': 'application/x-www-form-urlencoded'\n    }\n    bid_data = {\n        'action': 'retract'\n    }\n    # POST /api/projects/0.1/bids/{bid_id}/?action=revoke\n    endpoint = 'bids/{}'.format(bid_id)\n    response = make_put_request(session, endpoint, headers=headers,\n                                params_data=bid_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['status']\n    else:\n        json_data = response.json()\n        raise BidNotRetractedException(message=json_data['message'],\n                                       error_code=json_data['error_code'],\n                                       request_id=json_data['request_id'])"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nhighlights a bid on a project", "response": "def highlight_project_bid(session, bid_id):\n    \"\"\"\n    Highlight a bid on a project\n    \"\"\"\n    headers = {\n        'Content-Type': 'application/x-www-form-urlencoded'\n    }\n    bid_data = {\n        'action': 'highlight'\n    }\n    # POST /api/projects/0.1/bids/{bid_id}/?action=revoke\n    endpoint = 'bids/{}'.format(bid_id)\n    response = make_put_request(session, endpoint, headers=headers,\n                                params_data=bid_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['status']\n    else:\n        json_data = response.json()\n        raise BidNotHighlightedException(message=json_data['message'],\n                                         error_code=json_data['error_code'],\n                                         request_id=json_data['request_id'])"}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef create_milestone_payment(session, project_id, bidder_id, amount,\n                             reason, description):\n    \"\"\"\n    Create a milestone payment\n    \"\"\"\n    milestone_data = {\n        'project_id': project_id,\n        'bidder_id': bidder_id,\n        'amount': amount,\n        'reason': reason,\n        'description': description\n    }\n    # POST /api/projects/0.1/milestones/\n    response = make_post_request(session, 'milestones',\n                                 json_data=milestone_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        milestone_data = json_data['result']\n        return Milestone(milestone_data)\n    else:\n        raise MilestoneNotCreatedException(message=json_data['message'],\n                                           error_code=json_data['error_code'],\n                                           request_id=json_data['request_id'])", "response": "Create a milestone payment"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef post_track(session, user_id, project_id, latitude, longitude):\n    tracking_data = {\n        'user_id': user_id,\n        'project_id': project_id,\n        'track_point': {\n            'latitude': latitude,\n            'longitude': longitude\n        }\n    }\n\n    # POST /api/projects/0.1/tracks/\n    response = make_post_request(session, 'tracks',\n                                 json_data=tracking_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['result']\n    else:\n        raise TrackNotCreatedException(message=json_data['message'],\n                                       error_code=json_data['error_code'],\n                                       request_id=json_data['request_id'])", "response": "Create a track on a project"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef update_track(session, track_id, latitude, longitude, stop_tracking=False):\n\n    tracking_data = {\n        'track_point': {\n            'latitude': latitude,\n            'longitude': longitude,\n        },\n        'stop_tracking': stop_tracking\n    }\n\n    # PUT /api/projects/0.1/tracks/{track_id}/\n\n    response = make_put_request(session, 'tracks/{}'.format(track_id),\n                                json_data=tracking_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['result']\n    else:\n        raise TrackNotUpdatedException(message=json_data['message'],\n                                       error_code=json_data['error_code'],\n                                       request_id=json_data['request_id'])", "response": "Updates the current location of a track."}
{"SOURCE": "codesearchnet", "instruction": "Can you generate the documentation for the following Python 3 function\ndef get_track_by_id(session, track_id, track_point_limit=None, track_point_offset=None):\n\n    tracking_data = {}\n    if track_point_limit:\n        tracking_data['track_point_limit'] = track_point_limit\n    if track_point_offset:\n        tracking_data['track_point_offset'] = track_point_offset\n\n    # GET /api/projects/0.1/tracks/{track_id}/\n\n    response = make_get_request(session, 'tracks/{}'.format(track_id),\n                                params_data=tracking_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['result']\n    else:\n        raise TrackNotFoundException(message=json_data['message'],\n                                     error_code=json_data['error_code'],\n                                     request_id=json_data['request_id'])", "response": "Gets a specific track by ID"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function to\nrelease a milestone payment", "response": "def release_milestone_payment(session, milestone_id, amount):\n    \"\"\"\n    Release a milestone payment\n    \"\"\"\n    params_data = {\n        'action': 'release',\n    }\n    milestone_data = {\n        'amount': amount,\n    }\n    # PUT /api/projects/0.1/milestones/{milestone_id}/?action=release\n    endpoint = 'milestones/{}'.format(milestone_id)\n    response = make_put_request(session, endpoint, params_data=params_data,\n                                json_data=milestone_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['status']\n    else:\n        raise MilestoneNotReleasedException(message=json_data['message'],\n                                            error_code=json_data['error_code'],\n                                            request_id=json_data['request_id'])"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function that can\nrequest a release of a single resource.", "response": "def request_release_milestone_payment(session, milestone_id):\n    \"\"\"\n    Release a milestone payment\n    \"\"\"\n    params_data = {\n        'action': 'request_release',\n    }\n    # PUT /api/projects/0.1/milestones/{milestone_id}/?action=release\n    endpoint = 'milestones/{}'.format(milestone_id)\n    response = make_put_request(session, endpoint, params_data=params_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['status']\n    else:\n        raise MilestoneNotRequestedReleaseException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id'])"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef cancel_milestone_payment(session, milestone_id):\n    params_data = {\n        'action': 'cancel',\n    }\n    # PUT /api/projects/0.1/milestones/{milestone_id}/?action=release\n    endpoint = 'milestones/{}'.format(milestone_id)\n    response = make_put_request(session, endpoint, params_data=params_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['status']\n    else:\n        raise MilestoneNotCancelledException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id'])", "response": "Cancel a milestone payment"}
{"SOURCE": "codesearchnet", "instruction": "Explain what the following Python 3 code does\ndef create_milestone_request(session, project_id, bid_id, description, amount):\n    milestone_request_data = {\n        'project_id': project_id,\n        'bid_id': bid_id,\n        'description': description,\n        'amount': amount,\n    }\n    # POST /api/projects/0.1/milestone_requests/\n    response = make_post_request(session, 'milestone_requests',\n                                 json_data=milestone_request_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        milestone_request_data = json_data['result']\n        return MilestoneRequest(milestone_request_data)\n    else:\n        raise MilestoneRequestNotCreatedException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id'])", "response": "Create a new MilestoneRequest"}
{"SOURCE": "codesearchnet", "instruction": "How would you explain what the following Python 3 function does\ndef accept_milestone_request(session, milestone_request_id):\n    params_data = {\n        'action': 'accept',\n    }\n    # POST /api/projects/0.1/milestone_requests/{milestone_request_id}/?action=\n    # accept\n    endpoint = 'milestone_requests/{}'.format(milestone_request_id)\n    response = make_put_request(session, endpoint, params_data=params_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['status']\n    else:\n        raise MilestoneRequestNotAcceptedException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id'])", "response": "Accept a milestone request"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nrejects a milestone request", "response": "def reject_milestone_request(session, milestone_request_id):\n    \"\"\"\n    Reject a milestone request\n    \"\"\"\n    params_data = {\n        'action': 'reject',\n    }\n    # POST /api/projects/0.1/milestone_requests/{milestone_request_id}/?action=\n    # reject\n    endpoint = 'milestone_requests/{}'.format(milestone_request_id)\n    response = make_put_request(session, endpoint, params_data=params_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['status']\n    else:\n        raise MilestoneRequestNotRejectedException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id'])"}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ndeleting a milestone request", "response": "def delete_milestone_request(session, milestone_request_id):\n    \"\"\"\n    Delete a milestone request\n    \"\"\"\n    params_data = {\n        'action': 'delete',\n    }\n    # POST /api/projects/0.1/milestone_requests/{milestone_request_id}/?action=\n    # delete\n    endpoint = 'milestone_requests/{}'.format(milestone_request_id)\n    response = make_put_request(session, endpoint, params_data=params_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['status']\n    else:\n        raise MilestoneRequestNotDeletedException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id'])"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nposting a review to a specific resource.", "response": "def post_review(session, review):\n    \"\"\"\n    Post a review\n    \"\"\"\n    # POST /api/projects/0.1/reviews/\n    response = make_post_request(session, 'reviews', json_data=review)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['status']\n    else:\n        raise ReviewNotPostedException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id'])"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script to\nget a list of jobs", "response": "def get_jobs(session, job_ids, seo_details, lang):\n    \"\"\"\n    Get a list of jobs\n    \"\"\"\n    get_jobs_data = {\n        'jobs[]': job_ids,\n        'seo_details': seo_details,\n        'lang': lang,\n    }\n    # GET /api/projects/0.1/jobs/\n    response = make_get_request(session, 'jobs', params_data=get_jobs_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['result']\n    else:\n        raise JobsNotFoundException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id'])"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\ncreate a thread in the nova log", "response": "def create_thread(session, member_ids, context_type, context, message):\n    \"\"\"\n    Create a thread\n    \"\"\"\n    headers = {\n        'Content-Type': 'application/x-www-form-urlencoded'\n    }\n    thread_data = {\n        'members[]': member_ids,\n        'context_type': context_type,\n        'context': context,\n        'message': message,\n    }\n\n    # POST /api/messages/0.1/threads/\n    response = make_post_request(session, 'threads', headers,\n                                 form_data=thread_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return Thread(json_data['result'])\n    else:\n        raise ThreadNotCreatedException(message=json_data['message'],\n                                        error_code=json_data['error_code'],\n                                        request_id=json_data['request_id'])"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\ncreating a project thread", "response": "def create_project_thread(session, member_ids, project_id, message):\n    \"\"\"\n    Create a project thread\n    \"\"\"\n    return create_thread(session, member_ids, 'project', project_id, message)"}
{"SOURCE": "codesearchnet", "instruction": "Make a summary of the following Python 3 code\ndef post_message(session, thread_id, message):\n    headers = {\n        'Content-Type': 'application/x-www-form-urlencoded'\n    }\n    message_data = {\n        'message': message,\n    }\n\n    # POST /api/messages/0.1/threads/{thread_id}/messages/\n    endpoint = 'threads/{}/messages'.format(thread_id)\n    response = make_post_request(session, endpoint, headers,\n                                 form_data=message_data)\n    json_data = response.json()\n    if response.status_code == 200:\n        return Message(json_data['result'])\n    else:\n        raise MessageNotCreatedException(message=json_data['message'],\n                                         error_code=json_data['error_code'],\n                                         request_id=json_data['request_id'])", "response": "Add a message to a thread"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 function for\nadding a message to a thread", "response": "def post_attachment(session, thread_id, attachments):\n    \"\"\"\n    Add a message to a thread\n    \"\"\"\n    files = []\n    filenames = []\n    for attachment in attachments:\n        files.append(attachment['file'])\n        filenames.append(attachment['filename'])\n    message_data = {\n        'attachments[]': filenames,\n    }\n\n    # POST /api/messages/0.1/threads/{thread_id}/messages/\n    endpoint = 'threads/{}/messages'.format(thread_id)\n    response = make_post_request(session, endpoint,\n                                 form_data=message_data, files=files)\n    json_data = response.json()\n    if response.status_code == 200:\n        return Message(json_data['result'])\n    else:\n        raise MessageNotCreatedException(message=json_data['message'],\n                                         error_code=json_data['error_code'],\n                                         request_id=json_data['request_id'])"}
{"SOURCE": "codesearchnet", "instruction": "Write a Python 3 script for\ngetting one or more messages", "response": "def get_messages(session, query, limit=10, offset=0):\n    \"\"\"\n    Get one or more messages\n    \"\"\"\n    query['limit'] = limit\n    query['offset'] = offset\n\n    # GET /api/messages/0.1/messages\n    response = make_get_request(session, 'messages', params_data=query)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['result']\n    else:\n        raise MessagesNotFoundException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id']\n        )"}
{"SOURCE": "codesearchnet", "instruction": "How would you implement a function in Python 3 that\nsearches for messages in the log.", "response": "def search_messages(session, thread_id, query, limit=20,\n                    offset=0, message_context_details=None,\n                    window_above=None, window_below=None):\n    \"\"\"\n    Search for messages\n    \"\"\"\n    query = {\n        'thread_id': thread_id,\n        'query': query,\n        'limit': limit,\n        'offset': offset\n    }\n    if message_context_details:\n        query['message_context_details'] = message_context_details\n    if window_above:\n        query['window_above'] = window_above\n    if window_below:\n        query['window_below'] = window_below\n\n    # GET /api/messages/0.1/messages/search\n    response = make_get_request(session, 'messages/search', params_data=query)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['result']\n    else:\n        raise MessagesNotFoundException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id']\n        )"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef get_threads(session, query):\n    # GET /api/messages/0.1/threads\n    response = make_get_request(session, 'threads', params_data=query)\n    json_data = response.json()\n    if response.status_code == 200:\n        return json_data['result']\n    else:\n        raise ThreadsNotFoundException(\n            message=json_data['message'],\n            error_code=json_data['error_code'],\n            request_id=json_data['request_id']\n        )", "response": "Get one or more threads\n   "}
{"SOURCE": "codesearchnet", "instruction": "Implement a Python 3 function for\ncleaning a zipcode into a single element.", "response": "def _clean(zipcode, valid_length=_valid_zipcode_length):\n    \"\"\" Assumes zipcode is of type `str` \"\"\"\n    zipcode = zipcode.split(\"-\")[0]  # Convert #####-#### to #####\n\n    if len(zipcode) != valid_length:\n        raise ValueError(\n            'Invalid format, zipcode must be of the format: \"#####\" or \"#####-####\"'\n        )\n\n    if _contains_nondigits(zipcode):\n        raise ValueError('Invalid characters, zipcode may only contain digits and \"-\".')\n\n    return zipcode"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nlist of zipcode dicts where zip_code prefix matches partial_zipcode", "response": "def similar_to(partial_zipcode, zips=_zips):\n    \"\"\" List of zipcode dicts where zipcode prefix matches `partial_zipcode` \"\"\"\n    return [z for z in zips if z[\"zip_code\"].startswith(partial_zipcode)]"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nusing kwargs to select for desired attributes from list of zipcode dicts", "response": "def filter_by(zips=_zips, **kwargs):\n    \"\"\" Use `kwargs` to select for desired attributes from list of zipcode dicts \"\"\"\n    return [z for z in zips if all([k in z and z[k] == v for k, v in kwargs.items()])]"}
{"SOURCE": "codesearchnet", "instruction": "Implement a function in Python 3 to\nbuild a new instance of a PaletteEntry from a configuration string or a dictionary.", "response": "def from_config(\n            cls, cfg,\n            default_fg=DEFAULT_FG_16, default_bg=DEFAULT_BG_16,\n            default_fg_hi=DEFAULT_FG_256, default_bg_hi=DEFAULT_BG_256,\n            max_colors=2**24\n    ):\n        \"\"\"\n        Build a palette definition from either a simple string or a dictionary,\n        filling in defaults for items not specified.\n\n        e.g.:\n            \"dark green\"\n                dark green foreground, black background\n\n            {lo: dark gray, hi: \"#666\"}\n                dark gray on 16-color terminals, #666 for 256+ color\n\n        \"\"\"\n        # TODO: mono\n\n        e = PaletteEntry(mono = default_fg,\n                         foreground=default_fg,\n                         background=default_bg,\n                         foreground_high=default_fg_hi,\n                         background_high=default_bg_hi)\n\n        if isinstance(cfg, str):\n            e.foreground_high = cfg\n            if e.allowed(cfg, 16):\n                e.foreground = cfg\n            else:\n                rgb = AttrSpec(fg=cfg, bg=\"\", colors=max_colors).get_rgb_values()[0:3]\n                e.foreground = nearest_basic_color(rgb)\n\n        elif isinstance(cfg, dict):\n\n            bg = cfg.get(\"bg\", None)\n            if isinstance(bg, str):\n                e.background_high = bg\n                if e.allowed(bg, 16):\n                    e.background = bg\n                else:\n                    rgb = AttrSpec(fg=bg, bg=\"\", colors=max_colors).get_rgb_values()[0:3]\n                    e.background = nearest_basic_color(rgb)\n            elif isinstance(bg, dict):\n                e.background_high = bg.get(\"hi\", default_bg_hi)\n                if \"lo\" in bg:\n                    if e.allowed(bg[\"lo\"], 16):\n                        e.background = bg[\"lo\"]\n                    else:\n                        rgb = AttrSpec(fg=bg[\"lo\"], bg=\"\", colors=max_colors).get_rgb_values()[0:3]\n                        e.background = nearest_basic_color(rgb)\n\n            fg = cfg.get(\"fg\", cfg)\n            if isinstance(fg, str):\n                e.foreground_high = fg\n                if e.allowed(fg, 16):\n                    e.foreground = fg\n                else:\n                    rgb = AttrSpec(fg=fg, bg=\"\", colors=max_colors).get_rgb_values()[0:3]\n                    e.foreground = nearest_basic_color(rgb)\n\n            elif isinstance(fg, dict):\n                e.foreground_high = fg.get(\"hi\", default_fg_hi)\n                if \"lo\" in fg:\n                    if e.allowed(fg[\"lo\"], 16):\n                        e.foreground = fg[\"lo\"]\n                    else:\n                        rgb = AttrSpec(fg=fg[\"lo\"], bg=\"\", colors=max_colors).get_rgb_values()[0:3]\n                        e.foreground = nearest_basic_color(rgb)\n        return e"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef get_passphrase(passphrase=None):\n    for passphrase_file_path in POTENTIAL_PASSPHRASE_LOCATIONS:\n        if os.path.isfile(passphrase_file_path):\n            with open(passphrase_file_path) as passphrase_file:\n                return passphrase_file.read()\n    return passphrase", "response": "Return a passphrase as found in a passphrase. ghost file"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nmigrates all keys in a source stash to a destination stash", "response": "def migrate(src_path,\n            src_passphrase,\n            src_backend,\n            dst_path,\n            dst_passphrase,\n            dst_backend):\n    \"\"\"Migrate all keys in a source stash to a destination stash\n\n    The migration process will decrypt all keys using the source\n    stash's passphrase and then encrypt them based on the destination\n    stash's passphrase.\n\n    re-encryption will take place only if the passphrases are differing\n    \"\"\"\n    src_storage = STORAGE_MAPPING[src_backend](**_parse_path_string(src_path))\n    dst_storage = STORAGE_MAPPING[dst_backend](**_parse_path_string(dst_path))\n    src_stash = Stash(src_storage, src_passphrase)\n    dst_stash = Stash(dst_storage, dst_passphrase)\n    # TODO: Test that re-encryption does not occur on similar\n    # passphrases\n    keys = src_stash.export()\n    dst_stash.load(src_passphrase, keys=keys)"}
{"SOURCE": "codesearchnet", "instruction": "Create a Python 3 function for\nreturning a random string size long based on lowercase uppercase and digit chars", "response": "def generate_passphrase(size=12):\n    \"\"\"Return a generate string `size` long based on lowercase, uppercase,\n    and digit chars\n    \"\"\"\n    chars = string.ascii_lowercase + string.ascii_uppercase + string.digits\n    return str(''.join(random.choice(chars) for _ in range(size)))"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nbuilds a dict from a list of key = value pairs.", "response": "def _build_dict_from_key_value(keys_and_values):\n    \"\"\"Return a dict from a list of key=value pairs\n    \"\"\"\n    key_dict = {}\n    for key_value in keys_and_values:\n        if '=' not in key_value:\n            raise GhostError('Pair {0} is not of `key=value` format'.format(\n                key_value))\n        key, value = key_value.split('=', 1)\n        key_dict.update({str(key): str(value)})\n    return key_dict"}
{"SOURCE": "codesearchnet", "instruction": "Can you implement a function in Python 3 that\nreturns a human readable format of a dict.", "response": "def _prettify_dict(key):\n    \"\"\"Return a human readable format of a key (dict).\n\n    Example:\n\n    Description:   My Wonderful Key\n    Uid:           a54d6de1-922a-4998-ad34-cb838646daaa\n    Created_At:    2016-09-15T12:42:32\n    Metadata:      owner=me;\n    Modified_At:   2016-09-15T12:42:32\n    Value:         secret_key=my_secret_key;access_key=my_access_key\n    Name:          aws\n    \"\"\"\n    assert isinstance(key, dict)\n\n    pretty_key = ''\n    for key, value in key.items():\n        if isinstance(value, dict):\n            pretty_value = ''\n            for k, v in value.items():\n                pretty_value += '{0}={1};'.format(k, v)\n            value = pretty_value\n        pretty_key += '{0:15}{1}\\n'.format(key.title() + ':', value)\n    return pretty_key"}
{"SOURCE": "codesearchnet", "instruction": "Can you write a function in Python 3 where it\nreturns a human readable format of a list.", "response": "def _prettify_list(items):\n    \"\"\"Return a human readable format of a list.\n\n    Example:\n\n    Available Keys:\n      - my_first_key\n      - my_second_key\n    \"\"\"\n    assert isinstance(items, list)\n\n    keys_list = 'Available Keys:'\n    for item in items:\n        keys_list += '\\n  - {0}'.format(item)\n    return keys_list"}
{"SOURCE": "codesearchnet", "instruction": "Can you tell what is the following Python 3 function doing\ndef init_stash(stash_path, passphrase, passphrase_size, backend):\n    stash_path = stash_path or STORAGE_DEFAULT_PATH_MAPPING[backend]\n    click.echo('Stash: {0} at {1}'.format(backend, stash_path))\n    storage = STORAGE_MAPPING[backend](**_parse_path_string(stash_path))\n\n    try:\n        click.echo('Initializing stash...')\n        if os.path.isfile(PASSPHRASE_FILENAME):\n            raise GhostError(\n                '{0} already exists. Overwriting might prevent you '\n                'from accessing the stash it was generated for. '\n                'Please make sure to save and remove the file before '\n                'initializing another stash.'.format(PASSPHRASE_FILENAME))\n\n        stash = Stash(\n            storage,\n            passphrase=passphrase,\n            passphrase_size=passphrase_size)\n        passphrase = stash.init()\n\n        if not passphrase:\n            click.echo('Stash already initialized.')\n            sys.exit(0)\n\n        _write_passphrase_file(passphrase)\n    except GhostError as ex:\n        sys.exit(ex)\n    except (OSError, IOError) as ex:\n        click.echo(\"Seems like we've run into a problem.\")\n        file_path = _parse_path_string(stash_path)['db_path']\n        click.echo(\n            'Removing stale stash and passphrase: {0}. Note that any '\n            'directories created are not removed for safety reasons and you '\n            'might want to remove them manually.'.format(file_path))\n        if os.path.isfile(file_path):\n            os.remove(file_path)\n        sys.exit(ex)\n\n    click.echo('Initialized stash at: {0}'.format(stash_path))\n    click.echo(\n        'Your passphrase can be found under the `{0}` file in the '\n        'current directory.'.format(PASSPHRASE_FILENAME))\n    click.echo(\n        'Make sure you save your passphrase somewhere safe. '\n        'If lost, you will lose access to your stash.')", "response": "r Initializes a stash."}
{"SOURCE": "codesearchnet", "instruction": "Here you have a function in Python 3, explain what it does\ndef put_key(key_name,\n            value,\n            description,\n            meta,\n            modify,\n            add,\n            lock,\n            key_type,\n            stash,\n            passphrase,\n            backend):\n    \"\"\"Insert a key to the stash\n\n    `KEY_NAME` is the name of the key to insert\n\n    `VALUE` is a key=value argument which can be provided multiple times.\n    it is the encrypted value of your key\n    \"\"\"\n    stash = _get_stash(backend, stash, passphrase)\n\n    try:\n        click.echo('Stashing {0} key...'.format(key_type))\n        stash.put(\n            name=key_name,\n            value=_build_dict_from_key_value(value),\n            modify=modify,\n            metadata=_build_dict_from_key_value(meta),\n            description=description,\n            lock=lock,\n            key_type=key_type,\n            add=add)\n        click.echo('Key stashed successfully')\n    except GhostError as ex:\n        sys.exit(ex)", "response": "Insert a new key into the stash"}