File size: 2,614 Bytes
fd49381
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
# Python interface to Moses

The idea is to have some of Moses' internals exposed to Python (inspired on pycdec).

## What's been interfaced?

* Binary tables:

        Moses::PhraseDictionaryTree
        OnDiskPt::OnDiskWrapper

## Building

1.  Build the python extension: 

    You need to compile Moses with link=shared

        ./bjam --libdir=path link=shared

    Then you can build the extension (in case you used --libdir=path above, use --moses-lib=path below) 

        python setup.py build_ext -i [--with-cmph] [--moses-lib=PATH] [--cython] [--max-factors=NUM] [--max-kenlm-order=NUM]

    Use `--cython` if you want to re-compile the pyx files, note that they already come compiled so that you don't need to have Cython installed 

## Example

### Getting a phrase table

    cd examples
    export LC_ALL=C
    cat phrase-table.txt | sort | ../../../bin/processPhraseTable -ttable 0 0 - -nscores 5 -alignment-info -out phrase-table

### Getting a rule table

    cd examples
    ../../../bin/CreateOnDiskPt 0 0 5 20 2 rule-table.txt rule-table

### Querying

1. Phrase-based
    
        echo "casa" | python example.py examples/phrase-table 5 20
        echo "essa casa" | python example.py examples/phrase-table 5 20

2. Hierarchical

        echo "i [X]" | python example.py examples/rule-table 5 20
        echo "have [X]" | python example.py examples/rule-table 5 20
        echo "[X][X] do not [X][X] [X]" | python example.py examples/rule-table 5 20

### Code

```python
from moses.dictree import load # load abstracts away the choice of implementation by checking the available files
import sys

if len(sys.argv) != 4:
    print "Usage: %s table nscores tlimit < query > result" % (sys.argv[0])
    sys.exit(0)

path = sys.argv[1]
nscores = int(sys.argv[2])
tlimit = int(sys.argv[3])

table = load(path, nscores, tlimit)

for line in sys.stdin:
    f = line.strip()
    result = table.query(f)
    # you could simply print the matches
    # print '\n'.join([' ||| '.join((f, str(e))) for e in matches])
    # or you can use their attributes
    print result.source
    for e in result:
        if e.lhs:
            print '\t%s -> %s ||| %s ||| %s' % (e.lhs, 
                    ' '.join(e.rhs), 
                    e.scores, 
                    e.alignment)
        else:
            print '\t%s ||| %s ||| %s' % (' '.join(e.rhs), 
                    e.scores, 
                    e.alignment)
```


## Changing the code

If you want to add your changes you are going to have to recompile the cython code.

1.  Compile the cython code:

        python setup.py build_ext -i --cython