A guide to creating a NLP framework for iOS (Part 1)

Taichi Kato


A guide to creating a NLP framework for iOS (Part 1)

[image:1352FF89-F371-47D6-AE5A-0FB8E59D389E-88229-00009EA662FE2804/jRmhjo4 - Imgur (1).jpg]

One thing that has been on my bucket list for a while was to implement a NLP model to run locally on an iOS device. I recalled from a few years back the troubles I went through porting LSTM-based networks to run with CoreML, so I wanted to give it another shot.

The task was simple, my team at Questo wanted to move some of the computationally heavy processes running on the cloud to the edge. So naturally, we decided to port POS tagging and the dependency parsing—two critical parts of the question generation pipeline—to run locally on any iOS devices, utilising CoreML. As we were using spaCy via Python to run these two processes on our servers, the idea to “port spaCy over to iOS” came easily.

After exporting the neural net portion of the dependency parsing algorithm out of tensorflow to .mlmodel using CoreMLTools, it was then, a simple matter of drag and dropping the file into the project.

Create a new framework with Xcode

[image:8268EAF1-0446-4DDD-84B9-A0DA330A61A6-88229-0000A262438A2D92/AC30871E-17DE-4EA5-805C-7D1BCDAD86AC.png] Here, the steps to creating a new project for our framework is quite straight forward. It is important, however, that we create a new Project, rather than a Swift Package.

Note: I do like to illustrate that Swift packages is the recommended way to create Swift frameworks (not necessarily just for iOS, though you can set it to use iOS specific SDK), but in the cases where we require an external asset, like here with the CoreML model, the package is not able to load the model properly. [image:38CF8C2E-441D-4586-8C73-9D2594AF7CC7-88229-0000A2A20B6D4BB8/Screenshot 2020-05-13 at 10.35.29 PM.png] We would then like to use to choose Framework.

Writing your first framework

Frankly, the programming aspect of creating a framework is as complicated as what you want your framework to do. In our case, it was relatively difficult, because of the algorithmic complexity on top of the abstract and convoluted nature of NLP data. Nonetheless, our project structure is rather simple, with only a few files corresponding to various classes. One [image:F4AE6B7D-B775-40A8-BFBA-B9296060E63F-88229-0000A3BFC38B9EA5/Screenshot 2020-05-13 at 10.56.09 PM.png] One important thing in developing frameworks, is to make the interface as simple as possible. This means limiting the instantiation of classes in the global scope, creating generic class names which can conflict with other libraries. With proper access control set, we shouldn’t face many conflict with other libraries. [image:41434CE7-3AE9-4F47-88E7-9D0E1F30A9B2-88229-0000A40FEDE5417D/Screenshot 2020-05-13 at 11.01.57 PM.png] I found most other things about the implementation to be quite straightforward. I often found it beneficial to think of these classes in the framework as files that can be used when dragged into any Xcode project. Afterall, frameworks merely package these files into neat little binaries.

Testing the code

Unlike usual iOS apps which have UIs, or Playground files where there are dedicated buttons for running the program, a framework doesn’t have any of that. It has no entry point nor an initialisation script you can simply write. Therefore, it is a standard practice to create test classes (it’s also a great way to learn about testing, as it is not something you might’ve done writing iOS apps for fun) for debugging your code.

When the framework is small enough, it is sufficient to have one test class, but as your program grows, it is often useful to have more than one. Since Kafka was relatively small, we had two test classes, both testing two individual units within the framework. [image:40BD0A0E-C766-4DDC-BC42-C0538B0C92D0-88229-0000A4741D24BDCA/Screenshot 2020-05-13 at 11.09.04 PM.png]

    func testPartialParse_complete() throws {
        // test complete should be true

        let pp_complete = PartialParse(sentence: self.testSentenceWithPos)

        pp_complete.stack = [0]
        pp_complete.next = 50
        pp_complete.arcs = [(5, 4, "nummod"), (5, 3, "compound"), (5, 2, "det"), (5, 1, "case"), (9, 8, "det"), (9, 7, "punct"), (9, 6, "case"), (9, 10, "punct"), (12, 13, "case"), (15, 14, "compound"), (15, 12, "nmod:poss"), (15, 11, "case"), (9, 15, "nmod"), (5, 9, "nmod"), (19, 18, "amod"), (20, 19, "nsubj"), (20, 17, "punct"), (20, 16, "punct"), (22, 21, "det"), (20, 22, "dobj"), (25, 24, "compound"), (25, 23, "case"), (20, 25, "nmod"), (20, 26, "punct"), (20, 27, "punct"), (28, 29, "cc"), (28, 30, "conj"), (20, 28, "dep"), (20, 31, "punct"), (5, 20, "dep"), (34, 33, "det"), (36, 35, "case"), (34, 36, "nmod"), (34, 37, "punct"), (41, 40, "compound"), (41, 39, "case"), (38, 41, "nmod"), (34, 38, "acl"), (34, 42, "punct"), (45, 44, "advmod"), (45, 43, "auxpass"), (45, 34, "nsubjpass"), (45, 32, "punct"), (45, 5, "nmod"), (48, 47, "compound"), (48, 46, "case"), (45, 48, "nmod"), (45, 49, "punct"), (0, 45, "root")]


        // test complete shouldn't be true

        let pp_incomplete = PartialParse(sentence: self.testSentenceWithPos)

        pp_incomplete.stack = [0, 5, 6]
        pp_incomplete.next = 7
        pp_incomplete.arcs = [(5, 4, "nummod"), (5, 3, "compound"), (5, 2, "det"), (5, 1, "case")]