Difference between revisions of "Pix2code"
Line 5: | Line 5: | ||
==Brief Introduction== | ==Brief Introduction== | ||
− | Pix2code is an AI model that can convert GUI images to DSL codes and then uses a compiler to convert DSL code to HTML, Android XML, and iOS Storyboard. More details can be found [https://arxiv.org/pdf/1705.07962.pdf here] in the original paper. Instructions to train and use the models can be found on the original [https://github.com/tonybeltramelli/pix2code github] page. There is an improved version of pix2code, which is [https://github.com/fjbriones/pix2code2 pix2code2]. It uses a Convolutional Neural Network (CNN) as an autoencoder for the GUI before training. The users also include a pre-trained model to experiment with. | + | Pix2code is an AI model that can convert GUI images to DSL codes and then uses a compiler to convert DSL code to HTML, Android XML, and iOS Storyboard. More details can be found [https://arxiv.org/pdf/1705.07962.pdf here] in the original paper. Instructions to train and use the models can be found on the original [https://github.com/tonybeltramelli/pix2code github] page. There is an improved version of pix2code, which is [https://github.com/fjbriones/pix2code2 pix2code2]. It uses a Convolutional Neural Network (CNN) as an autoencoder for the GUI before training. The users also include a pre-trained model to experiment with. What we have in the RDP right now is pix2code2. |
==Usage of pix2code on RDP== | ==Usage of pix2code on RDP== |
Revision as of 17:45, 5 April 2019
Pix2code | |
---|---|
Project Information | |
Has title | Pix2code experimentation |
Has owner | Hiep Nguyen |
Has start date | |
Has deadline date | |
Has project status | |
Copyright © 2019 edegan.com. All Rights Reserved. |
Brief Introduction
Pix2code is an AI model that can convert GUI images to DSL codes and then uses a compiler to convert DSL code to HTML, Android XML, and iOS Storyboard. More details can be found here in the original paper. Instructions to train and use the models can be found on the original github page. There is an improved version of pix2code, which is pix2code2. It uses a Convolutional Neural Network (CNN) as an autoencoder for the GUI before training. The users also include a pre-trained model to experiment with. What we have in the RDP right now is pix2code2.
Usage of pix2code on RDP
Currently, source code and pre-trained model for pix2code are living on
E:/projects/pix2code_test
To generate DSL code from specific GUI images, first place the image on pix2code_test directory, then do the following
cd pix2code_test/model ./sample.py ../bin pix2code2 ../test_img.png ../code greedy #to use greedy algorithm, replace greedy with 1,2,3..,k to use beam search with size k.
The GUI code will be inside the pix2code_test/code directory
To generate GUI to HTML:
cd compiler ./web_compiler.py ./code/test_img.gui
Discussion
While pix2code can preserve the structure of the HTML page quite well, it cannot preserve the contents of the website. Most of the texts from the original page are distorted in the generated DSL. Moreover, pix2code is extremely expensive to train and the current model only works for very simple GUIs that are similar to ones in the training set. Hence, pix2code model would not be suited for building an information extractor. However, we can learn from the source code how to input and structure GUI data and construct LSTM networks on top of GUI and output DSL code.