Wei Wu(Work log)

From edegan.com
Jump to navigation Jump to search

Notes from Ed

Detail about the install/config/basic use of software on the db server is on the Database Server Documentation documentation page.

Please build and link to project pages that describe what you have done to date!

Summer 2018

Wei Wu Work Logs (log page)

2018-06-11

  • Set up wiki page and RDP for work. Installed CUDA on dbserver. Waiting for Matlab and Gurobi to be

installed on the dbserver (or I will do it myself later this week).

2018-06-12

  • Tested connection via localhost port.
  • Matlab matching code with Ed, James, and Chenyu via Skype.

2018-06-13

  • Continued searching for a method to set up vnc for dbserver without ssh.
  • Started moving the selenium box, monitors, keyboards, etc, from Room 310.
  • Matlab matching code with Ed, James, and Chenyu via Skype.

2018-06-18
Trained on using PostgreSQL for DBServer.

2018-06-19
Further training with SQL and SDC Platinum. Job assignment among team members.

2018-06-20

  • Started reading a short tutorial on GMM and its implementation[1][2]. Should have a good grasp before the end of the week.
  • Gurobi interface guide
  • It seems that Gurobi does not support GPGPU computation here in page 36, and here is a slightly more elaborate communication between the engineering director of Gurobi and the community regarding GPGPU computation support. Need to figure out how to do parallel computation in Matlab[3][4], and where we need it in the Startup-VC Code.


2018-06-21

  • Huge problem with code gmm_2stage_estimated.m. In line 80, we compute W by taking the inverse of matrix Om. We kept getting W as an ill-conditioned matrix, whose entries are infinitely large. There might be a bug in the readjusted code. I will try to catch it by comparing with the original code. If I can't, will try to set up another skype phone call with Chenyu.

Update: This might be related to the bug reported in the Matlab Code page. I also don't think the fix was correct. I will look into that.

2018-06-22

  • Want to test Matlab and its parallel computing toolbox on DBServer. Cannot use the Matlab GUI remotely. This is possible due to the environment variable setting for remote access. Update: now we have Matlab GUI. Nvidia CUDA is configured correctly as well. Today is a good day for a Linux user.
  • Probably it's the right time to further configure the VNC server on DBServer. Documentation for TightVNC configuration. Done. Documented in the Database Server Documentation

2018-06-25/26
Looking at quick tutorials for C/C++ and CUDA, in case that I will need to read CUDA code in the future.

2018-06-27
Sick. Working from home.

2018-06-28
Emailed Chenyu about the bug in gmm_2stage_estimated.m.

2018-06-29
Set up a meeting with Jeremy to talk about the Matlab code and paper.

2018-07-02

2018-07-03
Finally learned from Chenyu that the "bug" reported on June 21 was not a bug at all. We are getting singular matrices because we are using too small R and monte_M.

2018-07-05

  • There is still problems with W being singular. I have changed R and monte_M to be as big as in the original code. Either I neglected something, or there is still a bug. Or perhaps it's just normal. See File:A copy of warning messages from Matlab command windows.pdf. When W is singular, this leads to fitness function for the second stage ga being minimized to negative infinity.
  • I am trying to put gurobi into a parfor in Matlab. So far, not good. Wanted to figure out how to do CPU-based parallel computing with Gurobi. I cannot find a way to run Gurobi solvers inside a parfor. I believe Matlab's linprog can, but linprog is much slower than Gurobi. There will be some trade off. I need to test this.

2018-07-06 I really need to understand the code better. Also we probably can run Gurobi inside parfor, but I need to wrap it around inside a function.

2018-07-09 I have run profiling on the Matlab code several times. It seems that moments.m takes up as much time as calling Gurobi to solve LPs. Probably we should optimize moments.m instead.

Profiling.png

2018-07-10 Ran profiling again. With big enough R, the parallel code is much faster. Documented in the project Matlab, CUDA, and GPU Computing

2018-07-11 Croatia 2-1 England!!!!!!!!!!!!!!!!!!!!!

2018-07-12

  • Helped Minh install Tensorflow on DB Server.
  • Learned to use NOTS.

2018-07-13

  • Further parallelize Matlab code (msmf_corr_coeff.m). Now on our 12 cores server, one call to msmf_corr_coeff takes about 35 seconds for R=200, monte_M = 70, mktsize = 30.
  • Will try to parallelize moments.m. Currently it takes 10 seconds per call. This in included in the 35 seconds runtime of msmf_corr_coeff.m.
  • Reverted back to using Matlab's linprog rather than Gurobi. In a parfor, gurobi takes much longer than the native linprog to solve our LPs. I do not fully understand why this is happening. It might be the way that Gurobi was called inside a function that increases the overhead, or it might be that Gurobi couldn't utilize the full power of our CPU since all 12 cores have been scheduled to work on 12 different LPs (some LP algorithm of Gurobi has parallelism). Note that creating a model for Gurobi takes time.

Msmf35seconds.png
2018-07-16

  • Run monte_data mode with R=200, monte_M = 70, mktsize = 30. The msmf was computed fairly fast.

Msmf monte data.png

  • Run data mode with R=200, monte_M=70, mktsize = 30.

Msmf data.png
2018-07-18

  • Run monte mode with R=200, monte_M = 200, mktsize = 30.
  • Helped Maxine with industrial classifier.
  • Worked on documentations for NOTS and parallelization
  • Worked on running matlab code on NOTS

2018-07-19 ~ 30 Ran diagnostics requested by Chenyu

2018-08 Help Marcus get familiar with the Matlab code for matching VCs to startups.