|
I am working on UFLDFL tutorial for softmax regression. Due to some bug in my implementation my gradients are not matching.It may due to an error in using one of those heavily vectorized equations but i am having a hard time tracking it. If someone has implemented it any help would be greatly appreciated. i am getting a cost arnd 0.0323 which seems reasonable to me.
hx is the hypothesis |
|
my code is,but it can't work very well,the result of gradient-checking is low enough,got 3.7769e-10.,but when I use it to train with minfunc,it got "Function Value changing by less than TolX" in 20 iterations.could you help me?
|
|
I think some people might be getting confused when he starts talking about softmax reducing to logistic regression (1-y terms). The pure loop form should look like this:
Or using vectorized form:
|
|
Thank you so much trailblazer1019... this means so much to me. I haven't yet looked at the code to try and understand what is happening but hope to today. I spent some time trying to get this working in Octave as don't have Matlab. My fixes to make compatible in case of any use to anybody else(?): 1) "sumsqr" needed replacing with "sumsq" (Octave equivalent - in the context it was used). 2) the minFunc library (written by Mark Schmidt) caused an error in the polyinterp.m file with the line:
I downloaded the latest version thinking it was "linsolve" causing the issue - I couldn't find where this function was as doesn't appear in Octave help so I assumed part of his library (I thought I had a file missing)... http://www.di.ens.fr/~mschmidt/Software/minFunc_2012.zip Anyway still didn't work, finally found that Octave doesn't seem to like the comma tilda in [params,~]. I assume the comma tilda is a way to use dummy variables if output variables have to be declared for a correct function call? (maybe someone can confirm)? Anyway replacing with:
seemed to do the trick. Then your program worked in Octave :D |
|
https://www.dropbox.com/sh/p347fohdoby0zuv/hXTyhyipKc This should get you all the files. Its been a while i looked at it, so it might contain some of the code i have written.Anways, it should have all the default code that you get from UFLDL. Let me know if you need any help! |
|
well i think it should be ok since its based on ufldfl tutorials which are basically for self learning. by the way, here is my whole code sample .
the code above shows my implementation. I tried doing the vectorized way, u can see that line is commented out but the cost doesn't come out to be the same. Down below are my gradient results they are off , so there is some tiny thing that i am not catching or i haven't understood well.Any help would be really appreciated.
|
|
I've implemented the full vectorized version (without any loops) following the guidelines on UFLDL but, as I new in this forum I don't know if it's ok to publish the code, let me know. BTW the last line of your code should'nt be there, you need to apply regularization to all parameters since the form used is overparameterized. Hope it helps |