Skip to content
nouiz edited this page Aug 17, 2012 · 31 revisions

Updates in the Trunk since the last release:

Bug fixes
  • Outputs of Scan nodes could contain corrupted values: some parts of the output would be repeated a second time, instead of the correct values. It happened randomly, and quite infrequently, but the bug has been present (both in Python and Cython) since April 2011. (Pascal L.)
  • In Sparse sandbox, fix the grad of theano.sparse.sandbox.sp.row_scale. It did not return the right number of elements. (Frederic B.)
  • set_subtensor(x[int vector], new_value) when moved to the GPU was transformed into inc_subtensor on the GPU. Now we have a correct (but slow) GPU implementation. Note 1: set_subtensor(x[slice[,...]], new_value) was working correctly in all cases as well as inc_subtensor(*, *). Note 2: If your code was affected by the incorrect behavior, we now print a warning by default (Frederic B.)
  • Fixed an issue whereby config values were used as default arguments, with those defaults then stuck at old values if the config variables were changed during program execution. (David W-F)
  • Fixed many subtle bugs involving mutable default arguments which may have led to unexpected behaviour, such as objects sharing instance variables they were not supposed to share. (David W-F)
  • Correctly record the GPU device number used when we let the driver select it. (Frederic B.)
Documentation
Interface changes
  • In 0.5, we removed the deprecated sharedvar.value property. Now we raise an error if you access it. (Frederic B.)
  • theano.function does not accept duplicate inputs, so function([x, x], ...) does not work anymore. (Pascal L.)
  • theano.function now raises an error if some of the provided inputs are not part of the computational graph needed to compute the output, for instance, function([x, y], [y]). You can use the kwarg on_unused_input={'raise', 'warn', 'ignore'} to control this. (Pascal L.)
  • New Theano flag "on_unused_input" that define the default value of the previous point. (Frederic B.)
  • tensor.alloc() now raises an error during graph build time when we try to create less dimensions than the number of dimensions the provided value have. In the past, the error was at run time. (Frederic B.)
  • Remove theano.Value and related stuff(Ian G.) This was a test of what ended up as SharedVariable.
  • Renamed Env to FunctionGraph, and object attribute "env" to "fgraph" (Ian G.) Deprecation warning printed when you try to access the "env" attribute.
Deprecation
  • Deprecated the Module class (Ian G.) This was a predecessor of SharedVariable with a less pythonic phylosophy.
Speed up
  • Convolution on the GPU now check the generation of the card to make it faster in some cases (especially medium/big ouput image) (Frédéric B.) (We hardcoded 512 as the maximum number of thread per block. Newer card

    support up to 1024 threads per block.

  • CPU convolution are now parallelized (Frédric B.) By default use all cores/hyper-threads To control it, use the OMP_NUM_THREADS=N environment variable where N is the number of parallel thread to use. By default it is equal to the number of CPU cores/hyper threads that you have. There is a new Theano flags openmp to allow/disallow openmp op. If you blas is parallelized, this flag won't affect it, but the env variable will.

Buildbot
  • Now we use http://travis-ci.org/ to run all CPU tests with the default mode on all Pull Requests. This should make the trunk more stable. (Fredric B.)
New Features
  • debugprint new param ids=["CHAR", "id", "int", ""] This makes the identifier printed to be the python id, a unique char, a unique int, or not have it printed. We changed the default to be "CHAR" as this is more readable. (Frederic B.)
  • debugprint new param stop_on_name=[False, True]. If True, we don't print anything below an intermediate variable that has a name. Defaults to False. (Frederic B.)
  • debugprint does not print anymore the "|" symbol in a column after the last input. (Frederic B.)
  • If you use Enthought Python Distribution (EPD) now we use its blas implementation by default (tested on Linux and Windows) (Frederic B., Simon McGregor)
  • MRG random now raises an error with a clear message when the passed shape contains dimensions with bad value like 0. (Frédéric B. reported by Ian G.)
  • "CudaNdarray[*] = ndarray" works in more cases (Frederic B.)
  • "CudaNdarray[*] += ndarray" works in more cases (Frederic B.)
  • We add dimensions to CudaNdarray to automatically broadcast more frequently. (Frederic B.)
  • theano.tensor.argsort that wraps numpy.argsort (Hani Almousli).
  • New theano flag cmodule.warn_no_version. Default False. If True, will print a warning when compiling one or more Op with C code that can't be cached because there is no c_code_cache_version() function associated to at least one of those Ops. (Frederic B.)
  • CPU alloc now always generate C code (Pascal L.)
  • New Theano flag cmodule.warn_no_version=False. When True, warn when an op with C code is not versioned (which forces to recompile it everytimes). (Frédéric B.)
  • Made a few Ops with C code versioned to reduce compilation time. (Frédéric B, Pascal L.)
  • C code reuses preallocated outputs (only done by Scan) (Pascal L.)
  • Garbage collection of intermediate results during Theano function calls for Ops with C code (Pascal L.)
  • Theano flags compiledir_format now support the parameter "numpy_version" and "g++". (Frederic B.)
  • Theano GPU variables, shared variable and constant now support <, <=, > and >= as as those not on the GPU.
Sparse
  • Implement theano.sparse.mul(sparse1, sparse2) when both inputs don't have the same sparsity pattern. (Frederic B.)
  • sparse.Binomial
  • added sparse.{sqrt,sqr,log1p,floor,ceil,sgn,round_half_to_even,arctanh,tanh,arcsinh,sinh,arctan,arcsin,tan,sin} NB
Sparse Sandbox graduate (moved to theano.sparse)
  • sparse.sandbox.sp.Remove0 op: it removes stored elements with value 0. (Frederic B., Nicolas B.)
  • sparse.sandbox.sp.sp_sum(a, axis=None) (Nicolas B.) * bugfix: the not structured grad was returning a structured grad.
  • sparse.sandbox.sp.{col_scale,row_scale,ensure_sorted_indices,clean} (Nicolas B.)
  • sparse.sandbox.sp.diag (Make it work for csr matrix) (Nicolas B.)
  • sparse.sandbox.sp.square_diag (Nicolas B.)
Sparse Sandbox Additions (not reviewed/documented/tested, but used by some people)
  • They are all in the theano.sparse.sandbox.sp2 module

Sum is from sparse.sandbox.sp, bugfix: the not structured grad was returning a structured grad.

  • Op class: Cast, Poisson, Multinomial, EliminateZeros, Sum, Binomial
  • Op class: SamplingDot, SamplingDotCsr (inserted automatically)
  • Op function: structured_sigmoid, structured_exp, structured_pow, structured_minimum
  • Op class: StructuredAddSV, StrucutedAddSVCSR (inserted automatically)
  • opt: local_sampling_dot_csr, local_structured_add_s_v
  • new op MulSV (Yann D.) multiplication of sparse matrix by broadcasted vector
Internal changes
  • Define new exceptions MissingInputError and UnusedInputError, and use them in theano.function, instead of TypeError and ValueError. (Pascal L.)
  • Better handling of bitwidth and max values of integers and pointers across platforms (Pascal L.)
Crash Fix
  • Do not try to use the BLAS library when blas.ldflags is manually set to an empty string (Frederic B.)
  • When importing theano on a computer without GPU with the Theano flags 'device' or 'init_gpu_device' set to gpu* (Frederic B., reported by Luo Heng)
  • Optimization printed a useless error when scipy was not available. (Frederic B.)
  • GPU conv crash/slowdown on newer hardware (James B.)
  • Better error handling in GPU conv (Frederic B.)
  • GPU optimization that moves element-wise Ops to the GPU. Crash happened in a particular execution order of this optimization and the element-wise fusion optimization when upcasting some inputs to float32 (to compute them on the GPU). (Frederic B., reported by Sander Dieleman)
  • GpuReshape in some particular case when the input is not contiguous (Frederic B., reported by Sander Dieleman)
  • GpuSoftmaxWithBias with shape (0, N) with N > 1. (Frédéric B., reported by Razvan P.)
  • Fix crash under 64-bit Windows, when taking subtensors of the form a[n:] (Pascal L., reported by Simon McGregor)
  • Fixed issue with the MaxAndArgmax Op not properly preserving broadcastable dimensions, which could typically result in optimization crashes (Olivier D.)
  • Fixed crash when concatenating some arrays with specific broadcasting patterns (Olivier D.)
  • Work around a known issue with nvcc 4.1 on MacOS X. (Graham Taylon)
  • In advanced indexing, if some inputs are constant, no need to call constant(...) on their value any more. (Pascal L., reported by John Salvatier)
  • Fix crash on GPU when the GpuSubtensor didn't put the right stride when the results tensor had a dimensions with size of 1. (Pascal L, reported Graham T.)
  • Fix scan crash that made it not run on the GPU in one case. (Guillaume D.)
Clone this wiki locally