diff --git a/README.md b/README.md index 7956390..f87012a 100644 --- a/README.md +++ b/README.md @@ -90,6 +90,15 @@ When using multiple style images, you can control the degree to which they are b +### Relative style layer weights + +It is possible to adjust the relative weights of style layers giving a comma separated list of multipliers (actual style weight of each layer will then be style_layer_weight * style_weight. + +``` +th neural_style.lua -gpu -1 style_layer_weights 1,1,0.1,2,15 +``` + + ### Transfer style but not color If you add the flag `-original_colors 1` then the output image will retain the colors of the original image; @@ -188,6 +197,8 @@ path or a full absolute path. Default is `relu4_2`. * `-style_layers`: Comma-separated list of layer names to use for style reconstruction. Default is `relu1_1,relu2_1,relu3_1,relu4_1,relu5_1`. +* `-style_layer_weights`: Comma-separated list of weight multipliers to adjust relative weight of style layers. + If parameter is not given, all style layers will have relative weight = 1. **Other options**: * `-style_scale`: Scale at which to extract features from the style image. Default is 1.0. @@ -235,7 +246,7 @@ If you are running on a GPU, you can also try running with `-backend cudnn` to r **Solution:** Update `torch.paths` package to the latest version: `luarocks install paths` -**Problem:** NIN Imagenet model is not giving good results. +**Problem:** NIN Imagenet model is not giving good results. **Solution:** Make sure the correct `-proto_file` is selected. Also make sure the correct parameters for `-content_layers` and `-style_layers` are set. (See OpenCL usage example above.) @@ -254,7 +265,7 @@ These give good results, but can both use a lot of memory. You can reduce memory This should work in both CPU and GPU modes. * **Reduce image size**: If the above tricks are not enough, you can reduce the size of the generated image; pass the flag `-image_size 256` to generate an image at half the default size. - + With the default settings, `neural-style` uses about 3.5GB of GPU memory on my system; switching to ADAM and cuDNN reduces the GPU memory footprint to about 1GB. @@ -267,7 +278,7 @@ Here are some times for running 500 iterations with `-image_size=512` on a Maxwe * `-backend cudnn -cudnn_autotune -optimizer lbfgs`: 58 seconds * `-backend cudnn -cudnn_autotune -optimizer adam`: 44 seconds * `-backend clnn -optimizer lbfgs`: 169 seconds -* `-backend clnn -optimizer adam`: 106 seconds +* `-backend clnn -optimizer adam`: 106 seconds Here are the same benchmarks on a Pascal Titan X with cuDNN 5.0 on CUDA 8.0 RC: * `-backend nn -optimizer lbfgs`: 43 seconds @@ -289,7 +300,7 @@ for your setup in order to achieve maximal resolution. We can achieve very high quality results at high resolution by combining multi-GPU processing with multiscale generation as described in the paper -**Controlling Perceptual Factors in Neural Style Transfer** by Leon A. Gatys, +**Controlling Perceptual Factors in Neural Style Transfer** by Leon A. Gatys, Alexander S. Ecker, Matthias Bethge, Aaron Hertzmann and Eli Shechtman. Here is a 3620 x 1905 image generated on a server with four Pascal Titan X GPUs: @@ -305,6 +316,21 @@ We perform style reconstructions using the `conv1_1`, `conv2_1`, `conv3_1`, `con and content reconstructions using the `conv4_2` layer. As in the paper, the five style reconstruction losses have equal weights. +## Helper script +`run_neural_style.py` (tested on Python 3.5.2) script can be used to automate output file path and name generation like this: + +`python3 run_neural_style.py -style_image styles/style22_750.jpg -content_image input/sample1.jpg -optimizer adam -init image -output_path output -backend cudnn -style_weight 800 -content_weight 200 -style_scale 1.8 -learning_rate 3 -image_size 300 -tv_weight 0.001 -seed 150 -num_iterations 500` will create a `output/style22_750/sample1` folder and the result file name will be `i500.0cw200_sw600_adam_lr3_sc1.8_tv0.001.jpg` + +The arguments mostly follow the neural-style with some exceptions: + +* `-style_blend_weights`: not supported yet +* `-output_path`: Used instead of `-output_image`. Path for the output folder. Default is current folder. +* `-input_file_as_folder`: Uses input file +* `-style_as_folder` and `-no-style_as_folder`: Use/not use style name as an additional folder for output images. Default is `-style_as_folder` +* `-input_file_as_folder` and `-no-input_file_as_folder`: Use/not use input image file name as an additional folder for output images. Default is `-input_file_as_folder` +* `-style_layer_weights`: not supported yet +* `-time_markers`: add time marker (Ymd_HMS) to the end of the output file name + ## Citation If you find this code useful for your research, please cite: diff --git a/neural_style.lua b/neural_style.lua index adc7621..5b419a0 100644 --- a/neural_style.lua +++ b/neural_style.lua @@ -47,6 +47,7 @@ cmd:option('-seed', -1) cmd:option('-content_layers', 'relu4_2', 'layers for content') cmd:option('-style_layers', 'relu1_1,relu2_1,relu3_1,relu4_1,relu5_1', 'layers for style') +cmd:option('-style_layer_weights', 'nil') local function main(params) @@ -104,6 +105,17 @@ local function main(params) local content_layers = params.content_layers:split(",") local style_layers = params.style_layers:split(",") + local style_layer_weights = {} + if params.style_layer_weights == 'nil' then + for i = 1, #style_layers do + table.insert(style_layer_weights, 1) + end + else + style_layer_weights = params.style_layer_weights:split(',') + assert(#style_layer_weights == #style_layers, + '-style_layer_weights and -style_layers must have the same number of elements') + end + -- Set up the network, inserting style and content loss modules local content_losses, style_losses = {}, {} local next_content_idx, next_style_idx = 1, 1 @@ -140,7 +152,7 @@ local function main(params) if name == style_layers[next_style_idx] then print("Setting up style layer ", i, ":", layer.name) local norm = params.normalize_gradients - local loss_module = nn.StyleLoss(params.style_weight, norm):type(dtype) + local loss_module = nn.StyleLoss(params.style_weight * tonumber(style_layer_weights[next_style_idx]), target, norm):type(dtype) net:add(loss_module) table.insert(style_losses, loss_module) next_style_idx = next_style_idx + 1 diff --git a/run_neural_style.py b/run_neural_style.py new file mode 100644 index 0000000..bc460b8 --- /dev/null +++ b/run_neural_style.py @@ -0,0 +1,275 @@ +#!/usr/bin/env python3 +# -*- coding: utf-8 -*- +from __future__ import print_function +import sys +import subprocess +from datetime import datetime +from argparse import ArgumentParser +from os.path import basename, splitext, expanduser, isfile, join, exists +from os import listdir, getcwd, makedirs + +RUN_SCRIPT_NAME = "neural_style.lua" +DEF_CONTENT_LAYERS = 'relu4_2' +DEF_STYLE_LAYERS = 'relu1_1,relu2_1,relu3_1,relu4_1,relu5_1' + + +def build_parser(): + parser = ArgumentParser() + + # Basic options + parser.add_argument('-style_image', type=str, + default='examples/inputs/seated-nude.jpg', + dest='style_image', + help='Style target image', + metavar='STYLE_IMAGE', required=True) + ''' + parser.add_argument('-style_blend_weights', type=str, + default=None, + dest='style_blend_weights') + ''' + parser.add_argument('-content_image', type=str, + default='examples/inputs/tubingen.jpg', + dest='content_image', + help='Content target image or folder with images', + metavar='CONTENT_IMAGE', required=True) + + parser.add_argument('-image_size', type=int, + default=512, + dest='image_size', + help='Maximum height / width of generated image', + metavar='IMAGE_SIZE') + + parser.add_argument('-gpu', default=0, type=int, + dest='gpu', + help='Zero-indexed ID of the GPU to use; for \ + CPU mode set -gpu = -1', + metavar='GPU') + ''' + parser.add_argument('-multigpu_strategy', type=str, default='', + dest='multigpu_strategy', + help='Index of layers to split the network\ + across GPUs', + metavar='MULTI_GPU') + ''' + # Optimization options + parser.add_argument('-content_weight', default=5, type=float, + dest='content_weight') + parser.add_argument('-style_weight', default=100, type=float, + dest='style_weight') + parser.add_argument('-tv_weight', default=0.0003, type=float, + dest='tv_weight') + parser.add_argument('-num_iterations', default=1000, type=float, + dest='num_iter') + parser.add_argument('-normalize_gradients', default=False, + dest='normalize_gradients') + parser.add_argument('-init', default='random', dest='init', + choices=['random', 'image']) + parser.add_argument('-optimizer', default='lbfgs', dest='optimizer', + choices=['lbfgs', 'adam']) + parser.add_argument('-lbfgs_num_correction', default=0, dest='lbfgs_num_correction', + type=int) + parser.add_argument('-learning_rate', default=10, type=float, + dest='learning_rate') + + # Output options + parser.add_argument('-print_iter', default=50, type=int, + dest='print_iter') + parser.add_argument('-save_iter', default=100, type=int, + dest='save_iter') + parser.add_argument('-output_path', default=None, type=str, + dest='output_path') + parser.add_argument('-input_file_as_folder', dest='input_file_as_folder', + action='store_true') + parser.add_argument('-no-input_file_as_folder', dest='input_file_as_folder', + action='store_false') + parser.set_defaults(input_file_as_folder=True) + parser.add_argument('-style_as_folder', dest='style_as_folder', + action='store_true') + parser.add_argument('-no-style_as_folder', dest='style_as_folder', + action='store_false') + parser.set_defaults(style_as_folder=True) + + # Other options + parser.add_argument('-style_scale', default=1.0, type=float, + dest='style_scale') + parser.add_argument('-original_colors', default=False, + dest='original_colors') + parser.add_argument('-pooling', default='max', + choices=['max', 'avg']) + parser.add_argument('-proto_file', type=str, + default='models/VGG_ILSVRC_19_layers_deploy.prototxt', + dest='proto_file') + parser.add_argument('-model_file', type=str, + default='models/VGG_ILSVRC_19_layers.caffemodel', + dest='model_file') + parser.add_argument('-backend', default='nn', + choices=['nn', 'cudnn', 'clnn']) + parser.add_argument('-cudnn_autotune', default=False, + dest='cudnn_autotune') + parser.add_argument('-seed', default=-1, type=int, + dest='seed') + + # Layers options + parser.add_argument('-content_layers', type=str, + help='layers for content', + default=DEF_CONTENT_LAYERS, + dest='content_layers') + parser.add_argument('-style_layers', type=str, + help='layers for style', + default=DEF_STYLE_LAYERS, + dest='style_layers') + + # Runner options + parser.add_argument('-time_markers', + default=False, + dest='time_markers') + + # TODO: style_blend_weights, style_layer_weights + + return parser + + +def run_on_file(opts, input_file): + output_dir = expanduser(opts.output_path) if opts.output_path is not None else getcwd() + if opts.style_as_folder is True: + output_dir = join(output_dir, splitext(basename(opts.style_image))[0]) + if opts.input_file_as_folder is True: + output_dir = join(output_dir, splitext(basename(input_file))[0]) + if not exists(output_dir): + makedirs(output_dir) + optimizer_str = opts.optimizer + if opts.optimizer == 'adam': + optimizer_str += '_lr{0}'.format('%g' % (opts.learning_rate)) + if opts.optimizer == 'lbfgs' and opts.lbfgs_num_correction > 0: + optimizer_str += '_numcorr{0}'.format('%g' % (opts.lbfgs_num_correction)) + out_file_name = '{style_image}{content_image}{sep1}i{iter}cw{content_weight}_sw{style_weight}_\ +{optimizer}_sc{style_scale}_tv{tv_weight}'.format( + style_image=splitext(basename(opts.style_image))[0] if opts.style_as_folder is False else '', + content_image=('_'+splitext(basename(input_file))[0]) if opts.input_file_as_folder is False else '', + sep1='_' if opts.input_file_as_folder is False and opts.style_as_folder is False else '', + iter=opts.num_iter, + content_weight='%g' % (opts.content_weight), + style_weight='%g' % (opts.style_weight), + optimizer=optimizer_str, + style_scale='%g' % (opts.style_scale), + tv_weight=opts.tv_weight + ) + if opts.normalize_gradients: + out_file_name += '_norm' + if opts.original_colors: + out_file_name += '_colors' + if opts.style_layers != DEF_STYLE_LAYERS: + # relu1_1,relu2_1,relu3_1,relu4_1,relu5_1 + style_layers = opts.style_layers.replace('relu', '') \ + .replace('_', '').replace(',', '-') + out_file_name += '_sl{0}'.format(style_layers) + if opts.content_layers != DEF_CONTENT_LAYERS: + # relu1_1,relu2_1,relu3_1,relu4_1,relu5_1 + content_layers = opts.content_layers.replace('relu', '') \ + .replace('_', '').replace(',', '-') + out_file_name += '_cl{0}'.format(content_layers) + if opts.time_markers: + now = datetime.now() + out_file_name += '_{0}'.format(now.strftime('%Y%m%d_%H%M%S')) + out_file_name += '.jpg' + out_file_path = join(output_dir, out_file_name) + + run_script = 'th {script} -style_scale {style_scale} \ +-init {init} -style_image "{style_image}" \ +-content_image "{content_image}" -image_size {image_size} \ +-output_image "{output_image}" \ +-content_weight {content_weight} -style_weight {style_weight} \ +-save_iter {save_iter} \ +-print_iter {print_iter} \ +-num_iterations {num_iterations} -content_layers {content_layers} \ +-style_layers {style_layers} \ +-gpu {gpu} -optimizer {optimizer} -tv_weight {tv_weight} \ +-backend {backend} \ +-seed {seed} {normalize_gradients} \ +-learning_rate {learning_rate} \ +-original_colors {original_colors} {cudnn_autotune} \ +-lbfgs_num_correction {lbfgs_num_correction} \ +-pooling {pooling} -proto_file {proto_file} -model_file {model_file}'.format( + script=RUN_SCRIPT_NAME, + style_scale=opts.style_scale, + init=opts.init, + style_image=expanduser(opts.style_image), + content_image=input_file, + image_size=opts.image_size, + output_image=out_file_path, + content_weight=opts.content_weight, + style_weight=opts.style_weight, + save_iter=opts.save_iter, + print_iter=opts.print_iter, + num_iterations=opts.num_iter, + content_layers=opts.content_layers, + style_layers=opts.style_layers, + gpu=opts.gpu, + optimizer=opts.optimizer, + tv_weight=opts.tv_weight, + backend=opts.backend, + seed=opts.seed, + normalize_gradients=('-normalize_gradients' if opts.normalize_gradients else ''), + learning_rate=opts.learning_rate, + original_colors=('1' if opts.original_colors else '0'), + lbfgs_num_correction=opts.lbfgs_num_correction, + cudnn_autotune=('-cudnn_autotune' if opts.cudnn_autotune else ''), + pooling=opts.pooling, + proto_file=opts.proto_file, + model_file=opts.model_file) + + print('Script \'{0}\''.format(run_script)) + with subprocess.Popen(run_script, shell=True, + stdin=subprocess.PIPE, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE, + universal_newlines=True) as proc: + for line in iter(proc.stdout.readline, ''): + sys.stdout.write(line) + + +# Print iterations progress +def print_progress(iteration, total, prefix='', suffix='', decimals=1, + barLength=100): + """ + Call in a loop to create terminal progress bar + @params: + iteration - Required : current iteration (Int) + total - Required : total iterations (Int) + prefix - Optional : prefix string (Str) + suffix - Optional : suffix string (Str) + decimals - Optional : positive number of decimals in percent complete (Int) + barLength - Optional : character length of bar (Int) + """ + formatStr = "{0:." + str(decimals) + "f}" + percent = formatStr.format(100 * (iteration / float(total))) + filledLength = int(round(barLength * iteration / float(total))) + bar_str = '█' * filledLength + '-' * (barLength - filledLength) + text = '\r%s |%s| %s%s %s' % (prefix, bar_str, percent, '%', suffix) + sys.stdout.write(text), + if iteration == total: + sys.stdout.write('\n') + sys.stdout.flush() + + +def main(): + parser = build_parser() + opts = parser.parse_args() + + input_path = expanduser(opts.content_image) + print('Input path is {0}'.format(input_path)) + if isfile(input_path): + run_on_file(opts, input_path) + else: + input_files = [f for f in listdir(input_path) if isfile(join(input_path, f))] + count = len(input_files) + i = 1 + for input_filename in input_files: + print('\nInput file is {0}'.format(input_filename)) + run_on_file(opts, join(input_path, input_filename)) + print_progress(i, count, prefix='Progress:', + suffix='Complete', barLength=100) + i += 1 + +if __name__ == '__main__': + main()