diff --git a/doc/README b/doc/README index 57a496e1..9eea1133 100644 --- a/doc/README +++ b/doc/README @@ -1,3 +1,27 @@ + ADDA 0.78 + ********* + "Amsterdam DDA" + + Maxim A. Yurkin(1,2) and Alfons G. Hoekstra(1) + + (1) Faculty of Science, Section Computational Science, + of the University of Amsterdam, + Kruislaan 403, 1098 SJ, Amsterdam, The Netherlands, + tel: +31-20-525-7530, fax: +31-20-525-7490 + + (2) Institute of Chemical Kinetics and Combustion, + Siberian Branch of the Russian Academy of Sciences, + Institutskaya 3, Novosibirsk, 630090, Russia, + tel: +7-383-333-3240, fax: +7-383-334-2350 + + email: adda@science.uva.nl + + last revised: 19 March 2008 + + Copyright (C) 2006-2008 University of Amsterdam + This software package is covered by the GNU General Public License. + + ## ##### ## ##### ## ## /#### /##### /## /##### /## /#### / ### // / / ### // / / ### / ### @@ -17,31 +41,6 @@ ## ## ## ## - - Maxim A. Yurkin - - Institute of Chemical Kinetics and Combustion, - Siberian Branch of the Russian Academy of Sciences, - Institutskaya 3, Novosibirsk, 630090, Russia, - tel: +7-383-333-3240, fax: +7-383-334-2350 - - Alfons G. Hoekstra - - Faculty of Science, Section Computational Science, - of the University of Amsterdam, - Kruislaan 403, 1098 SJ, Amsterdam, The Netherlands, - tel: +31-20-525-7530, fax: +31-20-525-7490 - - - email: adda@science.uva.nl - - $Date:: $ - - Copyright (C) 2006-2008 University of Amsterdam - This software package is covered by the GNU General Public License. - - - 1. INTRODUCTION *************** diff --git a/doc/faq b/doc/faq index 40fcc872..85e6d9ee 100644 --- a/doc/faq +++ b/doc/faq @@ -1,7 +1,7 @@ Frequently Asked Questions - about ADDA + about Amsterdam DDA - $Date:: $ + last revised: 3 June 2007 Q: I have found a bug in ADDA. What should I do? A: 1) Make sure you are using the latest version of ADDA (check the ADDA @@ -14,7 +14,6 @@ A: 1) Make sure you are using the latest version of ADDA (check the ADDA parameters as possible without removing the bug. Also try to use defautl versions of input files, that you have modified. 5) Send the results of (3) and (4) to the authors, together with all input - files and Makefiles that you have used for compilation. Do not forget to include all the relevant output files, at least 'log'. Please also include a brief description of your operation system and hardware. We will try to @@ -71,14 +70,5 @@ A: The simplest is to specify your particle by a shape file. However, if your source files to the authors so they would be incorporated in the next release for the benefit of the community. -Q: How is the Mueller matrix, produced by ADDA, defined and/or normalized? -A: It is defined as in Bohren & Huffman "Absorption and scattering of Light by - Small Particles" (1983), and it is not normalized. Some other codes may - compute Stokes scattering matrix, which is normalized so that 1,1-element - is equal to 1 after averaging over the whole solid angle. This matrix - should be multiplied by (pi*Csca/(lambda^2)) to get Mueller matrix. Csca is - the scattering cross section for unpolarized light, equal to average of - scattering cross sections for any two perpendicular incident polarizations. - This list is far from being complete. Please send your questions to adda@science.uva.nl diff --git a/doc/history b/doc/history index ee5ec513..9220661a 100644 --- a/doc/history +++ b/doc/history @@ -259,18 +259,3 @@ Ver. 0.78 - (19.03.08) + New argument 'auto' was added to '-sym' command line option to correspond to the general rule that one of the possible arguments is the default one. No functionality is changed for other arguments. - ----------------------------------- -Ver. 0.78.1 - 08.04.08 - -- Critical bug in cDiv and cDivSelf functions in source file cmplx.h was fixed - (introduced in version 0.78). This bug made BiCGSTAB and BiCG iterative - solvers to fail. However, even when other two iterative solvers were used, - erroneous results were calculated for certain values of refractive index. - ----------------------------------- -Ver. 0.78.2 - 11.04.08 - -- An implementation of a new FCD polarization prescription (-pol fcd) was - somewhat faulty. Correction terms of orders (kd)^2 and ln(...)(kd)^3 were - interchanged. Fixed. diff --git a/doc/manual.doc b/doc/manual.doc index e83a8fa1..d0c0b1de 100644 Binary files a/doc/manual.doc and b/doc/manual.doc differ diff --git a/doc/manual.pdf b/doc/manual.pdf index b6f42ceb..9ce0a0ea 100644 Binary files a/doc/manual.pdf and b/doc/manual.pdf differ diff --git a/doc/todo.pdf b/doc/todo.pdf index 2b3cfc37..77876bcf 100644 Binary files a/doc/todo.pdf and b/doc/todo.pdf differ diff --git a/doc/todo.xls b/doc/todo.xls index 51610c5d..802d96ad 100644 Binary files a/doc/todo.xls and b/doc/todo.xls differ diff --git a/sample/run000_sphere_g16m1_5/log b/sample/run000_sphere_g16m1_5/log index ab02904e..fe119bfa 100644 --- a/sample/run000_sphere_g16m1_5/log +++ b/sample/run000_sphere_g16m1_5/log @@ -1,4 +1,4 @@ -Generated by ADDA v.0.78.2 +Generated by ADDA v.0.78 The program was run on: dda command: './adda ' lambda: 6.283185307 diff --git a/src/ADDAmain.c b/src/ADDAmain.c index d20c5715..46f48442 100644 --- a/src/ADDAmain.c +++ b/src/ADDAmain.c @@ -3,7 +3,7 @@ * DESCR: Main. All the work moved to other modules. * * Previous versions were developed by Alfons Hoekstra. - * Sequential version, Michel Grimminck January 1995 + * Sequential version, Michel Grimminck Jan 1995 * * Copyright (C) 2006-2008 University of Amsterdam * This code is covered by the GNU General Public License. @@ -15,58 +15,59 @@ #include "debug.h" #include "io.h" -// EXTERNAL FUNCTIONS +/* EXTERNAL FUNCTIONS */ -// calculator.c +/* calculator.c */ void Calculator(void); -// make_particle.c +/* make_particle.c */ void InitShape(void); int MakeParticle(void); -// param.c +/* param.c */ void InitVariables(void); void ParseParameters(int argc,char **argv); void VariablesInterconnect(void); void DirectoryLog(int argc,char **argv); void PrintInfo(void); -//============================================================ +/*============================================================*/ int main(int argc,char **argv) { - // Initialize error handling and line wrapping - logfile=NULL; - term_width=DEF_TERM_WIDTH; - // Start global time - StartTime(); - // Initialize communications - InitComm(&argc,&argv); - // Initialize and parse input parameters - InitVariables(); - ParseParameters(argc,argv); - D("Reading command line finished"); - VariablesInterconnect(); // also initializes beam - // Initialize symmetries and box's; get number of dipoles; set some variables - InitShape(); - // !!! before this line errors should be printed in simple format, after - in advanced one - // Create directory and start logfile (print command line) - DirectoryLog(argc,argv); - // Initialize FFT grid and its subdivision over processors - ParSetup(); - // MakeParticle; initialize dpl and nlocalRows - MakeParticle(); - // Print info to stdout and logfile - PrintInfo(); - // Main calculation part - D("Calculator started"); - Calculator(); - D("Calculator finished"); - // Print timing and statistics; close logfile - FinalStatistics(); - // check error on stdout - if (ferror(stdout)) LogError(EC_WARN,ALL_POS, - "Some errors occurred while writing to stdout during the execution of ADDA"); - // finish execution normally - Stop(0); - // never actually reached; just to make the compiler happy - return 0; + /* initialize error handling and line wrapping */ + logfile=NULL; + term_width=DEF_TERM_WIDTH; + /* start global time */ + StartTime(); + /* initialize communications */ + InitComm(&argc,&argv); + /* initialize and parse input parameters */ + InitVariables(); + ParseParameters(argc,argv); + D("Reading command line finished"); + VariablesInterconnect(); /* also initializes beam */ + /* initialize symmetries and box's; get number of dipoles; set some variables */ + InitShape(); + /* !!! before errorrs should be printed in simple format, after in more advanced one !!! */ + /* Create directory and start logfile (print command line) */ + DirectoryLog(argc,argv); + /* initialize FFT grid and its subdivision over processors */ + ParSetup(); + /* MakeParticle; initialize dpl and nlocalRows */ + MakeParticle(); + /* print info to stdout and logfile */ + PrintInfo(); + /* initialize times and counters */ + /* Main calculation part */ + D("Calculator started"); + Calculator(); + D("Calculator finished"); + /* print timing and statistics; close logfile */ + FinalStatistics(); + /* check error on stdout */ + if (ferror(stdout)) LogError(EC_WARN,ALL_POS, + "Some errors occured while writing to stdout during the execution of ADDA"); + /* finish execution normally */ + Stop(0); + /* never actually reached; just to make the compiler happy */ + return 0; } diff --git a/src/CalculateE.c b/src/CalculateE.c index 9e1e4220..8569799f 100644 --- a/src/CalculateE.c +++ b/src/CalculateE.c @@ -9,7 +9,7 @@ * * Previous versions by Alfons Hoekstra * - * Copyright (C) 2006-2008 University of Amsterdam + * Copyright (C) 2006-2007 University of Amsterdam * This code is covered by the GNU General Public License. */ #include <stdlib.h> @@ -27,675 +27,722 @@ #include "timing.h" #include "function.h" -// SEMI-GLOBAL VARIABLES +/* SEMI-GLOBAL VARIABLES */ -// defined and initialized in calculator.c -extern double *muel_phi,*muel_phi_buf; +/* defined and initialized in calculator.c */ +extern double *muel_phi,*muel_phi1; extern doublecomplex *EplaneX, *EplaneY; extern double *Eplane_buffer; extern const double dtheta_deg,dtheta_rad; extern doublecomplex *ampl_alphaX,*ampl_alphaY; extern double *muel_alpha; -// defined and initialized in crosssec.c +/* defined and initialized in crosssec.c */ extern const Parms_1D phi_sg; -// defined and initialized in param.c +/* defined and initialized in param.c */ extern const int store_int_field,store_dip_pol,store_beam,store_scat_grid,calc_Cext,calc_Cabs, -calc_Csca,calc_vec,calc_asym,calc_mat_force,store_force,phi_int_type; -// defined and initialized in timing.c + calc_Csca,calc_vec,calc_asym,calc_mat_force,store_force,phi_int_type; +/* defined and initialized in timing.c */ extern TIME_TYPE Timing_EFieldPlane,Timing_comm_EField, -Timing_IntField,Timing_IntFieldOne,Timing_ScatQuan; + Timing_IntField,Timing_IntFieldOne,Timing_ScatQuan; extern unsigned long TotalEFieldPlane; -// used in iterative.c +/* used in iterative.c */ TIME_TYPE tstart_CE; -// EXTERNAL FUNCTIONS +/* EXTERNAL FUNCTIONS */ -// GenerateB.c +/* GenerateB.c */ void GenerateB(char which,doublecomplex *x); -// iterative.c +/* iterative.c */ int IterativeSolver(int method); -//============================================================ +/*============================================================*/ -static void ComputeMuellerMatrix(double matrix[4][4], const doublecomplex s1,const doublecomplex s2, - const doublecomplex s3,const doublecomplex s4) -/* computer mueller matrix from scattering matrix elements s1, s2, s3, s4, according to formula - * 3.16 from Bohren and Huffman - */ +static void ComputeMuellerMatrix(double matrix[4][4], const doublecomplex s1, + const doublecomplex s2,const doublecomplex s3,const doublecomplex s4) +/* computer mueller matrix from scattering matrix elements s1, s2, s3, s4, accoording + to formula 3.16 from Bohren and Huffman */ { - matrix[0][0] = 0.5*(cMultConRe(s1,s1)+cMultConRe(s2,s2)+cMultConRe(s3,s3)+cMultConRe(s4,s4)); - matrix[0][1] = 0.5*(cMultConRe(s2,s2)-cMultConRe(s1,s1)+cMultConRe(s4,s4)-cMultConRe(s3,s3)); - matrix[0][2] = cMultConRe(s2,s3)+cMultConRe(s1,s4); - matrix[0][3] = cMultConIm(s2,s3)-cMultConIm(s1,s4); - - matrix[1][0] = 0.5*(cMultConRe(s2,s2)-cMultConRe(s1,s1)+cMultConRe(s3,s3)-cMultConRe(s4,s4)); - matrix[1][1] = 0.5*(cMultConRe(s2,s2)+cMultConRe(s1,s1)-cMultConRe(s3,s3)-cMultConRe(s4,s4)); - matrix[1][2] = cMultConRe(s2,s3)-cMultConRe(s1,s4); - matrix[1][3] = cMultConIm(s2,s3)+cMultConIm(s1,s4); - - matrix[2][0] = cMultConRe(s2,s4)+cMultConRe(s1,s3); - matrix[2][1] = cMultConRe(s2,s4)-cMultConRe(s1,s3); - matrix[2][2] = cMultConRe(s1,s2)+cMultConRe(s3,s4); - matrix[2][3] = cMultConIm(s2,s1)+cMultConIm(s4,s3); - - matrix[3][0] = cMultConIm(s4,s2)+cMultConIm(s1,s3); - matrix[3][1] = cMultConIm(s4,s2)-cMultConIm(s1,s3); - matrix[3][2] = cMultConIm(s1,s2)-cMultConIm(s3,s4); - matrix[3][3] = cMultConRe(s1,s2)-cMultConRe(s3,s4); + matrix[0][0] = 0.5*(cMultConRe(s1,s1)+cMultConRe(s2,s2)+ + cMultConRe(s3,s3)+cMultConRe(s4,s4)); + matrix[0][1] = 0.5*(cMultConRe(s2,s2)-cMultConRe(s1,s1)+ + cMultConRe(s4,s4)-cMultConRe(s3,s3)); + matrix[0][2] = cMultConRe(s2,s3)+cMultConRe(s1,s4); + matrix[0][3] = cMultConIm(s2,s3)-cMultConIm(s1,s4); + + matrix[1][0] = 0.5*(cMultConRe(s2,s2)-cMultConRe(s1,s1)+ + cMultConRe(s3,s3)-cMultConRe(s4,s4)); + matrix[1][1] = 0.5*(cMultConRe(s2,s2)+cMultConRe(s1,s1)+ + -cMultConRe(s3,s3)-cMultConRe(s4,s4)); + matrix[1][2] = cMultConRe(s2,s3)-cMultConRe(s1,s4); + matrix[1][3] = cMultConIm(s2,s3)+cMultConIm(s1,s4); + + matrix[2][0] = cMultConRe(s2,s4)+cMultConRe(s1,s3); + matrix[2][1] = cMultConRe(s2,s4)-cMultConRe(s1,s3); + matrix[2][2] = cMultConRe(s1,s2)+cMultConRe(s3,s4); + matrix[2][3] = cMultConIm(s2,s1)+cMultConIm(s4,s3); + + matrix[3][0] = cMultConIm(s4,s2)+cMultConIm(s1,s3); + matrix[3][1] = cMultConIm(s4,s2)-cMultConIm(s1,s3); + matrix[3][2] = cMultConIm(s1,s2)-cMultConIm(s3,s4); + matrix[3][3] = cMultConRe(s1,s2)-cMultConRe(s3,s4); } -//============================================================ -// this function is currently not used -static void ComputeMuellerMatrixNorm(double [4][4],const doublecomplex,const doublecomplex, - const doublecomplex,const doublecomplex) ATT_UNUSED; +/*============================================================*/ + /* this function is currently not used */ +static void ComputeMuellerMatrixNorm(double [4][4],const doublecomplex, + const doublecomplex,const doublecomplex,const doublecomplex) ATT_UNUSED; static void ComputeMuellerMatrixNorm(double matrix[4][4],const doublecomplex s1, - const doublecomplex s2,const doublecomplex s3,const doublecomplex s4) -/* computer mueller matrix from scattering matrix elements s1, s2, s3, s4, according to formula - * 3.16 from Bohren and Huffman; normalize all elements to S11 (except itself) - */ -{ - matrix[0][0] = 0.5*(cMultConRe(s1,s1)+cMultConRe(s2,s2)+cMultConRe(s3,s3)+cMultConRe(s4,s4)); - matrix[0][1] = 0.5*(cMultConRe(s2,s2)-cMultConRe(s1,s1)+cMultConRe(s4,s4)-cMultConRe(s3,s3)) - / matrix[0][0]; - matrix[0][2] = (cMultConRe(s2,s3)+cMultConRe(s1,s4))/matrix[0][0]; - matrix[0][3] = (cMultConIm(s2,s3)-cMultConIm(s1,s4))/matrix[0][0]; - - matrix[1][0] = 0.5*(cMultConRe(s2,s2)-cMultConRe(s1,s1)+cMultConRe(s3,s3)-cMultConRe(s4,s4)) - / matrix[0][0]; - matrix[1][1] = 0.5*(cMultConRe(s2,s2)+cMultConRe(s1,s1)-cMultConRe(s3,s3)-cMultConRe(s4,s4)) - / matrix[0][0]; - matrix[1][2] = (cMultConRe(s2,s3)-cMultConRe(s1,s4))/matrix[0][0]; - matrix[1][3] = (cMultConIm(s2,s3)+cMultConIm(s1,s4))/matrix[0][0]; - - matrix[2][0] = (cMultConRe(s2,s4)+cMultConRe(s1,s3))/matrix[0][0]; - matrix[2][1] = (cMultConRe(s2,s4)-cMultConRe(s1,s3))/matrix[0][0]; - matrix[2][2] = (cMultConRe(s1,s2)+cMultConRe(s3,s4))/matrix[0][0]; - matrix[2][3] = (cMultConIm(s2,s1)+cMultConIm(s4,s3))/matrix[0][0]; - - matrix[3][0] = (cMultConIm(s4,s2)+cMultConIm(s1,s3))/matrix[0][0]; - matrix[3][1] = (cMultConIm(s4,s2)-cMultConIm(s1,s3))/matrix[0][0]; - matrix[3][2] = (cMultConIm(s1,s2)-cMultConIm(s3,s4))/matrix[0][0]; - matrix[3][3] = (cMultConRe(s1,s2)-cMultConRe(s3,s4))/matrix[0][0]; -} - -//============================================================== -INLINE void InitMuellerIntegrFile(const int type,const char *fname,FILE **file,char *buf, - double **mult) -/* If 'phi_int_type' matches 'type', appropriate file (name given by 'fname') is created (with - * handle '*file'), and heading line is put into it. String buffer 'buf' is used. Vector of - * multipliers '*mult' is allocated if its pointer is specified. - */ -{ - if (phi_int_type & type) { - sprintf(buf,"%s/%s",directory,fname); - (*file)=FOpenErr(buf,"w",ONE_POS); - fprintf(*file,"theta s11 s12 s13 s14 s21 s22 s23 s24 s31 s32 s33 s34 s41 s42 s43 s44 " - "RMSE(integr)\n"); - if (mult!=NULL) MALLOC_VECTOR(*mult,double,angles.phi.N,ALL); - } -} - -//============================================================== - -INLINE void PrintToIntegrFile(const int type,FILE *file,double *maxerr,const double *muel, - double *muel_buf,const double *mult,double matrix[4][4],const double theta) -/* If 'phi_int_type' matches 'type', array 'muel' is integrated over phi (possibly using multiplier - * 'mult' and buffer 'muel_buf') and saved to 'file' together with 'theta'. Maximum error '*maxerr' - * is updated, 'matrix' buffer is used. - */ + const doublecomplex s2,const doublecomplex s3,const doublecomplex s4) +/* computer mueller matrix from scattering matrix elements s1, s2, s3, s4, accoording to + formula 3.16 from Bohren and Huffman; normalize all elements to S11 (except itself)*/ { - int k; - size_t j; - double err; - - if (phi_int_type & type) { - if (mult==NULL) err=Romberg1D(phi_sg,16,muel,matrix[0]); - else { - for (j=0;j<angles.phi.N;j++) for(k=0;k<16;k++) muel_buf[16*j+k]=muel[16*j+k]*mult[j]; - err=Romberg1D(phi_sg,16,muel_buf,matrix[0]); - } - if (err>*maxerr) *maxerr=err; - fprintf(file,"%.2f %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E" - " %.10E %.10E %.10E %.10E %.3E\n",theta,matrix[0][0],matrix[0][1],matrix[0][2], - matrix[0][3],matrix[1][0],matrix[1][1],matrix[1][2],matrix[1][3],matrix[2][0], - matrix[2][1],matrix[2][2],matrix[2][3],matrix[3][0],matrix[3][1],matrix[3][2], - matrix[3][3],err); - } + matrix[0][0] = 0.5*(cMultConRe(s1,s1)+cMultConRe(s2,s2)+ + cMultConRe(s3,s3)+cMultConRe(s4,s4)); + matrix[0][1] = 0.5*(cMultConRe(s2,s2)-cMultConRe(s1,s1)+ + cMultConRe(s4,s4)-cMultConRe(s3,s3))/matrix[0][0]; + matrix[0][2] = (cMultConRe(s2,s3)+cMultConRe(s1,s4))/matrix[0][0]; + matrix[0][3] = (cMultConIm(s2,s3)-cMultConIm(s1,s4))/matrix[0][0]; + + matrix[1][0] = 0.5*(cMultConRe(s2,s2)-cMultConRe(s1,s1)+ + cMultConRe(s3,s3)-cMultConRe(s4,s4))/matrix[0][0]; + matrix[1][1] = 0.5*(cMultConRe(s2,s2)+cMultConRe(s1,s1)+ + -cMultConRe(s3,s3)-cMultConRe(s4,s4))/matrix[0][0]; + matrix[1][2] = (cMultConRe(s2,s3)-cMultConRe(s1,s4))/matrix[0][0]; + matrix[1][3] = (cMultConIm(s2,s3)+cMultConIm(s1,s4))/matrix[0][0]; + + matrix[2][0] = (cMultConRe(s2,s4)+cMultConRe(s1,s3))/matrix[0][0]; + matrix[2][1] = (cMultConRe(s2,s4)-cMultConRe(s1,s3))/matrix[0][0]; + matrix[2][2] = (cMultConRe(s1,s2)+cMultConRe(s3,s4))/matrix[0][0]; + matrix[2][3] = (cMultConIm(s2,s1)+cMultConIm(s4,s3))/matrix[0][0]; + + matrix[3][0] = (cMultConIm(s4,s2)+cMultConIm(s1,s3))/matrix[0][0]; + matrix[3][1] = (cMultConIm(s4,s2)-cMultConIm(s1,s3))/matrix[0][0]; + matrix[3][2] = (cMultConIm(s1,s2)-cMultConIm(s3,s4))/matrix[0][0]; + matrix[3][3] = (cMultConRe(s1,s2)-cMultConRe(s3,s4))/matrix[0][0]; } -//============================================================== - -INLINE void CloseIntegrFile(const int type,FILE *file,const char *fname,double *mult) -/* If 'phi_int_type' matches 'type', appropriate 'file' (named 'fname') is closed and array 'mult' - * is freed. - */ -{ - if (phi_int_type & type) { - FCloseErr(file,fname,ONE_POS); - Free_general(mult); - } -} -//============================================================== +/*==============================================================*/ void MuellerMatrix(void) { - FILE *mueller,*mueller_int,*mueller_int_c2,*mueller_int_s2,*mueller_int_c4,*mueller_int_s4; - double *cos2,*sin2,*cos4,*sin4; - double matrix[4][4]; - double theta,phi,ph, - max_err,max_err_c2,max_err_s2,max_err_c4,max_err_s4; - doublecomplex s1,s2,s3,s4,s10,s20,s30,s40; - char fname[MAX_FNAME]; - int i; - size_t index,index1,k_or,j,n,ind; - double co,si; - double alph; - TIME_TYPE tstart; - - if (ringid!=ROOT) return; - - if (orient_avg) { // Amplitude matrix stored in ampl_alplha is - index1=index=0; // transformed into Mueller matrix stored in muel_alpha - for (k_or=0;k_or<alpha_int.N;k_or++) { - alph=Deg2Rad(alpha_int.val[k_or]); // read current alpha - co=cos(alph); - si=sin(alph); - for (i=0;i<nTheta;i++) { - // read amplitude matrix from memory - cEqual(ampl_alphaX[index],s10); - cEqual(ampl_alphaX[index+1],s30); - cEqual(ampl_alphaY[index],s40); - cEqual(ampl_alphaY[index+1],s20); - // transform it, multiplying by rotation matrix (-alpha) - cLinComb(s20,s30,co,si,s2); // s2 = co*s20 + si*s30 - cLinComb(s20,s30,-si,co,s3); // s3 = -si*s20 + co*s30 - cLinComb(s40,s10,co,si,s4); // s4 = co*s40 + si*s10 - cLinComb(s40,s10,-si,co,s1); // s1 = -si*s40 + co*s10 - - ComputeMuellerMatrix((double (*)[4])(muel_alpha+index1),s1,s2,s3,s4); - index+=2; - index1+=16; - } - } - } - else { - tstart=GET_TIME(); // here Mueller matrix is saved to file - if (yzplane) { - strcpy(fname,directory); - strcat(fname,"/" F_MUEL); - mueller=FOpenErr(fname,"w",ONE_POS); - fprintf(mueller, - "theta s11 s12 s13 s14 s21 s22 s23 s24 s31 s32 s33 s34 s41 s42 s43 s44\n"); - for (i=0;i<nTheta;i++) { - theta=i*dtheta_deg; - ComputeMuellerMatrix(matrix,EplaneX[2*i],EplaneY[2*i+1],EplaneX[2*i+1], - EplaneY[2*i]); - fprintf(mueller,"%.2f %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E " - "%.10E %.10E %.10E %.10E %.10E %.10E\n",theta,matrix[0][0],matrix[0][1], - matrix[0][2],matrix[0][3],matrix[1][0],matrix[1][1],matrix[1][2],matrix[1][3], - matrix[2][0],matrix[2][1],matrix[2][2],matrix[2][3],matrix[3][0],matrix[3][1], - matrix[3][2],matrix[3][3]); - } - FCloseErr(mueller,F_MUEL,ONE_POS); - } - - if (scat_grid) { - /* compute Mueller Matrix in full space angle. - * E-fields are stored in arrays EgridX and EgridY for incoming X and Y polarized light. - * It is converted to the scattering matrix elements (see e.g Bohren and Huffman) : - * s2 = cos(phi)E'X'par + sin(phi)E'Y'par - * s3 = sin(phi)E'X'par - cos(phi)E'Y'par - * s4 = cos(phi)E'X'per + sin(phi)E'Y'per - * s1 = sin(phi)E'X'per - cos(phi)E'Y'per - * from these the mueller matrix elements are computed - */ - // open files for writing - if (store_scat_grid) { - strcpy(fname,directory); - strcat(fname,"/" F_MUEL_SG); - mueller=FOpenErr(fname,"w",ONE_POS); - fprintf(mueller, - "theta phi s11 s12 s13 s14 s21 s22 s23 s24 s31 s32 s33 s34 s41 s42 s43 s44\n"); - } - if (phi_integr) { // also initializes arrays of multipliers - InitMuellerIntegrFile(PHI_UNITY,F_MUEL_INT,&mueller_int,fname,NULL); - InitMuellerIntegrFile(PHI_COS2,F_MUEL_C2,&mueller_int_c2,fname,&cos2); - InitMuellerIntegrFile(PHI_SIN2,F_MUEL_S2,&mueller_int_s2,fname,&sin2); - InitMuellerIntegrFile(PHI_COS4,F_MUEL_C4,&mueller_int_c4,fname,&cos4); - InitMuellerIntegrFile(PHI_SIN4,F_MUEL_S4,&mueller_int_s4,fname,&sin4); - // fills arrays with multipliers (optimized) - for (j=0;j<angles.phi.N;j++) { - // prepare - ph=2*Deg2Rad(angles.phi.val[j]); - if (phi_int_type & (PHI_COS2|PHI_COS4|PHI_SIN4)) co=cos(ph); - if (phi_int_type & (PHI_SIN2|PHI_SIN4)) si=sin(ph); - // fill - if (phi_int_type & PHI_COS2) cos2[j]=co; - if (phi_int_type & PHI_SIN2) sin2[j]=si; - if (phi_int_type & PHI_COS4) cos4[j]=2*co*co-1; - if (phi_int_type & PHI_SIN4) sin4[j]=2*si*co; - } - } - // set type of cycling through angles - if (angles.type==SG_GRID) n=angles.phi.N; - else if (angles.type==SG_PAIRS) n=1; - // main cycle - index=0; - max_err=max_err_c2=max_err_s2=max_err_c4=max_err_s4=0; - for (ind=0;ind<angles.theta.N;++ind) { - index1=0; - theta=angles.theta.val[ind]; - for (j=0;j<n;++j) { - if (angles.type==SG_GRID) phi=angles.phi.val[j]; - else if (angles.type==SG_PAIRS) phi=angles.phi.val[ind]; - ph=Deg2Rad(phi); - co=cos(ph); - si=sin(ph); - // read amplitude matrix from memory - cEqual(EgridY[index],s10); - cEqual(EgridY[index+1],s30); - cEqual(EgridX[index],s40); - cEqual(EgridX[index+1],s20); - // transform it, multiplying by rotation matrix from per-par to X-Y - cLinComb(s20,s30,co,si,s2); // s2 = co*s20 + si*s30 - cLinComb(s20,s30,si,-co,s3); // s3 = si*s20 - co*s30 - cLinComb(s40,s10,co,si,s4); // s4 = co*s40 + si*s10 - cLinComb(s40,s10,si,-co,s1); // s1 = si*s40 - co*s10 - - ComputeMuellerMatrix(matrix,s1,s2,s3,s4); - index+=2; - if (phi_integr) { - memcpy(muel_phi+index1,matrix[0],16*sizeof(double)); - index1+=16; - } - if (store_scat_grid) - fprintf(mueller, - "%.2f %.2f %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E"\ - " %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E\n", - theta,phi,matrix[0][0],matrix[0][1],matrix[0][2],matrix[0][3], - matrix[1][0],matrix[1][1],matrix[1][2],matrix[1][3], - matrix[2][0],matrix[2][1],matrix[2][2],matrix[2][3], - matrix[3][0],matrix[3][1],matrix[3][2],matrix[3][3]); - } - if (phi_integr) { - PrintToIntegrFile(PHI_UNITY,mueller_int,&max_err,muel_phi,NULL, - NULL,matrix,theta); - PrintToIntegrFile(PHI_COS2,mueller_int_c2,&max_err_c2,muel_phi,muel_phi_buf, - cos2,matrix,theta); - PrintToIntegrFile(PHI_SIN2,mueller_int_s2,&max_err_s2,muel_phi,muel_phi_buf, - sin2,matrix,theta); - PrintToIntegrFile(PHI_COS4,mueller_int_c4,&max_err_c4,muel_phi,muel_phi_buf, - cos4,matrix,theta); - PrintToIntegrFile(PHI_SIN4,mueller_int_s4,&max_err_s4,muel_phi,muel_phi_buf, - sin4,matrix,theta); - } - } - if (phi_integr) { - fprintf(logfile,"\nMaximum relative mean-square error of Mueller integration:\n"); - if (phi_int_type & PHI_UNITY) fprintf(logfile," 1 -> %.3E\n",max_err); - if (phi_int_type & PHI_COS2) fprintf(logfile," cos(2*phi) -> %.3E\n",max_err_c2); - if (phi_int_type & PHI_SIN2) fprintf(logfile," cos(2*phi) -> %.3E\n",max_err_c2); - if (phi_int_type & PHI_COS4) fprintf(logfile," cos(2*phi) -> %.3E\n",max_err_c2); - if (phi_int_type & PHI_SIN4) fprintf(logfile," cos(2*phi) -> %.3E\n",max_err_c2); - } - // close files; free arrays - if (store_scat_grid) FCloseErr(mueller,F_MUEL_SG,ONE_POS); - if (phi_integr) { - CloseIntegrFile(PHI_UNITY,mueller_int,F_MUEL_INT,NULL); - CloseIntegrFile(PHI_COS2,mueller_int_c2,F_MUEL_C2,cos2); - CloseIntegrFile(PHI_SIN2,mueller_int_s2,F_MUEL_S2,sin2); - CloseIntegrFile(PHI_COS4,mueller_int_c4,F_MUEL_C4,cos4); - CloseIntegrFile(PHI_SIN4,mueller_int_s4,F_MUEL_S4,sin4); - } - } - Timing_FileIO += GET_TIME() - tstart; - } + FILE *mueller,*mueller_int,*mueller_int_c2,*mueller_int_s2,*mueller_int_c4,*mueller_int_s4; + double *cos2,*sin2,*cos4,*sin4; + double matrix[4][4]; + double theta,phi,ph,err, + max_err,max_err_c2,max_err_s2,max_err_c4,max_err_s4; + doublecomplex s1,s2,s3,s4,s10,s20,s30,s40; + char fname[MAX_FNAME]; + int i,k; + size_t index,index1,k_or,j,n,ind; + double co,si; + double alph; + TIME_TYPE tstart; + + if (ringid!=ROOT) return; + + if (orient_avg) { /* Amplitude matrix stored in ampl_alplha is */ + index1=index=0; /* transformed into Mueller matrix stored in muel_alpha */ + for (k_or=0;k_or<alpha_int.N;k_or++) { + alph=Deg2Rad(alpha_int.val[k_or]); /* read current alpha */ + co=cos(alph); + si=sin(alph); + for (i=0;i<nTheta;i++) { + /* read amplitude matrix from memory */ + cEqual(ampl_alphaX[index],s10); + cEqual(ampl_alphaX[index+1],s30); + cEqual(ampl_alphaY[index],s40); + cEqual(ampl_alphaY[index+1],s20); + /* transform it, multiplying by rotation matrix (-alpha) */ + cLinComb(s20,s30,co,si,s2); /* s2 = co*s20 + si*s30 */ + cLinComb(s20,s30,-si,co,s3); /* s3 = -si*s20 + co*s30 */ + cLinComb(s40,s10,co,si,s4); /* s4 = co*s40 + si*s10 */ + cLinComb(s40,s10,-si,co,s1); /* s1 = -si*s40 + co*s10 */ + + ComputeMuellerMatrix((double (*)[4])(muel_alpha+index1),s1,s2,s3,s4); + index+=2; + index1+=16; + } + } + } + else { + tstart=GET_TIME(); /* here Mueller matrix is saved to file */ + if (yzplane) { + strcpy(fname,directory); + strcat(fname,"/" F_MUEL); + mueller=FOpenErr(fname,"w",ONE_POS); + fprintf(mueller,"theta s11 s12 s13 s14 s21 s22 s23 s24 s31 s32 s33 s34 s41 s42 s43 s44\n"); + for (i=0;i<nTheta;i++) { + theta=i*dtheta_deg; + ComputeMuellerMatrix(matrix,EplaneX[2*i],EplaneY[2*i+1],EplaneX[2*i+1],EplaneY[2*i]); + fprintf(mueller, + "%.2f %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E"\ + " %.10E %.10E %.10E %.10E %.10E %.10E %.10E\n", + theta,matrix[0][0],matrix[0][1],matrix[0][2],matrix[0][3], + matrix[1][0],matrix[1][1],matrix[1][2],matrix[1][3], + matrix[2][0],matrix[2][1],matrix[2][2],matrix[2][3], + matrix[3][0],matrix[3][1],matrix[3][2],matrix[3][3]); + } + FCloseErr(mueller,F_MUEL,ONE_POS); + } + + if (scat_grid) { + /* compute Mueller Matrix in full space angle. + * E-fields are stored in arrays EgridX and EgridY + * for incoming X and Y polarized light. From this compute the Mueller matrix. + * It is converted to the scattering matrix elements (see e.g Bohren and Huffman) : + * s2 = cos(phi)E'X'par + sin(phi)E'Y'par + * s3 = sin(phi)E'X'par - cos(phi)E'Y'par + * s4 = cos(phi)E'X'per + sin(phi)E'Y'per + * s1 = sin(phi)E'X'per - cos(phi)E'Y'per + * from this the mueller matrix elements are computed */ + + /* open files for writing */ + if (store_scat_grid) { + strcpy(fname,directory); + strcat(fname,"/" F_MUEL_SG); + mueller=FOpenErr(fname,"w",ONE_POS); + fprintf(mueller,"theta phi s11 s12 s13 s14 s21 s22 s23 s24"\ + " s31 s32 s33 s34 s41 s42 s43 s44\n"); + } + if (phi_integr) { /* also initializes arrays of multipliers */ + if (phi_int_type & PHI_UNITY) { + strcpy(fname,directory); + strcat(fname,"/" F_MUEL_INT); + mueller_int=FOpenErr(fname,"w",ONE_POS); + fprintf(mueller_int,"theta s11 s12 s13 s14 s21 s22 s23 s24"\ + " s31 s32 s33 s34 s41 s42 s43 s44 RMSE(integr)\n"); + } + if (phi_int_type & PHI_COS2) { + strcpy(fname,directory); + strcat(fname,"/" F_MUEL_C2); + mueller_int_c2=FOpenErr(fname,"w",ONE_POS); + fprintf(mueller_int_c2,"theta s11 s12 s13 s14 s21 s22 s23 s24"\ + " s31 s32 s33 s34 s41 s42 s43 s44 RMSE(integr)\n"); + MALLOC_VECTOR(cos2,double,angles.phi.N,ALL); + } + if (phi_int_type & PHI_SIN2) { + strcpy(fname,directory); + strcat(fname,"/" F_MUEL_S2); + mueller_int_s2=FOpenErr(fname,"w",ONE_POS); + fprintf(mueller_int_s2,"theta s11 s12 s13 s14 s21 s22 s23 s24"\ + " s31 s32 s33 s34 s41 s42 s43 s44 RMSE(integr)\n"); + MALLOC_VECTOR(sin2,double,angles.phi.N,ALL); + } + if (phi_int_type & PHI_COS4) { + strcpy(fname,directory); + strcat(fname,"/" F_MUEL_C4); + mueller_int_c4=FOpenErr(fname,"w",ONE_POS); + fprintf(mueller_int_c4,"theta s11 s12 s13 s14 s21 s22 s23 s24"\ + " s31 s32 s33 s34 s41 s42 s43 s44 RMSE(integr)\n"); + MALLOC_VECTOR(cos4,double,angles.phi.N,ALL); + } + if (phi_int_type & PHI_SIN4) { + strcpy(fname,directory); + strcat(fname,"/" F_MUEL_S4); + mueller_int_s4=FOpenErr(fname,"w",ONE_POS); + fprintf(mueller_int_s4,"theta s11 s12 s13 s14 s21 s22 s23 s24"\ + " s31 s32 s33 s34 s41 s42 s43 s44 RMSE(integr)\n"); + MALLOC_VECTOR(sin4,double,angles.phi.N,ALL); + } + /* fills arrays with multipliers (optimized) */ + for (j=0;j<angles.phi.N;j++) { + /* prepare */ + ph=2*Deg2Rad(angles.phi.val[j]); + if (phi_int_type & (PHI_COS2|PHI_COS4|PHI_SIN4)) co=cos(ph); + if (phi_int_type & (PHI_SIN2|PHI_SIN4)) si=sin(ph); + /* fill */ + if (phi_int_type & PHI_COS2) cos2[j]=co; + if (phi_int_type & PHI_SIN2) sin2[j]=si; + if (phi_int_type & PHI_COS4) cos4[j]=2*co*co-1; + if (phi_int_type & PHI_SIN4) sin4[j]=2*si*co; + } + } + /* set type of cycling through angles */ + if (angles.type==SG_GRID) n=angles.phi.N; + else if (angles.type==SG_PAIRS) n=1; + /* main cycle */ + index=0; + max_err=max_err_c2=max_err_s2=max_err_c4=max_err_s4=0; + for (ind=0;ind<angles.theta.N;++ind) { + index1=0; + theta=angles.theta.val[ind]; + for (j=0;j<n;++j) { + if (angles.type==SG_GRID) phi=angles.phi.val[j]; + else if (angles.type==SG_PAIRS) phi=angles.phi.val[ind]; + ph=Deg2Rad(phi); + co=cos(ph); + si=sin(ph); + /* read amplitude matrix from memory */ + cEqual(EgridY[index],s10); + cEqual(EgridY[index+1],s30); + cEqual(EgridX[index],s40); + cEqual(EgridX[index+1],s20); + /* transform it, multiplying by rotation matrix from per-par to X-Y */ + cLinComb(s20,s30,co,si,s2); /* s2 = co*s20 + si*s30 */ + cLinComb(s20,s30,si,-co,s3); /* s3 = si*s20 - co*s30 */ + cLinComb(s40,s10,co,si,s4); /* s4 = co*s40 + si*s10 */ + cLinComb(s40,s10,si,-co,s1); /* s1 = si*s40 - co*s10 */ + + ComputeMuellerMatrix(matrix,s1,s2,s3,s4); + index+=2; + if (phi_integr) { + memcpy(muel_phi+index1,matrix[0],16*sizeof(double)); + index1+=16; + } + if (store_scat_grid) + fprintf(mueller, + "%.2f %.2f %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E"\ + " %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E\n", + theta,phi,matrix[0][0],matrix[0][1],matrix[0][2],matrix[0][3], + matrix[1][0],matrix[1][1],matrix[1][2],matrix[1][3], + matrix[2][0],matrix[2][1],matrix[2][2],matrix[2][3], + matrix[3][0],matrix[3][1],matrix[3][2],matrix[3][3]); + } + if (phi_integr) { + if (phi_int_type & PHI_UNITY) { + err=Romberg1D(phi_sg,16,muel_phi,matrix[0]); + if (err>max_err) max_err=err; + fprintf(mueller_int, + "%.2f %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E"\ + " %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.3E\n", + theta,matrix[0][0],matrix[0][1],matrix[0][2],matrix[0][3], + matrix[1][0],matrix[1][1],matrix[1][2],matrix[1][3], + matrix[2][0],matrix[2][1],matrix[2][2],matrix[2][3], + matrix[3][0],matrix[3][1],matrix[3][2],matrix[3][3],err); + } + if (phi_int_type & PHI_COS2) { + for (j=0;j<angles.phi.N;j++) for(k=0;k<16;k++) + muel_phi1[16*j+k]=muel_phi[16*j+k]*cos2[j]; + err=Romberg1D(phi_sg,16,muel_phi1,matrix[0]); + if (err>max_err_c2) max_err_c2=err; + fprintf(mueller_int_c2, + "%.2f %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E"\ + " %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.3E\n", + theta,matrix[0][0],matrix[0][1],matrix[0][2],matrix[0][3], + matrix[1][0],matrix[1][1],matrix[1][2],matrix[1][3], + matrix[2][0],matrix[2][1],matrix[2][2],matrix[2][3], + matrix[3][0],matrix[3][1],matrix[3][2],matrix[3][3],err); + } + if (phi_int_type & PHI_SIN2) { + for (j=0;j<angles.phi.N;j++) for(k=0;k<16;k++) + muel_phi1[16*j+k]=muel_phi[16*j+k]*sin2[j]; + err=Romberg1D(phi_sg,16,muel_phi1,matrix[0]); + if (err>max_err_s2) max_err_s2=err; + fprintf(mueller_int_s2, + "%.2f %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E"\ + " %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.3E\n", + theta,matrix[0][0],matrix[0][1],matrix[0][2],matrix[0][3], + matrix[1][0],matrix[1][1],matrix[1][2],matrix[1][3], + matrix[2][0],matrix[2][1],matrix[2][2],matrix[2][3], + matrix[3][0],matrix[3][1],matrix[3][2],matrix[3][3],err); + } + if (phi_int_type & PHI_COS4) { + for (j=0;j<angles.phi.N;j++) for(k=0;k<16;k++) + muel_phi1[16*j+k]=muel_phi[16*j+k]*cos4[j]; + err=Romberg1D(phi_sg,16,muel_phi1,matrix[0]); + if (err>max_err_c4) max_err_c4=err; + fprintf(mueller_int_c4, + "%.2f %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E"\ + " %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.3E\n", + theta,matrix[0][0],matrix[0][1],matrix[0][2],matrix[0][3], + matrix[1][0],matrix[1][1],matrix[1][2],matrix[1][3], + matrix[2][0],matrix[2][1],matrix[2][2],matrix[2][3], + matrix[3][0],matrix[3][1],matrix[3][2],matrix[3][3],err); + } + if (phi_int_type & PHI_SIN4) { + for (j=0;j<angles.phi.N;j++) for(k=0;k<16;k++) + muel_phi1[16*j+k]=muel_phi[16*j+k]*sin4[j]; + err=Romberg1D(phi_sg,16,muel_phi1,matrix[0]); + if (err>max_err_s4) max_err_s4=err; + fprintf(mueller_int_s4, + "%.2f %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.10E"\ + " %.10E %.10E %.10E %.10E %.10E %.10E %.10E %.3E\n", + theta,matrix[0][0],matrix[0][1],matrix[0][2],matrix[0][3], + matrix[1][0],matrix[1][1],matrix[1][2],matrix[1][3], + matrix[2][0],matrix[2][1],matrix[2][2],matrix[2][3], + matrix[3][0],matrix[3][1],matrix[3][2],matrix[3][3],err); + } + } + } + if (phi_integr) { + fprintf(logfile,"\nMaximum relative mean-square error of Mueller integration:\n"); + if (phi_int_type & PHI_UNITY) fprintf(logfile," 1 -> %.3E\n",max_err); + if (phi_int_type & PHI_COS2) fprintf(logfile," cos(2*phi) -> %.3E\n",max_err_c2); + if (phi_int_type & PHI_SIN2) fprintf(logfile," cos(2*phi) -> %.3E\n",max_err_c2); + if (phi_int_type & PHI_COS4) fprintf(logfile," cos(2*phi) -> %.3E\n",max_err_c2); + if (phi_int_type & PHI_SIN4) fprintf(logfile," cos(2*phi) -> %.3E\n",max_err_c2); + } + /* close files; free arrays */ + if (store_scat_grid) FCloseErr(mueller,F_MUEL_SG,ONE_POS); + if (phi_integr) { + if (phi_int_type & PHI_UNITY) FCloseErr(mueller_int,F_MUEL_INT,ONE_POS); + if (phi_int_type & PHI_COS2) { + FCloseErr(mueller_int_c2,F_MUEL_C2,ONE_POS); + Free_general(cos2); + } + if (phi_int_type & PHI_SIN2) { + FCloseErr(mueller_int_s2,F_MUEL_S2,ONE_POS); + Free_general(sin2); + } + if (phi_int_type & PHI_COS4) { + FCloseErr(mueller_int_c4,F_MUEL_C4,ONE_POS); + Free_general(cos4); + } + if (phi_int_type & PHI_SIN4) { + FCloseErr(mueller_int_s4,F_MUEL_S4,ONE_POS); + Free_general(sin4); + } + } + } + Timing_FileIO += GET_TIME() - tstart; + } } -//============================================================ +/*============================================================*/ static void CalcEplane(const char which,const int type) -// calculates scattered electric field in a plane + /* calculates scattered electric field in a plane */ { - double *incPol,*incPolper,*incPolpar; - // where to store calculated field for one plane (actually points to different other arrays) - doublecomplex *Eplane; - int i; - doublecomplex ebuff[3]; // small vector to hold E fields - double robserver[3]; // small vector for observer in E calculation - double epar[3]; // unit vector in direction of Epar - double theta; // scattering angle - double co,si; // temporary, cos and sin of some angle - double incPol_tmp1[3],incPol_tmp2[3]; // just allocated memory for incPolper, incPolpar - double alph; - TIME_TYPE tstart; - size_t k_or; - int orient,Norient; - char choice; - - incPolper=incPol_tmp1; // initialization of per and par polarizations - incPolpar=incPol_tmp2; - - if (type==CE_NORMAL) Norient=1; // initialize # orientations - else if (type==CE_PARPER) Norient=2; - - for (k_or=0;k_or<alpha_int.N;k_or++) { - // cycle over alpha - for orientation averaging - if (orient_avg) { - alph=Deg2Rad(alpha_int.val[k_or]); // rotate polarization basis vectors by -alpha - co=cos(alph); - si=sin(alph); - LinComb(incPolX,incPolY,co,-si,incPolper); // incPolper = co*incPolX - si*incPolY; - LinComb(incPolX,incPolY,si,co,incPolpar); // incPolpar = si*incPolX + co*incPolY; - } - else { - memcpy(incPolper,incPolX,3*sizeof(double)); // per <=> X - memcpy(incPolpar,incPolY,3*sizeof(double)); // par <=> Y - } - - for(orient=0;orient<Norient;orient++) { - // in case of Rotation symmetry - tstart = GET_TIME (); - if (orient==0) choice=which; - else if (orient==1) { - /* Rotation symmetry: calculate per-per from current data. CalculateE is called - * from calculator with 'Y' polarization - we now just assume that we have - * the x-z plane as the scattering plane, rotating in the negative x-direction. - * This mimics the real case of X polarization with the y-z plane as scattering plane - * Then IncPolY -> -IncPolX; incPolX -> IncPolY - * */ - if (which=='X') choice='Y'; - else if (which=='Y') choice='X'; - incPol=incPolper; - incPolper=incPolpar; - incPolpar=incPol; - MultScal(-1,incPolpar,incPolpar); - } - // initialize Eplane - if (orient_avg) { - if (choice=='X') Eplane=ampl_alphaX + 2*nTheta*k_or; - else if (choice=='Y') Eplane=ampl_alphaY + 2*nTheta*k_or; - } - else { - if (choice=='X') Eplane=EplaneX; - else if (choice=='Y') Eplane=EplaneY; - } - - for (i=0;i<nTheta;i++) { - theta = i * dtheta_rad; - co=cos(theta); - si=sin(theta); - LinComb(prop,incPolpar,co,si,robserver); // robserver = co*prop + si*incPolpar; - - CalcField(ebuff,robserver); - // convert to (l,r) frame - crDotProd(ebuff,incPolper,Eplane[2*i]); // Eper[i]=Esca.incPolper - LinComb(prop,incPolpar,-si,co,epar); // epar=-si*prop+co*incPolpar - crDotProd(ebuff,epar,Eplane[2*i+1]); // Epar[i]=Esca.epar - } // end for i - - // Accumulate Eplane to root and sum - D("Accumulating Eplane started"); - // accumulate only on processor 0 !, done in one operation - Accumulate((double *)Eplane,4*nTheta,Eplane_buffer,&Timing_comm_EField); - D("Accumulating Eplane finished"); - - Timing_EFieldPlane = GET_TIME() - tstart; - Timing_EField += Timing_EFieldPlane; - TotalEFieldPlane++; - } // end of orient loop - } // end of alpha loop + double *incPol,*incPolper,*incPolpar; + doublecomplex *Eplane; /* where to store calculated field for one plane + (actually points to different other arrays) */ + int i; + doublecomplex ebuff[3]; /* small vector to hold E fields */ + double robserver[3]; /* small vector for observer in E calculation */ + double epar[3]; /* unit vector in direction of Epar */ + double theta; /* scattering angle */ + double co,si; /* temporary, cos and sin of some angle */ + double incPol_tmp1[3],incPol_tmp2[3]; /* just allocateed memory for incPolper, incPolpar */ + double alph; + TIME_TYPE tstart; + size_t k_or; + int orient,Norient; + char choice; + + incPolper=incPol_tmp1; /* initialization of per and par polarizations */ + incPolpar=incPol_tmp2; + + if (type==CE_NORMAL) Norient=1; /* initialize # orientations */ + else if (type==CE_PARPER) Norient=2; + + for (k_or=0;k_or<alpha_int.N;k_or++) { + /* cycle over alpha - for orientation averaging */ + if (orient_avg) { + alph=Deg2Rad(alpha_int.val[k_or]); /* rotate polarization basis vectors by -alpha */ + co=cos(alph); + si=sin(alph); + LinComb(incPolX,incPolY,co,-si,incPolper); /* incPolper = co*incPolX - si*incPolY; */ + LinComb(incPolX,incPolY,si,co,incPolpar); /* incPolpar = si*incPolX + co*incPolY; */ + } + else { + memcpy(incPolper,incPolX,3*sizeof(double)); /* per <=> X */ + memcpy(incPolpar,incPolY,3*sizeof(double)); /* par <=> Y */ + } + + for(orient=0;orient<Norient;orient++) { + /* in case of Rotation symmetry */ + tstart = GET_TIME (); + if (orient==0) choice=which; + else if (orient==1) { + /* Rotation symmetry: calculate per-per from current data. CalculateE is called + * from calculator with 'Y' polarization - we now just assume that we have + * the x-z plane as the scatering plane, rotating in the negative x-direction. + * This mimics the real case of X polarization with the y-z plane as scattering plane + * Then IncPolY -> -IncPolX; incPolX -> IncPolY */ + if (which=='X') choice='Y'; + else if (which=='Y') choice='X'; + incPol=incPolper; + incPolper=incPolpar; + incPolpar=incPol; + MultScal(-1,incPolpar,incPolpar); + } + /* initialize Eplane */ + if (orient_avg) { + if (choice=='X') Eplane=ampl_alphaX + 2*nTheta*k_or; + else if (choice=='Y') Eplane=ampl_alphaY + 2*nTheta*k_or; + } + else { + if (choice=='X') Eplane=EplaneX; + else if (choice=='Y') Eplane=EplaneY; + } + + for (i=0;i<nTheta;i++) { + theta = i * dtheta_rad; + co=cos(theta); + si=sin(theta); + LinComb(prop,incPolpar,co,si,robserver); /* robserver = co*prop + si*incPolpar; */ + + CalcField(ebuff,robserver); + /* convert to (l,r) frame */ + crDotProd(ebuff,incPolper,Eplane[2*i]); /* Eper[i]=Esca.incPolper */ + LinComb(prop,incPolpar,-si,co,epar); /* epar=-si*prop+co*incPolpar */ + crDotProd(ebuff,epar,Eplane[2*i+1]); /* Epar[i]=Esca.epar */ + } /* end for i */ + + /* Accumulate Eplane to root and summate */ + D("Accumulating Eplane started"); + /* accumulate only on processor 0 !, done in one operation */ + Accumulate((double *)Eplane,4*nTheta,Eplane_buffer,&Timing_comm_EField); + D("Accumulating Eplane finished"); + + Timing_EFieldPlane = GET_TIME() - tstart; + Timing_EField += Timing_EFieldPlane; + TotalEFieldPlane++; + } /* end of orient loop */ + } /* end of alpha loop */ } -//============================================================ +/*============================================================*/ static void CalcIntegralScatQuantities(const char which) -/* calculates all the scattering cross sections, normalized and unnormalized asymmetry parameter, - * and force on the particle and each dipole. Cext and Cabs are averaged over orientation, - * if needed. - */ + /* calculates all the scattering crosssections, normalized and unnormalized + asymmetry parameter, and force on the particle and each dipole. + Cext and Cabs are overaged over orientation, if needed. */ { - double *Fsca,*Finc,*Frp; // Scattering force, extinction force and radiation pressure per dipole - double Cext,Cabs,Csca, // Cross sections - dummy[3], // asymmetry parameter*Csca - Fsca_tot[3], // total scattering force - Finc_tot[3], // total extinction force - Frp_tot[3], // total radiation pressure - Cnorm, // normalizing factor from force to cross section - Qnorm; // normalizing factor from force to efficiency - FILE *VisFrp,*CCfile; - TIME_TYPE tstart; - char fname_cs[MAX_FNAME],fname_frp[MAX_FNAME]; - size_t j; - double const *incPol; - char f_suf[MAX_WORD]; - - D("Calculation of cross sections started"); - tstart = GET_TIME(); - - strcpy(fname_cs,directory); - strcat(fname_cs,"/" F_CS); - if (which == 'X') { - strcpy(f_suf,F_XSUF); - incPol=incPolX; - } - if (which == 'Y') { - strcpy(f_suf,F_YSUF); - incPol=incPolY; - } - strcat(fname_cs,f_suf); - if (calc_Cext) Cext = ExtCross(incPol); - if (calc_Cabs) Cabs = AbsCross(); - D("Cext and Cabs calculated"); - if (orient_avg) { - if (ringid==ROOT) { - if (which == 'Y') { // assumed that first call of CalculateE is with 'Y' flag - muel_alpha[-2]=Cext; - muel_alpha[-1]=Cabs; - } - else if (which == 'X') { - muel_alpha[-2]=(muel_alpha[-2]+Cext)/2; - muel_alpha[-1]=(muel_alpha[-1]+Cabs)/2; - } - } - } - else { // not orient_avg - if (ringid==ROOT) { - CCfile=FOpenErr(fname_cs,"w",ONE_POS); - if (calc_Cext) PrintBoth(CCfile,"Cext\t= %.10g\nQext\t= %.10g\n",Cext,Cext*inv_G); - if (calc_Cabs) PrintBoth(CCfile,"Cabs\t= %.10g\nQabs\t= %.10g\n",Cabs,Cabs*inv_G); - if (all_dir) fprintf(CCfile,"\nIntegration\n\n"); - if (calc_Csca) { - Csca=ScaCross(f_suf); - PrintBoth(CCfile,"Csca\t= %.10g\nQsca\t= %.10g\n",Csca,Csca*inv_G); - } - if (calc_vec) { - AsymParm_x(dummy,f_suf); - AsymParm_y(dummy+1,f_suf); - AsymParm_z(dummy+2,f_suf); - PrintBoth(CCfile,"Csca.g\t= (%.10g,%.10g,%.10g)\n",dummy[0],dummy[1],dummy[2]); - if (calc_asym) PrintBoth(CCfile,"g\t= (%.10g,%.10g,%.10g)\n", - dummy[0]/Csca,dummy[1]/Csca,dummy[2]/Csca); - } - } // end of ROOT - if (calc_mat_force) { - MALLOC_VECTOR(Fsca,double,3*local_nvoid_Ndip,ALL); - MALLOC_VECTOR(Finc,double,3*local_nvoid_Ndip,ALL); - MALLOC_VECTOR(Frp,double,3*local_nvoid_Ndip,ALL); - for (j=0;j<3*local_nvoid_Ndip;j++) Fsca[j]=Finc[j]=Frp[j]=0; - PRINTZ("Calculating the force per dipole\n"); - // Calculate forces - Frp_mat(Fsca_tot,Fsca,Finc_tot,Finc,Frp_tot,Frp); - // Write Cross-Sections and Efficiencies to file - if (ringid==ROOT) { - Cnorm = EIGHT_PI; - Qnorm = EIGHT_PI*inv_G; - PrintBoth(CCfile,"\nMatrix\n"\ - "Cext\t= %.10g\nQext\t= %.10g\n"\ - "Csca.g\t= (%.10g,%.10g,%.10g)\n"\ - "Cpr\t= (%.10g,%.10g,%.10g)\n"\ - "Qpr\t= (%.10g,%.10g,%.10g)\n",Cnorm*Finc_tot[2],Qnorm*Finc_tot[2], - -Cnorm*Fsca_tot[0],-Cnorm*Fsca_tot[1],-Cnorm*Fsca_tot[2], - Cnorm*Frp_tot[0],Cnorm*Frp_tot[1],Cnorm*Frp_tot[2], - Qnorm*Frp_tot[0],Qnorm*Frp_tot[1],Qnorm*Frp_tot[2]); - if (store_force) { - // Write Radiation pressure per dipole to file - strcpy(fname_frp,directory); - strcat(fname_frp,"/" F_FRP); - strcat(fname_frp,f_suf); - strcat(fname_frp,".dat"); // TODO: should be removed in the future - VisFrp=FOpenErr(fname_frp,"w",ONE_POS); - fprintf(VisFrp,"#sphere x=%.10g m=%.10g%+.10gi\n"\ - "#number of real dipoles %.0f\n"\ - "#Forces per dipole\n"\ - "#r.x r.y r.z F.x F.y F.z\n", - ka_eq,ref_index[0][RE],ref_index[0][IM],nvoid_Ndip); - for (j=0;j<local_nvoid_Ndip;++j) fprintf(VisFrp, - "%.10g %.10g %.10g %.10g %.10g %.10g\n", - DipoleCoord[3*j],DipoleCoord[3*j+1], - DipoleCoord[3*j+2], - Frp[3*j],Frp[3*j+1],Frp[3*j+2]); - FCloseErr(VisFrp,fname_frp,ONE_POS); - } - } - Free_general(Fsca); - Free_general(Finc); - Free_general(Frp); - } - if (ringid==ROOT) FCloseErr(CCfile,fname_cs,ONE_POS); - } - D("Calculation of cross sections finished"); - Timing_ScatQuan += GET_TIME() - tstart; + double *Fsca,*Finc,*Frp; /* Scattering force, extinction force and + radiation pressure per dipole */ + double Cext,Cabs,Csca, /* Cross sections */ + dummy[3], /* asymmetry paramter*Csca */ + Fsca_tot[3], /* total scattering force */ + Finc_tot[3], /* total extinction force */ + Frp_tot[3], /* total radiation pressure */ + Cnorm, /* normalizing factor from force to cross section */ + Qnorm; /* normalizing factor from force to efficiency */ + FILE *VisFrp,*CCfile; + TIME_TYPE tstart; + char fname_cs[MAX_FNAME],fname_frp[MAX_FNAME]; + size_t j; + double const *incPol; + char f_suf[MAX_WORD]; + + D("Calculation of cross sections started"); + tstart = GET_TIME(); + + strcpy(fname_cs,directory); + strcat(fname_cs,"/" F_CS); + if (which == 'X') { + strcpy(f_suf,F_XSUF); + incPol=incPolX; + } + if (which == 'Y') { + strcpy(f_suf,F_YSUF); + incPol=incPolY; + } + strcat(fname_cs,f_suf); + if (calc_Cext) Cext = ExtCross(incPol); + if (calc_Cabs) Cabs = AbsCross(); + D("Cext and Cabs calculated"); + if (orient_avg) { + if (ringid==ROOT) { + if (which == 'Y') { /* assumed that first call of CalculateE is with 'Y' flag */ + muel_alpha[-2]=Cext; + muel_alpha[-1]=Cabs; + } + else if (which == 'X') { + muel_alpha[-2]=(muel_alpha[-2]+Cext)/2; + muel_alpha[-1]=(muel_alpha[-1]+Cabs)/2; + } + } + } + else { /* not orient_avg */ + if (ringid==ROOT) { + CCfile=FOpenErr(fname_cs,"w",ONE_POS); + if (calc_Cext) PrintBoth(CCfile,"Cext\t= %.10g\nQext\t= %.10g\n",Cext,Cext*inv_G); + if (calc_Cabs) PrintBoth(CCfile,"Cabs\t= %.10g\nQabs\t= %.10g\n",Cabs,Cabs*inv_G); + if (all_dir) fprintf(CCfile,"\nIntegration\n\n"); + if (calc_Csca) { + Csca=ScaCross(f_suf); + PrintBoth(CCfile,"Csca\t= %.10g\nQsca\t= %.10g\n",Csca,Csca*inv_G); + } + if (calc_vec) { + AsymParm_x(dummy,f_suf); + AsymParm_y(dummy+1,f_suf); + AsymParm_z(dummy+2,f_suf); + PrintBoth(CCfile,"Csca.g\t= (%.10g,%.10g,%.10g)\n",dummy[0],dummy[1],dummy[2]); + if (calc_asym) PrintBoth(CCfile,"g\t= (%.10g,%.10g,%.10g)\n", + dummy[0]/Csca,dummy[1]/Csca,dummy[2]/Csca); + } + } /* end of ROOT */ + if (calc_mat_force) { + MALLOC_VECTOR(Fsca,double,3*local_nvoid_Ndip,ALL); + MALLOC_VECTOR(Finc,double,3*local_nvoid_Ndip,ALL); + MALLOC_VECTOR(Frp,double,3*local_nvoid_Ndip,ALL); + for (j=0;j<3*local_nvoid_Ndip;j++) Fsca[j]=Finc[j]=Frp[j]=0; + + PRINTZ("Calculating the force per dipole\n"); + + /* Calculate forces */ + Frp_mat(Fsca_tot,Fsca,Finc_tot,Finc,Frp_tot,Frp); + + /* Write Cross-Sections and Efficiencies to file */ + if (ringid==ROOT) { + Cnorm = EIGHT_PI; + Qnorm = EIGHT_PI*inv_G; + PrintBoth(CCfile,"\nMatrix\n"\ + "Cext\t= %.10g\nQext\t= %.10g\n"\ + "Csca.g\t= (%.10g,%.10g,%.10g)\n"\ + "Cpr\t= (%.10g,%.10g,%.10g)\n"\ + "Qpr\t= (%.10g,%.10g,%.10g)\n", + Cnorm*Finc_tot[2],Qnorm*Finc_tot[2], + -Cnorm*Fsca_tot[0],-Cnorm*Fsca_tot[1],-Cnorm*Fsca_tot[2], + Cnorm*Frp_tot[0],Cnorm*Frp_tot[1],Cnorm*Frp_tot[2], + Qnorm*Frp_tot[0],Qnorm*Frp_tot[1],Qnorm*Frp_tot[2]); + if (store_force) { + /* Write Radiation pressure per dipole to file */ + strcpy(fname_frp,directory); + strcat(fname_frp,"/" F_FRP); + strcat(fname_frp,f_suf); + strcat(fname_frp,".dat"); /* should be removed in the future */ + VisFrp=FOpenErr(fname_frp,"w",ONE_POS); + fprintf(VisFrp,"#sphere x=%.10g m=%.10g%+.10gi\n"\ + "#number of real dipoles %.0f\n"\ + "#Forces per dipole\n"\ + "#r.x r.y r.z F.x F.y F.z\n", + ka_eq,ref_index[0][RE],ref_index[0][IM],nvoid_Ndip); + for (j=0;j<local_nvoid_Ndip;++j) fprintf(VisFrp, + "%.10g %.10g %.10g %.10g %.10g %.10g\n", + DipoleCoord[3*j],DipoleCoord[3*j+1], + DipoleCoord[3*j+2], + Frp[3*j],Frp[3*j+1],Frp[3*j+2]); + FCloseErr(VisFrp,fname_frp,ONE_POS); + } + } + Free_general(Fsca); + Free_general(Finc); + Free_general(Frp); + } + if (ringid==ROOT) FCloseErr(CCfile,fname_cs,ONE_POS); + } + D("Calculation of cross sections finished"); + Timing_ScatQuan += GET_TIME() - tstart; } -//============================================================ +/*============================================================*/ static void StoreFields(const char which,doublecomplex *field,const char *fname_preffix, - const char *tmpl,const char *field_name,const char *fullname) -/* Write any fields on each dipole to file (internal fields, incident beam, polarization, etc.). - * All processors should write the 'field' to temporary file. These files are named by template - * 'tmpl' and afterwards are concatenated into the file, which name is build by adding a small - * suffix to 'fname_preffix'. If CE_PARPER is employed then naturally saves only once; use '-sym no' - * if needed. 'field_name' is used to build column labels (i.e. there is difference in the first row - * between different fields). 'fullname' is for standard output. - */ + const char *tmpl,const char *field_name,const char *fullname) + /* Write any fields on each dipole to file (internal fields, incident beam, polarization, + etc...). All processors should write the 'field' to temporary file. + These files are named by template 'tmpl' and afterwards are concannetated into the file, + which name is build by adding a small suffix to 'fname_preffix'. + If CE_PARPER is employed then naturally saves only once; use '-sym no' if needed. + 'field_name' is used to build column labels (i.e. there is difference in the first row + between different fields). 'fullname' is for standard output. */ { - FILE *file; // file to store the fields - size_t i,j; - TIME_TYPE tstart; - char fname[MAX_FNAME],fname_sh[MAX_FNAME_SH]; - - tstart=GET_TIME(); - // build file name (without directory) - strcpy(fname_sh,fname_preffix); - if (which=='X') strcat(fname_sh,F_XSUF); - else if (which=='Y') strcat(fname_sh,F_YSUF); - // choose filename for direct saving + FILE *file; /* file to store the fields */ + size_t i,j; + TIME_TYPE tstart; + char fname[MAX_FNAME],fname_sh[MAX_FNAME_SH]; + + tstart=GET_TIME(); + /* build file name (without directory) */ + strcpy(fname_sh,fname_preffix); + if (which=='X') strcat(fname_sh,F_XSUF); + else if (which=='Y') strcat(fname_sh,F_YSUF); + /* choose filename for direct saving */ #ifdef PARALLEL - sprintf(fname,"%s/",directory); - sprintf(fname+strlen(fname),tmpl,ringid); + sprintf(fname,"%s/",directory); + sprintf(fname+strlen(fname),tmpl,ringid); #else - sprintf(fname,"%s/%s",directory,fname_sh); + sprintf(fname,"%s/%s",directory,fname_sh); #endif - file=FOpenErr(fname,"w",ALL_POS); - // print head of file + file=FOpenErr(fname,"w",ALL_POS); + /* print head of file */ #ifdef PARALLEL - if (ringid==0) { // this condition can be different from being ROOT + if (ringid==0) { /* this condition can be different from being ROOT */ #endif - fprintf(file,"x y z |%s|^2 %sx.r %sx.i %sy.r %sy.i %sz.r %sz.i\n",field_name,field_name, - field_name,field_name,field_name,field_name,field_name); + fprintf(file,"x y z |%s|^2 %sx.r %sx.i %sy.r %sy.i %sz.r %sz.i\n",field_name,field_name, + field_name,field_name,field_name,field_name,field_name); #ifdef PARALLEL - } // end of if + } /* end of if */ #endif - // saves fields to file - for (i=0;i<local_nvoid_Ndip;++i) { - j=3*i; - fprintf(file, - "%.10g %.10g %.10g %.10g %.10g %.10g %.10g %.10g %.10g %.10g\n", - DipoleCoord[j],DipoleCoord[j+1],DipoleCoord[j+2],cvNorm2(field+j), - field[j][RE],field[j][IM],field[j+1][RE],field[j+1][IM],field[j+2][RE],field[j+2][IM]); - } - FCloseErr(file,fname,ALL_POS); + /* saves fields to file */ + for (i=0;i<local_nvoid_Ndip;++i) { + j=3*i; + fprintf(file, + "%.10g %.10g %.10g %.10g %.10g %.10g %.10g %.10g %.10g %.10g\n", + DipoleCoord[j],DipoleCoord[j+1],DipoleCoord[j+2],cvNorm2(field+j), + field[j][RE],field[j][IM],field[j+1][RE],field[j+1][IM],field[j+2][RE],field[j+2][IM]); + } + FCloseErr(file,fname,ALL_POS); #ifdef PARALLEL - // wait for all processes to save their part of geometry - Synchronize(); - if (ringid==ROOT) CatNFiles(directory,tmpl,fname_sh); + /* wait for all processes to save their part of geometry */ + Synchronize(); + if (ringid==ROOT) CatNFiles(directory,tmpl,fname_sh); #endif - PRINTZ("%s saved to file\n",fullname); - Timing_FileIO += GET_TIME() - tstart; + PRINTZ("%s saved to file\n",fullname); + Timing_FileIO += GET_TIME() - tstart; } -//============================================================ +/*============================================================*/ static void StoreIntFields(const char which) -// Write actual internal fields (not exciting) on each dipole to file + /* Write actual internal fields (not exciting) on each dipole to file */ { - double V; - doublecomplex hi,hi_inv[MAX_NMAT]; - unsigned char mat; - size_t i; - int j; - - // calculate multipliers - V=gridspace*gridspace*gridspace; - for (j=0;j<Ncomp*Nmat;j++) { - // hi_inv=1/(V*hi)=4*PI/(V(m^2-1)); for anisotropic - by components - cSquare(ref_index[j],hi); - hi[RE]-=1; - cMultReal(V,hi,hi); - cInv(hi,hi_inv[j]); - cMultReal(FOUR_PI,hi_inv[j],hi_inv[j]); - } - // calculate fields - for (i=0;i<local_nvoid_Ndip;++i) { - mat=(unsigned char)(material[i]*Ncomp); - // e_field=P/(V*hi); for anisotropic - by components - for (j=0;j<3;j++) { - cMult(hi_inv[mat],pvec[3*i+j],xvec[3*i+j]); - if (anisotropy) mat++; - } - } - // save fields to file - StoreFields(which,xvec,F_INTFLD,F_INTFLD_TMP,"E","Internal fields"); + double V; + doublecomplex hi,hi_inv[MAX_NMAT]; + unsigned char mat; + size_t i; + int j; + + /* calculate multipliers */ + V=gridspace*gridspace*gridspace; + for (j=0;j<Ncomp*Nmat;j++) { + /* hi_inv=1/(V*hi)=4*PI/(V(m^2-1)); for anisotropic - by components */ + cSquare(ref_index[j],hi); + hi[RE]-=1; + cMultReal(V,hi,hi); + cInv(hi,hi_inv[j]); + cMultReal(FOUR_PI,hi_inv[j],hi_inv[j]); + } + /* calculate fields */ + for (i=0;i<local_nvoid_Ndip;++i) { + mat=(unsigned char)(material[i]*Ncomp); + /* e_field=P/(V*hi); for anisotropic - by components */ + for (j=0;j<3;j++) { + cMult(hi_inv[mat],pvec[3*i+j],xvec[3*i+j]); + if (anisotropy) mat++; + } + } + /* save fields to file */ + StoreFields(which,xvec,F_INTFLD,F_INTFLD_TMP,"E","Internal fields"); } -//============================================================ +/*============================================================*/ int CalculateE(const char which,const int type) -/* Calculate everything for x or y polarized incident light; or one and use symmetry to determine - * the rest (determined by type) - */ + /* Calculate everything for x or y polarized incident light; or one and use symmetry + to determine the rest (determined by type) */ { - int exit_status; - - tstart_CE=GET_TIME(); - // calculate the incident field Einc; vector b=Einc*cc_sqrt - D("Generating B"); - GenerateB (which, Einc); - if (store_beam) StoreFields(which,Einc,F_BEAM,F_BEAM_TMP,"Einc","Incident beam"); - // calculate solution vector x - D("Iterative solver started"); - exit_status=IterativeSolver(IterMethod); - D("Iterative solver finished"); - Timing_IntFieldOne = GET_TIME() - tstart_CE; - Timing_IntField += Timing_IntFieldOne; - // return if checkpoint (normal) occurred - if (exit_status==CHP_EXIT) return CHP_EXIT; - - if (yzplane) CalcEplane(which,type); //generally plane of incPolY and prop - // Calculate the scattered field for the whole solid-angle - if (all_dir) CalcAlldir(); - // Calculate the scattered field on the given grid of angles - if (scat_grid) CalcScatGrid(which); - /* Calculate integral scattering quantities (cross sections, asymmetry parameter, - * electric forces) - */ - if (calc_Cext || calc_Cabs || calc_Csca || calc_asym || calc_mat_force) - CalcIntegralScatQuantities(which); - // saves internal fields and/or dipole polarizations to text file - if (store_int_field) StoreIntFields(which); - if (store_dip_pol) StoreFields(which,pvec,F_DIPPOL,F_DIPPOL_TMP,"P","Dipole polarizations"); - return 0; + int exit_status; + + tstart_CE=GET_TIME(); + /* calculate the incident field Einc; vector b=Einc*cc_sqrt */ + D("Generating B"); + GenerateB (which, Einc); + if (store_beam) StoreFields(which,Einc,F_BEAM,F_BEAM_TMP,"Einc","Incident beam"); + /* calculate solution vector x */ + D("Iterative solver started"); + exit_status=IterativeSolver(IterMethod); + D("Iterative solver finished"); + Timing_IntFieldOne = GET_TIME() - tstart_CE; + Timing_IntField += Timing_IntFieldOne; + /* return if checkpoint (normal) occured */ + if (exit_status==CHP_EXIT) return CHP_EXIT; + + if (yzplane) CalcEplane(which,type); /*generally plane of incPolY and prop*/ + /* Calculate the scattered field for the whole solid-angle */ + if (all_dir) CalcAlldir(); + /* Calculate the scattered field on the given grid of angles */ + if (scat_grid) CalcScatGrid(which); + /* Calculate integral scattering quantities + (crosssections, asymmetry parameter, electric forces) */ + if (calc_Cext || calc_Cabs || calc_Csca || calc_asym || calc_mat_force) + CalcIntegralScatQuantities(which); + /* saves internal fields and/or dipole polarizations to text file */ + if (store_int_field) StoreIntFields(which); + if (store_dip_pol) StoreFields(which,pvec,F_DIPPOL,F_DIPPOL_TMP,"P","Dipole polarizations"); + return 0; } diff --git a/src/GenerateB.c b/src/GenerateB.c index ce6d0193..5593e0cb 100644 --- a/src/GenerateB.c +++ b/src/GenerateB.c @@ -15,10 +15,10 @@ * * Barton beam is based on: * J. P. Barton and D. R. Alexander, "Fifth-order corrected electromagnetic-field - * components for a fundamental Gaussian-beam," J.Appl.Phys. 66, 2800-2802 (1989). + * components for a fundamental gaussian-beam," J.Appl.Phys. 66, 2800-2802 (1989). * Eqs.(25)-(28) - complex conjugate * - * Copyright (C) 2006-2008 University of Amsterdam + * Copyright (C) 2006-2007 University of Amsterdam * This code is covered by the GNU General Public License. */ #include <stdio.h> @@ -28,197 +28,194 @@ #include "const.h" #include "comm.h" -// SEMI-GLOBAL VARIABLES +/* SEMI-GLOBAL VARIABLES */ -// defined and initialized in param.c +/* defined and initialized in param.c */ extern const int beam_Npars; extern const double beam_pars[]; -// used in crosssec.c -double beam_center_0[3]; // position of the beam center in laboratory reference frame -// used in param.c -char beam_descr[MAX_PARAGRAPH]; // string for log file with beam parameters +/* used in crosssec.c */ +double beam_center_0[3]; /* position of the beam center in laboratory reference frame */ +/* used in param.c */ +char beam_descr[MAX_PARAGRAPH]; /* string for log file with beam parameters */ -// LOCAL VARIABLES -double s,s2; // beam confinement factor and its square -double scale_x,scale_z; // multipliers for scaling coordinates +/* LOCAL VARIABLES */ +double s,s2; /* beam confinement factor and its square */ +double scale_x,scale_z; /* multipliers for scaling coordinates */ -//============================================================ +/*============================================================*/ void InitBeam(void) -// initialize beam; produce description string + /* initialize beam; produce dscription string */ { - double w0; // beam width + double w0; /* beam width */ - if (beamtype==B_PLANE) { - STRCPYZ(beam_descr,"Plane wave"); - beam_asym=FALSE; - } - else { - // initialize parameters - w0=beam_pars[0]; - beam_asym=(beam_Npars==4 && (beam_pars[1]!=0 || beam_pars[2]!=0 || beam_pars[3]!=0)); - if (beam_asym) { - memcpy(beam_center_0,beam_pars+1,3*sizeof(double)); - // if necessary break the symmetry of the problem - if (beam_center_0[0]!=0) symX=symR=FALSE; - if (beam_center_0[1]!=0) symY=symR=FALSE; - if (beam_center_0[2]!=0) symZ=FALSE; - } - else beam_center[0]=beam_center[1]=beam_center[2]=0; - s=1/(WaveNum*w0); - s2=s*s; - scale_x=1/w0; - scale_z=s*scale_x; // 1/(k*w0^2) - // beam info - if (ringid==ROOT) { - strcpy(beam_descr,"Gaussian beam ("); - if (beamtype==B_LMINUS) strcat(beam_descr,"L- approximation)\n"); - else if (beamtype==B_DAVIS3) strcat(beam_descr,"3rd order approximation, by Davis)\n"); - else if (beamtype==B_BARTON5) strcat(beam_descr,"5th order approximation, by Barton)\n"); - sprintf(beam_descr+strlen(beam_descr),"\tWidth=%g (confinement factor s=%g)\n",w0,s); - if (beam_asym) - sprintf(beam_descr+strlen(beam_descr),"\tCenter position: (%g,%g,%g)", - beam_center_0[0],beam_center_0[1],beam_center_0[2]); - else strcat(beam_descr,"\tCenter is in the origin"); - } - } + if (beamtype==B_PLANE) { + STRCPYZ(beam_descr,"Plane wave"); + beam_asym=FALSE; + } + else { + /* initialize parameters */ + w0=beam_pars[0]; + beam_asym=(beam_Npars==4 && (beam_pars[1]!=0 || beam_pars[2]!=0 || beam_pars[3]!=0)); + if (beam_asym) { + memcpy(beam_center_0,beam_pars+1,3*sizeof(double)); + /* if needed break the symmetry of the problem */ + if (beam_center_0[0]!=0) symX=symR=FALSE; + if (beam_center_0[1]!=0) symY=symR=FALSE; + if (beam_center_0[2]!=0) symZ=FALSE; + } + else beam_center[0]=beam_center[1]=beam_center[2]=0; + s=1/(WaveNum*w0); + s2=s*s; + scale_x=1/w0; + scale_z=s*scale_x; /* 1/(k*w0^2) */ + /* beam info */ + if (ringid==ROOT) { + strcpy(beam_descr,"Gaussian beam ("); + if (beamtype==B_LMINUS) strcat(beam_descr,"L- approximation)\n"); + else if (beamtype==B_DAVIS3) strcat(beam_descr,"3rd order approximation, by Davis)\n"); + else if (beamtype==B_BARTON5) strcat(beam_descr,"5th order approximation, by Barton)\n"); + sprintf(beam_descr+strlen(beam_descr),"\tWidth=%g (confinement factor s=%g)\n",w0,s); + if (beam_asym) + sprintf(beam_descr+strlen(beam_descr),"\tCenter position: (%g,%g,%g)", + beam_center_0[0],beam_center_0[1],beam_center_0[2]); + else strcat(beam_descr,"\tCenter is in the origin"); + } + } } -//============================================================ +/*============================================================*/ -void GenerateB (const char which, // x - or y polarized incident light - doublecomplex *b) // the b vector for the incident field -// generates incident beam at every dipole +void GenerateB (const char which, /* x - or y polarized incident light */ + doublecomplex *b) /* the b vector for the incident field */ + /* generates incident beam at every dipole */ { - size_t i,j; - doublecomplex psi0,Q,Q2; - doublecomplex v1[3],v2[3],v3[3]; - double ro2,ro4; - double x,y,z,x2_s,xy_s; - doublecomplex t1,t2,t3,t4,t5,t6,t7,t8,t0,ctemp; - double const *ex; // coordinate axis of the beam reference frame - double ey[3]; - double r1[3]; + size_t i,j; + doublecomplex psi0,Q,Q2; + doublecomplex v1[3],v2[3],v3[3]; + double ro2,ro4; + double x,y,z,x2_s,xy_s; + doublecomplex t1,t2,t3,t4,t5,t6,t7,t8,t0,ctemp; + double const *ex; /* coordinate axis of the beam reference frame */ + double ey[3]; + double r1[3]; - // set reference frame of the beam; ez=prop, ex - incident polarization - if (which=='Y') { - ex=incPolY; - memcpy(ey,incPolX,3*sizeof(double)); - MultScal(-1,ey,ey); - } - if (which=='X') { - ex=incPolX; - memcpy(ey,incPolY,3*sizeof(double)); - } - // plane is separate to be fast - if (beamtype==B_PLANE) - for (i=0;i<local_nvoid_Ndip;i++) { - j=3*i; - imExp(WaveNum*DotProd(DipoleCoord+j,prop),ctemp); // ctemp=exp(ik*r.a) - cScalMultRVec(ex,ctemp,b+j); // b[i]=ctemp*ex - } - else { // all other beam types - for (i=0;i<local_nvoid_Ndip;i++) { - j=3*i; - // set relative coordinates (in beam's coordinate system) - LinComb(DipoleCoord+j,beam_center,1,-1,r1); - x=DotProd(r1,ex)*scale_x; - y=DotProd(r1,ey)*scale_x; - z=DotProd(r1,prop)*scale_z; - ro2=x*x+y*y; - // calculate Q=1/(2z-i) - Q[IM]=1/(1+4*z*z); - Q[RE]=2*z*Q[IM]; - // calculate psi0=-iQexp(iQro^2) - cMult_i2(Q,t1); - cMultReal(ro2,t1,t2); - cExpSelf(t2); - cMult(t1,t2,psi0); - cInvSign(psi0); - // ctemp=exp(ik*z*scale_z)*psi0 - imExp(WaveNum*z/scale_z,ctemp); - cMultSelf(ctemp,psi0); - if (beamtype==B_LMINUS) { - cScalMultRVec(ex,ctemp,b+j); // b[i]=ctemp*ex - } - else { - x2_s=x*x/ro2; - cSquare(Q,Q2); - ro4=ro2*ro2; - // some combinations that are used more than once - cMultReal(s2*ro2,Q2,t4); // t4=(s*ro*Q)^2 - cMultReal(ro2,Q,t5); - cMult_i(t5); // t5=i*Q*ro^2 - cMultReal(ro4,Q2,t6); // t6=ro^4*Q^2 - cMultReal(x*s,Q,t7); // t7=x*s*Q - if (beamtype==B_DAVIS3) { - // t1=1+s^2(-4Q^2*x^2-iQ^3*ro^4)=1-t4(4x2_s+t5) - cEqual(t5,t1); - t1[RE]+=4*x2_s; - cMultSelf(t1,t4); - cMultReal(-1,t1,t1); - t1[RE]+=1; - // t2=0 - t2[RE]=t2[IM]=0; - // t3=-s(2Qx)+s^3(8Q^3*ro^2*x+2iQ^4*ro^4*x-4iQ^2x)=2t7[-1+iQ*s2*(-4t5+t6-2)] - cMultReal(-4,t5,t3); - cAdd(t3,t6,t3); - t3[RE]-=2; - cMultReal(s2,t3,t3); - cMultSelf(t3,Q); - cMult_i(t3); - t3[RE]-=1; - cMultSelf(t3,t7); - cMultReal(2,t3,t3); - } - else if (beamtype==B_BARTON5) { - xy_s=x*y/ro2; - cMultReal(2,t5,t8); - t8[RE]+=8; // t8=8+2i*Q*ro^2 - /* t1 = 1 + s^2(-ro^2*Q^2-i*ro^4*Q^3-2Q^2*x^2) - * + s^4[2ro^4*Q^4+3iro^6*Q^5-0.5ro^8*Q^6+x^2(8ro^2*Q^4+2iro^4*Q^5)] - * = 1 + t4*{-1-2xs2-t5+t4*[2+3t5-0.5t6+x2_s*t8]} - */ - cMultReal(x2_s,t8,t1); - cMultReal(-0.5,t6,t0); - cAdd(t1,t0,t1); - cMultReal(3,t5,t0); - cAdd(t1,t0,t1); - t1[RE]+=2; - cMultSelf(t1,t4); - cSubtr(t1,t5,t1); - t1[RE]-=1+2*x2_s; - cMultSelf(t1,t4); - t1[RE]+=1; - // t2=s^2(-2Q^2*xy)+s^4[xy(8ro^2*Q^4+2iro^4*Q^5)]=xy_s*t4(-2+t4*t8) - cMult(t4,t8,t2); - t2[RE]-=2; - cMultSelf(t2,t4); - cMultReal(xy_s,t2,t2); - /* t3 = s(-2Qx) + s^3[(6ro^2*Q^3+2iro^4*Q^4)x] - * + s^5[(-20ro^4*Q^5-10iro^6*Q^6+ro^8*Q^7)x] - * = t7{-2+t4[6+2t5+t4(-20-10t5+t6)]} - */ - cMultReal(-10,t5,t3); - cAdd(t3,t6,t3); - t3[RE]-=20; - cMultSelf(t3,t4); - cMultReal(2,t5,t0); - cAdd(t3,t0,t3); - t3[RE]+=6; - cMultSelf(t3,t4); - t3[RE]-=2; - cMultSelf(t3,t7); - } - // b[i]=ctemp(ex*t1+ey*t2+ez*t3) - cScalMultRVec(ex,t1,v1); - cScalMultRVec(ey,t2,v2); - cScalMultRVec(prop,t3,v3); - cvAdd(v1,v2,v1); - cvAdd(v1,v3,v1); - cvMultScal_cmplx(ctemp,v1,b+j); - } - } - } + /* set reference frame of the beam; ez=prop, ex - incident polarization */ + if (which=='Y') { + ex=incPolY; + memcpy(ey,incPolX,3*sizeof(double)); + MultScal(-1,ey,ey); + } + if (which=='X') { + ex=incPolX; + memcpy(ey,incPolY,3*sizeof(double)); + } + /* plane is separate to be fast */ + if (beamtype==B_PLANE) + for (i=0;i<local_nvoid_Ndip;i++) { + j=3*i; + imExp(WaveNum*DotProd(DipoleCoord+j,prop),ctemp); /* ctemp=exp(ik*r.a) */ + cScalMultRVec(ex,ctemp,b+j); /* b[i]=ctemp*ex */ + } + else { /* all other beam types */ + for (i=0;i<local_nvoid_Ndip;i++) { + j=3*i; + /* set relative coordinates (in beam's coordinate system) */ + LinComb(DipoleCoord+j,beam_center,1,-1,r1); + x=DotProd(r1,ex)*scale_x; + y=DotProd(r1,ey)*scale_x; + z=DotProd(r1,prop)*scale_z; + ro2=x*x+y*y; + /* calculate Q=1/(2z-i) */ + Q[IM]=1/(1+4*z*z); + Q[RE]=2*z*Q[IM]; + /* calculate psi0=-iQexp(iQro^2) */ + cMult_i2(Q,t1); + cMultReal(ro2,t1,t2); + cExpSelf(t2); + cMult(t1,t2,psi0); + cInvSign(psi0); + /* ctemp=exp(ik*z*scale_z)*psi0 */ + imExp(WaveNum*z/scale_z,ctemp); + cMultSelf(ctemp,psi0); + if (beamtype==B_LMINUS) { + cScalMultRVec(ex,ctemp,b+j); /* b[i]=ctemp*ex */ + } + else { + x2_s=x*x/ro2; + cSquare(Q,Q2); + ro4=ro2*ro2; + /* some combinations that are used more than once */ + cMultReal(s2*ro2,Q2,t4); /* t4=(s*ro*Q)^2 */ + cMultReal(ro2,Q,t5); + cMult_i(t5); /* t5=i*Q*ro^2 */ + cMultReal(ro4,Q2,t6); /* t6=ro^4*Q^2 */ + cMultReal(x*s,Q,t7); /* t7=x*s*Q */ + if (beamtype==B_DAVIS3) { + /* t1=1+s^2(-4Q^2*x^2-iQ^3*ro^4)=1-t4(4x2_s+t5) */ + cEqual(t5,t1); + t1[RE]+=4*x2_s; + cMultSelf(t1,t4); + cMultReal(-1,t1,t1); + t1[RE]+=1; + /* t2=0 */ + t2[RE]=t2[IM]=0; + /* t3=-s(2Qx)+s^3(8Q^3*ro^2*x+2iQ^4*ro^4*x-4iQ^2x)=2t7[-1+iQ*s2*(-4t5+t6-2)] */ + cMultReal(-4,t5,t3); + cAdd(t3,t6,t3); + t3[RE]-=2; + cMultReal(s2,t3,t3); + cMultSelf(t3,Q); + cMult_i(t3); + t3[RE]-=1; + cMultSelf(t3,t7); + cMultReal(2,t3,t3); + } + else if (beamtype==B_BARTON5) { + xy_s=x*y/ro2; + cMultReal(2,t5,t8); + t8[RE]+=8; /* t8=8+2i*Q*ro^2 */ + /* t1=1+s^2(-ro^2*Q^2-i*ro^4*Q^3-2Q^2*x^2)+ + +s^4[2ro^4*Q^4+3iro^6*Q^5-0.5ro^8*Q^6+x^2(8ro^2*Q^4+2iro^4*Q^5)]= + =1+t4*{-1-2xs2-t5+t4*[2+3t5-0.5t6+x2_s*t8]} */ + cMultReal(x2_s,t8,t1); + cMultReal(-0.5,t6,t0); + cAdd(t1,t0,t1); + cMultReal(3,t5,t0); + cAdd(t1,t0,t1); + t1[RE]+=2; + cMultSelf(t1,t4); + cSubtr(t1,t5,t1); + t1[RE]-=1+2*x2_s; + cMultSelf(t1,t4); + t1[RE]+=1; + /* t2=s^2(-2Q^2*xy)+s^4[xy(8ro^2*Q^4+2iro^4*Q^5)]=xy_s*t4(-2+t4*t8) */ + cMult(t4,t8,t2); + t2[RE]-=2; + cMultSelf(t2,t4); + cMultReal(xy_s,t2,t2); + /* t3=s(-2Qx)+s^3[(6ro^2*Q^3+2iro^4*Q^4)x]+s^5[(-20ro^4*Q^5-10iro^6*Q^6+ro^8*Q^7)x]= + =t7{-2+t4[6+2t5+t4(-20-10t5+t6)]} */ + cMultReal(-10,t5,t3); + cAdd(t3,t6,t3); + t3[RE]-=20; + cMultSelf(t3,t4); + cMultReal(2,t5,t0); + cAdd(t3,t0,t3); + t3[RE]+=6; + cMultSelf(t3,t4); + t3[RE]-=2; + cMultSelf(t3,t7); + } + /* b[i]=ctemp(ex*t1+ey*t2+ez*t3) */ + cScalMultRVec(ex,t1,v1); + cScalMultRVec(ey,t2,v2); + cScalMultRVec(prop,t3,v3); + cvAdd(v1,v2,v1); + cvAdd(v1,v3,v1); + cvMultScal_cmplx(ctemp,v1,b+j); + } + } + } } diff --git a/src/Makefile b/src/Makefile index a1e19fa8..2f2c6226 100644 --- a/src/Makefile +++ b/src/Makefile @@ -26,19 +26,27 @@ #CFLAGS += -DNOT_USE_LOCK #CFLAGS += -DONLY_LOCKFILE -# --Compilers-- Choose one of the following -# gnu - tested for gcc 3.2.3 - 3.4.6 -# intel - tested on icc 9.0 and 9.1 -# compaq - tested on Compaq C V6.5-303 (dtk) -# ibm - tested on xlc 8.0 -# other - +# --Compilers-- Choose (uncomment) one of the lines below + +# tested for gcc 3.2.3 and higher; COMPILER = gnu -# Compilation mode (bits): 32 or 64 -# Although all options in this Makefile are adjusted automatically, different -# modes may require different configurations (environments) of the system, -# especially if MPI mode is used. -BITS = 32 +# uses -fast option, which is general, but includes static linking, which may +# cause problems in some cases +#COMPILER = intel9.x + +# the same as intel9.x but do not use static linking; it may also not use all +# possible optimizations, depending on a particular processor +#COMPILER = intel9.x_ns + +# this is almost the same as intel9.x with few changes for old version, obsolete +#COMPILER = intel8.1 + +# tested on Compaq C V6.5-303 (dtk) +#COMPILER = compaq + +# All options should be implemented below and in sub-makefiles +#COMPILER = other # --Warnings-- If uncommented, all warning are suppressed RELEASE = on @@ -83,7 +91,6 @@ LASTMPI = .lastmpi SHELL = /bin/sh LDLIBS = -lm DEPFLAG = -M -DFFLAG = -MF MFILES = Makefile # Fortran sources generate a lot of warnings FWARN = -w @@ -104,57 +111,65 @@ ifeq ($(findstring -DFFT_TEMPERTON,$(CFLAGS)),) else FSOURCE += cfft99D.f endif -# compiler warnings and optimization flags; -# when release, warning are turned off below +# if 'release' turn off warningns +ifdef RELEASE + CWARN = -w + LWARN = -w +endif +# compiler warnings and optimization flags FOPT = $(COPT) ifeq ($(COMPILER),gnu) - # You can add option '-march=...' or 'mcpu=...' to COPT and FOPT in this section + # You can add option '-march=...' to COPT and FOPT in this section CC = gcc CF = g77 - COPT = -m$(BITS) -std=c99 -O3 -ffast-math -funroll-loops + COPT = -O3 -ffast-math -funroll-loops FOPT = -O - CWARN = -Wall -W -Wpointer-arith -Wcast-qual \ - -Wwrite-strings -Wstrict-prototypes -Wno-uninitialized \ - -Wno-unknown-pragmas -Wno-comment -Wno-unused-parameter + ifndef RELEASE + CWARN = -Wall -W -Winline -Wpointer-arith -Wcast-qual \ + -Wwrite-strings -Wstrict-prototypes -Wno-uninitialized \ + -Wno-unknown-pragmas -Wno-comment -Wno-unused-parameter \ + -Wno-long-long + endif endif -ifeq ($(COMPILER),intel) +ifeq ($(COMPILER),intel9.x) + CC = icc + CF = ifort + COPT = -fast -fp-model fast + FOPT = -fast + ifndef RELEASE + CWARN = -Wall -Wno-uninitialized -Wno-comment \ + -wd279,810,869,981,1418,1419,1572 + endif +endif +ifeq ($(COMPILER),intel9.x_ns) CC = icc CF = ifort COPT = -fp-model fast -O3 -ipo -no-prec-div -xP FOPT = -O3 -ipo -no-prec-div -xP - CWARN = -Wall -Wno-uninitialized -Wno-comment \ + ifndef RELEASE + CWARN = -Wall -Wno-uninitialized -Wno-comment \ -wd279,810,869,981,1418,1419,1572 + endif +endif +ifeq ($(COMPILER),intel8.1) + CC = icc + CF = ifort + COPT = -fast + ifndef RELEASE + CWARN = -Wall -wd9,279,810,869,981,1418,1419,1572 + endif endif ifeq ($(COMPILER),compaq) - # This compiler was not tested since 2007. In particular, warning options may - # not fit exactly the C99 standard, to which the code was transferred. - # Its support for 64 bit compilations is also undefined. - # If you happen to use this compiler, please report results to the authors. - # # You can add option '-arch host' to COPT and FOPT in this section CC = cc CF = f77 COPT = -fast - CWARN = -w0 -msg_disable nestedcomment,unknownpragma,unreachcode -endif -ifeq ($(COMPILER),ibm) - # -O5 implies "-arch=auto", which tunes compilation exclusively for the host - # machine. However, it will not work in some configurations. - # Then use '-O3 -qarch=... -qtune=...' instead - CC = xlc - CF = xlf - COPT = -O3 -qcache=auto -q$(BITS) -qlanglvl=extc99 - # DFREDIRECT = 2>nul -qipa=level=2 -qhot - DEPFLAG = -qmakedep=gcc -qsyntaxonly - CWARN = -qsuppress=1506-224:1506-342:1500-036 + ifndef RELEASE + CWARN = -w0 -msg_disable nestedcomment,unknownpragma,unreachcode + endif endif ifeq ($(COMPILER),other) endif -# if 'release' turn off warningns -ifdef RELEASE - CWARN = -w - LWARN = -w -endif # Finalize option flags (almost) CFLAGS += $(COPT) $(CWARN) FFLAGS += $(FOPT) $(FWARN) diff --git a/src/Romberg.c b/src/Romberg.c index 995aa847..fb950d4b 100644 --- a/src/Romberg.c +++ b/src/Romberg.c @@ -18,7 +18,7 @@ * Two instances of Romberg 2D should not be used in parallel (they use common storage). * E.g. calculation of Csca inside orientation averaging must not be done. * - * Integration parameters are described in a special structure Parms_1D defined in types.h + * Integration parameters are desribed in a special structure Parms_1D defined in types.h * They must be set outside of the Romberg routine. * * All the routines normalize the result on the interval width, i.e. @@ -39,495 +39,493 @@ #include "memory.h" #include "io.h" -// SEMI-GLOBAL VARIABLES +/* SEMI-GLOBAL VARIABLES */ -// defined and initialized in crosssec.c +/* defined and initialized in crossec.c */ extern const int full_al_range; -// LOCAL VARIABLES - -static int dim; // dimension of the data (integrated simultaneously) -static int N_eval; // number of function evaluations in inner cycle -static int N_tot_eval; // total number of function evaluation -static int no_convergence; // number of inner integrals that did not converge -static FILE *file; // file to print info -// used in inner loop -static int size_in; // size of M array -static double **M_in, // array of M values - *T_in, // T_m^0 - *dummy_in; // save function values -// used in outer loop +/* LOCAL VARIABLES */ + +static int dim; /* dimension of the data (integrated simultaneously) */ +static int N_eval; /* number of function evaluations in inner cycle */ +static int N_tot_eval; /* total number of function evaluation */ +static int no_convergence; /* number of inner integrals that did not converge */ +static FILE *file; /* file to print info */ +/* used in inner loop */ +static int size_in; /* size of M array */ +static double **M_in, /* array of M values */ + *T_in, /* T_m^0 */ + *dummy_in; /* save function values */ +/* used in outer loop */ static int size_out; -static double **M_out,*T_out,*dummy_out; // analogous to the above -// common arrays with frequently used values -static double *tv1, // 4^m - *tv2, // 1/(4^m-1) - *tv3; // 2*4^m-1 -// pointer to the function that is integrated +static double **M_out,*T_out,*dummy_out; /* analogous to the above */ +/* common arrays with frequently used values */ +static double *tv1, /* 4^m */ + *tv2, /* 1/(4^m-1) */ + *tv3; /* 2*4^m-1 */ +/* pointer to the function that is integrated */ static double (*func)(int theta,int phi,double *res); -static const Parms_1D *input; // parameters of integration - -//============================================================ - -double Romberg1D(const Parms_1D param, // parameters of integration - const int size, // size of block of data - const double *data, // written as sequential blocks - double *res) // where to put result -/* Performs integration of data. Since all values are already calculated, no adaptation is used - * (all data is used). Result is normalized on the interval width, i.e. actually averaging takes - * place. Returns relative mean-square error; if function is periodic then only the first column of - * the table is used - i.e. trapezoid rule. This function is completely independent of others. - */ +static const Parms_1D *input; /* parameters of integration */ + +/*============================================================*/ + +double Romberg1D(/* Performs integration of data */ + const Parms_1D param,/* parameters of integration */ + const int size, /* size of block of data */ + const double *data, /* written as sequential blocks */ + double *res) /* where to put result */ + + /* Since all values are already calculated, no adaptation is used (all data is used) + Result is normalized on the interval width, i.e. actually averaging takes place. + returns relative mean-square error; if function is periodic then only the first column + of the table is used - i.e. trapezoid rule; + This function is completely independent */ { - int m,m0,comp,i,step,index,Msize; - size_t j; - double abs_res,abs_err; // norms (squared) of result and error - double temp; - double **M1,*T1,*t1,*t2,*t3; // analogous to those used in 2D Romberg - - // allocate memory - Msize = param.periodic ? 0 : param.Jmax; - MALLOC_DMATRIX(M1,Msize+1,size,ONE); - MALLOC_VECTOR(T1,double,size,ONE); - // common to fasten calculations; needed only for really Romberg - if (Msize!=0) { - MALLOC_VECTOR(t1,double,Msize+1,ONE); - MALLOC_DVECTOR2(t2,1,Msize,ONE); - MALLOC_VECTOR(t3,double,Msize+1,ONE); - t1[0]=1; - for (i=1;i<Msize;i++) { - t1[i]=t1[i-1]*4; - t2[i]=1/(t1[i]-1); - t3[i-1]=2*t1[i-1]-1; - } - } - // integration - if (param.Grid_size==1) { // if only one point - memcpy(res,data,size*sizeof(double)); - abs_err=0; - abs_res=0; // that is not to do unnecessary calculations - } - else { - m0=0; // equals 0 for periodic, m otherwise - for (m=0;m<param.Jmax;m++) { - // calculate T_0^m - if (m==0) { - if (param.equival) memcpy(T1,data,size*sizeof(double)); - else { - index=(param.Grid_size-1)*size; - for (comp=0;comp<size;++comp) T1[comp]=0.5*(data[comp]+data[index+comp]); - } - } - else { - if (param.periodic) for (comp=0;comp<size;++comp) - T1[comp]=0.5*(T1[comp]+M1[0][comp]); - else { - for (comp=0;comp<size;++comp) - T1[comp]=t3[m-1]*t2[m]*(T1[comp]-M1[0][comp])+M1[0][comp]; - m0=m; - } - } - // get new integrand values (M_0^m) - step=(param.Grid_size-1)>>m; - for (comp=0;comp<size;++comp) M1[m0][comp]=0; - for (j=step>>1;j<param.Grid_size;j+=step) { - index=j*size; - for (comp=0;comp<size;++comp) M1[m0][comp]+=data[index+comp]; - } - temp=pow(2,-m); - for (comp=0;comp<size;++comp) M1[m0][comp]*=temp; - // generate M_1^(m-1), M_2^(m-2), ..., M_(m-1)^1, M_m^0 - if (m0!=0) for (i=m-1;i>=0;i--) for (comp=0;comp<size;comp++) - M1[i][comp]=t2[m-i]*(t1[m-i]*M1[i+1][comp]-M1[i][comp]); - } - // set result and error - abs_res=abs_err=0; - for (comp=0;comp<size;++comp) { - res[comp]=0.5*(M1[0][comp]+T1[comp]); - abs_res+=res[comp]*res[comp]; - temp=0.5*fabs(M1[0][comp]-T1[comp]); - abs_err+=temp*temp; - } - } - // free all - Free_dMatrix(M1,Msize+1); - Free_general(T1); - if (Msize!=0) { - Free_general(t1); - Free_dVector2(t2,1); - Free_general(t3); - } - // return - if (abs_res==0) return 0; - else return (sqrt(abs_err/abs_res)); + int m,m0,comp,i,step,index,Msize; + size_t j; + double abs_res,abs_err; /* norms (squared) of result and error */ + double temp; + double **M1,*T1,*t1,*t2,*t3; /* analogous to those used in 2D Romberg */ + + /* allocate memory */ + Msize = param.periodic ? 0 : param.Jmax; + MALLOC_DMATRIX(M1,Msize+1,size,ONE); + MALLOC_VECTOR(T1,double,size,ONE); + /* common to fasten calculations; needed only for really Romberg */ + if (Msize!=0) { + MALLOC_VECTOR(t1,double,Msize+1,ONE); + MALLOC_DVECTOR2(t2,1,Msize,ONE); + MALLOC_VECTOR(t3,double,Msize+1,ONE); + t1[0]=1; + for (i=1;i<Msize;i++) { + t1[i]=t1[i-1]*4; + t2[i]=1/(t1[i]-1); + t3[i-1]=2*t1[i-1]-1; + } + } + /* integration */ + if (param.Grid_size==1) { /* if only one point */ + memcpy(res,data,size*sizeof(double)); + abs_err=0; + abs_res=0; /* that is not to do unnecessary calculations */ + } + else { + m0=0; /* equals 0 for periodic, m otherwise */ + for (m=0;m<param.Jmax;m++) { + /* calculate T_0^m */ + if (m==0) { + if (param.equival) memcpy(T1,data,size*sizeof(double)); + else { + index=(param.Grid_size-1)*size; + for (comp=0;comp<size;++comp) T1[comp]=0.5*(data[comp]+data[index+comp]); + } + } + else { + if (param.periodic) for (comp=0;comp<size;++comp) + T1[comp]=0.5*(T1[comp]+M1[0][comp]); + else { + for (comp=0;comp<size;++comp) T1[comp]=t3[m-1]*t2[m]*(T1[comp]-M1[0][comp])+M1[0][comp]; + m0=m; + } + } + /* get new integrand values (M_0^m) */ + step=(param.Grid_size-1)>>m; + for (comp=0;comp<size;++comp) M1[m0][comp]=0; + for (j=step>>1;j<param.Grid_size;j+=step) { + index=j*size; + for (comp=0;comp<size;++comp) M1[m0][comp]+=data[index+comp]; + } + temp=pow(2,-m); + for (comp=0;comp<size;++comp) M1[m0][comp]*=temp; + /* generate M_1^(m-1), M_2^(m-2), ..., M_(m-1)^1, M_m^0 */ + if (m0!=0) for (i=m-1;i>=0;i--) for (comp=0;comp<size;comp++) + M1[i][comp]=t2[m-i]*(t1[m-i]*M1[i+1][comp]-M1[i][comp]); + } + /* set result and error */ + abs_res=abs_err=0; + for (comp=0;comp<size;++comp) { + res[comp]=0.5*(M1[0][comp]+T1[comp]); + abs_res+=res[comp]*res[comp]; + temp=0.5*fabs(M1[0][comp]-T1[comp]); + abs_err+=temp*temp; + } + } + /* free all */ + Free_dMatrix(M1,Msize+1); + Free_general(T1); + if (Msize!=0) { + Free_general(t1); + Free_dVector2(t2,1); + Free_general(t3); + } + /* return */ + if (abs_res==0) return 0; + else return (sqrt(abs_err/abs_res)); } -//============================================================ +/*============================================================*/ static void AllocateAll(void) -// allocates all needed memory + /* allocates all needed memory */ { - int i,maxdim; - - // inner loop vectors - size_in=input[PHI].periodic ? 0 : input[PHI].Jmax; - MALLOC_DMATRIX(M_in,size_in+1,dim,ONE); - MALLOC_VECTOR(T_in,double,dim,ONE); - MALLOC_VECTOR(dummy_in,double,dim,ONE); - // outer loop vectors - size_out=input[THETA].periodic ? 0 : input[THETA].Jmax; - MALLOC_DMATRIX(M_out,size_out+1,dim,ONE); - MALLOC_VECTOR(T_out,double,dim,ONE); - MALLOC_VECTOR(dummy_out,double,dim,ONE); - // common to fasten calculations; needed only for really Romberg - maxdim=MAX(size_in,size_out); - if (maxdim!=0) { - MALLOC_VECTOR(tv1,double,maxdim+1,ONE); - MALLOC_DVECTOR2(tv2,1,maxdim,ONE); - MALLOC_VECTOR(tv3,double,maxdim+1,ONE); - tv1[0]=1; - for (i=1;i<maxdim;i++) { - tv1[i]=tv1[i-1]*4; - tv2[i]=1/(tv1[i]-1); - tv3[i-1]=2*tv1[i-1]-1; - } - } + int i,maxdim; + + /* inner loop vectors */ + size_in=input[PHI].periodic ? 0 : input[PHI].Jmax; + MALLOC_DMATRIX(M_in,size_in+1,dim,ONE); + MALLOC_VECTOR(T_in,double,dim,ONE); + MALLOC_VECTOR(dummy_in,double,dim,ONE); + /* outer loop vectors */ + size_out=input[THETA].periodic ? 0 : input[THETA].Jmax; + MALLOC_DMATRIX(M_out,size_out+1,dim,ONE); + MALLOC_VECTOR(T_out,double,dim,ONE); + MALLOC_VECTOR(dummy_out,double,dim,ONE); + /* common to fasten calculations; needed only for really Romberg */ + maxdim=MAX(size_in,size_out); + if (maxdim!=0) { + MALLOC_VECTOR(tv1,double,maxdim+1,ONE); + MALLOC_DVECTOR2(tv2,1,maxdim,ONE); + MALLOC_VECTOR(tv3,double,maxdim+1,ONE); + tv1[0]=1; + for (i=1;i<maxdim;i++) { + tv1[i]=tv1[i-1]*4; + tv2[i]=1/(tv1[i]-1); + tv3[i-1]=2*tv1[i-1]-1; + } + } } -//============================================================ +/*============================================================*/ static void FreeAll(void) -// frees all memory + /* frees all memory */ { - // inner - Free_dMatrix(M_in,size_in+1); - Free_general(T_in); - Free_general(dummy_in); - // outer - Free_dMatrix(M_out,size_out+1); - Free_general(T_out); - Free_general(dummy_out); - // common - if (size_in!=0 || size_out!=0) { - Free_general(tv1); - Free_dVector2(tv2,1); - Free_general(tv3); - } + /* inner */ + Free_dMatrix(M_in,size_in+1); + Free_general(T_in); + Free_general(dummy_in); + /* outer */ + Free_dMatrix(M_out,size_out+1); + Free_general(T_out); + Free_general(dummy_out); + /* common */ + if (size_in!=0 || size_out!=0) { + Free_general(tv1); + Free_dVector2(tv2,1); + Free_general(tv3); + } } -//============================================================ +/*============================================================*/ -static void RombergIterate(double **M, // array of M values - const int m) // maximum order -/* performs one Romberg iteration; transforms previous array of M into a new one - * M_m^k=((4^m)*M_(m-1)^(k+1)-M_(m-1)^k)/(4^m-1); our storage implies - * M_(m-1-k)^k -old-> M[k] -new-> M_(m-k)^k - */ +static void RombergIterate(double **M, /* array of M values */ + const int m) /* maximum order */ + /* performs one Romberg iteration; transforms previous array of M into a new one + M_m^k=((4^m)*M_(m-1)^(k+1)-M_(m-1)^k)/(4^m-1); + our storage implies M_(m-1-k)^k -old-> M[k] -new-> M_(m-k)^k */ { - int k,comp; + int k,comp; - for (k=m-1;k>=0;k--) for (comp=0;comp<dim;comp++) - M[k][comp]=tv2[m-k]*(tv1[m-k]*M[k+1][comp]-M[k][comp]); + for (k=m-1;k>=0;k--) + for (comp=0;comp<dim;comp++) + M[k][comp]=tv2[m-k]*(tv1[m-k]*M[k+1][comp]-M[k][comp]); } -//============================================================ +/*============================================================*/ static double InnerInitT(const int fixed,double *res) -/* Calculate term T_0^0 for the inner integration of func over phi_min < phi < phi_max for fixed - * theta = th_f - */ + /* Calculate term T_0^0 for the inner integration of func + over phi_min < phi < phi_max for fixed theta = th_f */ { - int comp; - double err; - - // calculate first point - err=(*func)(fixed,0,res); - N_eval++; - - if (!input[PHI].equival) { - // calculate last point - err=0.5*(err+(*func)(fixed,input[PHI].Grid_size-1,dummy_in)); - N_eval++; - for (comp=0;comp<dim;++comp) res[comp] = 0.5*(dummy_in[comp]+res[comp]); - } - return err; + int comp; + double err; + + /* calculate first point */ + err=(*func)(fixed,0,res); + N_eval++; + + if (!input[PHI].equival) { + /* calculate last point */ + err=0.5*(err+(*func)(fixed,input[PHI].Grid_size-1,dummy_in)); + N_eval++; + for (comp=0;comp<dim;++comp) + res[comp] = 0.5*(dummy_in[comp]+res[comp]); + } + return err; } -//============================================================ +/*============================================================*/ static double InnerTrapzd(const int fixed,double *res,const int n) -/* Calculate n'th refinement (term M_0^n) for the inner integration of func over - * phi_min < phi < phi_max for fixed theta = th_f - */ + /* Calculate n'th refinement (term M_0^n) for the inner integration + of func over phi_min < phi < phi_max for fixed theta = th_f */ { - int comp,step; - size_t j; - double temp,err; - - step=(input[PHI].Grid_size-1)>>n; - // init sum - for (comp=0;comp<dim;++comp) res[comp]=0; - err=0; - // accumulate sum - for (j=step>>1;j<input[PHI].Grid_size;j+=step) { - err+=(*func)(fixed,j,dummy_in); - N_eval++; - for (comp=0;comp<dim;++comp) res[comp]+=dummy_in[comp]; - } - // scale it - temp=pow(2,-n); - for (comp=0;comp<dim;++comp) res[comp]*=temp; - return (err*temp); + int comp,step; + size_t j; + double temp,err; + + step=(input[PHI].Grid_size-1)>>n; + /* init sum */ + for (comp=0;comp<dim;++comp) res[comp]=0; + err=0; + /* accumulate sum */ + for (j=step>>1;j<input[PHI].Grid_size;j+=step) { + err+=(*func)(fixed,j,dummy_in); + N_eval++; + for (comp=0;comp<dim;++comp) res[comp]+=dummy_in[comp]; + } + /* scale it */ + temp=pow(2,-n); + for (comp=0;comp<dim;++comp) res[comp]*=temp; + return (err*temp); } -//============================================================ +/*============================================================*/ static double InnerRomberg(const int fixed,double *res,const int onepoint) -/* Integrate (average) func for fixed theta=fixed; returns estimate of the absolute error (for the - * first element); if function is periodic then only the first column of the table is used - i.e. - * trapezoid rule; if 'onepoint' is TRUE only one point is used for evaluation, this is used e.g. - * for theta==0, when all phi points are equivalent - * */ + /* Integrate (average) func for fixed theta=fixed; returns estimate of the absolute error + (for the first element); if function is periodic then only the first column of the table + is used - i.e. trapezoid rule; if onepoint is TRUE only one point is used for evaluation, + this is used e.g. for theta==0, when all phi points are equivalent */ { - int m,m0,comp; - double abs_res,abs_err; // norms of result and error - double int_err; // absolute error of previous layer integration - double err; - - if (input[PHI].Grid_size==1 || onepoint) { // if only one point (really or assumed) - int_err=(*func)(fixed,0,res); - N_eval++; - return int_err; - } - m0=0; // equals 0 for periodic, m otherwise - for (m=0;m<input[PHI].Jmax;m++) { - // calculate T_0^m - if (m==0) int_err=InnerInitT(fixed,T_in); - else { - if (input[PHI].periodic) for (comp=0;comp<dim;++comp) - T_in[comp]=0.5*(T_in[comp]+M_in[0][comp]); - else { - for (comp=0;comp<dim;++comp) - T_in[comp]=tv3[m-1]*tv2[m]*(T_in[comp]-M_in[0][comp])+M_in[0][comp]; - m0=m; - } - } - // get new integrand values (M_0^m) - int_err=0.5*(int_err+InnerTrapzd(fixed,M_in[m0],m)); - // generate M_1^(m-1), M_2^(m-2), ..., M_(m-1)^1, M_m^0 - if (m0!=0) RombergIterate(M_in,m); - // get error and check for convergence - if (m>=input[PHI].Jmin-1) { - abs_res=0.5*fabs(M_in[0][0]+T_in[0]); - abs_err=0.5*fabs(M_in[0][0]-T_in[0])+int_err; - if (abs_res==0) err=0; - else err=abs_err/abs_res; - if (err<input[PHI].eps) break; - } - } - // set result - for (comp=0;comp<dim;++comp) - res[comp]=0.5*(M_in[0][comp]+T_in[comp]); - // set no_convergence - if (err>=input[PHI].eps) { - fprintf(file,"Inner_qromb converged only to d=%g for cosine value #%d\n",err,fixed); - fflush(file); - no_convergence++; - } - return (abs_err); + int m,m0,comp; + double abs_res,abs_err; /* norms of result and error */ + double int_err; /* absolute error of previous layer integration */ + double err; + + if (input[PHI].Grid_size==1 || onepoint) { /* if only one point (really or assumed) */ + int_err=(*func)(fixed,0,res); + N_eval++; + return int_err; + } + m0=0; /* equals 0 for periodic, m otherwise */ + for (m=0;m<input[PHI].Jmax;m++) { + /* calculate T_0^m */ + if (m==0) int_err=InnerInitT(fixed,T_in); + else { + if (input[PHI].periodic) for (comp=0;comp<dim;++comp) + T_in[comp]=0.5*(T_in[comp]+M_in[0][comp]); + else { + for (comp=0;comp<dim;++comp) + T_in[comp]=tv3[m-1]*tv2[m]*(T_in[comp]-M_in[0][comp])+M_in[0][comp]; + m0=m; + } + } + /* get new integrand values (M_0^m) */ + int_err=0.5*(int_err+InnerTrapzd(fixed,M_in[m0],m)); + /* generate M_1^(m-1), M_2^(m-2), ..., M_(m-1)^1, M_m^0 */ + if (m0!=0) RombergIterate(M_in,m); + /* get error and check for convergence */ + if (m>=input[PHI].Jmin-1) { + abs_res=0.5*fabs(M_in[0][0]+T_in[0]); + abs_err=0.5*fabs(M_in[0][0]-T_in[0])+int_err; + if (abs_res==0) err=0; + else err=abs_err/abs_res; + if (err<input[PHI].eps) break; + } + } + /* set result */ + for (comp=0;comp<dim;++comp) + res[comp]=0.5*(M_in[0][comp]+T_in[comp]); + /* set no_convergence */ + if (err>=input[PHI].eps) { + fprintf(file,"Inner_qromb converged only to d=%g for cosine value #%d\n",err,fixed); + fflush(file); + no_convergence++; + } + return (abs_err); } -//============================================================ +/*============================================================*/ static double OuterInitT(double *res) -/* Calculate term T_0^0 for the outer integration of func; - * returns absolute error of the inner integration - */ + /* Calculate term T_0^0 for the outer integration of func; + returns absolute error of the inner integration */ { - int comp; - double err; - - // calculate first point - err=InnerRomberg(0,res,input[THETA].min==-1 && full_al_range); - - if (!input[THETA].equival) { - // calculate last point - err=0.5*(err - +InnerRomberg(input[THETA].Grid_size-1,dummy_out,input[THETA].max==1 && full_al_range)); - for (comp=0;comp<dim;++comp) - res[comp] = 0.5*(dummy_out[comp]+res[comp]); - } - return err; + int comp; + double err; + + /* calculate first point */ + err=InnerRomberg(0,res,input[THETA].min==-1 && full_al_range); + + if (!input[THETA].equival) { + /* calculate last point */ + err=0.5*(err + +InnerRomberg(input[THETA].Grid_size-1,dummy_out,input[THETA].max==1 && full_al_range)); + for (comp=0;comp<dim;++comp) + res[comp] = 0.5*(dummy_out[comp]+res[comp]); + } + return err; } -//============================================================ +/*============================================================*/ static double OuterTrapzd(double *res,const int n) -/* Calculate n'th refinement for the outer integration of func (term M_0^n); - * returns absolute error of the inner integration - */ + /* Calculate n'th refinement for the outer integration of func (term M_0^n); + returns absolute error of the inner integration */ { - int comp,step; - size_t j; - double temp,err; - - step=(input[THETA].Grid_size-1)>>n; - // init sum - for (comp=0;comp<dim;++comp) res[comp]=0; - err=0; - // accumulate sum - for (j=step>>1;j<input[THETA].Grid_size;j+=step) { - err+=InnerRomberg(j,dummy_out,FALSE); - for (comp=0;comp<dim;++comp) res[comp]+=dummy_out[comp]; - } - // scale it - temp=pow(2,-n); - for (comp=0;comp<dim;++comp) res[comp]*=temp; - return (err*temp); + int comp,step; + size_t j; + double temp,err; + + step=(input[THETA].Grid_size-1)>>n; + /* init sum */ + for (comp=0;comp<dim;++comp) res[comp]=0; + err=0; + /* accumulate sum */ + for (j=step>>1;j<input[THETA].Grid_size;j+=step) { + err+=InnerRomberg(j,dummy_out,FALSE); + for (comp=0;comp<dim;++comp) res[comp]+=dummy_out[comp]; + } + /* scale it */ + temp=pow(2,-n); + for (comp=0;comp<dim;++comp) res[comp]*=temp; + return (err*temp); } -//============================================================ +/*============================================================*/ static double OuterRomberg(double *res) -/* Performs outer integration (averaging). Returns relative error of integration (for the first - * element). If function is periodic then only the first column of the table is used - i.e. - * trapezoid rule - */ + /* Performs outer integration (averaging). Returns relative error of integration + (for the first element). If function is periodic then only the first column of the table + is used - i.e. trapezoid rule */ { - int m,m0,comp; - double abs_res,abs_err; // norms of result and error - double int_err; // absolute error of previous layer integration - double err; - - if (input[THETA].Grid_size==1) { // if only one point - N_eval=0; - int_err=InnerRomberg(0,res,FALSE); - fprintf(file,"single\t\t%d integrand-values were used.\n",N_eval); - N_tot_eval+=N_eval; - return ((res[0]==0) ? 0 : (int_err/fabs(res[0]))); - } - m0=0; // equals 0 for periodic, m otherwise - for (m=0;m<input[THETA].Jmax;m++) { - // calculate T_0^m - if (m==0) { - N_eval=0; - int_err=OuterInitT(T_out); - fprintf(file,"init\t\t%d integrand-values were used.\n",N_eval); - N_tot_eval+=N_eval; - } - else { - if (input[THETA].periodic) for (comp=0;comp<dim;++comp) - T_out[comp]=0.5*(T_out[comp]+M_out[0][comp]); - else { - for (comp=0;comp<dim;++comp) - T_out[comp]=tv3[m-1]*tv2[m]*(T_out[comp]-M_out[0][comp])+M_out[0][comp]; - m0=m; - } - } - // get new integrand values (M_0^m) - N_eval=0; - int_err=0.5*(int_err+OuterTrapzd(M_out[m0],m)); - fprintf(file,"%d\t\t%d integrand-values were used.\n",m+1,N_eval); - fflush(file); - N_tot_eval+=N_eval; - // generate M_1^(m-1), M_2^(m-2), ..., M_(m-1)^1, M_m^0 - if (m0!=0) RombergIterate(M_out,m); - // get error and check for convergence - if (m>=input[THETA].Jmin-1) { - abs_res=0.5*fabs(M_out[0][0]+T_out[0]); - // absolute error is sum of the errors for current integration and accumulated inner error - abs_err=0.5*fabs(M_out[0][0]-T_out[0])+int_err; - if (abs_res==0) err=0; - else err=abs_err/abs_res; - if (err<input[THETA].eps) break; - } - } - // set result - for (comp=0;comp<dim;++comp) res[comp]=0.5*(M_out[0][comp]+T_out[comp]); - - return (err); + int m,m0,comp; + double abs_res,abs_err; /* norms of result and error */ + double int_err; /* absolute error of previous layer integration */ + double err; + + if (input[THETA].Grid_size==1) { /* if only one point */ + N_eval=0; + int_err=InnerRomberg(0,res,FALSE); + fprintf(file,"single\t\t%d integrand-values were used.\n",N_eval); + N_tot_eval+=N_eval; + return ((res[0]==0) ? 0 : (int_err/fabs(res[0]))); + } + m0=0; /* equals 0 for periodic, m otherwise */ + for (m=0;m<input[THETA].Jmax;m++) { + /* calculate T_0^m */ + if (m==0) { + N_eval=0; + int_err=OuterInitT(T_out); + fprintf(file,"init\t\t%d integrand-values were used.\n",N_eval); + N_tot_eval+=N_eval; + } + else { + if (input[THETA].periodic) for (comp=0;comp<dim;++comp) + T_out[comp]=0.5*(T_out[comp]+M_out[0][comp]); + else { + for (comp=0;comp<dim;++comp) + T_out[comp]=tv3[m-1]*tv2[m]*(T_out[comp]-M_out[0][comp])+M_out[0][comp]; + m0=m; + } + } + /* get new integrand values (M_0^m) */ + N_eval=0; + int_err=0.5*(int_err+OuterTrapzd(M_out[m0],m)); + fprintf(file,"%d\t\t%d integrand-values were used.\n",m+1,N_eval); + fflush(file); + N_tot_eval+=N_eval; + /* generate M_1^(m-1), M_2^(m-2), ..., M_(m-1)^1, M_m^0 */ + if (m0!=0) RombergIterate(M_out,m); + /* get error and check for convergence */ + if (m>=input[THETA].Jmin-1) { + abs_res=0.5*fabs(M_out[0][0]+T_out[0]); + /* absolute error is sum of the errors for current integration and accumulated inner error */ + abs_err=0.5*fabs(M_out[0][0]-T_out[0])+int_err; + if (abs_res==0) err=0; + else err=abs_err/abs_res; + if (err<input[THETA].eps) break; + } + } + /* set result */ + for (comp=0;comp<dim;++comp) + res[comp]=0.5*(M_out[0][comp]+T_out[comp]); + + return (err); } -//============================================================ +/*============================================================*/ -void Romberg2D(const Parms_1D parms_input[2],double (*func_input)(int theta,int phi,double *res), - const int dim_input,double *res,const char *fname) -/* Integrate 2D func with Romberg's method according to input's parameters. Function func_input - * returns the estimate of the absolute error. Argument dim_input gives the number of components - * of (double *). Consistency between 'func' and 'dim_input' is the user's responsibility. Result is - * normalized on the interval widths, i.e. actually averaging takes place. - */ +void Romberg2D(const Parms_1D parms_input[2], + double (*func_input)(int theta,int phi,double *res), + const int dim_input,double *res,const char *fname) +/* Integrate 2D func with Romberg's method according to input's parameters. + function func_input returns the estimate of the absolute error; + Argument dim_input gives the number of components of (double *). + Consistency between func and dim_input is the user's responsibilty. + Result is normalized on the interval widths, i.e. actually averaging takes place. */ { - double error; - char buf1[MAX_WORD],buf2[MAX_WORD]; - const char true_s[]="true",false_s[]="false"; - const char *se1,*se2,*sp1,*sp2; - - // initialize global values - dim = dim_input; - func = func_input; - input = parms_input; - file=FOpenErr(fname,"w",ONE_POS); - no_convergence = 0; - N_tot_eval=0; - - AllocateAll(); // allocate memory - - if (orient_avg) { - strcpy(buf1,"BETA"); - strcpy(buf2,"GAMMA"); - } - else { - strcpy(buf1,"THETA"); - strcpy(buf2,"PHI"); - } - if (input[THETA].equival) se1=true_s; - else se1=false_s; - if (input[PHI].equival) se2=true_s; - else se2=false_s; - if (input[THETA].periodic) sp1=true_s; - else sp1=false_s; - if (input[PHI].periodic) sp2=true_s; - else sp2=false_s; - // print info - fprintf(file, - " %4s(rad) cos(%s)\n" - "EPS %-7g %g\n" - "Refinement stages:\n" - "Minimum %-7d %d\n" - "Maximum %-7d %d\n" - "lower boundary %-7g %g\n" - "upper boundary %-7g %g\n" - "equivalent min&max %-7s %s\n" - "periodic %-7s %s\n", - buf2,buf1, - input[PHI].eps,input[THETA].eps, - input[PHI].Jmin,input[THETA].Jmin, - input[PHI].Jmax,input[THETA].Jmax, - input[PHI].min,input[THETA].min, - input[PHI].max,input[THETA].max, - se2,se1,sp2,sp1); - fprintf(file,"\n\nOuter-Loop\tInner Loop\n"); - fflush(file); - - error=OuterRomberg(res); // main calculation - - // finalize log - if (error<input[THETA].eps) { - if (no_convergence==0) PrintBoth(file,"All inner integrations converged\n" - "The outer integration converged\n"); - else PrintBoth(file,"%d inner integrations did not converge.\n" - "The outer integration converged\n",no_convergence); - } - else { - if (no_convergence==0) PrintBoth(file,"Only the outer integration did not converge \n" - "It reached d=%g\n",error); - else PrintBoth(file,"%d inner integrations did not converge.\n" - "The outer integration did not converge\n" - "The outer integration reached d=%g\n",no_convergence,error); - } - PrintBoth(file,"In total %d evaluations were used\n",N_tot_eval); - FCloseErr(file,fname,ONE_POS); - - FreeAll(); // free all memory + double error; + char buf1[MAX_WORD],buf2[MAX_WORD]; + const char true_s[]="true",false_s[]="false"; + const char *se1,*se2,*sp1,*sp2; + + /* initialize global values */ + dim = dim_input; + func = func_input; + input = parms_input; + file=FOpenErr(fname,"w",ONE_POS); + no_convergence = 0; + N_tot_eval=0; + + AllocateAll(); /* allocate memory */ + + if (orient_avg) { + strcpy(buf1,"BETA"); + strcpy(buf2,"GAMMA"); + } + else { + strcpy(buf1,"THETA"); + strcpy(buf2,"PHI"); + } + if (input[THETA].equival) se1=true_s; + else se1=false_s; + if (input[PHI].equival) se2=true_s; + else se2=false_s; + if (input[THETA].periodic) sp1=true_s; + else sp1=false_s; + if (input[PHI].periodic) sp2=true_s; + else sp2=false_s; + /* print info */ + fprintf(file," %4s(rad) cos(%s)\n"\ + "EPS %-7g %g\n"\ + "Refinement stages:\n"\ + "Minimum %-7d %d\n"\ + "Maximum %-7d %d\n"\ + "lower boundary %-7g %g\n"\ + "upper boundary %-7g %g\n"\ + "equivalent min&max %-7s %s\n"\ + "periodic %-7s %s\n", + buf2,buf1, + input[PHI].eps,input[THETA].eps, + input[PHI].Jmin,input[THETA].Jmin, + input[PHI].Jmax,input[THETA].Jmax, + input[PHI].min,input[THETA].min, + input[PHI].max,input[THETA].max, + se2,se1,sp2,sp1); + fprintf(file,"\n\nOuter-Loop\tInner Loop\n"); + fflush(file); + + error=OuterRomberg(res); /* main calculation */ + + /* finalize log */ + if (error<input[THETA].eps) { + if (no_convergence==0) PrintBoth(file,"All inner integrations converged\n"\ + "The outer integration converged\n"); + else PrintBoth(file,"%d inner integrations did not converge.\n"\ + "The outer integration converged\n",no_convergence); + } + else { + if (no_convergence==0) PrintBoth(file,"Only the outer integration did not converge \n"\ + "It reached d=%g\n",error); + else PrintBoth(file, + "%d inner integrations did not converge.\n"\ + "The outer integration did not converge\n"\ + "The outer integration reached d=%g\n",no_convergence,error); + } + PrintBoth(file,"In total %d evaluations were used\n",N_tot_eval); + FCloseErr(file,fname,ONE_POS); + + FreeAll(); /* free all memory */ } diff --git a/src/Romberg.h b/src/Romberg.h index 5370dc7d..8fe148c9 100644 --- a/src/Romberg.h +++ b/src/Romberg.h @@ -2,14 +2,14 @@ * AUTH: Maxim Yurkin * DESCR: definitions of Romberg routines * - * Copyright (C) 2006,2008 University of Amsterdam + * Copyright (C) 2006 University of Amsterdam * This code is covered by the GNU General Public License. */ #ifndef __Romberg_h #define __Romberg_h -#include "types.h" // needed for Parms_1D -// indexes of array +#include "types.h" /* needed fot Parms_1D */ +/* indexes of array */ #define THETA 0 #define PHI 1 @@ -18,4 +18,4 @@ double Romberg1D(Parms_1D param,int size,const double *data,double *ss); void Romberg2D(const Parms_1D parms_input[2],double (*func_input)(int theta,int phi,double *res), int dim_input, double *res, const char *fname); -#endif // __Romberg_h +#endif /*__Romberg_h*/ diff --git a/src/calculator.c b/src/calculator.c index 2e0508ea..f7b31c01 100644 --- a/src/calculator.c +++ b/src/calculator.c @@ -23,603 +23,601 @@ #include "fft.h" #include "timing.h" -// SEMI-GLOBAL VARIABLES +/* SEMI-GLOBAL VARIABLES */ -// defined and initialized in crosssec.c +/* defined and initialized in crosssec.c */ extern const Parms_1D parms[2],parms_alpha; extern const angle_set beta_int,gamma_int,theta_int,phi_int; -// defined and initialized in param.c +/* defined and initialized in param.c */ extern const int avg_inc_pol; extern const char alldir_parms[],scat_grid_parms[]; -// defined and initialized in timing.c +/* defined and initialized in timing.c */ extern TIME_TYPE Timing_Init; extern unsigned long TotalEval; -// used in CalculateE.c -double *muel_phi; // used to store values of Mueller matrix for different phi (to integrate) -double *muel_phi_buf; // additional for integrating with different multipliers - // scattered E (for scattering in one plane) for two incident polarizations -doublecomplex *EplaneX, *EplaneY; -double *Eplane_buffer; // buffer to accumulate Eplane -double dtheta_deg,dtheta_rad; // delta theta in degrees and radians -doublecomplex *ampl_alphaX,*ampl_alphaY; // storing amplitude matrix for different values of alpha -double *muel_alpha; // storing mueller matrix for different values of alpha -// used in fft.c -double *tab1,*tab2,*tab3,*tab4,*tab5,*tab6,*tab7,*tab8,*tab9,*tab10; // tables of integrals -int **tab_index; // matrix for indexing of table arrays -// used in crosssec.c -double *E2_alldir; // square of E, calculated for alldir -double *E2_alldir_buffer; // buffer to accumulate E2_alldir -doublecomplex cc[MAX_NMAT][3]; // couple constants -doublecomplex *expsX,*expsY,*expsZ; // arrays of exponents along 3 axes (for calc_field) -// used in iterative.c -doublecomplex *rvec,*vec1,*vec2,*vec3,*Avecbuffer; // vectors for iterative solvers - -// LOCAL VARIABLES - -static size_t block_theta; // size of one block of mueller matrix - 16*nTheta -static int finish_avg; // whether to stop orientation averaging - // used to collect both mueller matrix and integral scattering quantities when orient_avg -static double *out; - -// EXTERNAL FUNCTIONS - -// CalculateE.c +/* used in CalculateE.c */ +double *muel_phi; /* used to store values of Mueller matrix for different phi (to integrate) */ +double *muel_phi1; /* additional for integrating with different multipliers */ +doublecomplex *EplaneX, *EplaneY; /* scattered E (for scattering in one plane) + for two incident polarizations */ +double *Eplane_buffer; /* buffer to accumulate Eplane */ +double dtheta_deg,dtheta_rad; /* delta theta in deg and radians */ +doublecomplex *ampl_alphaX,*ampl_alphaY; /* storing amplitude matrix for + different values of alpha */ +double *muel_alpha; /* storing mueller matrix for different values of alpha */ +/* used in fft.c */ +double *tab1,*tab2,*tab3,*tab4,*tab5,*tab6,*tab7,*tab8,*tab9,*tab10; /* tables of integrals */ +int **tab_index; /* matrix for indexation of table arrays */ +/* used in crosssec.c */ +double *E2_alldir; /* square of E, calculated for alldir */ +double *E2_alldir_buffer; /* buffer to accumulate E2_alldir */ +doublecomplex cc[MAX_NMAT][3]; /* couple constants */ +doublecomplex *expsX,*expsY,*expsZ; /* arrays of exponents along 3 axes (for calc_field) */ +/* used in iterative.c */ +doublecomplex *rvec,*vec1,*vec2,*vec3,*Avecbuffer; /* vectors for iterative solvers */ + +/* LOCAL VARIABLES */ + +static size_t block_theta; /* size of one block of mueller matrix - 16*nTheta */ +static int finish_avg; /* whether to stop orientation averaging */ +static double *out; /* used to collect both mueller matrix and integral + scattering quantities when orient_avg */ +/* EXTERNAL FUNCTIONS */ + +/* CalculateE.c */ extern int CalculateE(char which,int type); extern void MuellerMatrix(void); -//============================================================ +/*============================================================*/ static void CoupleConstant(doublecomplex *mrel,const char which,doublecomplex *res) { - doublecomplex coup_con[3]; - doublecomplex tempa,tempb,cm,m2,t1; - double temp,V,b1,b2,b3; - int i,imax,j,jmax; // counters: i is for 'asym', j is for 'anysotropy' - double S,prop2[3]; - int asym; // whether polarizability is asymmetric (for isotropic m) - const double *incPol; - int pol_avg=TRUE; - - asym = (PolRelation==POL_CLDR || PolRelation==POL_SO); - // !!! this should never happen - if (asym && anisotropy) LogError(EC_ERROR,ONE_POS,"Incompatibility error in CoupleConstant"); - if (asym) imax=3; - else imax=1; - if (anisotropy) jmax=3; - else jmax=1; - if (PolRelation==POL_LDR || PolRelation==POL_CLDR) { - b1=LDR_B1; - b2=LDR_B2; - b3=LDR_B3; - } - else if (PolRelation==POL_SO) { - b1=SO_B1; - b2=SO_B2; - b3=SO_B3; - } - // calculate the CM couple constant CC=(3V/4pi)*(m^2-1)/(m^2+2) - V=gridspace*gridspace*gridspace; // volume of one dipole - temp = (3*V)/(4*PI); - for (j=0;j<jmax;j++) { - cSquare(mrel[j],m2); // m2=m^2 - tempa[RE] = m2[RE] - 1.0; - tempa[IM] = tempb[IM] = m2[IM]; - tempb[RE] = m2[RE] + 2.0; - cDiv(tempa,tempb,coup_con[j]); - coup_con[j][RE] *= temp; - coup_con[j][IM] *= temp; - - if (PolRelation!=POL_CM) { - if (PolRelation==POL_LDR || PolRelation==POL_CLDR || PolRelation==POL_SO) { - // set prop_i^2 - for (i=0;i<3;i++) { - if (pol_avg && PolRelation==POL_SO) prop2[i]=ONE_THIRD; - else prop2[i]=prop[i]*prop[i]; - } - // determine S coefficient for LDR - if (PolRelation==POL_LDR) { - if (avg_inc_pol) S=0.5*(1-DotProd(prop2,prop2)); - else { - if (which=='X') incPol=incPolX; - else if (which=='Y') incPol=incPolY; - S = prop2[0]*incPol[0]*incPol[0] + prop2[1]*incPol[1]*incPol[1] - + prop2[2]*incPol[2]*incPol[2]; - } - } - } - cEqual(coup_con[j],cm); - for (i=0;i<imax;i++) { - // RR correction - t1[RE]=0.0; - t1[IM]=2*kd*kd*kd/3; // t1=2/3*i*kd^3 - // plus more advanced corrections - if (PolRelation==POL_FCD) // t1+=((4/3)kd^2+(2/3pi)*log((pi-kd)/(pi+kd))*kd^3) - t1[RE]+=2*ONE_THIRD*kd*kd*(2+kd*INV_PI*log((PI-kd)/(PI+kd))); - else if (PolRelation==POL_LDR || PolRelation==POL_CLDR || PolRelation==POL_SO) { - if (PolRelation!=POL_LDR) S=prop2[i]; - t1[RE]+=(b1+(b2+b3*S)*m2[RE])*kd*kd; // t1+=(b1+(b2+b3*S)*m^2)*kd^2 - t1[IM]+=(b2+b3*S)*m2[IM]*kd*kd; - } - // CC[i]=cm/(1-(cm/V)*t1) - cMultReal(1.0/V,t1,t1); - cMultSelf(t1,cm); - t1[RE]=1-t1[RE]; - t1[IM]=-t1[IM]; - // 'i+j' is not robust. It assumes that only one counter is used - cDiv(cm,t1,coup_con[i+j]); - } - } - } - if (asym || anisotropy) { - if (!orient_avg) { - PRINTBOTHZ(logfile, "CoupleConstant:(%.10g%+.10gi,%.10g%+.10gi,%.10g%+.10gi)\n", - coup_con[0][RE],coup_con[0][IM],coup_con[1][RE], - coup_con[1][IM],coup_con[2][RE],coup_con[2][IM]); - } - } - else { - cEqual(coup_con[0],coup_con[1]); - cEqual(coup_con[0],coup_con[2]); - if (!orient_avg) { - PRINTBOTHZ(logfile,"CoupleConstant:%.10g%+.10gi\n", - coup_con[0][RE],coup_con[0][IM]); - } - } - memcpy(res,coup_con,3*sizeof(doublecomplex)); + doublecomplex coup_con[3]; + doublecomplex tempa,tempb,cm,m2,t1; + double temp,V,b1,b2,b3; + int i,imax,j,jmax; /* counters: i is for 'asym', j is for 'anysotropy' */ + double S,prop2[3]; + int asym; /* whether polarizability is asymmetric (for isotropic m) */ + const double *incPol; + int pol_avg=TRUE; + + asym = (PolRelation==POL_CLDR || PolRelation==POL_SO); + /* this should never happen !!! */ + if (asym && anisotropy) LogError(EC_ERROR,ONE_POS,"Incompatibility error in CoupleConstant"); + if (asym) imax=3; + else imax=1; + if (anisotropy) jmax=3; + else jmax=1; + if (PolRelation==POL_LDR || PolRelation==POL_CLDR) { + b1=LDR_B1; + b2=LDR_B2; + b3=LDR_B3; + } + else if (PolRelation==POL_SO) { + b1=SO_B1; + b2=SO_B2; + b3=SO_B3; + } + /* calculate the CM couple constant CC=(3V/4pi)*(m^2-1)/(m^2+2) */ + V=gridspace*gridspace*gridspace; /* volume of one dipole */ + temp = (3*V)/(4*PI); + for (j=0;j<jmax;j++) { + cSquare(mrel[j],m2); /* m2=m^2 */ + tempa[RE] = m2[RE] - 1.0; + tempa[IM] = tempb[IM] = m2[IM]; + tempb[RE] = m2[RE] + 2.0; + cDiv(tempa,tempb,coup_con[j]); + coup_con[j][RE] *= temp; + coup_con[j][IM] *= temp; + + if (PolRelation!=POL_CM) { + if (PolRelation==POL_LDR || PolRelation==POL_CLDR || PolRelation==POL_SO) { + /* set prop_i^2 */ + for (i=0;i<3;i++) { + if (pol_avg && PolRelation==POL_SO) prop2[i]=ONE_THIRD; + else prop2[i]=prop[i]*prop[i]; + } + /* determine S coefficient for LDR */ + if (PolRelation==POL_LDR) { + if (avg_inc_pol) S=0.5*(1-DotProd(prop2,prop2)); + else { + if (which=='X') incPol=incPolX; + else if (which=='Y') incPol=incPolY; + S=prop2[0]*incPol[0]*incPol[0]+prop2[1]*incPol[1]*incPol[1] + +prop2[2]*incPol[2]*incPol[2]; + } + } + } + cEqual(coup_con[j],cm); + for (i=0;i<imax;i++) { + /* RR correction */ + t1[RE]=0.0; + t1[IM]=2*kd*kd*kd/3; /* t1=2/3*i*kd^3 */ + /* plus more advanced corrections */ + if (PolRelation==POL_FCD) /* t1+=((4/3)kd^2+(2/3pi)*log((pi-kd)/(pi+kd))*kd^3) */ + t1[RE]+=2*ONE_THIRD*kd*kd*(2*kd+INV_PI*log((PI-kd)/(PI+kd))); + else if (PolRelation==POL_LDR || PolRelation==POL_CLDR || PolRelation==POL_SO) { + if (PolRelation!=POL_LDR) S=prop2[i]; + t1[RE]+=(b1+(b2+b3*S)*m2[RE])*kd*kd; /* t1+=(b1+(b2+b3*S)*m^2)*kd^2 */ + t1[IM]+=(b2+b3*S)*m2[IM]*kd*kd; + } + cMultReal(1.0/V,t1,t1); + cMultSelf(t1,cm); + t1[RE]=1-t1[RE]; + t1[IM]=-t1[IM]; + /* 'i+j' is not robust. It assumes that only one counter is used */ + cDiv(cm,t1,coup_con[i+j]); /* CC[i]=cm/(1-(cm/V)*t1) */ + } + } + } + if (asym || anisotropy) { + if (!orient_avg) { + PRINTBOTHZ(logfile, "CoupleConstant:(%.10g%+.10gi,%.10g%+.10gi,%.10g%+.10gi)\n", + coup_con[0][RE],coup_con[0][IM],coup_con[1][RE], + coup_con[1][IM],coup_con[2][RE],coup_con[2][IM]); + } + } + else { + cEqual(coup_con[0],coup_con[1]); + cEqual(coup_con[0],coup_con[2]); + if (!orient_avg) { + PRINTBOTHZ(logfile,"CoupleConstant:%.10g%+.10gi\n", + coup_con[0][RE],coup_con[0][IM]); + } + } + memcpy(res,coup_con,3*sizeof(doublecomplex)); } -//============================================================ +/*============================================================*/ static void InitCC(const char which) -// calculate cc and cc_sqrt + /* calculate cc and cc_sqrt */ { - int i,j; + int i,j; - for(i=0;i<Nmat;i++) { - CoupleConstant(ref_index+Ncomp*i,which,cc[i]); - for(j=0;j<3;j++) cSqrt(cc[i][j],cc_sqrt[i][j]); - } + for(i=0;i<Nmat;i++) { + CoupleConstant(ref_index+Ncomp*i,which,cc[i]); + for(j=0;j<3;j++) cSqrt(cc[i][j],cc_sqrt[i][j]); + } } -//============================================================ +/*============================================================*/ static double *ReadTableFile(const char *sh_fname,const int size_multiplier) { - FILE *ftab; - double *tab_n; - int size; - char fname[MAX_FNAME]; - int i; - - size=TAB_SIZE*size_multiplier; - memory+=size*sizeof(double); - if (!prognose) { - // allocate memory for tab_n - MALLOC_VECTOR(tab_n,double,size,ALL); - // open file - strcpy(fname,TAB_PATH); - strcat(fname,sh_fname); - ftab=FOpenErr(fname,"r",ALL_POS); - // scan file - for (i=0; i<size; i++) if (fscanf(ftab,"%lf\t",&(tab_n[i]))!=1) - LogError(EC_ERROR,ALL_POS,"Scan error in file '%s'. Probably file is too small",fname); - if (!feof(ftab)) - LogError(EC_WARN,ONE_POS,"File '%s' is longer than specified size (%d)",fname,size); - // close file - FCloseErr(ftab,fname,ALL_POS); - } - return tab_n; + FILE *ftab; + double *tab_n; + int size; + char fname[MAX_FNAME]; + int i; + + size=TAB_SIZE*size_multiplier; + memory+=size*sizeof(double); + if (!prognose) { + /* allocate memory for tab_n */ + MALLOC_VECTOR(tab_n,double,size,ALL); + /* open file */ + strcpy(fname,TAB_PATH); + strcat(fname,sh_fname); + ftab=FOpenErr(fname,"r",ALL_POS); + /* scan file */ + for (i=0; i<size; i++) if (fscanf(ftab,"%lf\t",&(tab_n[i]))!=1) + LogError(EC_ERROR,ALL_POS,"Scan error in file '%s'. Probably file is too small",fname); + if (!feof(ftab)) + LogError(EC_WARN,ONE_POS,"File '%s' is longer than specified size (%d)",fname,size); + /* close file */ + FCloseErr(ftab,fname,ALL_POS); + } + return tab_n; } -//============================================================ +/*============================================================*/ static void ReadTables(void) { - int i, j, ymax, Rm2, Rm2x; - - tab1=ReadTableFile(TAB_FNAME(1),1); - tab2=ReadTableFile(TAB_FNAME(2),6); - tab3=ReadTableFile(TAB_FNAME(3),3); - tab4=ReadTableFile(TAB_FNAME(4),18); - tab5=ReadTableFile(TAB_FNAME(5),6); - tab6=ReadTableFile(TAB_FNAME(6),36); - tab7=ReadTableFile(TAB_FNAME(7),1); - tab8=ReadTableFile(TAB_FNAME(8),6); - tab9=ReadTableFile(TAB_FNAME(9),1); - tab10=ReadTableFile(TAB_FNAME(10),6); - - if (!prognose) { - // allocate memory for tab_index - MALLOC_IMATRIX(tab_index,1,TAB_RMAX,0,TAB_RMAX,ALL); - // fill tab_index - Rm2=TAB_RMAX*TAB_RMAX; - tab_index[1][0] = 0; - for (i=1; i<=TAB_RMAX; i++) { - Rm2x=Rm2-i*i; - ymax = MIN(i,(int)floor(sqrt(Rm2x))); - for (j=0; j<ymax; j++) { - tab_index[i][j+1] = tab_index[i][j] + MIN(j,(int)floor(sqrt(Rm2x-j*j)))+1; - } - if (i<TAB_RMAX) tab_index[i+1][0] = tab_index[i][ymax] - + MIN(ymax,(int)floor(sqrt(Rm2x-ymax*ymax)))+1; - } - } - // PRINTZ("P[5,3]=%d (should be 41)\n",tab_index[5][3]); + int i, j, ymax, Rm2, Rm2x; + + tab1=ReadTableFile(TAB_FNAME(1),1); + tab2=ReadTableFile(TAB_FNAME(2),6); + tab3=ReadTableFile(TAB_FNAME(3),3); + tab4=ReadTableFile(TAB_FNAME(4),18); + tab5=ReadTableFile(TAB_FNAME(5),6); + tab6=ReadTableFile(TAB_FNAME(6),36); + tab7=ReadTableFile(TAB_FNAME(7),1); + tab8=ReadTableFile(TAB_FNAME(8),6); + tab9=ReadTableFile(TAB_FNAME(9),1); + tab10=ReadTableFile(TAB_FNAME(10),6); + + if (!prognose) { + /* allocate memory for tab_index */ + MALLOC_IMATRIX(tab_index,1,TAB_RMAX,0,TAB_RMAX,ALL); + /* fill tab_index */ + Rm2=TAB_RMAX*TAB_RMAX; + tab_index[1][0] = 0; + for (i=1; i<=TAB_RMAX; i++) { + Rm2x=Rm2-i*i; + ymax = MIN(i,(int)floor(sqrt(Rm2x))); + for (j=0; j<ymax; j++) { + tab_index[i][j+1] = tab_index[i][j] + MIN(j,(int)floor(sqrt(Rm2x-j*j)))+1; + } + if (i<TAB_RMAX) tab_index[i+1][0] = tab_index[i][ymax] + + MIN(ymax,(int)floor(sqrt(Rm2x-ymax*ymax)))+1; + } + } + /* PRINTZ("P[5,3]=%d (should be 41)\n",tab_index[5][3]); */ } -//============================================================ +/*============================================================*/ static void FreeTables(void) { - Free_iMatrix(tab_index,1,TAB_RMAX,0); - Free_general(tab1); - Free_general(tab2); - Free_general(tab3); - Free_general(tab4); - Free_general(tab5); - Free_general(tab6); - Free_general(tab7); - Free_general(tab8); - Free_general(tab9); - Free_general(tab10); + Free_iMatrix(tab_index,1,TAB_RMAX,0); + Free_general(tab1); + Free_general(tab2); + Free_general(tab3); + Free_general(tab4); + Free_general(tab5); + Free_general(tab6); + Free_general(tab7); + Free_general(tab8); + Free_general(tab9); + Free_general(tab10); } -//============================================================ +/*============================================================*/ static void SaveMueller(double *muel) -// saves Mueller matrix (averaged) to file + /* saves mueller matrix (averaged) to file */ { - FILE *mueller; - char fname[MAX_FNAME]; - int i,j; - double theta; - TIME_TYPE tstart; - - tstart=GET_TIME(); - - strcpy(fname,directory); - strcat(fname,"/" F_MUEL); - mueller=FOpenErr(fname,"w",ONE_POS); - fprintf(mueller,"theta s11 s12 s13 s14 s21 s22 s23 s24 s31 s32 s33 s34 s41 s42 s43 s44\n"); - for (i=0;i<nTheta;i++) { - theta=i*dtheta_deg; - fprintf(mueller,"%.2f",theta); - for (j=0;j<16;j++) fprintf(mueller," %.10E",muel[16*i+j]); - fprintf(mueller,"\n"); - } - FCloseErr(mueller,F_MUEL,ONE_POS); - - Timing_FileIO += GET_TIME() - tstart; + FILE *mueller; + char fname[MAX_FNAME]; + int i,j; + double theta; + TIME_TYPE tstart; + + tstart=GET_TIME(); + + strcpy(fname,directory); + strcat(fname,"/" F_MUEL); + + mueller=FOpenErr(fname,"w",ONE_POS); + + fprintf(mueller,"theta s11 s12 s13 s14 s21 s22 s23 s24 s31 s32 s33 s34 s41 s42 s43 s44\n"); + + for (i=0;i<nTheta;i++) { + theta=i*dtheta_deg; + fprintf(mueller,"%.2f",theta); + for (j=0;j<16;j++) fprintf(mueller," %.10E",muel[16*i+j]); + fprintf(mueller,"\n"); + } + FCloseErr(mueller,F_MUEL,ONE_POS); + + Timing_FileIO += GET_TIME() - tstart; } -//============================================================== +/*==============================================================*/ static void SaveCS(const double Cext,const double Cabs) -// save calculated cross sections (averaged) to file + /* save calculated crossections (averaged) to file */ { - FILE *CCfile; - char fname[MAX_FNAME]; - TIME_TYPE tstart; + FILE *CCfile; + char fname[MAX_FNAME]; + TIME_TYPE tstart; - tstart=GET_TIME(); + tstart=GET_TIME(); - strcpy(fname,directory); - strcat(fname,"/" F_CS); + strcpy(fname,directory); + strcat(fname,"/" F_CS); - CCfile=FOpenErr(fname,"w",ONE_POS); + CCfile=FOpenErr(fname,"w",ONE_POS); - PrintBoth(CCfile,"Cext\t= %.10g\nQext\t= %.10g\n",Cext,Cext*inv_G); - PrintBoth(CCfile,"Cabs\t= %.10g\nQabs\t= %.10g\n",Cabs,Cabs*inv_G); + PrintBoth(CCfile,"Cext\t= %.10g\nQext\t= %.10g\n",Cext,Cext*inv_G); + PrintBoth(CCfile,"Cabs\t= %.10g\nQabs\t= %.10g\n",Cabs,Cabs*inv_G); - FCloseErr(CCfile,F_CS,ONE_POS); + FCloseErr(CCfile,F_CS,ONE_POS); - Timing_FileIO += GET_TIME() - tstart; + Timing_FileIO += GET_TIME() - tstart; } -//============================================================ +/*============================================================*/ static void calculate_one_orientation(double *res) -// performs calculation for one orientation; may do orientation averaging and put the result in res + /* performs calculation for one orientation; may do orientation averaging and put + the result in res */ { - TIME_TYPE tstart; - - if (orient_avg) { - alph_deg=0; - InitRotation(); - PRINTBOTHZ(logfile,"\nORIENTATION STEP beta=%g gamma=%g\n",bet_deg,gam_deg); - } - - // calculate scattered field for y - polarized incident light - PRINTZ("\nhere we go, calc Y\n\n"); - if (!orient_avg) FPRINTZ(logfile,"\nhere we go, calc Y\n\n"); - InitCC('Y'); - if (symR && !scat_grid) { - if (CalculateE('Y',CE_PARPER)==CHP_EXIT) return; - } - else { // no rotational symmetry - /* TODO: in case of scat_grid we run twice to get the full electric field with incoming - * light polarized in X and Y direction. In case of rotational symmetry this is not needed - * but requires lots more programming so we leave this optimization to a later time. - */ - if(CalculateE('Y',CE_NORMAL)==CHP_EXIT) return; - - PRINTZ("\nhere we go, calc X\n\n"); - if (!orient_avg) FPRINTZ(logfile,"\nhere we go, calc X\n\n"); - if(PolRelation==POL_LDR && !avg_inc_pol) InitCC('X'); - - if(CalculateE('X',CE_NORMAL)==CHP_EXIT) return; - } - D("CalculateE finished"); - MuellerMatrix(); - D("MuellerMatrix finished"); - if (ringid==ROOT && orient_avg) { - tstart=GET_TIME(); - printf("\nError of alpha integration (Mueller) is %g\n", - Romberg1D(parms_alpha,block_theta,muel_alpha,res+2)); - memcpy(res,muel_alpha-2,2*sizeof(double)); - D("Integration over alpha completed on ROOT"); - Timing_Integration += GET_TIME() - tstart; - } - TotalEval++; + TIME_TYPE tstart; + + if (orient_avg) { + alph_deg=0; + InitRotation(); + PRINTBOTHZ(logfile,"\nORIENTATION STEP beta=%g gamma=%g\n",bet_deg,gam_deg); + } + + /* calculate scattered field for y - polarized incident light */ + PRINTZ("\nhere we go, calc Y\n\n"); + if (!orient_avg) FPRINTZ(logfile,"\nhere we go, calc Y\n\n"); + InitCC('Y'); + if (symR && !scat_grid) { + if (CalculateE('Y',CE_PARPER)==CHP_EXIT) return; + } + else { /* no rotational symmetry */ + /* in case of scat_grid we run twice to get the full electric field */ + /* with incoming light in X and Y direction. In case of rotational */ + /* symmetry this is not needed but requires lots more programming */ + /* so we leave this optimization to a later time. */ + if(CalculateE('Y',CE_NORMAL)==CHP_EXIT) return; + + PRINTZ("\nhere we go, calc X\n\n"); + if (!orient_avg) FPRINTZ(logfile,"\nhere we go, calc X\n\n"); + if(PolRelation==POL_LDR && !avg_inc_pol) InitCC('X'); + + if(CalculateE('X',CE_NORMAL)==CHP_EXIT) return; + } + D("CalculateE finished"); + MuellerMatrix(); + D("MuellerMatrix finished"); + if (ringid==ROOT && orient_avg) { + tstart=GET_TIME(); + printf("\nError of alpha integration (Mueller) is %g\n", + Romberg1D(parms_alpha,block_theta,muel_alpha,res+2)); + memcpy(res,muel_alpha-2,2*sizeof(double)); + D("Integration over alpha completed on ROOT"); + Timing_Integration += GET_TIME() - tstart; + } + TotalEval++; } -//============================================================ +/*============================================================*/ static double orient_integrand(int beta_i,int gamma_i, double *res) -// function that provides interface with Romberg integration + /* function that provides interface with Romberg integration */ { - BcastOrient(&beta_i,&gamma_i,&finish_avg); - if (finish_avg) return 0; + BcastOrient(&beta_i,&gamma_i,&finish_avg); + if (finish_avg) return 0; - bet_deg=beta_int.val[beta_i]; - gam_deg=gamma_int.val[gamma_i]; - calculate_one_orientation(res); - return 0; + bet_deg=beta_int.val[beta_i]; + gam_deg=gamma_int.val[gamma_i]; + calculate_one_orientation(res); + return 0; } -//============================================================ +/*============================================================*/ static void AllocateEverything(void) -// allocates a lot of arrays and performs memory analysis + /* allocates a lot of arrays and performs memory analysis */ { - double tmp; - size_t temp_int; - double memmax; - - // allocate all the memory - tmp=sizeof(doublecomplex)*(double)nlocalRows; - if (!prognose) { - MALLOC_VECTOR(xvec,complex,nlocalRows,ALL); - MALLOC_VECTOR(rvec,complex,nlocalRows,ALL); - MALLOC_VECTOR(pvec,complex,nlocalRows,ALL); - MALLOC_VECTOR(Einc,complex,nlocalRows,ALL); - MALLOC_VECTOR(Avecbuffer,complex,nlocalRows,ALL); - } - memory+=5*tmp; - if (IterMethod==IT_BICGSTAB || IterMethod==IT_QMR_CS) { - // additional vectors for iterative methods - if (!prognose) { - MALLOC_VECTOR(vec1,complex,nlocalRows,ALL); - MALLOC_VECTOR(vec2,complex,nlocalRows,ALL); - MALLOC_VECTOR(vec3,complex,nlocalRows,ALL); - } - memory+=3*tmp; - } - MALLOC_VECTOR(expsX,complex,boxX,ALL); - MALLOC_VECTOR(expsY,complex,boxY,ALL); - MALLOC_VECTOR(expsZ,complex,local_Nz_unif,ALL); - if (yzplane) { - tmp=2*(double)nTheta; - if (!prognose) { - CheckOverflow(2*tmp,ONE_POS,"AllocateEverything()"); - temp_int=tmp; - MALLOC_VECTOR(EplaneX,complex,temp_int,ALL); - MALLOC_VECTOR(EplaneY,complex,temp_int,ALL); - } - memory+=2*tmp*sizeof(doublecomplex); + double tmp; + size_t temp_int; + double memmax; + + /* allocate all the memory */ + tmp=sizeof(doublecomplex)*(double)nlocalRows; + if (!prognose) { + MALLOC_VECTOR(xvec,complex,nlocalRows,ALL); + MALLOC_VECTOR(rvec,complex,nlocalRows,ALL); + MALLOC_VECTOR(pvec,complex,nlocalRows,ALL); + MALLOC_VECTOR(Einc,complex,nlocalRows,ALL); + MALLOC_VECTOR(Avecbuffer,complex,nlocalRows,ALL); + } + memory+=5*tmp; + if (IterMethod==IT_BICGSTAB || IterMethod==IT_QMR_CS) { + /* additional vectors for iterative methods */ + if (!prognose) { + MALLOC_VECTOR(vec1,complex,nlocalRows,ALL); + MALLOC_VECTOR(vec2,complex,nlocalRows,ALL); + MALLOC_VECTOR(vec3,complex,nlocalRows,ALL); + } + memory+=3*tmp; + } + MALLOC_VECTOR(expsX,complex,boxX,ALL); + MALLOC_VECTOR(expsY,complex,boxY,ALL); + MALLOC_VECTOR(expsZ,complex,local_Nz_unif,ALL); + if (yzplane) { + tmp=2*(double)nTheta; + if (!prognose) { + CheckOverflow(2*tmp,ONE_POS,"AllocateEverything()"); + temp_int=tmp; + MALLOC_VECTOR(EplaneX,complex,temp_int,ALL); + MALLOC_VECTOR(EplaneY,complex,temp_int,ALL); + } + memory+=2*tmp*sizeof(doublecomplex); #ifdef PARALLEL - if (ringid==ROOT) { // buffer for accumulate operation - if (!prognose) MALLOC_VECTOR(Eplane_buffer,double,2*temp_int,ONE); - memory+=2*tmp*sizeof(double); - } + if (ringid==ROOT) { /* buffer for accumulate operation */ + if (!prognose) MALLOC_VECTOR(Eplane_buffer,double,2*temp_int,ONE); + memory+=2*tmp*sizeof(double); + } #endif - } - if (all_dir) { - ReadAlldirParms(alldir_parms); - /* calculate size of vectors; 4 - because first it is used to store per and par components - * of the field, and only afterwards - squares. - */ - tmp=4*((double)theta_int.N)*phi_int.N; - if (!prognose) { - CheckOverflow(tmp,ONE_POS,"AllocateEverything()"); - temp_int=tmp; - MALLOC_VECTOR(E2_alldir,double,temp_int,ALL); - } - memory+=tmp*sizeof(double); + } + if (all_dir) { + ReadAlldirParms(alldir_parms); + /* calculate size of vectors; 4 - because first it is used to store + per and par components of the field, and only afterwards squares */ + tmp=4*((double)theta_int.N)*phi_int.N; + if (!prognose) { + CheckOverflow(tmp,ONE_POS,"AllocateEverything()"); + temp_int=tmp; + MALLOC_VECTOR(E2_alldir,double,temp_int,ALL); + } + memory+=tmp*sizeof(double); #ifdef PARALLEL - if (ringid==ROOT) { // buffer for accumulate operation - if (!prognose) MALLOC_VECTOR(E2_alldir_buffer,double,temp_int,ONE); - memory+=tmp*sizeof(double); - } + if (ringid==ROOT) { /* buffer for accumulate operation */ + if (!prognose) MALLOC_VECTOR(E2_alldir_buffer,double,temp_int,ONE); + memory+=tmp*sizeof(double); + } #endif - } - if (scat_grid) { - ReadScatGridParms(scat_grid_parms); - // calculate size of vectors - holds all per-par combinations - tmp=2*(double)angles.N; - if (!prognose) { - CheckOverflow(2*tmp,ONE_POS,"AllocateEverything()"); - temp_int=tmp; - MALLOC_VECTOR(EgridX,complex,temp_int,ALL); - MALLOC_VECTOR(EgridY,complex,temp_int,ALL); - } - memory+=2*tmp*sizeof(doublecomplex); + } + if (scat_grid) { + ReadScatGridParms(scat_grid_parms); + /* calculate size of vectors - holds all per-par combinations*/ + tmp=2*(double)angles.N; + if (!prognose) { + CheckOverflow(2*tmp,ONE_POS,"AllocateEverything()"); + temp_int=tmp; + MALLOC_VECTOR(EgridX,complex,temp_int,ALL); + MALLOC_VECTOR(EgridY,complex,temp_int,ALL); + } + memory+=2*tmp*sizeof(doublecomplex); #ifdef PARALLEL - if (ringid==ROOT) { // buffer for accumulate operation - if (!prognose) MALLOC_VECTOR(Egrid_buffer,double,2*temp_int,ONE); - memory+=2*tmp*sizeof(double); - } + if (ringid==ROOT) { /* buffer for accumulate operation */ + if (!prognose) MALLOC_VECTOR(Egrid_buffer,double,2*temp_int,ONE); + memory+=2*tmp*sizeof(double); + } #endif - if (phi_integr && ringid==ROOT) { - tmp=16*(double)angles.phi.N; - if (!prognose) { - CheckOverflow(tmp,ONE_POS,"AllocateEverything()"); - temp_int=tmp; - MALLOC_VECTOR(muel_phi,double,temp_int,ONE); - MALLOC_VECTOR(muel_phi_buf,double,temp_int,ONE); - } - memory+=2*tmp*sizeof(double); - } - } - if (orient_avg) { - tmp=2*((double)nTheta)*alpha_int.N; - if (!prognose) { - // this covers these 2 and next 2 malloc calls - CheckOverflow(8*tmp+2,ONE_POS,"AllocateEverything()"); - temp_int=tmp; - MALLOC_VECTOR(ampl_alphaX,complex,temp_int,ONE); - MALLOC_VECTOR(ampl_alphaY,complex,temp_int,ONE); - } - memory += 2*tmp*sizeof(doublecomplex); - if (ringid==ROOT) { - if (!prognose) { - MALLOC_VECTOR(muel_alpha,double,block_theta*alpha_int.N+2,ONE); - muel_alpha+=2; - MALLOC_VECTOR(out,double,block_theta+2,ONE); - } - memory += (8*tmp*(1+1.0/alpha_int.N)+4)*sizeof(double); - } - } - /* estimate of the memory (only the fastest scaling part): - * MatVec - (288+384nprocs/boxX [+192/nprocs])*Ndip - * more exactly: gridX*gridY*gridZ*(36+48nprocs/boxX [+24/nprocs]) value in [] is only - * for parallel mode - * others - nvoid_Ndip*271(+144 for BiCGStab and QMR_CS) - * PARALLEL: above is total; division over processors of MatVec is uniform, - * others - according to local_nvoid_Ndip - */ - memory/=MBYTE; - AccumulateMax(&memory,&memmax); - PRINTBOTHZ(logfile,"Total memory usage: %.1f Mb\n",memory); + if (phi_integr && ringid==ROOT) { + tmp=16*(double)angles.phi.N; + if (!prognose) { + CheckOverflow(tmp,ONE_POS,"AllocateEverything()"); + temp_int=tmp; + MALLOC_VECTOR(muel_phi,double,temp_int,ONE); + MALLOC_VECTOR(muel_phi1,double,temp_int,ONE); + } + memory+=2*tmp*sizeof(double); + } + } + if (orient_avg) { + tmp=2*((double)nTheta)*alpha_int.N; + if (!prognose) { + /* this covers these 2 and next 2 mallocs */ + CheckOverflow(8*tmp+2,ONE_POS,"AllocateEverything()"); + temp_int=tmp; + MALLOC_VECTOR(ampl_alphaX,complex,temp_int,ONE); + MALLOC_VECTOR(ampl_alphaY,complex,temp_int,ONE); + } + memory += 2*tmp*sizeof(doublecomplex); + if (ringid==ROOT) { + if (!prognose) { + MALLOC_VECTOR(muel_alpha,double,block_theta*alpha_int.N+2,ONE); + muel_alpha+=2; + MALLOC_VECTOR(out,double,block_theta+2,ONE); + } + memory += (8*tmp*(1+1.0/alpha_int.N)+4)*sizeof(double); + } + } + /* estimate of the memory (only the fastest scaling part): + MatVec - (288+384nprocs/boxX [+192/nprocs])*Ndip + more exactly: gridX*gridY*gridZ*(36+48nprocs/boxX [+24/nprocs]) + value in [] is only for parallel mode + others - nvoid_Ndip*271(+144 for BiCGStab and QMR_CS) + PARALLEL: above is total; division over processors of MatVec is uniform, + others - according to local_nvoid_Ndip */ + memory/=MBYTE; + AccumulateMax(&memory,&memmax); + PRINTBOTHZ(logfile,"Total memory usage: %.1f Mb\n",memory); #ifdef PARALLEL - PRINTBOTHZ(logfile,"Maximum memory usage of single processor: %.1f Mb\n",memmax); + PRINTBOTHZ(logfile,"Maximum memory usage of single processor: %.1f Mb\n",memmax); #endif } -//============================================================ +/*============================================================*/ static void FreeEverything(void) -/* frees all allocated vectors; should not be called in prognosis mode, since arrays are not - * actually allocated. - */ + /* frees all allocated vectors; should not be called in prognose mode, + since arrays are not actually allocated */ { - if (IntRelation == G_SO) FreeTables(); - Free_FFT_Dmat(); - Free_cVector(xvec); - Free_cVector(rvec); - Free_cVector(pvec); - Free_cVector(Einc); - Free_cVector(Avecbuffer); - if (IterMethod==IT_BICGSTAB || IterMethod==IT_QMR_CS) { - Free_cVector(vec1); - Free_cVector(vec2); - Free_cVector(vec3); - } - Free_cVector(expsX); - Free_cVector(expsY); - Free_cVector(expsZ); - if (yzplane) { - Free_cVector(EplaneX); - Free_cVector(EplaneY); + if (IntRelation == G_SO) FreeTables(); + Free_FFT_Dmat(); + Free_cVector(xvec); + Free_cVector(rvec); + Free_cVector(pvec); + Free_cVector(Einc); + Free_cVector(Avecbuffer); + if (IterMethod==IT_BICGSTAB || IterMethod==IT_QMR_CS) { + Free_cVector(vec1); + Free_cVector(vec2); + Free_cVector(vec3); + } + Free_cVector(expsX); + Free_cVector(expsY); + Free_cVector(expsZ); + if (yzplane) { + Free_cVector(EplaneX); + Free_cVector(EplaneY); #ifdef PARALLEL - Free_general(Eplane_buffer); + Free_general(Eplane_buffer); #endif - } - if (all_dir) { - Free_general(theta_int.val); - Free_general(phi_int.val); - Free_general(E2_alldir); + } + if (all_dir) { + Free_general(theta_int.val); + Free_general(phi_int.val); + Free_general(E2_alldir); #ifdef PARALLEL - Free_general(E2_alldir_buffer); + Free_general(E2_alldir_buffer); #endif - } - if (scat_grid) { - Free_general(angles.theta.val); - Free_general(angles.phi.val); - Free_cVector(EgridX); - Free_cVector(EgridY); - if (phi_integr && ringid==ROOT) { - Free_general(muel_phi); - Free_general(muel_phi_buf); - } + } + if (scat_grid) { + Free_general(angles.theta.val); + Free_general(angles.phi.val); + Free_cVector(EgridX); + Free_cVector(EgridY); + if (phi_integr && ringid==ROOT) { + Free_general(muel_phi); + Free_general(muel_phi1); + } #ifdef PARALLEL - Free_general(Egrid_buffer); + Free_general(Egrid_buffer); #endif - } - // these 3 were allocated in MakeParticle - Free_general(DipoleCoord); - Free_general(position); - Free_general(material); - - if (orient_avg) { - if (ringid==ROOT) { - Free_cVector(ampl_alphaX); - Free_cVector(ampl_alphaY); - Free_general(muel_alpha-2); - Free_general(out); - } - Free_general(alpha_int.val); - Free_general(beta_int.val); - Free_general(gamma_int.val); - } + } + /* these 3 were allocated in MakeParticle */ + Free_general(DipoleCoord); + Free_general(position); + Free_general(material); + + if (orient_avg) { + if (ringid==ROOT) { + Free_cVector(ampl_alphaX); + Free_cVector(ampl_alphaY); + Free_general(muel_alpha-2); + Free_general(out); + } + Free_general(alpha_int.val); + Free_general(beta_int.val); + Free_general(gamma_int.val); + } } -//============================================================ +/*============================================================*/ void Calculator (void) { - char fname[MAX_FNAME]; - - // initialize variables - dtheta_deg = 180.0 / ((double)(nTheta-1)); - dtheta_rad = Deg2Rad(dtheta_deg); - block_theta= 16*(size_t)nTheta; - // if not enough symmetry, calculate for +- theta (for one plane) - if (!(symY || orient_avg)) nTheta=2*(nTheta-1); - finish_avg=FALSE; - // read tables if needed - if (IntRelation == G_SO) ReadTables(); - // initialize D matrix (for matrix-vector multiplication) - D("InitDmatrix started"); - InitDmatrix(); - D("InitDmatrix finished"); - // allocate most (that is not already allocated; perform memory analysis - AllocateEverything(); - // finish initialization - if (!orient_avg) alpha_int.N=1; - Timing_Init = GET_TIME() - tstart_main; - // prognosis stops here - if (prognose) return; - // main calculation part - if (orient_avg) { - if (ringid==ROOT) { - sprintf(fname,"%s/" F_LOG_ORAVG,directory); - D("Romberg2D started on ROOT"); - Romberg2D(parms,orient_integrand,block_theta+2,out,fname); - D("Romberg2D finished on ROOT"); - finish_avg=TRUE; - /* first two are dummy variables; this call corresponds to similar call in - * orient_integrand by other processors; - * TODO: replace by a call without unnecessary overhead - */ - BcastOrient(&finish_avg,&finish_avg,&finish_avg); - SaveMueller(&out[2]); - SaveCS(out[0],out[1]); - } - else while (!finish_avg) orient_integrand(0,0,NULL); - } - else calculate_one_orientation(NULL); - // cleaning - FreeEverything(); + char fname[MAX_FNAME]; + + /* initialize variables */ + dtheta_deg = 180.0 / ((double)(nTheta-1)); + dtheta_rad = Deg2Rad(dtheta_deg); + block_theta= 16*(size_t)nTheta; + /* if not enough symmetry, calculate for +- theta (for one plane) */ + if (!(symY || orient_avg)) nTheta=2*(nTheta-1); + finish_avg=FALSE; + /* read tables if needed */ + if (IntRelation == G_SO) ReadTables(); + /* initialize D matrix (for matvec) */ + D("InitDmatrix started"); + InitDmatrix(); + D("InitDmatrix finished"); + /* allocate most (that is not already allocated; perform memory analysis */ + AllocateEverything(); + /* finish init */ + if (!orient_avg) alpha_int.N=1; + Timing_Init = GET_TIME() - tstart_main; + /* prognose stops here */ + if (prognose) return; + /* main calculation part */ + if (orient_avg) { + if (ringid==ROOT) { + sprintf(fname,"%s/" F_LOG_ORAVG,directory); + D("Romberg2D started on ROOT"); + Romberg2D(parms,orient_integrand,block_theta+2,out,fname); + D("Romberg2D finished on ROOT"); + finish_avg=TRUE; + /* first two are dummy variables; this call corresponds to similar call in orient_integrand + by other processors */ + BcastOrient(&finish_avg,&finish_avg,&finish_avg); + SaveMueller(&out[2]); + SaveCS(out[0],out[1]); + } + else while (!finish_avg) orient_integrand(0,0,NULL); + } + else calculate_one_orientation(NULL); + /* cleaning */ + FreeEverything(); } diff --git a/src/cmplx.h b/src/cmplx.h index 47dd7ec6..6c814393 100644 --- a/src/cmplx.h +++ b/src/cmplx.h @@ -8,7 +8,7 @@ * to be a principal limitation of C standard (some compilers may work, some produce * warnings) * A few changes for reliability and stability were made according to the ideas of section 5.5 - * of the Numerical Recipes, 3rd edition + * of the Numerical Recipes, 3rd ed. * * Copyright (C) 2006-2008 University of Amsterdam * This code is covered by the GNU General Public License. @@ -16,727 +16,728 @@ #ifndef __cmplx_h #define __cmplx_h -#include <string.h> // for memcpy -#include <math.h> // for cos, sin -#include "const.h" // for math constants -#include "types.h" // for doublecomplex -#include "function.h" // for INLINE +#include <string.h> /* for memcpy */ +#include <math.h> /* for cos, sin */ +#include "const.h" /* for math constants */ +#include "types.h" /* for doublecomplex */ +#include "function.h" /* for INLINE */ -//============================================================ -// operations on complex numbers +/*============================================================*/ +/* operations on complex numbers */ INLINE void cEqual(const doublecomplex a,doublecomplex b) -// performs b=a + /* performs b=a */ { - memcpy(b,a,sizeof(doublecomplex)); + memcpy(b,a,sizeof(doublecomplex)); } -//============================================================ +/*============================================================*/ INLINE double cAbs2(const doublecomplex a) -// square of absolute value of complex number; |a|^2 + /* square of absolute value of complex number; |a|^2 */ { - return (a[RE]*a[RE] + a[IM]*a[IM]); + return (a[RE]*a[RE] + a[IM]*a[IM]); } -//============================================================ +/*============================================================*/ INLINE double cAbs(const doublecomplex a) -// absolute value of complex number |a|, specially designed to avoid overflow + /* absolute value of complex number |a|, specially designed to avoid overflow */ { - double u,v,w; - u=fabs(a[RE]); - v=fabs(a[IM]); + double u,v,w; + u=fabs(a[RE]); + v=fabs(a[IM]); - if (u==0 && v==0) return 0; - else { - if (u>=v) { - w=v/u; - return (u*sqrt(1+w*w)); - } - else { - w=u/v; - return (v*sqrt(1+w*w)); - } - } + if (u==0 && v==0) return 0; + else { + if (u>=v) { + w=v/u; + return (u*sqrt(1+w*w)); + } + else { + w=u/v; + return (v*sqrt(1+w*w)); + } + } } -//============================================================ +/*============================================================*/ INLINE void cConj(const doublecomplex a,doublecomplex b) -// complex conjugate; b=a* + /* complex conjugate; b=a* */ { - b[RE] = a[RE]; - b[IM] = - a[IM]; + b[RE] = a[RE]; + b[IM] = - a[IM]; } -//============================================================ +/*============================================================*/ INLINE void cAdd(const doublecomplex a,const doublecomplex b,doublecomplex c) -// add two complex numbers; c=a+b + /* add two complex numbers; c=a+b */ { - c[RE] = a[RE] + b[RE]; - c[IM] = a[IM] + b[IM]; + c[RE] = a[RE] + b[RE]; + c[IM] = a[IM] + b[IM]; } -//============================================================ +/*============================================================*/ INLINE void cSubtr(const doublecomplex a,const doublecomplex b,doublecomplex c) -// subtract two complex numbers; c=a-b + /* subtract two complex numbers; c=a-b */ { - c[RE] = a[RE] - b[RE]; - c[IM] = a[IM] - b[IM]; + c[RE] = a[RE] - b[RE]; + c[IM] = a[IM] - b[IM]; } -//============================================================ +/*============================================================*/ INLINE void cSquare(const doublecomplex a,doublecomplex b) -// square of complex number; b=a^2 + /* square of complex number; b=a^2 */ { - b[RE]=a[RE]*a[RE] - a[IM]*a[IM]; - b[IM]=2*a[IM]*a[RE]; + b[RE]=a[RE]*a[RE] - a[IM]*a[IM]; + b[IM]=2*a[IM]*a[RE]; } -//============================================================ +/*============================================================*/ INLINE void cMultReal(const double a,const doublecomplex b,doublecomplex c) -// complex multiplication by real; c=ab + /* complex multiplication by real; c=ab */ { - c[RE]=a*b[RE]; - c[IM]=a*b[IM]; + c[RE]=a*b[RE]; + c[IM]=a*b[IM]; } -//============================================================ +/*============================================================*/ INLINE void cMult_i(doublecomplex c) -// complex multiplication by i; c=i*c + /* complex multiplication by i; c=i*c */ { - double tmp; - tmp=c[RE]; - c[RE]=-c[IM]; - c[IM]=tmp; + double tmp; + tmp=c[RE]; + c[RE]=-c[IM]; + c[IM]=tmp; } -//============================================================ +/*============================================================*/ INLINE void cMult_i2(doublecomplex a,doublecomplex b) -// complex multiplication by i; b=i*a; !!! b and c should be different !!! + /* complex multiplication by i; b=i*a + !!! b and c should be different !!! */ { - b[RE]=-a[IM]; - b[IM]=a[RE]; + b[RE]=-a[IM]; + b[IM]=a[RE]; } -//============================================================ +/*============================================================*/ INLINE void cMult(const doublecomplex a,const doublecomplex b,doublecomplex c) -// complex multiplication; c=ab; !!! c should be different from a and b !!! + /* complex multiplication; c=ab */ + /* !!! c should be different from a and b !!! */ { - c[RE]=a[RE]*b[RE] - a[IM]*b[IM]; - c[IM]=a[IM]*b[RE] + a[RE]*b[IM]; + c[RE]=a[RE]*b[RE] - a[IM]*b[IM]; + c[IM]=a[IM]*b[RE] + a[RE]*b[IM]; } -//============================================================ +/*============================================================*/ INLINE void cMultSelf(doublecomplex a,const doublecomplex b) -// complex multiplication; a*=b + /* complex multiplication; a*=b */ { - double tmp; + double tmp; - tmp=a[RE]; - a[RE]=a[RE]*b[RE] - a[IM]*b[IM]; - a[IM]=a[IM]*b[RE] + tmp*b[IM]; + tmp=a[RE]; + a[RE]=a[RE]*b[RE] - a[IM]*b[IM]; + a[IM]=a[IM]*b[RE] + tmp*b[IM]; } -//============================================================ +/*============================================================*/ INLINE double cMultConRe(const doublecomplex a,const doublecomplex b) -// complex multiplication; returns real(a*b_conjugated) + /* complex multiplication; returns real(a*b_conjugated) */ { - return (a[RE]*b[RE] + a[IM]*b[IM]); + return (a[RE]*b[RE] + a[IM]*b[IM]); } -//============================================================ +/*============================================================*/ INLINE double cMultConIm(const doublecomplex a,const doublecomplex b) -// complex multiplication; returns imag(a*b_conjugated) + /* complex multiplication; returns imag(a*b_conjugated) */ { - return (a[IM]*b[RE] - a[RE]*b[IM]); + return (a[IM]*b[RE] - a[RE]*b[IM]); } -//============================================================ +/*============================================================*/ INLINE void cLinComb(const doublecomplex a,const doublecomplex b, const double c1,const double c2,doublecomplex c) -// linear combination of two complex numbers; c=c1*a+c2*b + /* linear combination of two complex numbers; c=c1*a+c2*b */ { - c[RE]=c1*a[RE]+c2*b[RE]; - c[IM]=c1*a[IM]+c2*b[IM]; + c[RE]=c1*a[RE]+c2*b[RE]; + c[IM]=c1*a[IM]+c2*b[IM]; } -//============================================================ +/*============================================================*/ INLINE void cInvSign(doublecomplex a) -// change sign of complex number; a*=-1; + /* change sign of complex number; a*=-1; */ { - a[RE] = - a[RE]; - a[IM] = - a[IM]; + a[RE] = - a[RE]; + a[IM] = - a[IM]; } -//============================================================ +/*============================================================*/ INLINE void cInvSign2(const doublecomplex a,doublecomplex b) -// change sign of complex number and store to different address; b=-a; + /* change sign of complex number and store to different address; b=-a; */ { - b[RE] = - a[RE]; - b[IM] = - a[IM]; + b[RE] = - a[RE]; + b[IM] = - a[IM]; } -//============================================================ +/*============================================================*/ INLINE void cInv(const doublecomplex a,doublecomplex b) -// complex inversion; b=1/a; designed to avoid under and overflows + /* complex inversion; b=1/a; designed to avoid under and overflows */ { - double tmp; + double tmp; - if (fabs(a[RE])>=fabs(a[IM])) { - tmp=a[IM]/a[RE]; - b[RE]=1/(a[RE]+a[IM]*tmp); - b[IM]=-b[RE]*tmp; - } - else { - tmp=a[RE]/a[IM]; - b[IM]=-1/(a[RE]*tmp+a[IM]); - b[RE]=-b[IM]*tmp; - } + if (fabs(a[RE])>=fabs(a[IM])) { + tmp=a[IM]/a[RE]; + b[RE]=1/(a[RE]+a[IM]*tmp); + b[IM]=-b[RE]*tmp; + } + else { + tmp=a[RE]/a[IM]; + b[IM]=-1/(a[RE]*tmp+a[IM]); + b[RE]=-b[IM]*tmp; + } } -//============================================================ +/*============================================================*/ INLINE double cInvIm(const doublecomplex a) -// returns Im of inverse of a; designed to avoid under and overflows + /* returns Im of inverse of a; designed to avoid under and overflows */ { - double tmp; + double tmp; - if (fabs(a[RE])>=fabs(a[IM])) { - tmp=a[IM]/a[RE]; - return (-tmp/(a[RE]+a[IM]*tmp)); - } - else { - tmp=a[RE]/a[IM]; - return (-1/(a[RE]*tmp+a[IM])); - } + if (fabs(a[RE])>=fabs(a[IM])) { + tmp=a[IM]/a[RE]; + return (-tmp/(a[RE]+a[IM]*tmp)); + } + else { + tmp=a[RE]/a[IM]; + return (-1/(a[RE]*tmp+a[IM])); + } } -//============================================================ +/*============================================================*/ INLINE void cDiv(const doublecomplex a,const doublecomplex b,doublecomplex c) -/* complex division; c=a/b; designed to avoid under and overflows - * !!! c should be different from a !!! - */ + /* complex division; c=a/b; designed to avoid under and overflows */ + /* !!! c should be different from a !!! */ { - double u,v; + double u,v; - if (fabs(b[RE])>=fabs(b[IM])) { - u=b[IM]/b[RE]; - v=1/(b[RE]+b[IM]*u); - c[RE]=(a[RE]+a[IM]*u)*v; - c[IM]=(a[IM]-a[RE]*u)*v; - } - else { - u=b[RE]/b[IM]; - v=1/(b[RE]*u+b[IM]); - c[RE]=(a[RE]*u+a[IM])*v; - c[IM]=(a[IM]*u-a[RE])*v; - } + if (fabs(b[RE])>=fabs(b[IM])) { + u=b[IM]/b[RE]; + v=1/(b[RE]+b[IM]*u); + c[RE]=(a[RE]+a[IM]*u)*v; + c[IM]=(a[IM]-a[RE]*u)*v; + } + else { + u=b[RE]/b[IM]; + v=1/(a[RE]*u+a[IM]); + c[RE]=(a[RE]*u+a[IM])*v; + c[IM]=(a[IM]*u-a[RE])*v; + } } -//============================================================ +/*============================================================*/ INLINE void cDivSelf(doublecomplex a,const doublecomplex b) -// complex division; a/=b; designed to avoid under and overflows + /* complex division; a/=b; designed to avoid under and overflows */ { - double u,v,w; + double u,v,w; - w=a[RE]; - if (fabs(b[RE])>=fabs(b[IM])) { - u=b[IM]/b[RE]; - v=1/(b[RE]+b[IM]*u); - a[RE]=(w+a[IM]*u)*v; - a[IM]=(a[IM]-w*u)*v; - } - else { - u=b[RE]/b[IM]; - v=1/(b[RE]*u+b[IM]); - a[RE]=(w*u+a[IM])*v; - a[IM]=(a[IM]*u-w)*v; - } + w=a[RE]; + if (fabs(b[RE])>=fabs(b[IM])) { + u=b[IM]/b[RE]; + v=1/(b[RE]+b[IM]*u); + a[RE]=(w+a[IM]*u)*v; + a[IM]=(a[IM]-w*u)*v; + } + else { + u=b[RE]/b[IM]; + v=1/(a[RE]*u+a[IM]); + a[RE]=(w*u+a[IM])*v; + a[IM]=(a[IM]*u-w)*v; + } } -//============================================================ +/*============================================================*/ INLINE void cSqrt(const doublecomplex a,doublecomplex b) -/* complex square root; b=sqrt(a); designed to avoid under and overflows; - * branch cut discontinuity is (-inf,0) - b[RE]>=0 - */ -{ - double u,v,w,r; - - u=fabs(a[RE]); - v=fabs(a[IM]); - if (u==0 && v==0) b[RE]=b[IM]=0; - else { - // first determine w - if (u>=v) { - r=v/u; - w=sqrt(u)*sqrt((1+sqrt(1+r*r))/2); - } - else { - r=u/v; - w=sqrt(v)*sqrt((r+sqrt(1+r*r))/2); - } - // compute the result - if (a[RE]>=0) { - b[RE]=w; - b[IM]=a[IM]/(2*w); - } - else { - b[RE]=v/(2*w); - if (a[IM]>=0) b[IM]=w; - else b[IM]=-w; - } - } -} - -//============================================================ + /* complex square root; b=sqrt(a); designed to avoid under and overflows + branch cut discontinuity is (-inf,0) - b[RE]>=0 */ +{ + double u,v,w,r; + + u=fabs(a[RE]); + v=fabs(a[IM]); + if (u==0 && v==0) b[RE]=b[IM]=0; + else { + /* first determine w */ + if (u>=v) { + r=v/u; + w=sqrt(u)*sqrt((1+sqrt(1+r*r))/2); + } + else { + r=u/v; + w=sqrt(v)*sqrt((r+sqrt(1+r*r))/2); + } + /* compute the result */ + if (a[RE]>=0) { + b[RE]=w; + b[IM]=a[IM]/(2*w); + } + else { + b[RE]=v/(2*w); + if (a[IM]>=0) b[IM]=w; + else b[IM]=-w; + } + } +} + +/*============================================================*/ INLINE void imExp(const double arg,doublecomplex c) -// exponent of imaginary argument c=Exp(i*arg); optimization is performed by compiler + /* exponent of imaginary argument c=Exp(i*arg) + Optimization is performed by compiler */ { - c[RE]=cos(arg); - c[IM]=sin(arg); + c[RE]=cos(arg); + c[IM]=sin(arg); } -//============================================================ +/*============================================================*/ INLINE void imExp_arr(const double arg,const int size,doublecomplex *c) -/* construct an array of exponent of imaginary argument c=Exp(i*k*arg) - * where k=0,1,...,size-1. Uses stable recurrence from Numerical Recipes. - * Optimization of the initial simultaneous calculation of sin and cos is performed - * by compiler; It is assumed that size is at least 1 - */ -{ - int k; - double a,b; - - c[0][RE]=1; - c[0][IM]=0; - if (size>1) { - // set a=2*sin^2(arg/2), b=sin(arg) - a=sin(arg/2); - b=cos(arg/2); - b*=2*a; - a*=2*a; - // this a bit faster than in the main cycle - c[1][RE]=1-a; - c[1][IM]=b; - // main cycle - for (k=2;k<size;k++) { - /* potentially compiler may open brackets to accelerate calculation but lose significant - * digits. We hope it doesn't happen, but it should not be a big problem anyway - */ - c[k][RE]=c[k-1][RE]-(a*c[k-1][RE]+b*c[k-1][IM]); - c[k][IM]=c[k-1][IM]-(a*c[k-1][IM]-b*c[k-1][RE]); - } - } -} - -//============================================================ + /* construct an array of exponent of imaginary argument c=Exp(i*k*arg) + where k=0,1,...,size-1. Uses stable recurrence from Numerical Recipes. + Optimization of the initial simultaneous calculation of sin and cos is performed + by compiler; It is assumed that size is at least 1 */ +{ + int k; + double a,b; + + c[0][RE]=1; + c[0][IM]=0; + if (size>1) { + /* set a=2*sin^2(arg/2), b=sin(arg) */ + a=sin(arg/2); + b=cos(arg/2); + b*=2*a; + a*=2*a; + /* this a bit faster than in the main cycle */ + c[1][RE]=1-a; + c[1][IM]=b; + /* main cycle */ + for (k=2;k<size;k++) { + /* potentially compiler may open brackets to accelerate calculation but lose significant + digits. We hope it doesn't happen, but even if it does, it should not be a big problem */ + c[k][RE]=c[k-1][RE]-(a*c[k-1][RE]+b*c[k-1][IM]); + c[k][IM]=c[k-1][IM]-(a*c[k-1][IM]-b*c[k-1][RE]); + } + } +} + +/*============================================================*/ INLINE void cExp(const doublecomplex arg,doublecomplex c) -/* complex exponent of complex argument c=Exp(arg); optimization is performed by compiler; - * !!! c should be different from arg !!! - */ + /* complex exponent of complex argument c=Exp(arg) + Optimization is performed by compiler + !!! c should be different from arg !!! */ { - c[RE]=c[IM]=exp(arg[RE]); - c[RE]*=cos(arg[IM]); - c[IM]*=sin(arg[IM]); + c[RE]=c[IM]=exp(arg[RE]); + c[RE]*=cos(arg[IM]); + c[IM]*=sin(arg[IM]); } -//============================================================ +/*============================================================*/ INLINE void cExpSelf(doublecomplex arg) -/* complex exponent of complex argument arg=Exp(arg); result is stored in the argument itself - * Optimization is performed by compiler - */ + /* complex exponent of complex argument arg=Exp(arg); result stored in the argument itself + Optimization is performed by compiler */ { - double tmp; + double tmp; - tmp=arg[IM]; - arg[RE]=arg[IM]=exp(arg[RE]); - arg[RE]*=cos(tmp); - arg[IM]*=sin(tmp); + tmp=arg[IM]; + arg[RE]=arg[IM]=exp(arg[RE]); + arg[RE]*=cos(tmp); + arg[IM]*=sin(tmp); } -//============================================================ -// operations on complex vectors +/*============================================================*/ +/* operations on complex vectors */ INLINE void cvMultScal(const double a,doublecomplex *b,doublecomplex *c) -// multiplication of vector by real scalar; c=ab + /* multiplication of vector by real scalar; c=ab */ { - c[0][RE] = a*b[0][RE]; - c[0][IM] = a*b[0][IM]; - c[1][RE] = a*b[1][RE]; - c[1][IM] = a*b[1][IM]; - c[2][RE] = a*b[2][RE]; - c[2][IM] = a*b[2][IM]; + c[0][RE] = a*b[0][RE]; + c[0][IM] = a*b[0][IM]; + c[1][RE] = a*b[1][RE]; + c[1][IM] = a*b[1][IM]; + c[2][RE] = a*b[2][RE]; + c[2][IM] = a*b[2][IM]; } -//============================================================ +/*============================================================*/ INLINE void cScalMultRVec(const double *a,const doublecomplex b,doublecomplex *c) -// complex scalar- real vector[3] multiplication; c=b*a + /* complex scalar- real vector[3] multiplication; c=b*a */ { - c[0][RE] = b[RE]*a[0]; - c[0][IM] = b[IM]*a[0]; - c[1][RE] = b[RE]*a[1]; - c[1][IM] = b[IM]*a[1]; - c[2][RE] = b[RE]*a[2]; - c[2][IM] = b[IM]*a[2]; + c[0][RE] = b[RE]*a[0]; + c[0][IM] = b[IM]*a[0]; + c[1][RE] = b[RE]*a[1]; + c[1][IM] = b[IM]*a[1]; + c[2][RE] = b[RE]*a[2]; + c[2][IM] = b[IM]*a[2]; } -//============================================================ +/*============================================================*/ INLINE void cvMultScal_cmplx(const doublecomplex a,doublecomplex *b,doublecomplex *c) -// multiplication of vector[3] by complex scalar; c=ab + /* multiplication of vector[3] by complex scalar; c=ab */ { - c[0][RE] = a[RE]*b[0][RE] - a[IM]*b[0][IM]; - c[0][IM] = a[RE]*b[0][IM] + a[IM]*b[0][RE]; - c[1][RE] = a[RE]*b[1][RE] - a[IM]*b[1][IM]; - c[1][IM] = a[RE]*b[1][IM] + a[IM]*b[1][RE]; - c[2][RE] = a[RE]*b[2][RE] - a[IM]*b[2][IM]; - c[2][IM] = a[RE]*b[2][IM] + a[IM]*b[2][RE]; -} + c[0][RE] = a[RE]*b[0][RE] - a[IM]*b[0][IM]; + c[0][IM] = a[RE]*b[0][IM] + a[IM]*b[0][RE]; + c[1][RE] = a[RE]*b[1][RE] - a[IM]*b[1][IM]; + c[1][IM] = a[RE]*b[1][IM] + a[IM]*b[1][RE]; + c[2][RE] = a[RE]*b[2][RE] - a[IM]*b[2][IM]; + c[2][IM] = a[RE]*b[2][IM] + a[IM]*b[2][RE]; +} -//============================================================ +/*============================================================*/ INLINE double cvNorm2(doublecomplex *a) -// square of the norm of a complex vector[3] + /* square of the norm of a complex vector[3] */ { - return ( a[0][RE]*a[0][RE] + a[0][IM]*a[0][IM] - + a[1][RE]*a[1][RE] + a[1][IM]*a[1][IM] - + a[2][RE]*a[2][RE] + a[2][IM]*a[2][IM] ); + return (a[0][RE]*a[0][RE] + a[0][IM]*a[0][IM] + + a[1][RE]*a[1][RE] + a[1][IM]*a[1][IM] + + a[2][RE]*a[2][RE] + a[2][IM]*a[2][IM]); } -//============================================================ +/*============================================================*/ INLINE void cDotProd(doublecomplex *a,doublecomplex *b,doublecomplex c) -// conjugate dot product of two complex vector[3]; c=a.b = a[0]*b*[0]+...+a[2]*b*[2]*/ + /* conjugate dot product of two complex vector[3]; c=a.b = a[0]*b*[0]+...+a[2]*b*[2]*/ { - c[RE] = a[0][RE]*b[0][RE] + a[0][IM]*b[0][IM] - + a[1][RE]*b[1][RE] + a[1][IM]*b[1][IM] - + a[2][RE]*b[2][RE] + a[2][IM]*b[2][IM]; - c[IM] = a[0][IM]*b[0][RE] - a[0][RE]*b[0][IM] - + a[1][IM]*b[1][RE] - a[1][RE]*b[1][IM] - + a[2][IM]*b[2][RE] - a[2][RE]*b[2][IM]; + c[RE] = a[0][RE]*b[0][RE] + a[0][IM]*b[0][IM] + + a[1][RE]*b[1][RE] + a[1][IM]*b[1][IM] + + a[2][RE]*b[2][RE] + a[2][IM]*b[2][IM]; + c[IM] = a[0][IM]*b[0][RE] - a[0][RE]*b[0][IM] + + a[1][IM]*b[1][RE] - a[1][RE]*b[1][IM] + + a[2][IM]*b[2][RE] - a[2][RE]*b[2][IM]; } -//============================================================ +/*============================================================*/ INLINE double cDotProd_Re(doublecomplex *a,doublecomplex *b) -// real part of dot product of two complex vector[3]; c=Re(a.b) + /* real part of dot product of two complex vector[3]; c=Re(a.b) */ { - return ( a[0][RE]*b[0][RE] + a[0][IM]*b[0][IM] - + a[1][RE]*b[1][RE] + a[1][IM]*b[1][IM] - + a[2][RE]*b[2][RE] + a[2][IM]*b[2][IM] ); + return (a[0][RE]*b[0][RE] + a[0][IM]*b[0][IM] + + a[1][RE]*b[1][RE] + a[1][IM]*b[1][IM] + + a[2][RE]*b[2][RE] + a[2][IM]*b[2][IM]); } -//============================================================ +/*============================================================*/ INLINE double cDotProd_Im(doublecomplex *a,doublecomplex *b) -// imaginary part of dot product of two complex vector[3]; c=Im(a.b) + /* imaginary part of dot product of two complex vector[3]; c=Im(a.b) */ { - return ( a[0][IM]*b[0][RE] - a[0][RE]*b[0][IM] - + a[1][IM]*b[1][RE] - a[1][RE]*b[1][IM] - + a[2][IM]*b[2][RE] - a[2][RE]*b[2][IM] ); + return (a[0][IM]*b[0][RE] - a[0][RE]*b[0][IM] + + a[1][IM]*b[1][RE] - a[1][RE]*b[1][IM] + + a[2][IM]*b[2][RE] - a[2][RE]*b[2][IM]); } -//============================================================ +/*============================================================*/ INLINE void cDotProd_conj(doublecomplex *a,doublecomplex *b,doublecomplex c) -// dot product of two complex vector[3]; c=a.b* = a[0]*b[0]+...+a[2]*b[2] + /* dot product of two complex vector[3]; c=a.b* = a[0]*b[0]+...+a[2]*b[2] */ { - c[RE] = a[0][RE]*b[0][RE] - a[0][IM]*b[0][IM] - + a[1][RE]*b[1][RE] - a[1][IM]*b[1][IM] - + a[2][RE]*b[2][RE] - a[2][IM]*b[2][IM]; - c[IM] = a[0][IM]*b[0][RE] + a[0][RE]*b[0][IM] - + a[1][IM]*b[1][RE] + a[1][RE]*b[1][IM] - + a[2][IM]*b[2][RE] + a[2][RE]*b[2][IM]; + c[RE] = a[0][RE]*b[0][RE] - a[0][IM]*b[0][IM] + + a[1][RE]*b[1][RE] - a[1][IM]*b[1][IM] + + a[2][RE]*b[2][RE] - a[2][IM]*b[2][IM]; + c[IM] = a[0][IM]*b[0][RE] + a[0][RE]*b[0][IM] + + a[1][IM]*b[1][RE] + a[1][RE]*b[1][IM] + + a[2][IM]*b[2][RE] + a[2][RE]*b[2][IM]; } -//============================================================ +/*============================================================*/ INLINE double cDotProd_conj_Re(doublecomplex *a,doublecomplex *b) -// real part of dot product of two complex vector[3]; c=Re(a.b*) + /* real part of dot product of two complex vector[3]; c=Re(a.b*) */ { - return ( a[0][RE]*b[0][RE] - a[0][IM]*b[0][IM] - + a[1][RE]*b[1][RE] - a[1][IM]*b[1][IM] - + a[2][RE]*b[2][RE] - a[2][IM]*b[2][IM] ); + return (a[0][RE]*b[0][RE] - a[0][IM]*b[0][IM] + + a[1][RE]*b[1][RE] - a[1][IM]*b[1][IM] + + a[2][RE]*b[2][RE] - a[2][IM]*b[2][IM]); } -//============================================================ +/*============================================================*/ INLINE double cDotProd_conj_Im(doublecomplex *a,doublecomplex *b) -// imaginary part of dot product of two complex vector[3]; c=Im(a.b*) + /* imaginary part of dot product of two complex vector[3]; c=Im(a.b*) */ { - return ( a[0][IM]*b[0][RE] + a[0][RE]*b[0][IM] - + a[1][IM]*b[1][RE] + a[1][RE]*b[1][IM] - + a[2][IM]*b[2][RE] + a[2][RE]*b[2][IM] ); + return (a[0][IM]*b[0][RE] + a[0][RE]*b[0][IM] + + a[1][IM]*b[1][RE] + a[1][RE]*b[1][IM] + + a[2][IM]*b[2][RE] + a[2][RE]*b[2][IM]); } -//============================================================ +/*============================================================*/ INLINE void cvAdd(doublecomplex *a,doublecomplex *b,doublecomplex *c) -// add two complex vector[3]; c=a+b + /* add two complex vector[3]; c=a+b */ { - c[0][RE] = a[0][RE] + b[0][RE]; - c[0][IM] = a[0][IM] + b[0][IM]; - c[1][RE] = a[1][RE] + b[1][RE]; - c[1][IM] = a[1][IM] + b[1][IM]; - c[2][RE] = a[2][RE] + b[2][RE]; - c[2][IM] = a[2][IM] + b[2][IM]; + c[0][RE] = a[0][RE] + b[0][RE]; + c[0][IM] = a[0][IM] + b[0][IM]; + c[1][RE] = a[1][RE] + b[1][RE]; + c[1][IM] = a[1][IM] + b[1][IM]; + c[2][RE] = a[2][RE] + b[2][RE]; + c[2][IM] = a[2][IM] + b[2][IM]; } -//============================================================ +/*============================================================*/ INLINE void cvSubtr(doublecomplex *a,doublecomplex *b,doublecomplex *c) -// subtraction of two complex vector[3]; c=a-b + /* subtraction of two complex vector[3]; c=a-b */ { - c[0][RE] = a[0][RE] - b[0][RE]; - c[0][IM] = a[0][IM] - b[0][IM]; - c[1][RE] = a[1][RE] - b[1][RE]; - c[1][IM] = a[1][IM] - b[1][IM]; - c[2][RE] = a[2][RE] - b[2][RE]; - c[2][IM] = a[2][IM] - b[2][IM]; + c[0][RE] = a[0][RE] - b[0][RE]; + c[0][IM] = a[0][IM] - b[0][IM]; + c[1][RE] = a[1][RE] - b[1][RE]; + c[1][IM] = a[1][IM] - b[1][IM]; + c[2][RE] = a[2][RE] - b[2][RE]; + c[2][IM] = a[2][IM] - b[2][IM]; } -//============================================================ +/*============================================================*/ INLINE void crDotProd(doublecomplex *a,const double *b,doublecomplex c) -// dot product of complex and real vectors[3]; c=a.b + /* dot product of complex and real vectors[3]; c=a.b */ { - c[RE] = a[0][RE]*b[0] + a[1][RE]*b[1] + a[2][RE]*b[2]; - c[IM] = a[0][IM]*b[0] + a[1][IM]*b[1] + a[2][IM]*b[2]; + c[RE] = a[0][RE]*b[0] + a[1][RE]*b[1] + a[2][RE]*b[2]; + c[IM] = a[0][IM]*b[0] + a[1][IM]*b[1] + a[2][IM]*b[2]; } -//============================================================ +/*============================================================*/ INLINE double crDotProd_Re(doublecomplex *a,const double *b) -// real part of dot product of complex and real vectors[3]; c=Re(a.b) + /* real part of dot product of complex and real vectors[3]; c=Re(a.b) */ { - return (a[0][RE]*b[0] + a[1][RE]*b[1] + a[2][RE]*b[2]); + return (a[0][RE]*b[0] + a[1][RE]*b[1] + a[2][RE]*b[2]); } -//============================================================ +/*============================================================*/ INLINE double crDotProd_Im(doublecomplex *a,const double *b) -// imaginary part of dot product of complex and real vectors[3]; c=Im(a.b) + /* imaginary part of dot product of complex and real vectors[3]; c=Im(a.b) */ { - return (a[0][IM]*b[0] + a[1][IM]*b[1] + a[2][IM]*b[2]); + return (a[0][IM]*b[0] + a[1][IM]*b[1] + a[2][IM]*b[2]); } -//============================================================ +/*============================================================*/ INLINE void cvIncremScaled_cmplx(doublecomplex *a,const doublecomplex b,doublecomplex *c) -// increment of complex vectors[3] by complex-scaled other vector; c+=b*a + /* increment of complex vectors[3] by complex-scaled other vector; c+=b*a */ { - c[0][RE] += b[RE]*a[0][RE] - b[IM]*a[0][IM]; - c[0][IM] += b[RE]*a[0][IM] + b[IM]*a[0][RE]; - c[1][RE] += b[RE]*a[1][RE] - b[IM]*a[1][IM]; - c[1][IM] += b[RE]*a[1][IM] + b[IM]*a[1][RE]; - c[2][RE] += b[RE]*a[2][RE] - b[IM]*a[2][IM]; - c[2][IM] += b[RE]*a[2][IM] + b[IM]*a[2][RE]; + c[0][RE] += b[RE]*a[0][RE] - b[IM]*a[0][IM]; + c[0][IM] += b[RE]*a[0][IM] + b[IM]*a[0][RE]; + c[1][RE] += b[RE]*a[1][RE] - b[IM]*a[1][IM]; + c[1][IM] += b[RE]*a[1][IM] + b[IM]*a[1][RE]; + c[2][RE] += b[RE]*a[2][RE] - b[IM]*a[2][IM]; + c[2][IM] += b[RE]*a[2][IM] + b[IM]*a[2][RE]; } -//============================================================ +/*============================================================*/ INLINE void cvMultAdd(doublecomplex *a,const doublecomplex b,doublecomplex *c) -// multiply complex vectors[3] with complex constant and add another vector; c=b*c+a + /* multiply complex vectors[3] with complex constant and add another vector; + c=b*c+a */ { - double tmp; - tmp=c[0][RE]; - c[0][RE] = c[0][RE]*b[RE] - c[0][IM]*b[IM] + a[0][RE]; - c[0][IM] = tmp*b[IM] + c[0][IM]*b[RE] + a[0][IM]; - tmp=c[1][RE]; - c[1][RE] = c[1][RE]*b[RE] - c[1][IM]*b[IM] + a[1][RE]; - c[1][IM] = tmp*b[IM] + c[1][IM]*b[RE] + a[1][IM]; - tmp=c[2][RE]; - c[2][RE] = c[2][RE]*b[RE] - c[2][IM]*b[IM] + a[2][RE]; - c[2][IM] = tmp*b[IM] + c[2][IM]*b[RE] + a[2][IM]; + double tmp; + tmp=c[0][RE]; + c[0][RE] = c[0][RE]*b[RE] - c[0][IM]*b[IM] + a[0][RE]; + c[0][IM] = tmp*b[IM] + c[0][IM]*b[RE] + a[0][IM]; + tmp=c[1][RE]; + c[1][RE] = c[1][RE]*b[RE] - c[1][IM]*b[IM] + a[1][RE]; + c[1][IM] = tmp*b[IM] + c[1][IM]*b[RE] + a[1][IM]; + tmp=c[2][RE]; + c[2][RE] = c[2][RE]*b[RE] - c[2][IM]*b[IM] + a[2][RE]; + c[2][IM] = tmp*b[IM] + c[2][IM]*b[RE] + a[2][IM]; } -//============================================================ +/*============================================================*/ INLINE void cvLinComb1(doublecomplex *a,doublecomplex *b, const double c1,doublecomplex *c) -// linear combination of complex vectors[3]; second coefficient is unity; c=c1*a+b + /* linear combination of complex vectors[3]; second coef is unity; c=c1*a+b */ { - c[0][RE] = c1*a[0][RE] + b[0][RE]; - c[0][IM] = c1*a[0][IM] + b[0][IM]; - c[1][RE] = c1*a[1][RE] + b[1][RE]; - c[1][IM] = c1*a[1][IM] + b[1][IM]; - c[2][RE] = c1*a[2][RE] + b[2][RE]; - c[2][IM] = c1*a[2][IM] + b[2][IM]; + c[0][RE] = c1*a[0][RE] + b[0][RE]; + c[0][IM] = c1*a[0][IM] + b[0][IM]; + c[1][RE] = c1*a[1][RE] + b[1][RE]; + c[1][IM] = c1*a[1][IM] + b[1][IM]; + c[2][RE] = c1*a[2][RE] + b[2][RE]; + c[2][IM] = c1*a[2][IM] + b[2][IM]; } -//============================================================ +/*============================================================*/ INLINE void cvLinComb1_cmplx(doublecomplex *a,doublecomplex *b, const doublecomplex c1,doublecomplex *c) -/* linear combination of complex vectors[3] with complex coefficients; - * second coefficient is unity; c=c1*a+b !!! c!=a - */ + /* linear combination of complex vectors[3] with complex coefficients; + second coef is unity; c=c1*a+b !!! c!=a */ { - c[0][RE] = a[0][RE]*c1[RE] - a[0][IM]*c1[IM] + b[0][RE]; - c[0][IM] = a[0][RE]*c1[IM] + a[0][IM]*c1[RE] + b[0][IM]; - c[1][RE] = a[1][RE]*c1[RE] - a[1][IM]*c1[IM] + b[1][RE]; - c[1][IM] = a[1][RE]*c1[IM] + a[1][IM]*c1[RE] + b[1][IM]; - c[2][RE] = a[2][RE]*c1[RE] - a[2][IM]*c1[IM] + b[2][RE]; - c[2][IM] = a[2][RE]*c1[IM] + a[2][IM]*c1[RE] + b[2][IM]; + c[0][RE] = a[0][RE]*c1[RE] - a[0][IM]*c1[IM] + b[0][RE]; + c[0][IM] = a[0][RE]*c1[IM] + a[0][IM]*c1[RE] + b[0][IM]; + c[1][RE] = a[1][RE]*c1[RE] - a[1][IM]*c1[IM] + b[1][RE]; + c[1][IM] = a[1][RE]*c1[IM] + a[1][IM]*c1[RE] + b[1][IM]; + c[2][RE] = a[2][RE]*c1[RE] - a[2][IM]*c1[IM] + b[2][RE]; + c[2][IM] = a[2][RE]*c1[IM] + a[2][IM]*c1[RE] + b[2][IM]; } -//============================================================ +/*============================================================*/ INLINE void cSymMatrVec(doublecomplex *matr,doublecomplex *vec,doublecomplex *res) -// multiplication of complex symmetric matrix[6] by complex vec[3]; res=matr.vec -{ - res[0][RE] = matr[0][RE]*vec[0][RE] - matr[0][IM]*vec[0][IM] - + matr[1][RE]*vec[1][RE] - matr[1][IM]*vec[1][IM] - + matr[2][RE]*vec[2][RE] - matr[2][IM]*vec[2][IM]; - res[0][IM] = matr[0][RE]*vec[0][IM] + matr[0][IM]*vec[0][RE] - + matr[1][RE]*vec[1][IM] + matr[1][IM]*vec[1][RE] - + matr[2][RE]*vec[2][IM] + matr[2][IM]*vec[2][RE]; - - res[1][RE] = matr[1][RE]*vec[0][RE] - matr[1][IM]*vec[0][IM] - + matr[3][RE]*vec[1][RE] - matr[3][IM]*vec[1][IM] - + matr[4][RE]*vec[2][RE] - matr[4][IM]*vec[2][IM]; - res[1][IM] = matr[1][RE]*vec[0][IM] + matr[1][IM]*vec[0][RE] - + matr[3][RE]*vec[1][IM] + matr[3][IM]*vec[1][RE] - + matr[4][RE]*vec[2][IM] + matr[4][IM]*vec[2][RE]; - - res[2][RE] = matr[2][RE]*vec[0][RE] - matr[2][IM]*vec[0][IM] - + matr[4][RE]*vec[1][RE] - matr[4][IM]*vec[1][IM] - + matr[5][RE]*vec[2][RE] - matr[5][IM]*vec[2][IM]; - res[2][IM] = matr[2][RE]*vec[0][IM] + matr[2][IM]*vec[0][RE] - + matr[4][RE]*vec[1][IM] + matr[4][IM]*vec[1][RE] - + matr[5][RE]*vec[2][IM] + matr[5][IM]*vec[2][RE]; -} - -//============================================================ -// operations on real vectors + /* multiplication of complex symmetric matrix[6] by complex vec[3] + res=matr.vec */ +{ + res[0][RE] = matr[0][RE] * vec[0][RE] - matr[0][IM] * vec[0][IM] + + matr[1][RE] * vec[1][RE] - matr[1][IM] * vec[1][IM] + + matr[2][RE] * vec[2][RE] - matr[2][IM] * vec[2][IM]; + res[0][IM] = matr[0][RE] * vec[0][IM] + matr[0][IM] * vec[0][RE] + + matr[1][RE] * vec[1][IM] + matr[1][IM] * vec[1][RE] + + matr[2][RE] * vec[2][IM] + matr[2][IM] * vec[2][RE]; + + res[1][RE] = matr[1][RE] * vec[0][RE] - matr[1][IM] * vec[0][IM] + + matr[3][RE] * vec[1][RE] - matr[3][IM] * vec[1][IM] + + matr[4][RE] * vec[2][RE] - matr[4][IM] * vec[2][IM]; + res[1][IM] = matr[1][RE] * vec[0][IM] + matr[1][IM] * vec[0][RE] + + matr[3][RE] * vec[1][IM] + matr[3][IM] * vec[1][RE] + + matr[4][RE] * vec[2][IM] + matr[4][IM] * vec[2][RE]; + + res[2][RE] = matr[2][RE] * vec[0][RE] - matr[2][IM] * vec[0][IM] + + matr[4][RE] * vec[1][RE] - matr[4][IM] * vec[1][IM] + + matr[5][RE] * vec[2][RE] - matr[5][IM] * vec[2][IM]; + res[2][IM] = matr[2][RE] * vec[0][IM] + matr[2][IM] * vec[0][RE] + + matr[4][RE] * vec[1][IM] + matr[4][IM] * vec[1][RE] + + matr[5][RE] * vec[2][IM] + matr[5][IM] * vec[2][RE]; +} + +/*============================================================*/ +/* operations on real vectors */ INLINE void MultScal(const double a,const double *b,double *c) -// multiplication of vector by scalar; c=a*b + /* multiplication of vector by scalar; c=a*b */ { - c[0]=a*b[0]; - c[1]=a*b[1]; - c[2]=a*b[2]; + c[0]=a*b[0]; + c[1]=a*b[1]; + c[2]=a*b[2]; } -//============================================================ +/*============================================================*/ INLINE void vMult(const double *a,const double *b,double *c) -// multiplication of two vectors (by elements); c[i]=a[i]*b[i] + /* multiplication of two vectors (by elements); c[i]=a[i]*b[i] */ { - c[0]=a[0]*b[0]; - c[1]=a[1]*b[1]; - c[2]=a[2]*b[2]; + c[0]=a[0]*b[0]; + c[1]=a[1]*b[1]; + c[2]=a[2]*b[2]; } -//============================================================ +/*============================================================*/ INLINE double DotProd(const double *a,const double *b) -// dot product of two real vectors[3] + /* dot product of two real vectors[3] */ { - return (a[0]*b[0]+a[1]*b[1]+a[2]*b[2]); + return (a[0]*b[0]+a[1]*b[1]+a[2]*b[2]); } -//============================================================ +/*============================================================*/ INLINE void LinComb(const double *a,const double *b, - const double c1,const double c2, double *c) -// linear combination of real vectors[3]; c=c1*a+c2*b + const double c1,const double c2, double *c) + /* linear combination of real vectors[3]; c=c1*a+c2*b */ { - c[0]=c1*a[0]+c2*b[0]; - c[1]=c1*a[1]+c2*b[1]; - c[2]=c1*a[2]+c2*b[2]; + c[0]=c1*a[0]+c2*b[0]; + c[1]=c1*a[1]+c2*b[1]; + c[2]=c1*a[2]+c2*b[2]; } -//============================================================ +/*============================================================*/ INLINE double TrSym(const double *a) -// trace of a symmetric matrix stored as a vector of size 6 + /* trace of a symmetric matrix stored as a vector of size 6 */ { - return (a[0]+a[2]+a[5]); + return (a[0]+a[2]+a[5]); } -//============================================================ +/*============================================================*/ INLINE double QuadForm(const double *matr,const double *vec) -// value of a quadratic form matr (symmetric matrix stored as a vector of size 6) over a vector vec + /* value of a quadratic form matr (symmetric matrix stored as + a vector of size 6) over a vector vec */ { - return ( vec[0]*vec[0]*matr[0] + vec[1]*vec[1]*matr[2] + vec[2]*vec[2]*matr[5] - + 2*(vec[0]*vec[1]*matr[1] + vec[0]*vec[2]*matr[3] + vec[1]*vec[2]*matr[4]) ); + return (vec[0]*vec[0]*matr[0]+vec[1]*vec[1]*matr[2]+vec[2]*vec[2]*matr[5]+ + 2*(vec[0]*vec[1]*matr[1]+vec[0]*vec[2]*matr[3]+vec[1]*vec[2]*matr[4])); } -//============================================================ +/*============================================================*/ INLINE void MatrVec(double matr[3][3],const double *vec, double *res) -// multiplication of matrix[3][3] by vec[3] (all real); res=matr.vec + /* multiplication of matrix[3][3] by vec[3] (all real) + res=matr.vec */ { - res[0]=matr[0][0]*vec[0]+matr[0][1]*vec[1]+matr[0][2]*vec[2]; - res[1]=matr[1][0]*vec[0]+matr[1][1]*vec[1]+matr[1][2]*vec[2]; - res[2]=matr[2][0]*vec[0]+matr[2][1]*vec[1]+matr[2][2]*vec[2]; + res[0]=matr[0][0]*vec[0]+matr[0][1]*vec[1]+matr[0][2]*vec[2]; + res[1]=matr[1][0]*vec[0]+matr[1][1]*vec[1]+matr[1][2]*vec[2]; + res[2]=matr[2][0]*vec[0]+matr[2][1]*vec[1]+matr[2][2]*vec[2]; } -//============================================================ +/*============================================================*/ INLINE void Permutate(double *vec,const int *ord) -// permutate double vector vec using permutation ord + /* permutate double vector vec using permutation ord */ { - double buf[3]; + double buf[3]; - memcpy(buf,vec,3*sizeof(double)); - vec[0]=buf[ord[0]]; - vec[1]=buf[ord[1]]; - vec[2]=buf[ord[2]]; + memcpy(buf,vec,3*sizeof(double)); + vec[0]=buf[ord[0]]; + vec[1]=buf[ord[1]]; + vec[2]=buf[ord[2]]; } -//============================================================ +/*============================================================*/ INLINE void Permutate_i(int *vec,const int *ord) -// permutate int vector vec using permutation ord + /* permutate int vector vec using permutation ord */ { - int buf[3]; + int buf[3]; - memcpy(buf,vec,3*sizeof(int)); - vec[0]=buf[ord[0]]; - vec[1]=buf[ord[1]]; - vec[2]=buf[ord[2]]; + memcpy(buf,vec,3*sizeof(int)); + vec[0]=buf[ord[0]]; + vec[1]=buf[ord[1]]; + vec[2]=buf[ord[2]]; } -//============================================================ -// Auxiliary functions +/*=====================================================================*/ +/* Auxillary functions */ INLINE double Deg2Rad(const double deg) -// transforms angle in degrees to radians + /* transforms angle in degrees to radians */ { - return (PI_OVER_180*deg); + return (PI_OVER_180*deg); } -//============================================================ +/*=====================================================================*/ INLINE double Rad2Deg(const double rad) -// transforms angle in radians to degrees + /* transforms angle in radians to degrees */ { - return (INV_PI_180*rad); + return (INV_PI_180*rad); } -#endif // __cmplx_h +#endif /*__cmplx_h*/ diff --git a/src/comm.c b/src/comm.c index cd24b65d..8cc04000 100644 --- a/src/comm.c +++ b/src/comm.c @@ -1,8 +1,8 @@ /* FILE : comm.c * AUTH : Maxim Yurkin * DESCR: The main intention of this library is to incorporate all - * parallelization related code, so most of it is directly - * involved in or closely related to interprocess communication, + * parallelisation related code, so most of it is directly + * involved in or closely related to inter-process communication, * hence its name. * * Previous versions were by Martijn Frijlink @@ -23,647 +23,633 @@ #include "function.h" #include "parbas.h" -// for getch in Stop +/* for getch in Stop */ #ifdef RUN_BCB # include <conio.h> #endif #ifdef MPI -MPI_Datatype mpi_dcomplex; + MPI_Datatype mpi_dcomplex; #endif /* whether a synchronize call should be performed before parallel timing. It makes communication - * timing more accurate, but may deteriorate overall performance by introducing unnecessary - * delays (test showed only slight difference for granule generator) */ + timing more accurate, but may deteriorate overall performance by introducing unnecessary + delays (test showed only slight difference for granule generator) */ #define SYNCHRONIZE_TIMING -// SEMI-GLOBAL VARIABLES +/* SEMI-GLOBAL VARIABLES */ -// defined and allocated in fft.c +/* defined and allocated in fft.c */ extern double *BT_buffer, *BT_rbuffer; -// defined and initialized in timing.c +/* defined and initialized in timing.c */ extern TIME_TYPE Timing_Dm_Init_comm; -// LOCAL VARIABLES +/* LOCAL VARIABLES */ #ifdef PARALLEL -static int Ntrans; // number of transmissions; used in CalcPartner -static int *gr_comm_size; // sizes of transmissions for granule generator communications -static int *gr_comm_overl; // shows whether two sequential transmissions overlap -static unsigned char *gr_comm_ob; // buffer for overlaps -static void *gr_comm_buf; // buffer for MPI transfers - -// First several functions are defined only in parallel mode -//=========================================== +static int Ntrans; /* number of transmissions; used in CalcPartner */ +static int *gr_comm_size; /* sizes of transmissions for granule generator communications */ +static int *gr_comm_overl; /* shows whether two sequential transmissions overlap */ +static unsigned char *gr_comm_ob; /* buffer for overlaps */ +static void *gr_comm_buf; /* buffer for MPI transfers */ +/* First funtions that are defined only in parallel mode */ +/*===========================================*/ static void RecoverCommandLine(int *argc_p,char ***argv_p) -/* eliminate all NULL pointers from argv, shift the rest, and adjust argc accordingly. - * Used in InitComm - * */ + /* eliminate all NULL pointers from argv, shift the rest, and + adjust argc accordingly. Used in InitComm */ { - int i,j; + int i,j; - for (i=0,j=0;i<(*argc_p);i++) { - if ((*argv_p)[i]==NULL) j++; - else if (j!=0) (*argv_p)[i-j]=(*argv_p)[i]; - } - (*argc_p)-=j; + for (i=0,j=0;i<(*argc_p);i++) { + if ((*argv_p)[i]==NULL) j++; + else if (j!=0) (*argv_p)[i-j]=(*argv_p)[i]; + } + (*argc_p)-=j; } -//============================================================ +/*============================================================*/ INLINE size_t IndexBlock(const size_t x,const size_t y,const size_t z,const size_t lengthY) -// index block; used in BlockTranspose + /* index block; used in BlockTranspose */ { - return((z*lengthY+y)*gridX+x); + return((z*lengthY+y)*gridX+x); } -//============================================================ +/*============================================================*/ INLINE int CalcPartner(const int tran) -/* calculate ringid of partner processor for current transmission; used in BlockTranspose. Many - * different implementations are possible; the only requirements are - * 1) f(tran,f(tran,ringid))=ringid - * 2) f({1,2,Ntrans},ringid)={0,1,Ntrans}\ringid - * where f=nprocs is equivalent to skipping this transmission (relevant for odd nprocs) - */ + /* calculate ringid of partner processor for current transmission; + used in BlockTranspose + many different implementations are possible; the only requirements are + 1) f(tran,f(tran,ringid))=ringid + 2) f({1,2,Ntrans},ringid)={0,1,Ntrans}\ringid + where f=nprocs <=> skip this transmission (for odd nprocs) */ { - int part; - - if (ringid==0) part=tran; - else if (ringid==tran) part=0; - else { - part=2*tran-ringid; - if (part<=0) part+=Ntrans; - else if (part>Ntrans) part-=Ntrans; - } - return part; + int part; + + if (ringid==0) part=tran; + else if (ringid==tran) part=0; + else { + part=2*tran-ringid; + if (part<=0) part+=Ntrans; + else if (part>Ntrans) part-=Ntrans; + } + return part; } -//============================================================ +/*===========================================*/ void CatNFiles(const char *dir,const char *tmpl,const char *dest) -/* cat several temporary files (one for each processor, names defined by the template 'tmpl' that - * should contain %d to be replaced by ringid). Files are located in directory 'dir'. Combined into - * 'dest' in the same directory. Afterwards temporary files are removed. - */ + /* cat several temporary files (one for each processor, names defines by the template 'temp' + that should contain %d to be replaced by ringid). Files are located in directory 'dir'. + Combined into 'dest' in the same directory. Afterwards temporary files are removed. */ { - int i,c; - FILE *in,*out; - char fname_out[MAX_TMP_FNAME],fname_in[MAX_TMP_FNAME]; - - // produce full path of destination file and open it - sprintf(fname_out,"%s/%s",directory,dest); - out=FOpenErr(fname_out,"w",ONE_POS); - for (i=0;i<nprocs;i++) { - // produce full path of tmp file and open it - sprintf(fname_in,"%s/",directory); - sprintf(fname_in+strlen(fname_in),tmpl,i); - in=FOpenErr(fname_in,"r",ONE_POS); - // copy file in to out - while((c=getc(in))!=EOF) putc(c,out); - // close and remove tmp file - FCloseErr(in,fname_in,ONE_POS); - RemoveErr(fname_in,ONE_POS); - } - // close destination file - FCloseErr(out,fname_out,ONE_POS); + int i,c; + FILE *in,*out; + char fname_out[MAX_TMP_FNAME],fname_in[MAX_TMP_FNAME]; + + /* produce full path of destination file and open it */ + sprintf(fname_out,"%s/%s",directory,dest); + out=FOpenErr(fname_out,"w",ONE_POS); + for (i=0;i<nprocs;i++) { + /* produce full path of tmp file and open it */ + sprintf(fname_in,"%s/",directory); + sprintf(fname_in+strlen(fname_in),tmpl,i); + in=FOpenErr(fname_in,"r",ONE_POS); + /* copy file in to out */ + while((c=getc(in))!=EOF) putc(c,out); + /* close and remove tmp file */ + FCloseErr(in,fname_in,ONE_POS); + RemoveErr(fname_in,ONE_POS); + } + /* close destination file */ + FCloseErr(out,fname_out,ONE_POS); } #endif -//============================================================ +/*===========================================*/ void InitComm(int *argc_p,char ***argv_p) -// initialize communications in the beginning of the program + /* initialize communications in the beginning of the program */ { #ifdef MPI - int dcmplx_blocklength[2]={1,1},ver,subver; - MPI_Aint dcmplx_displs[2] = {0,1}; - MPI_Datatype dcmplx_type[2]; - - /* MPI_Init may alter argc and argv and interfere with normal parsing of command line - * parameters. The way of altering is implementation depending. MPI searches for MPI parameters - * in the command line and removes them (we assume some kind of removing does take place - - * otherwise ADDA will produce error 'unknown parameter'). The best would be to change argc and - * argv so that they look like no special command line arguments are present. However, - * MPICH 1.2.5, for example, just replaces corresponding parameters by NULLs. To incorporate it - * we introduce special function to restore the command line - */ - MPI_Init(argc_p,argv_p); - tstart_main = GET_TIME(); // initialize program time - RecoverCommandLine(argc_p,argv_p); - // initialize ringid and nprocs - MPI_Comm_rank(MPI_COMM_WORLD,&ringid); - MPI_Comm_size(MPI_COMM_WORLD,&nprocs); - // initialize Ntrans - if (IS_EVEN(nprocs)) Ntrans=nprocs-1; - else Ntrans=nprocs; - // Create MPI-type for sending dcomplex-variables - dcmplx_type[0] = MPI_DOUBLE; dcmplx_type[1] = MPI_DOUBLE; - MPI_Type_struct(2,dcmplx_blocklength,dcmplx_displs,dcmplx_type,&mpi_dcomplex); - MPI_Type_commit(&mpi_dcomplex); - // check MPI version at runtime - MPI_Get_version(&ver,&subver); - if ((ver<MPI_VER_REQ) || ((ver==MPI_VER_REQ) && (subver<MPI_SUBVER_REQ))) LogError(EC_ERROR, - ONE_POS,"MPI version (%d.%d) is too old. Version %d.%d or newer is required",ver,subver, - MPI_VER_REQ,MPI_SUBVER_REQ); - // if MPI crashes, it happens here - Synchronize(); + int dcmplx_blocklength[2]={1,1},ver,subver; + MPI_Aint dcmplx_displs[2] = {0,1}; + MPI_Datatype dcmplx_type[2]; + + /* MPI_Init may alter argc and argv and interfere with normal parsing of command + line parameters. The way of altering is implementation depending. MPI searches for + mpi parameters in command line and removes them (we assume some kind of removing does + take place - otherwise ADDA will produce error 'unknown parameter'). The best would be + to change argc and argv so that they look like no special command line arguments are + present. However, MPICH 1.2.5, for example, just replaces corresponding parameters by + NULLs. To incorporate it we introduce special function to restore the command line */ + MPI_Init(argc_p,argv_p); + tstart_main = GET_TIME(); /* initialize program time */ + RecoverCommandLine(argc_p,argv_p); + /* initialize ringid and nprocs */ + MPI_Comm_rank(MPI_COMM_WORLD,&ringid); + MPI_Comm_size(MPI_COMM_WORLD,&nprocs); + /* initialize Ntrans */ + if ((nprocs%2)==0) Ntrans=nprocs-1; + else Ntrans=nprocs; + /* Create MPI-type for sending dcomplex-variables */ + dcmplx_type[0] = MPI_DOUBLE; dcmplx_type[1] = MPI_DOUBLE; + MPI_Type_struct(2,dcmplx_blocklength,dcmplx_displs,dcmplx_type,&mpi_dcomplex); + MPI_Type_commit(&mpi_dcomplex); + /* check MPI version at runtime */ + MPI_Get_version(&ver,&subver); + if ((ver<MPI_VER_REQ) || ((ver==MPI_VER_REQ) && (subver<MPI_SUBVER_REQ))) + LogError(EC_ERROR,ONE_POS,"MPI version (%d.%d) is too old. Version %d.%d or newer is required", + ver,subver,MPI_VER_REQ,MPI_SUBVER_REQ); + /* if MPI crashes, it happens here */ + Synchronize(); #elif !defined(PARALLEL) - nprocs=1; - ringid=ROOT; + nprocs=1; + ringid=ROOT; #endif - /* check if weird number of processors is specified; called even in sequential mode to - * initialize weird_nprocs - */ - CheckNprocs(); + /* check if wierd number of processors is specified + called even in sequenatial mode to initialize weird_nprocs */ + CheckNprocs(); } -//============================================================ +/*===========================================*/ void Stop(const int code) -// stops the program with exit 'code' + /* stops the programm with code */ { #ifdef MPI - if (code) { // error occurred - fflush(stdout); - fprintf(stderr,"Aborting process %d\n",ringid); - fflush(stderr); - MPI_Abort(MPI_COMM_WORLD,code); - } - else { // regular termination - // wait for all processors - fflush(stdout); - Synchronize(); - // finalize MPI communications - MPI_Finalize(); - } + if (code) { /* error occured */ + fflush(stdout); + fprintf(stderr,"Aborting process %d\n",ringid); + fflush(stderr); + MPI_Abort(MPI_COMM_WORLD,code); + } + else { /* regular termination */ + /* wait for all processors */ + fflush(stdout); + Synchronize(); + /* finalize MPI communications */ + MPI_Finalize(); + } #endif - // if run under Borland C++ Builder, don't close the window automatically +/* if run under Borland C++ Builder, don't close the window automatically */ #ifdef RUN_BCB - PRINTZ("\nProgram has finished execution.\nPress any key to close window..."); - // waits for pressed key - getch(); + PRINTZ("\nProgram has finished execution.\nPress any key to close window..."); + /* waits for pressed key */ + getch(); #endif - // the code '0' corresponds to success, and '1' to failure - if (code==0) exit(EXIT_SUCCESS); - else if (code==1) exit(EXIT_FAILURE); - else exit(code); + /* in the code '0' corresponds to success, and '1' to failure */ + if (code==0) exit(EXIT_SUCCESS); + else if (code==1) exit(EXIT_FAILURE); + else exit(code); } -//============================================================ +/*===========================================*/ void Synchronize(void) -// synchronizes all processes + /* synchronizes all processes */ { #ifdef MPI - MPI_Barrier(MPI_COMM_WORLD); + MPI_Barrier(MPI_COMM_WORLD); #endif } -//============================================================ +/*===========================================*/ void MyBcast(void *data,const var_type type,size_t n_elem,TIME_TYPE *timing) -/* casts values stored in '*data' from ROOT processor to all other; works for all types; increments - * 'timing' (if not NULL) by the time used - */ + /* casts values stored in *data from ROOT processor to all other; works for all types + increments timing (if not NULL) by the time used */ { #ifdef MPI - MPI_Datatype mes_type; - TIME_TYPE tstart; + MPI_Datatype mes_type; + TIME_TYPE tstart; - if (timing!=NULL) { + if (timing!=NULL) { #ifdef SYNCHRONIZE_TIMING - MPI_Barrier(MPI_COMM_WORLD); // synchronize to get correct timing + MPI_Barrier(MPI_COMM_WORLD); /* synchronize to get correct timing */ #endif - tstart=GET_TIME(); - } - if (type==char_type) mes_type=MPI_CHAR; - if (type==int_type) mes_type=MPI_INT; - else if (type==double_type) mes_type=MPI_DOUBLE; - else if (type==cmplx_type) { - mes_type=MPI_DOUBLE; - n_elem*=2; - } - else LogError(EC_ERROR,ONE_POS,"MyBcast: variable type %u is not supported",type); - - MPI_Bcast(data,n_elem,mes_type,ROOT,MPI_COMM_WORLD); - if (timing!=NULL) (*timing)+=GET_TIME()-tstart; + tstart=GET_TIME(); + } + if (type==char_type) mes_type=MPI_CHAR; + if (type==int_type) mes_type=MPI_INT; + else if (type==double_type) mes_type=MPI_DOUBLE; + else if (type==cmplx_type) { + mes_type=MPI_DOUBLE; + n_elem*=2; + } + else LogError(EC_ERROR,ONE_POS,"MyBcast: variable type %u is not supported",type); + + MPI_Bcast(data,n_elem,mes_type,ROOT,MPI_COMM_WORLD); + if (timing!=NULL) (*timing)+=GET_TIME()-tstart; #endif } -//============================================================ +/*===========================================*/ void BcastOrient(int *i, int *j, int *k) -// cast current orientation angle (in orientation averaging) to all processes from root + /* cast current orientation angle (in orientation averaging) + to all processes from root */ { #ifdef MPI - int buf[3]; - - if (ringid==ROOT) { - buf[0]=*i; - buf[1]=*j; - buf[2]=*k; - } - MPI_Bcast(buf,3,MPI_INT,ROOT,MPI_COMM_WORLD); - if (ringid!=ROOT) { - *i=buf[0]; - *j=buf[1]; - *k=buf[2]; - } + int buf[3]; + + if (ringid==ROOT) { + buf[0]=*i; + buf[1]=*j; + buf[2]=*k; + } + + MPI_Bcast(buf,3,MPI_INT,ROOT,MPI_COMM_WORLD); + + if (ringid!=ROOT) { + *i=buf[0]; + *j=buf[1]; + *k=buf[2]; + } #endif } -//============================================================ +/*===========================================*/ void AccumulateMax(double *data,double *max) -// given a single double on each processor, accumulates their sum and maximum on ROOT processor + /* given element, accumulates their sum and maximum on ROOT processor */ { #ifdef MPI - double buf; - // potentially can be optimized by combining into one operation - MPI_Reduce(data,&buf,1,MPI_DOUBLE,MPI_SUM,ROOT,MPI_COMM_WORLD); - MPI_Reduce(data,max,1,MPI_DOUBLE,MPI_MAX,ROOT,MPI_COMM_WORLD); - if (ringid==ROOT) *data=buf; + double buf; + /* potentially can be optimized by combining into one operation */ + MPI_Reduce(data,&buf,1,MPI_DOUBLE,MPI_SUM,ROOT,MPI_COMM_WORLD); + MPI_Reduce(data,max,1,MPI_DOUBLE,MPI_MAX,ROOT,MPI_COMM_WORLD); + if (ringid==ROOT) *data=buf; #endif } -//============================================================ +/*===========================================*/ void Accumulate(double *vector,const size_t n,double *buf,TIME_TYPE *timing) -// gather and add double vector on processor ROOT; total time is saved in timing (NOT incremented) + /* gather and add double vector on processor ROOT; + total time is saved in timing (NOT incremented) */ { #ifdef MPI - TIME_TYPE tstart; + TIME_TYPE tstart; #ifdef SYNCHRONIZE_TIMING - MPI_Barrier(MPI_COMM_WORLD); // synchronize to get correct timing + MPI_Barrier(MPI_COMM_WORLD); /* synchronize to get correct timing */ #endif - tstart=GET_TIME(); - MPI_Reduce(vector,buf,n,MPI_DOUBLE,MPI_SUM,ROOT,MPI_COMM_WORLD); - if (ringid==ROOT) memcpy(vector,buf,n*sizeof(double)); - (*timing)=GET_TIME()-tstart; + tstart=GET_TIME(); + MPI_Reduce(vector,buf,n,MPI_DOUBLE,MPI_SUM,ROOT,MPI_COMM_WORLD); + if (ringid==ROOT) memcpy(vector,buf,n*sizeof(double)); + (*timing)=GET_TIME()-tstart; #endif } -//============================================================ +/*===========================================*/ void MyInnerProduct(void *data,const var_type type,size_t n_elem,TIME_TYPE *timing) -/* gather values stored in *data, add them and return them in *data; works for all types; increments - * 'timing' (if not NULL) by the time used; not optimized for long data (allocates memory at every - * call) - */ + /* gather values stored in *data, add them and return them in *data + works for all types; increments timing (if not NULL) by the time used + not optimized for long data (allocates memory at every call) */ { #ifdef MPI - size_t size; - MPI_Datatype mes_type; - void *temp; - TIME_TYPE tstart; + size_t size; + MPI_Datatype mes_type; + void *temp; + TIME_TYPE tstart; - if (timing!=NULL) { + if (timing!=NULL) { #ifdef SYNCHRONIZE_TIMING - MPI_Barrier(MPI_COMM_WORLD); // synchronize to get correct timing + MPI_Barrier(MPI_COMM_WORLD); /* synchronize to get correct timing */ #endif - tstart=GET_TIME(); - } - if (type==int_type) { - mes_type=MPI_INT; - size=n_elem*sizeof(int); - } - else if (type==double_type) { - mes_type=MPI_DOUBLE; - size=n_elem*sizeof(double); - } - else if (type==cmplx_type) { - mes_type=MPI_DOUBLE; - n_elem*=2; - size=n_elem*sizeof(double); - } - else LogError(EC_ERROR,ONE_POS,"MyInnerProduct: variable type %u is not supported",type); - - MALLOC_VECTOR(temp,void,size,ALL); - MPI_Allreduce(data,temp,n_elem,mes_type,MPI_SUM,MPI_COMM_WORLD); - memcpy(data,temp,size); - Free_general(temp); - if (timing!=NULL) (*timing)+=GET_TIME()-tstart; + tstart=GET_TIME(); + } + if (type==int_type) { + mes_type=MPI_INT; + size=n_elem*sizeof(int); + } + else if (type==double_type) { + mes_type=MPI_DOUBLE; + size=n_elem*sizeof(double); + } + else if (type==cmplx_type) { + mes_type=MPI_DOUBLE; + n_elem*=2; + size=n_elem*sizeof(double); + } + else LogError(EC_ERROR,ONE_POS,"MyInnerProduct: variable type %u is not supported",type); + + MALLOC_VECTOR(temp,void,size,ALL); + MPI_Allreduce(data,temp,n_elem,mes_type,MPI_SUM,MPI_COMM_WORLD); + memcpy(data,temp,size); + Free_general(temp); + if (timing!=NULL) (*timing)+=GET_TIME()-tstart; #endif } -//============================================================ +/*============================================================*/ void BlockTranspose(doublecomplex *X) -/* do the data-transposition, i.e. exchange, between fftX and fftY&fftZ; specializes at Xmatrix; - * do 3 components in one message; - */ + /* do the data-transposition, i.e. exchange, between fftX and fftY&fftZ + specializes at Xmatrix; do 3 components in one message */ { #ifdef MPI - TIME_TYPE tstart; - size_t bufsize,msize,posit,step,y,z; - int transmission,part,Xpos,Xcomp; - MPI_Status status; + TIME_TYPE tstart; + size_t bufsize,msize,posit,step,y,z; + int transmission,part,Xpos,Xcomp; + MPI_Status status; #ifdef SYNCHRONIZE_TIMING - MPI_Barrier(MPI_COMM_WORLD); // synchronize to get correct timing + MPI_Barrier(MPI_COMM_WORLD); /* synchronize to get correct timing */ #endif - tstart=GET_TIME(); - step=2*local_Nx; - msize=local_Nx*sizeof(doublecomplex); - bufsize=6*local_Nz*smallY*local_Nx; - - for(transmission=1;transmission<=Ntrans;transmission++) { - // if part==nprocs then skip this transmission - if ((part=CalcPartner(transmission))!=nprocs) { - posit=0; - Xpos=local_Nx*part; - for(Xcomp=0;Xcomp<3;Xcomp++) for(z=0;z<local_Nz;z++) for(y=0;y<smallY;y++) { - memcpy(BT_buffer+posit,X+Xcomp*local_Nsmall+IndexBlock(Xpos,y,z,smallY),msize); - posit+=step; - } - - MPI_Sendrecv(BT_buffer, bufsize, MPI_DOUBLE, part, 0, - BT_rbuffer, bufsize, MPI_DOUBLE, part, 0, - MPI_COMM_WORLD,&status); - - posit=0; - Xpos=local_Nx*part; - for(Xcomp=0;Xcomp<3;Xcomp++) for(z=0;z<local_Nz;z++) for(y=0;y<smallY;y++) { - memcpy(X+Xcomp*local_Nsmall+IndexBlock(Xpos,y,z,smallY),BT_rbuffer+posit,msize); - posit+=step; - } - } - } - Timing_OneIterComm += GET_TIME() - tstart; + tstart=GET_TIME(); + step=2*local_Nx; + msize=local_Nx*sizeof(doublecomplex); + bufsize=6*local_Nz*smallY*local_Nx; + + for(transmission=1;transmission<=Ntrans;transmission++) { + /* if part==nprocs then scip this transmission */ + if ((part=CalcPartner(transmission))!=nprocs) { + posit=0; + Xpos=local_Nx*part; + for(Xcomp=0;Xcomp<3;Xcomp++) for(z=0;z<local_Nz;z++) for(y=0;y<smallY;y++) { + memcpy(BT_buffer+posit,X+Xcomp*local_Nsmall+IndexBlock(Xpos,y,z,smallY),msize); + posit+=step; + } + + MPI_Sendrecv(BT_buffer, bufsize, MPI_DOUBLE, part, 0, + BT_rbuffer, bufsize, MPI_DOUBLE, part, 0, + MPI_COMM_WORLD,&status); + + posit=0; + Xpos=local_Nx*part; + for(Xcomp=0;Xcomp<3;Xcomp++) for(z=0;z<local_Nz;z++) for(y=0;y<smallY;y++) { + memcpy(X+Xcomp*local_Nsmall+IndexBlock(Xpos,y,z,smallY),BT_rbuffer+posit,msize); + posit+=step; + } + } + } + Timing_OneIterComm += GET_TIME() - tstart; #endif } -//============================================================ +/*===========================================*/ void BlockTranspose_Dm(doublecomplex *X,const size_t lengthY,const size_t lengthZ) -// do the data-transposition, i.e. exchange, between fftX and fftY&fftZ; specialized for D matrix + /* do the data-transposition, i.e. exchange, between fftX and fftY&fftZ + specialized for D matrix */ { #ifdef MPI - TIME_TYPE tstart; - size_t bufsize,msize,posit,step,y,z; - int transmission,part,Xpos; - MPI_Status status; + TIME_TYPE tstart; + size_t bufsize,msize,posit,step,y,z; + int transmission,part,Xpos; + MPI_Status status; #ifdef SYNCHRONIZE_TIMING - MPI_Barrier(MPI_COMM_WORLD); // synchronize to get correct timing + MPI_Barrier(MPI_COMM_WORLD); /* synchronize to get correct timing */ #endif - tstart=GET_TIME(); - step=2*local_Nx; - msize=local_Nx*sizeof(doublecomplex); - bufsize = 2*lengthZ*lengthY*local_Nx; - - for(transmission=1;transmission<=Ntrans;transmission++) { - if ((part=CalcPartner(transmission))!=nprocs) { - posit=0; - Xpos=local_Nx*part; - for(z=0;z<lengthZ;z++) for(y=0;y<lengthY;y++) { - memcpy(BT_buffer+posit,X+IndexBlock(Xpos,y,z,lengthY),msize); - posit+=step; - } - - MPI_Sendrecv(BT_buffer,bufsize,MPI_DOUBLE,part,0, - BT_rbuffer,bufsize,MPI_DOUBLE,part,0, - MPI_COMM_WORLD,&status); - - posit=0; - Xpos=local_Nx*part; - for(z=0;z<lengthZ;z++) for(y=0;y<lengthY;y++) { - memcpy(X+IndexBlock(Xpos,y,z,lengthY),BT_rbuffer+posit,msize); - posit+=step; - } - } - } - Timing_Dm_Init_comm += GET_TIME() - tstart; + tstart=GET_TIME(); + step=2*local_Nx; + msize=local_Nx*sizeof(doublecomplex); + bufsize = 2*lengthZ*lengthY*local_Nx; + + for(transmission=1;transmission<=Ntrans;transmission++) { + if ((part=CalcPartner(transmission))!=nprocs) { + posit=0; + Xpos=local_Nx*part; + for(z=0;z<lengthZ;z++) for(y=0;y<lengthY;y++) { + memcpy(BT_buffer+posit,X+IndexBlock(Xpos,y,z,lengthY),msize); + posit+=step; + } + + MPI_Sendrecv(BT_buffer,bufsize,MPI_DOUBLE,part,0, + BT_rbuffer,bufsize,MPI_DOUBLE,part,0, + MPI_COMM_WORLD,&status); + + posit=0; + Xpos=local_Nx*part; + for(z=0;z<lengthZ;z++) for(y=0;y<lengthY;y++) { + memcpy(X+IndexBlock(Xpos,y,z,lengthY),BT_rbuffer+posit,msize); + posit+=step; + } + } + } + Timing_Dm_Init_comm += GET_TIME() - tstart; #endif } -//============================================================ +/*===========================================*/ void ParSetup(void) -// initialize common parameters; need to do in the beginning to enable call to MakeParticle + /* initialize common parameters; need to do in the beginning to enable call to MakeParticle */ { #ifdef PARALLEL - int unitZ,unitX; + int unitZ,unitX; #endif - // calculate size of 3D grid - gridX=fftFit(2*boxX,nprocs); - gridY=fftFit(2*boxY,1); - gridZ=fftFit(2*boxZ,2*nprocs); - // initialize some variables - smallY=gridY/2; - smallZ=gridZ/2; - /* if this check is passed then all other multiplications of 2 grids are OK, - * except for XY values, used in granule generator - */ - gridYZ=MultOverflow(gridY,gridZ,ALL_POS,"gridYZ"); + /* calculate size of 3d grid */ + gridX=fftFit(2*boxX,nprocs); + gridY=fftFit(2*boxY,1); + gridZ=fftFit(2*boxZ,2*nprocs); + /* initialise some variables */ + smallY=gridY/2; + smallZ=gridZ/2; + /* if this check is passed then all other multiplications of 2 grids are OK, + except for XY values, used in granule generator */ + gridYZ=MultOverflow(gridY,gridZ,ALL_POS,"gridYZ"); #ifdef PARALLEL - unitZ=smallZ/nprocs; // this should always be an exact division - local_z0=ringid*unitZ; - local_z1=(ringid+1)*unitZ; - if (local_z1 > boxZ) local_z1_coer=boxZ; - else local_z1_coer=local_z1; - unitX=gridX/nprocs; - local_x0=ringid*unitX; - local_x1=(ringid+1)*unitX; + unitZ=smallZ/nprocs; /* this should always be an exact division */ + local_z0=ringid*unitZ; + local_z1=(ringid+1)*unitZ; + if (local_z1 > boxZ) local_z1_coer=boxZ; + else local_z1_coer=local_z1; + unitX=gridX/nprocs; + local_x0=ringid*unitX; + local_x1=(ringid+1)*unitX; #else - local_z0=0; - local_z1=smallZ; - local_z1_coer=boxZ; - local_x0=0; - local_x1=gridX; + local_z0=0; + local_z1=smallZ; + local_z1_coer=boxZ; + local_x0=0; + local_x1=gridX; #endif - if (local_z1_coer<=local_z0) { - LogError(EC_INFO,ALL_POS,"No real dipoles are assigned"); - local_z1_coer=local_z0; - } - local_Nz=local_z1-local_z0; - local_Nx=local_x1-local_x0; - local_Ndip=MultOverflow(boxX*(size_t)boxY,local_z1_coer-local_z0,ALL_POS,"local_Ndip"); - /* conversions to (unsigned long) are needed (to remove warnings) because %z printf argument is - * not yet supported by all target compiler environments - */ - printf("%i : %i %i %i %lu %lu \n", - ringid,local_z0,local_z1_coer,local_z1,(unsigned long)local_Ndip,(unsigned long)local_Nx); + if (local_z1_coer<=local_z0) { + LogError(EC_INFO,ALL_POS,"No real dipoles are assigned"); + local_z1_coer=local_z0; + } + local_Nz=local_z1-local_z0; + local_Nx=local_x1-local_x0; + local_Ndip=MultOverflow(boxX*(size_t)boxY,local_z1_coer-local_z0,ALL_POS,"local_Ndip"); + printf("%i : %i %i %i %u %u \n", + ringid,local_z0,local_z1_coer,local_z1,local_Ndip,local_Nx); } -//============================================================ +/*===========================================*/ void AllGather(void *x_from,void *x_to,const var_type type,size_t n_elem) -// Gather distributed arrays; works for all types + /* Gather distributed arrays; works for all types */ { #ifdef MPI - // TODO: need to be rewritten when n_elem are unequal on each processor - MPI_Datatype mes_type; - - if (type==char_type) mes_type = MPI_CHAR; - else if (type==int_type) mes_type = MPI_INT; - else if (type==double_type) mes_type = MPI_DOUBLE; - else if (type==cmplx_type) { - mes_type = MPI_DOUBLE; - n_elem *= 2; - } - else LogError(EC_ERROR,ONE_POS,"AllGather: variable type %u is not supported",type); - - MPI_Allgather(x_from,n_elem,mes_type,x_to,n_elem,mes_type,MPI_COMM_WORLD); + /* need to be rewritten when n_elem are unequal on each processor */ + MPI_Datatype mes_type; + + if (type==char_type) mes_type = MPI_CHAR; + else if (type==int_type) mes_type = MPI_INT; + else if (type==double_type) mes_type = MPI_DOUBLE; + else if (type==cmplx_type) { + mes_type = MPI_DOUBLE; + n_elem *= 2; + } + else LogError(EC_ERROR,ONE_POS,"AllGather: variable type %u is not supported",type); + + MPI_Allgather(x_from,n_elem,mes_type,x_to,n_elem,mes_type,MPI_COMM_WORLD); #endif } -//============================================================ +/*===========================================*/ #ifdef PARALLEL void CalcLocalGranulGrid(const double z0,const double z1,const double gdZ,const int gZ, - const int id,int *lz0,int *lz1) -/* calculates starting and ending (+1) cell of granule grid (lz0 & lz1) on a processor with - * ringid=id - */ + const int id,int *lz0,int *lz1) + /* calculates starting and ending (+1) cell of granule grid (lz0 & lz1) + on a processor with ringid=id */ { - int dzl,dzh; // similar to local_z0 and local_z1 - - dzl=local_Nz*id; - // should not be coerced because the result differs only for dzh>boxZ, then dzh-1>z0 - dzh=dzl+local_Nz; - if (dzl>z1) *lz0=*lz1=gZ; - else { - if (dzl>z0) *lz0=(int)floor((dzl-z0)/gdZ); - else *lz0=0; - if (dzh>z1) *lz1=gZ; - else if (dzh-1>z0) *lz1=(int)floor((dzh-z0-1)/gdZ)+1; - else *lz1=0; - } + int dzl,dzh; /* similar to local_z0 and local_z1 */ + + dzl=local_Nz*id; + dzh=dzl+local_Nz; /* should not be coerced because the result differs only for dzh>boxZ, + then dzh-1>z0 */ + if (dzl>z1) *lz0=*lz1=gZ; + else { + if (dzl>z0) *lz0=(int)floor((dzl-z0)/gdZ); + else *lz0=0; + if (dzh>z1) *lz1=gZ; + else if (dzh-1>z0) *lz1=(int)floor((dzh-z0-1)/gdZ)+1; + else *lz1=0; + } } #endif -//============================================================ +/*===========================================*/ void SetGranulComm(const double z0,const double z1,const double gdZ,const int gZ,const size_t gXY, - size_t max_gran,int *lz0,int *lz1,const int sm_gr) -/* sets communication for granule generator; max_gran - maximum number of granules in one set - * (used to allocate buffer); sm_gr - whether granules are small (simpler) - */ + size_t max_gran,int *lz0,int *lz1,const int sm_gr) + /* sets communication for granule generator + max_gran - maximum number of granules in one set (used to allocate buffer) + sm_gr - whether granules are small (simpler) */ { #ifdef PARALLEL - int i,loc0,loc1,loc1_prev=0; - - MALLOC_VECTOR(gr_comm_buf,void,max_gran*sizeof(char),ALL); - if (!sm_gr) { - if (ringid==ROOT) { - MALLOC_VECTOR(gr_comm_size,int,nprocs,ONE); - MALLOC_VECTOR(gr_comm_overl,int,nprocs-1,ONE); - // always allocated, not to mess with its freeing - MALLOC_VECTOR(gr_comm_ob,uchar,gXY,ONE); - /* The following is very inefficient (may be significantly optimized), but using one - * common function is more robust. - */ - for (i=0;i<nprocs;i++) { - CalcLocalGranulGrid(z0,z1,gdZ,gZ,i,&loc0,&loc1); - if (i==ROOT) { - *lz0=loc0; - *lz1=loc1; - } - gr_comm_size[i]=loc1-loc0; - if (i!=0) gr_comm_overl[i-1]=(loc0<loc1_prev); - loc1_prev=loc1;; - } - } - else CalcLocalGranulGrid(z0,z1,gdZ,gZ,ringid,lz0,lz1); - } + int i,loc0,loc1,loc1_prev=0; + + MALLOC_VECTOR(gr_comm_buf,void,max_gran*sizeof(char),ALL); + if (!sm_gr) { + if (ringid==ROOT) { + MALLOC_VECTOR(gr_comm_size,int,nprocs,ONE); + MALLOC_VECTOR(gr_comm_overl,int,nprocs-1,ONE); + MALLOC_VECTOR(gr_comm_ob,uchar,gXY,ONE); /* always allocated, not to mess with its freeing */ + /* that is a very inefficient way to do it (it may be significantly optiimized), but + using one common function is more robust */ + for (i=0;i<nprocs;i++) { + CalcLocalGranulGrid(z0,z1,gdZ,gZ,i,&loc0,&loc1); + if (i==ROOT) { + *lz0=loc0; + *lz1=loc1; + } + gr_comm_size[i]=loc1-loc0; + if (i!=0) gr_comm_overl[i-1]=(loc0<loc1_prev); + loc1_prev=loc1;; + } + } + else CalcLocalGranulGrid(z0,z1,gdZ,gZ,ringid,lz0,lz1); + } #else - *lz0=0; - *lz1=gZ; + *lz0=0; + *lz1=gZ; #endif } -//============================================================ +/*===========================================*/ void CollectDomainGranul(unsigned char *dom,const size_t gXY,const int lz0, - const int locgZ,TIME_TYPE *timing) -/* collects the map of domain for granule generator on the ROOT processor; - * timing is incremented by the total time used - */ + const int locgZ,TIME_TYPE *timing) + /* collects the map of domain for granule generator on the ROOT processor; + timing is incremented by the total time used */ { #ifdef PARALLEL - int i,unit,index; - size_t j; - MPI_Status status; - TIME_TYPE tstart; + int i,unit,index; + size_t j; + MPI_Status status; + TIME_TYPE tstart; #ifdef SYNCHRONIZE_TIMING - MPI_Barrier(MPI_COMM_WORLD); // synchronize to get correct timing + MPI_Barrier(MPI_COMM_WORLD); /* synchronize to get correct timing */ #endif - tstart=GET_TIME(); - unit=gXY*sizeof(char); - if (ringid==ROOT) { - index=(lz0+gr_comm_size[ROOT])*gXY; - for (i=ROOT+1;i<nprocs;i++) { - if (gr_comm_size[i]!=0) { - if (gr_comm_overl[i-1]) { - index-=gXY; - memcpy(gr_comm_ob,dom+index,unit); - } - MPI_Recv(dom+index,unit*gr_comm_size[i],MPI_UNSIGNED_CHAR,i,0,MPI_COMM_WORLD, - &status); - if (gr_comm_overl[i-1]) for (j=0;j<gXY;j++) dom[index+j]|=gr_comm_ob[j]; - index+=gXY*gr_comm_size[i]; - } - } - // that is only needed when ROOT!=0; kind of weird but should work - index=lz0*gXY; - for (i=ROOT-1;i>=0;i--) { - if (gr_comm_size[i]!=0) { - if (gr_comm_overl[i]) { - memcpy(gr_comm_ob,dom+index,unit); - index+=gXY; - } - MPI_Recv(dom+index-gXY*gr_comm_size[i],unit*gr_comm_size[i],MPI_UNSIGNED_CHAR,i,0, - MPI_COMM_WORLD,&status); - if (gr_comm_overl[i]) for (j=0;j<gXY;j++) dom[index-gXY+j]|=gr_comm_ob[j]; - index-=gXY*gr_comm_size[i]; - } - } - } - else if (locgZ!=0) { - MPI_Send(dom,unit*locgZ,MPI_UNSIGNED_CHAR,ROOT,0,MPI_COMM_WORLD); - } - (*timing)+=GET_TIME()-tstart; + tstart=GET_TIME(); + unit=gXY*sizeof(char); + if (ringid==ROOT) { + index=(lz0+gr_comm_size[ROOT])*gXY; + for (i=ROOT+1;i<nprocs;i++) { + if (gr_comm_size[i]!=0) { + if (gr_comm_overl[i-1]) { + index-=gXY; + memcpy(gr_comm_ob,dom+index,unit); + } + MPI_Recv(dom+index,unit*gr_comm_size[i],MPI_UNSIGNED_CHAR,i,0,MPI_COMM_WORLD,&status); + if (gr_comm_overl[i-1]) for (j=0;j<gXY;j++) dom[index+j]|=gr_comm_ob[j]; + index+=gXY*gr_comm_size[i]; + } + } + /* that is only needed when ROOT!=0; kind of weird but should work */ + index=lz0*gXY; + for (i=ROOT-1;i>=0;i--) { + if (gr_comm_size[i]!=0) { + if (gr_comm_overl[i]) { + memcpy(gr_comm_ob,dom+index,unit); + index+=gXY; + } + MPI_Recv(dom+index-gXY*gr_comm_size[i],unit*gr_comm_size[i],MPI_UNSIGNED_CHAR,i,0, + MPI_COMM_WORLD,&status); + if (gr_comm_overl[i]) for (j=0;j<gXY;j++) dom[index-gXY+j]|=gr_comm_ob[j]; + index-=gXY*gr_comm_size[i]; + } + } + } + else if (locgZ!=0) { + MPI_Send(dom,unit*locgZ,MPI_UNSIGNED_CHAR,ROOT,0,MPI_COMM_WORLD); + } + (*timing)+=GET_TIME()-tstart; #endif } -//============================================================ +/*===========================================*/ void FreeGranulComm(const int sm_gr) -/* frees all additional memory used for communications of granule generator; - * simpler if small granules - */ + /* frees all additional memory used for communications of granule generator; + simpler if small granules */ { #ifdef PARALLEL - Free_general(gr_comm_buf); - if (!sm_gr && ringid==ROOT) { - Free_general(gr_comm_size); - Free_general(gr_comm_overl); - Free_general(gr_comm_ob); - } + Free_general(gr_comm_buf); + if (!sm_gr && ringid==ROOT) { + Free_general(gr_comm_size); + Free_general(gr_comm_overl); + Free_general(gr_comm_ob); + } #endif } -//============================================================ +/*===========================================*/ void ExchangeFits(char *data,const size_t n,TIME_TYPE *timing) -/* performs a collective AND operation on the (vector) data; timing is incremented by the total - * time used - */ + /* performs a collective AND operation on the (vector) data; + timing is incremented by the total time used */ { #ifdef MPI - TIME_TYPE tstart; + TIME_TYPE tstart; #ifdef SYNCHRONIZE_TIMING - MPI_Barrier(MPI_COMM_WORLD); // synchronize to get correct timing + MPI_Barrier(MPI_COMM_WORLD); /* synchronize to get correct timing */ #endif - tstart=GET_TIME(); - MPI_Allreduce(data,gr_comm_buf,n,MPI_CHAR,MPI_LAND,MPI_COMM_WORLD); - memcpy(data,gr_comm_buf,n*sizeof(char)); - (*timing)+=GET_TIME()-tstart; + tstart=GET_TIME(); + MPI_Allreduce(data,gr_comm_buf,n,MPI_CHAR,MPI_LAND,MPI_COMM_WORLD); + memcpy(data,gr_comm_buf,n*sizeof(char)); + (*timing)+=GET_TIME()-tstart; #endif } diff --git a/src/comm.h b/src/comm.h index 191eead0..4939cf59 100644 --- a/src/comm.h +++ b/src/comm.h @@ -3,15 +3,15 @@ * DESCR: definitions of communication global variables * and routines * - * Copyright (C) 2006-2008 University of Amsterdam + * Copyright (C) 2006-2007 University of Amsterdam * This code is covered by the GNU General Public License. */ #ifndef __comm_h #define __comm_h -#include "types.h" // needed for doublecomplex -#include "function.h" // for function attributes -#include "timing.h" // for TIME_TYPE +#include "types.h" /* needed for doublecomplex */ +#include "function.h" /* for function attributes */ +#include "timing.h" /* for TIME_TYPE */ typedef enum {char_type,int_type,double_type,cmplx_type} var_type; @@ -27,7 +27,7 @@ void InitComm(int *argc_p,char ***argv_p); void ParSetup(void); void MyBcast(void *data,var_type type,size_t n_elem,TIME_TYPE *timing); void BcastOrient(int *i,int *j,int *k); -// used by granule generator + /* used by granule generator */ void SetGranulComm(double z0,double z1,double gdZ,int gZ,size_t gXY,size_t buf_size,int *lz0, int *lz1,int sm_gr); void CollectDomainGranul(unsigned char *dom,size_t gXY,int lz0,int locgZ,TIME_TYPE *timing); @@ -35,27 +35,26 @@ void FreeGranulComm(int sm_gr); void ExchangeFits(char *data,const size_t n,TIME_TYPE *timing); #ifdef PARALLEL -// this functions are defined only in parallel mode +/* this functions are defined only in parallel mode */ void CatNFiles(const char *dir,const char *tmpl,const char *dest); -/* analogs of frequently used functions that should be executed only by the ROOT processor - * !!! not safe if used in constructions like { if (...) PRINTZ(...); else } - */ -# define PRINTZ if (ringid==ROOT) printf -# define FPRINTZ if (ringid==ROOT) fprintf -# define SPRINTZ if (ringid==ROOT) sprintf -# define STRCPYZ if (ringid==ROOT) strcpy -# define FCLOSEZ if (ringid==ROOT) fclose -# define FFLUSHZ if (ringid==ROOT) fflush -# define PRINTBOTHZ if (ringid==ROOT) PrintBoth +/* analogs of frequently used functions that should be executed only by the ROOT processor */ +/* not safe if used in constructions like { if PRINTZ(...); else } */ +# define PRINTZ if (ringid==ROOT) printf +# define FPRINTZ if (ringid==ROOT) fprintf +# define SPRINTZ if (ringid==ROOT) sprintf +# define STRCPYZ if (ringid==ROOT) strcpy +# define FCLOSEZ if (ringid==ROOT) fclose +# define FFLUSHZ if (ringid==ROOT) fflush +# define PRINTBOTHZ if (ringid==ROOT) PrintBoth #else -# define PRINTZ printf -# define FPRINTZ fprintf -# define SPRINTZ sprintf -# define STRCPYZ strcpy -# define FCLOSEZ fclose -# define FFLUSHZ fflush -# define PRINTBOTHZ PrintBoth +# define PRINTZ printf +# define FPRINTZ fprintf +# define SPRINTZ sprintf +# define STRCPYZ strcpy +# define FCLOSEZ fclose +# define FFLUSHZ fflush +# define PRINTBOTHZ PrintBoth #endif -#endif // __comm_h +#endif /*__comm_h*/ diff --git a/src/const.h b/src/const.h index af84c70e..d2b234aa 100644 --- a/src/const.h +++ b/src/const.h @@ -9,33 +9,31 @@ #ifndef __const_h #define __const_h -// version number (string) -#define ADDA_VERSION "0.79a3" +/* version number (string) */ +#define ADDA_VERSION "0.78" -// basic constants -#define FALSE 0 -#define TRUE 1 // do not use for comparison, since true may be != TRUE; use only for assignment -#define UNDEF -1 // should be used only for variables, which are naturally non-negative - // denotes that shape accepts single filename argument; used in definitions of suboptions -#define FNAME_ARG -2 +/* basic constants */ +#define FALSE 0 +#define TRUE 1 /* better not use for comparison, since true may be != TRUE + use only for assignment */ +#define UNDEF -1 /* should be used only for variables, + which are naturally non-negative */ -// simple functions +/* simple functions */ #define MIN(A,B) (((A) > (B)) ? (B) : (A)) #define MAX(A,B) (((A) < (B)) ? (B) : (A)) -#define IS_EVEN(A) (((A)%2) == 0) -#define LENGTH(A) ((int)(sizeof(A)/sizeof(A[0]))) // length of any array (converted to int) +#define LENGTH(A) ((int)(sizeof(A)/sizeof(A[0]))) /* length of any array (converted to int) */ -// parallel definitions +/* parallel definitions */ #ifdef MPI #define PARALLEL #endif -/* ringid of root processor. Using ROOT!=0 should work, however it was not thoroughly tested. - * Hence do not change without necessity. - */ -#define ROOT 0 +#define ROOT 0 /* ringid of root processor */ + /* Using ROOT!=0 should work, however it was not thoroughly tested. + Hence do not change without necessity */ -// math constants rounded for 32 decimals +/* math constants rounded for 32 decimals */ #define PI 3.1415926535897932384626433832795 #define TWO_PI 6.283185307179586476925286766559 #define FOUR_PI 12.566370614359172953850573533118 @@ -54,158 +52,158 @@ #define EULER 0.57721566490153286060651209008241 #define FULL_ANGLE 360.0 -// determines the maximum number representable by size_t +/* determines the maximum number representable by size_t */ #ifndef SIZE_MAX #define SIZE_MAX ((size_t)-1) #endif -// sets the maximum box size; otherwise 'position' should be changed +/* sets the maximum box size; otherwise 'position' should be changed */ #define BOX_MAX USHRT_MAX -// sizes of some arrays -#define MAX_NMAT 15 // maximum number of different refractive indices (<256) -#define MAX_N_SH_PARMS 25 // maximum number of shape parameters -#define MAX_N_BEAM_PARMS 4 // maximum number of beam parameters - -// sizes of filenames and other strings -#define MAX_DIRNAME 300 // maximum length of dirname; increase THIS if any errors appear -#define MAX_FNAME_SH 100 // maximum length of filename (used for known names) -#define MAX_TMP_FNAME_SH 15 // maximum length of names of temporary files (short) -#define MAX_SYSTEM_CALL 10 // maximum string length of system call (itself) -#define MAX_WORD 10 // maximum length of a short word -#define MAX_LINE 50 // maximum length of a line - // size of buffer for reading lines (longer lines are handled robustly) -#define BUF_LINE 150 -#define MAX_PARAGRAPH 600 // maximum length of a paragraph (few lines) - -// derived sizes - // maximum string to create directory +/* sizes of some arrays */ +#define MAX_NMAT 15 /* maximum number of different refractive indices (<256) */ +#define MAX_N_SH_PARMS 25 /* maximum number of shape parameters */ +#define MAX_N_BEAM_PARMS 4 /* maximum number of beam parameters */ + +/* sizes of filenames and other strings */ + /* maximum length of dirname; increase THIS if any errors appear */ +#define MAX_DIRNAME 300 +#define MAX_FNAME_SH 100 /* maximum length of filename (used for known names) */ +#define MAX_TMP_FNAME_SH 15 /* maximum length of names of temporary files (short) */ +#define MAX_SYSTEM_CALL 10 /* maximum string length of system call (itself) */ +#define MAX_WORD 10 /* maximum length of a short word */ +#define MAX_LINE 50 /* maximum length of a line */ + /* size of buffer for reading lines (longer lines are handled robustly) */ +#define BUF_LINE 150 +#define MAX_PARAGRAPH 600 /* maximum length of a paragraph (few lines) */ +/* derived sizes */ + /* maximum string to create directory */ #define MAX_DIRSYS (MAX_DIRNAME + MAX_SYSTEM_CALL) - // maximum length of filename (including directory name) + /* maximum length of filename (including directory name) */ #define MAX_FNAME (MAX_DIRNAME + MAX_FNAME_SH) - // maximum length of temporary filename (including directory name) + /* maximum length of temporary filename (including directory name) */ #define MAX_TMP_FNAME (MAX_DIRNAME + MAX_TMP_FNAME_SH) - // maximum message that may include a filename (for PrintError) + /* maximum message that may include a filename (for PrintError) */ #define MAX_MESSAGE (MAX_FNAME + MAX_PARAGRAPH) - // maximum message that may include 2 filenames (for LogError) + /* maximum message that may include 2 filenames (for LogError) */ #define MAX_MESSAGE2 (2*MAX_FNAME + MAX_PARAGRAPH) -// widths of terminal used for output -#define DEF_TERM_WIDTH 80 // default -#define MIN_TERM_WIDTH 20 // ADDA never takes value less than that from environmental variables - -// shape types -#define SH_SPHERE 0 // sphere -#define SH_BOX 1 // box (may be rectangular) -#define SH_PRISMA 2 // prisma (triangular) -- not operational -#define SH_LINE 3 // line with width of one dipole -#define SH_COATED 4 // coated sphere -#define SH_SPHEREBOX 5 // sphere in a box -#define SH_RBC 6 // Red Blood Cell -#define SH_ELLIPSOID 7 // general ellipsoid -#define SH_SDISK_ROT 8 // disc cut of a sphere -- not operational -#define SH_CYLINDER 9 // cylinder -#define SH_READ 10 // read from file -#define SH_EGG 11 // egg -#define SH_CAPSULE 12 // capsule -#define SH_AXISYMMETRIC 13 // axisymmetric +/* widths of terminal used for output */ +#define DEF_TERM_WIDTH 80 /* default */ +#define MIN_TERM_WIDTH 20 /* no lesser value is accepted by ADDA from environmental variables */ + +/* shape types */ +#define SH_SPHERE 0 /* sphere */ +#define SH_BOX 1 /* box (may be rectangular) */ +#define SH_PRISMA 2 /* prisma (triangular) -- not operational */ +#define SH_LINE 3 /* line with width of one dipole */ +#define SH_COATED 4 /* coated sphere */ +#define SH_SPHEREBOX 5 /* sphere in a box */ +#define SH_RBC 6 /* Red Blood Cell */ +#define SH_ELLIPSOID 7 /* general ellipsoid */ +#define SH_SDISK_ROT 8 /* disc cut of a sphere -- not operational */ +#define SH_CYLINDER 9 /* cylinder */ +#define SH_READ 10 /* read from file */ +#define SH_EGG 11 /* egg */ +#define SH_CAPSULE 12 /* capsule */ /* TO ADD NEW SHAPE - * add a define starting with 'SH_' here; the number should be different from any others in this - * list. Add a descriptive comment. - */ - -// which way to calculate coupleconstant -#define POL_CM 0 // Clausius-Mossotti -#define POL_RR 1 // Radiative Reaction correction -#define POL_LDR 2 // Lattice Dispersion Relation -#define POL_CLDR 3 // Corrected Lattice Dispersion Relation -#define POL_FCD 4 // Filtered Coupled Dipoles -#define POL_SO 5 // Second Order formulation - -// how to calculate scattering quantities -#define SQ_DRAINE 0 // classical, as Draine -#define SQ_SO 1 // Second Order formulation - -// how to calculate interaction term -#define G_POINT_DIP 0 // as point dipoles -#define G_FCD 1 // Filtered Green's tensor (Filtered Coupled Dipoles) -#define G_FCD_ST 2 // quasi-static version of FCD -#define G_SO 3 // Second Order formulation - -// ldr constants -#define LDR_B1 1.8915316 -#define LDR_B2 -0.1648469 -#define LDR_B3 1.7700004 - -// 2nd_order constants -#define SO_B1 1.5867182 -#define SO_B2 0.13488017 -#define SO_B3 0.11895826 - -// two boundaries for separation between G_SO 'close', 'median', and 'far' -#define G_BOUND_CLOSE 1 // k*R^2/d < GB_CLOSE => 'close' -#define G_BOUND_MEDIAN 1 // k*R < GB_MEDIAN => 'median' - -// iterative methods; see iterative.c for info -#define IT_CGNR 0 -#define IT_BICGSTAB 1 -#define IT_BICG_CS 2 -#define IT_QMR_CS 3 - -// type of E field calculation -#define CE_NORMAL 0 // normal -#define CE_PARPER 1 // use symmetry to calculate both incident polarizations - // from one calculation of internal fields - -// path and size of tables + add a define starting with 'SH_' here; the number should be different from + any others in this list. Add a descriptive comment. */ + + +/* which way to calculate coupleconstant */ +#define POL_CM 0 /* Clausius Mossotti */ +#define POL_RR 1 /* Radiative Reaction correction */ +#define POL_LDR 2 /* Lattice Dispersion Relation */ +#define POL_CLDR 3 /* Corrected Lattice Dispersion Relation */ +#define POL_FCD 4 /* Filtered Coupled Dipoles */ +#define POL_SO 5 /* Second Order formulation */ + +/* how to calculate scattering quantities */ +#define SQ_DRAINE 0 /* classical, as Draine */ +#define SQ_SO 1 /* Second Order formulation */ + +/* how to calculate interaction term */ +#define G_POINT_DIP 0 /* as point dipoles */ +#define G_FCD 1 /* Filtered Green's tensor (Filtered Coupled Dipoles) */ +#define G_FCD_ST 2 /* Quasistatis version of FCD */ +#define G_SO 3 /* Second Order formulation */ + +/* ldr constants */ +#define LDR_B1 1.8915316 +#define LDR_B2 -0.1648469 +#define LDR_B3 1.7700004 + +/* 2nd_order constants */ +#define SO_B1 1.5867182 +#define SO_B2 0.13488017 +#define SO_B3 0.11895826 + +/* two boundaries for separation between G_SO 'close', 'median', and 'far' */ +#define G_BOUND_CLOSE 1 /* k*R^2/d < GB_CLOSE => 'close' */ +#define G_BOUND_MEDIAN 1 /* k*R < GB_MEDIAN => 'median' */ + +/* iterative methods; see iterative.c for info */ +#define IT_CGNR 0 +#define IT_BICGSTAB 1 +#define IT_BICG_CS 2 +#define IT_QMR_CS 3 + +/* type of E field calculation */ +#define CE_NORMAL 0 /* normal */ +#define CE_PARPER 1 /* use symmetry to calculate both incident polarizations + from one calculation of internal fields */ + +/* path and size of tables */ #define TAB_PATH "tables/" -#define TAB_FNAME(a) "t" #a "f.dat" // a is a number, e.g. TAB_FNAME(2) -> "t2f.dat" +#define TAB_FNAME(a) "t" #a "f.dat" /* a is a number, e.g. TAB_FNAME(2) -> "t2f.dat" */ #define TAB_SIZE 142 #define TAB_RMAX 10 -// beam types -#define B_PLANE 0 -#define B_LMINUS 1 -#define B_DAVIS3 2 -#define B_BARTON5 3 - -// types of scattering grid -#define SG_GRID 0 // grid of angles -#define SG_PAIRS 1 // set of independent pairs -// types of angles set -#define SG_RANGE 0 // range with uniformly spaced points -#define SG_VALUES 1 // any set of values - -// types of phi_integr (should be different one-bit numbers) -#define PHI_UNITY 1 // just integrate -#define PHI_COS2 2 // integrate with cos(2*phi) -#define PHI_SIN2 4 // integrate with sin(2*phi) -#define PHI_COS4 8 // integrate with cos(4*phi) -#define PHI_SIN4 16 // integrate with sin(4*phi) - -// ways to treat particle symmetries -#define SYM_AUTO 0 // automatic -#define SYM_NO 1 // do not take into account -#define SYM_ENF 2 // enforce - -// types of checkpoint (to save) -#define CHP_NONE 0 // do not save checkpoint -#define CHP_NORMAL 1 // save checkpoint if not finished in time and exit -#define CHP_REGULAR 2 // save checkpoints in regular time intervals (until finished or halted) -#define CHP_ALWAYS 3 /* save checkpoint either if finished or time elapsed - * and calculate all scattering quantities - */ -// return values for functions -#define CHP_EXIT -2 // exit after saving checkpoint - -// default values; other are specified in InitVariables (param.c) +/* beam types */ +#define B_PLANE 0 +#define B_LMINUS 1 +#define B_DAVIS3 2 +#define B_BARTON5 3 + +/* types of scattering grid */ +#define SG_GRID 0 /* grid of angles */ +#define SG_PAIRS 1 /* set of independent pairs */ +/* types of angles set */ +#define SG_RANGE 0 /* range with uniformly spaced points */ +#define SG_VALUES 1 /* any set of values */ + +/* types of phi_integr (should be different one-bit numbers) */ +#define PHI_UNITY 1 /* just integrate */ +#define PHI_COS2 2 /* integrate with cos(2*phi) */ +#define PHI_SIN2 4 /* integrate with sin(2*phi) */ +#define PHI_COS4 8 /* integrate with cos(4*phi) */ +#define PHI_SIN4 16 /* integrate with sin(4*phi) */ + +/* ways to treat particle symmetries */ +#define SYM_AUTO 0 /* automatic */ +#define SYM_NO 1 /* do not take into account */ +#define SYM_ENF 2 /* enforce */ + +/* types of checkpoint (to save) */ +#define CHP_NONE 0 /* do not save checkpoint */ +#define CHP_NORMAL 1 /* save checkpoint if not finished in time and exit */ +#define CHP_REGULAR 2 /* save checkpoints in regular time intervals + (until finished or halted) */ +#define CHP_ALWAYS 3 /* save checkpoint either if finished or time elapsed + and calculate all the scattering quantities */ + +/* return values for functions */ +#define CHP_EXIT -2 /* exit after saving checkpoint */ + +/* default values; other are specified in InitVariables (param.c) */ #define DEF_GRID (16*jagged) -#define MIN_AUTO_GRID 16 // minimum grid, when set from default dpl +#define MIN_AUTO_GRID 16 /* minimum grid, when set from default dpl */ -// numbers less than this value (compared to unity) are considered to be zero +/* numbers less than this value (compared to unity) are considered to be zero */ #define ROUND_ERR 1E-15 -// output and input file and directory names (can only be changed at compile time) +/* output and input file and dir names (can only be changed at compile time) */ #define F_EXPCOUNT "ExpCount" #define F_EXPCOUNT_LCK F_EXPCOUNT ".lck" #define F_CS "CrossSec" @@ -213,21 +211,20 @@ #define F_INTFLD "IntField" #define F_DIPPOL "DipPol" #define F_BEAM "IncBeam" -#define F_GRANS "granules" - // suffixes + /* suffixes */ #define F_XSUF "-X" #define F_YSUF "-Y" - // logs + /* logs */ #define F_LOG "log" -#define F_LOG_ERR "logerr.%d" // ringid as argument +#define F_LOG_ERR "logerr.%d" /* ringid as argument */ #define F_LOG_ORAVG "log_orient_avg" #define F_LOG_INT_CSCA "log_int_Csca" #define F_LOG_INT_ASYM "log_int_asym" - // log suffixes + /* log suffixes */ #define F_LOG_X "_x" #define F_LOG_Y "_y" #define F_LOG_Z "_z" - // Mueller files + /* mueller files */ #define F_MUEL "mueller" #define F_MUEL_SG "mueller_scatgrid" #define F_MUEL_INT "mueller_integr" @@ -235,42 +232,41 @@ #define F_MUEL_S2 "mueller_integr_s2" #define F_MUEL_C4 "mueller_integr_c4" #define F_MUEL_S4 "mueller_integr_s4" - // temporary files; used in printf with ringid as argument + /* temporary files; used in printf with ringid as argument */ #define F_BEAM_TMP "b%d.tmp" #define F_INTFLD_TMP "f%d.tmp" #define F_DIPPOL_TMP "p%d.tmp" #define F_GEOM_TMP "g%d.tmp" - // checkpoint files + /* checkpoint files */ #define F_CHP_LOG "chp.log" -#define F_CHP "chp.%d" // ringid as argument +#define F_CHP "chp.%d" /* ringid as argument */ -// default file and directory names; can be changed by command line options +/* default file and dir names; can be changed by command line options */ #define FD_ALLDIR_PARMS "alldir_params.dat" #define FD_AVG_PARMS "avg_params.dat" #define FD_SCAT_PARMS "scat_params.dat" #define FD_CHP_DIR "chpoint" -// number of components of D +/* number of components of D */ #define NDCOMP 6 -// shape formats; numbers should be nonnegative -#define SF_TEXT 0 // ADDA text format for one-domain particles -#define SF_TEXT_EXT 1 // ADDA text format for multi-domain particles -#define SF_DDSCAT 2 // DDSCAT 6.1 format (FRMFIL), produced by calltarget - -//************* Global Defines and Data structures (all for LogError) ***************** - -#define POSIT __FILE__,__LINE__ // position of the error in source code -// who definitions -#define ALL 0 // each processor may report this error -#define ONE 1 // only root processor reports an error - -// derived; for simplicity -#define ALL_POS ALL,POSIT -#define ONE_POS ONE,POSIT -// error codes -#define EC_ERROR 1 // error -#define EC_WARN 2 // warning -#define EC_INFO 3 // slight warning, that does not interfere at all with normal execution - -#endif // __const_h +/* shape formats; numbers should be nonnegative */ +#define SF_TEXT 0 /* ADDA text format for one-domain particles */ +#define SF_TEXT_EXT 1 /* ADDA text format for multi-domain particles */ +#define SF_DDSCAT 2 /* DDSCAT 6.1 format (FRMFIL), produced by calltarget */ + +/************** Global Defines and Data structures (all for LogError) *****************/ + +#define POSIT __FILE__,__LINE__ /* position of the error in source code */ +/* who definitions */ +#define ALL 0 /* each processor may report this error */ +#define ONE 1 /* only root processor reports an error */ + /* derived; for simplicity */ +#define ALL_POS ALL,POSIT +#define ONE_POS ONE,POSIT +/* error codes */ +#define EC_ERROR 1 /* error */ +#define EC_WARN 2 /* warning */ +#define EC_INFO 3 /* slight warning, that does not interfere at all with normal execution */ + +#endif /*__const_h*/ diff --git a/src/crosssec.c b/src/crosssec.c index b8a23081..07fb2d1c 100644 --- a/src/crosssec.c +++ b/src/crosssec.c @@ -1,6 +1,6 @@ /* FILE : crosssec.c * AUTH : Maxim Yurkin - * DESCR: All the functions to calculate scattering quantities (except Mueller matrix). + * DESCR: All the functions to calculate scattering qunatities (except Mueller matrix). * Functions to read different parameters from files. * Initialization of orientation of the particle. * @@ -25,1042 +25,1042 @@ #include "timing.h" #include "function.h" -// SEMI-GLOBAL VARIABLES +/* SEMI-GLOBAL VARIABLES */ -// defined and initialized in calculator.c +/* defined and initialized in calculator.c */ extern double *E2_alldir,*E2_alldir_buffer; extern const doublecomplex cc[][3]; extern doublecomplex *expsX,*expsY,*expsZ; -// defined and initialized in GenerateB.c +/* defined and initialized in GenerateB.c */ extern const double beam_center_0[3]; -// defined and initialized in param.c +/* defined and initialized in param.c */ extern const double prop_0[3],incPolX_0[3],incPolY_0[3]; extern const int ScatRelation; -// defined and initialized in timing.c -extern TIME_TYPE Timing_EField_ad,Timing_comm_EField_ad,Timing_EField_sg,Timing_comm_EField_sg, -Timing_ScatQuan_comm; +/* defined and initialized in timing.c */ +extern TIME_TYPE Timing_EField_ad,Timing_comm_EField_ad, + Timing_EField_sg,Timing_comm_EField_sg,Timing_ScatQuan_comm; -// used in CalculateE.c +/* used in CalculateE.c */ Parms_1D phi_sg; -// used in calculator.c -Parms_1D parms_alpha; // parameters of integration over alpha -Parms_1D parms[2]; // parameters for integration over theta,phi or beta,gamma -angle_set beta_int,gamma_int,theta_int,phi_int; // sets of angles -// used in param.c -char avg_string[MAX_PARAGRAPH]; // string for output of function that reads averaging parameters -// used in Romberg.c -int full_al_range; // whether full range of alpha angle is used +/* used in calculator.c */ +Parms_1D parms_alpha; /* parameters of integration over alpha */ +Parms_1D parms[2]; /* parameters for integration over theta,phi or beta,gamma */ +angle_set beta_int,gamma_int,theta_int,phi_int; /* sets of angles */ +/* used in param.c */ +char avg_string[MAX_PARAGRAPH]; /* string for output of function that reads averaging parameters */ +/* used in Romberg.c */ +int full_al_range; /* whether full range of alpha angle is used */ -//===================================================================== +/*=====================================================================*/ INLINE int AlldirIndex(const int theta,const int phi) -// Convert the (theta,phi) couple into a linear array index + /* Convert the (theta,phi) couple into a linear array index */ { - return (theta*phi_int.N + phi); + return (theta*phi_int.N + phi); } -//===================================================================== +/*=====================================================================*/ void InitRotation (void) -/* initialize matrices used for reference frame transformation; based on Mishchenko M.I. - * "Calculation of the amplitude matrix for a nonspherical particle in a fixed orientation", - * Applied Optics 39(6):1026-1031. This is so-called zyz-notation or y-convention. - */ + /* initialize matrices used for reference frame transformation + based on Mishchenko,M.I. "Calculation of the amplitude matrix + for a nonspherical particle in a fixed orientation", + Applied Optics 39(6):1026-1031. This is so-called zyz-notation + or y-convention. */ { - double ca,sa,cb,sb,cg,sg; - double beta_matr[3][3]; - double alph,bet,gam; // in radians - - // initialization of angle values in radians - alph=Deg2Rad(alph_deg); - bet=Deg2Rad(bet_deg); - gam=Deg2Rad(gam_deg); - // calculation of rotation matrix - ca=cos(alph); - sa=sin(alph); - cb=cos(bet); - sb=sin(bet); - cg=cos(gam); - sg=sin(gam); - - beta_matr[0][0]=ca*cb*cg-sa*sg; - beta_matr[0][1]=sa*cb*cg+ca*sg; - beta_matr[0][2]=-sb*cg; - beta_matr[1][0]=-ca*cb*sg-sa*cg; - beta_matr[1][1]=-sa*cb*sg+ca*cg; - beta_matr[1][2]=sb*sg; - beta_matr[2][0]=ca*sb; - beta_matr[2][1]=sa*sb; - beta_matr[2][2]=cb; - // rotation of incident field - MatrVec(beta_matr,prop_0,prop); - MatrVec(beta_matr,incPolY_0,incPolY); - MatrVec(beta_matr,incPolX_0,incPolX); - if (beam_asym) MatrVec(beta_matr,beam_center_0,beam_center); + double ca,sa,cb,sb,cg,sg; + double beta_matr[3][3]; + double alph,bet,gam; /* in radians */ + + /* initialization of angle values in radians */ + alph=Deg2Rad(alph_deg); + bet=Deg2Rad(bet_deg); + gam=Deg2Rad(gam_deg); + /* calculation of rotation matrix */ + ca=cos(alph); + sa=sin(alph); + cb=cos(bet); + sb=sin(bet); + cg=cos(gam); + sg=sin(gam); + + beta_matr[0][0]=ca*cb*cg-sa*sg; + beta_matr[0][1]=sa*cb*cg+ca*sg; + beta_matr[0][2]=-sb*cg; + beta_matr[1][0]=-ca*cb*sg-sa*cg; + beta_matr[1][1]=-sa*cb*sg+ca*cg; + beta_matr[1][2]=sb*sg; + beta_matr[2][0]=ca*sb; + beta_matr[2][1]=sa*sb; + beta_matr[2][2]=cb; + /* rotation of incident field */ + MatrVec(beta_matr,prop_0,prop); + MatrVec(beta_matr,incPolY_0,incPolY); + MatrVec(beta_matr,incPolX_0,incPolX); + if (beam_asym) MatrVec(beta_matr,beam_center_0,beam_center); } -//===================================================================== -// currently not used +/*=====================================================================*/ + /* currently not used */ static int ReadLine(FILE *,const char *,char *,const int) ATT_UNUSED; -static int ReadLine(FILE *file,const char *fname, // opened file and filename - char *buf,const int buf_size) // buffer for line and its size -// reads the first uncommented line; returns 1 if EOF reached +static int ReadLine(FILE *file,const char *fname, /* opened file and filename */ + char *buf,const int buf_size) /* buffer for line and its size */ + /* reads the first uncommented line; returns 1 if EOF reached */ { - while (!feof(file)) { - fgets(buf,buf_size,file); - if (*buf!='#') { // if uncommented - if (strstr(buf,"\n")==NULL && !feof(file)) LogError(EC_ERROR,ONE_POS, - "Buffer overflow while reading '%s' (size of uncommented line > %d)", - fname,buf_size-1); - else return 0; // complete line is read - } // finish reading the commented line - else while (strstr(buf,"\n")==NULL && !feof(file)) fgets(buf,buf_size,file); - } - return 1; + while (!feof(file)) { + fgets(buf,buf_size,file); + if (*buf!='#') { /* if uncommented */ + if (strstr(buf,"\n")==NULL && !feof(file)) LogError(EC_ERROR,ONE_POS, + "Buffer overflow while reading '%s' (size of uncommented line > %d)", + fname,buf_size-1); + else return 0; /* complete line is read */ + } /* finish reading the commented line */ + else while (strstr(buf,"\n")==NULL && !feof(file)) fgets(buf,buf_size,file); + } + return 1; } -//===================================================================== +/*=====================================================================*/ -static void ReadLineStart(FILE *file,const char *fname, // opened file and filename - char *buf,const int buf_size, // buffer for line and its size - const char *start) // beginning of the line to search -// reads the first line that starts with 'start' +static void ReadLineStart(FILE *file,const char *fname, /* opened file and filename */ + char *buf,const int buf_size, /* buffer for line and its size */ + const char *start) /* beginning of the line to search */ + /* reads the first line that starts with 'start' */ { - while (!feof(file)) { - fgets(buf,buf_size,file); - if (strstr(buf,start)==buf) { // if correct beginning - if (strstr(buf,"\n")==NULL && !feof(file)) LogError(EC_ERROR,ONE_POS, - "Buffer overflow while reading '%s' (size of essential line > %d)", - fname,buf_size-1); - else return; // line found and fits into buffer - } // finish reading unmatched line - else while (strstr(buf,"\n")==NULL && !feof(file)) fgets(buf,buf_size,file); - } - LogError(EC_ERROR,ONE_POS,"String '%s' is not found (in correct place) in file '%s'", - start,fname); + while (!feof(file)) { + fgets(buf,buf_size,file); + if (strstr(buf,start)==buf) { /* if correct beginning */ + if (strstr(buf,"\n")==NULL && !feof(file)) LogError(EC_ERROR,ONE_POS, + "Buffer overflow while reading '%s' (size of essential line > %d)", + fname,buf_size-1); + else return; /* line found and fits into buffer */ + } /* finish reading unmatched line */ + else while (strstr(buf,"\n")==NULL && !feof(file)) fgets(buf,buf_size,file); + } + LogError(EC_ERROR,ONE_POS, + "String '%s' is not found (in correct place) in file '%s'",start,fname); } -//===================================================================== +/*=====================================================================*/ -INLINE void ScanDouble(FILE *file,const char *fname,char *buf,const int buf_size,const char *start, - double *res) -/* scans double value from a line starting with exactly 'start'; contains the same arguments as - * ReadLineStart function, plus pointer to where the result should be placed - */ +INLINE void ScanDouble(FILE *file,const char *fname, /* arguments of the ReadLineStart function */ + char *buf,const int buf_size,const char *start, /* ... */ + double *res) /* result */ + /* scans double value from a line starting with exactly 'start' */ { - ReadLineStart(file,fname,buf,buf_size,start); - if (sscanf(buf+strlen(start),"%lf",res)!=1) LogError(EC_ERROR,ONE_POS, - "Error reading value after '%s' in file '%s'",start,fname); + ReadLineStart(file,fname,buf,buf_size,start); + if (sscanf(buf+strlen(start),"%lf",res)!=1) LogError(EC_ERROR,ONE_POS, + "Error reading value after '%s' in file '%s'",start,fname); } -//===================================================================== +/*=====================================================================*/ -INLINE void ScanInt(FILE *file,const char *fname,char *buf,const int buf_size,const char *start, - int *res) -/* scans integer value from a line starting with exactly 'start'; contains the same arguments as - * ReadLineStart function, plus pointer to where the result should be placed - */ +INLINE void ScanInt(FILE *file,const char *fname, /* arguments of the ReadLineStart function */ + char *buf,const int buf_size,const char *start, /* ... */ + int *res) /* result */ + /* scans integer value from a line starting with exactly 'start' */ { - double tmp; - - ReadLineStart(file,fname,buf,buf_size,start); - if (sscanf(buf+strlen(start),"%lf",&tmp)!=1) LogError(EC_ERROR,ONE_POS, - "Error reading value after '%s' in file '%s'",start,fname); - if (tmp<INT_MIN || tmp>INT_MAX) LogError(EC_ERROR,ONE_POS, - "Value after '%s' in file '%s' is out of integer bounds",start,fname); - if (sscanf(buf+strlen(start),"%d",res)!=1) LogError(EC_ERROR,ONE_POS, - "Error reading value after '%s' in file '%s'",start,fname); -} - -//===================================================================== + double tmp; -INLINE void ScanSizet(FILE *file,const char *fname,char *buf,const int buf_size,const char *start, - size_t *res) -/* scans large integer value from a line starting with exactly 'start'; contains the same arguments - * as ReadLineStart function, plus pointer to where the result should be placed. Conversion from - * (unsigned long) is needed (to remove warnings) because %z printf argument is not yet supported by - * all target compiler environments - */ + ReadLineStart(file,fname,buf,buf_size,start); + if (sscanf(buf+strlen(start),"%lf",&tmp)!=1) LogError(EC_ERROR,ONE_POS, + "Error reading value after '%s' in file '%s'",start,fname); + if (tmp <INT_MIN || tmp>INT_MAX) LogError(EC_ERROR,ONE_POS, + "Value after '%s' in file '%s' is out of integer bounds",start,fname); + if (sscanf(buf+strlen(start),"%d",res)!=1) LogError(EC_ERROR,ONE_POS, + "Error reading value after '%s' in file '%s'",start,fname); -{ - double tmp; - unsigned long res_tmp; - - ReadLineStart(file,fname,buf,buf_size,start); - if (sscanf(buf+strlen(start),"%lf",&tmp)!=1) LogError(EC_ERROR,ONE_POS, - "Error reading value after '%s' in file '%s'",start,fname); - if (tmp<0 || tmp>SIZE_MAX) LogError(EC_ERROR,ONE_POS, - "Value after '%s' in file '%s' is out of size_t bounds",start,fname); - if (sscanf(buf+strlen(start),"%lu",&res_tmp)!=1) LogError(EC_ERROR,ONE_POS, - "Error reading value after '%s' in file '%s'",start,fname); - *res=(size_t)res_tmp; } -//===================================================================== +/*=====================================================================*/ -INLINE void ScanString(FILE *file,const char *fname,char *buf,const int buf_size,const char *start, - char *res) -/* scans string value from a line starting with exactly 'start'; contains the same arguments as - * ReadLineStart function, plus pointer to where the result should be placed - */ +INLINE void ScanString(FILE *file,const char *fname, /* arguments of the ReadLineStart function */ + char *buf,const int buf_size,const char *start, /* ... */ + char *res) /* result */ + /* scans string value from a line starting with exactly 'start' */ { - ReadLineStart(file,fname,buf,buf_size,start); - if (sscanf(buf+strlen(start),"%s",res)!=1) LogError(EC_ERROR,ONE_POS, - "Error reading value after '%s' in file '%s'",start,fname); + ReadLineStart(file,fname,buf,buf_size,start); + if (sscanf(buf+strlen(start),"%s",res)!=1) LogError(EC_ERROR,ONE_POS, + "Error reading value after '%s' in file '%s'",start,fname); } -//===================================================================== +/*=====================================================================*/ static void ScanIntegrParms( - FILE *file,const char *fname, // opened file and filename - angle_set *a, // pointer to angle set - Parms_1D *b, // pointer to parameters of integration - const int ifcos, // if space angles equally in cos - char *buf,char* temp, // 2 buffers - const int buf_size) // and their size -// scan integration parameters for angles from file + FILE *file,const char *fname, /* opened file and filename */ + angle_set *a, /* pointer to angle set */ + Parms_1D *b, /* pointer to parameters of integration */ + const int ifcos, /* if space angles equally in cos */ + char *buf,char* temp, /* 2 buffers */ + const int buf_size) /* and their size */ + /* scan integration parameters for angles from file */ { - size_t i; - double unit; - - // scan file - ScanDouble(file,fname,buf,buf_size,"min=",&(a->min)); - ScanDouble(file,fname,buf,buf_size,"max=",&(a->max)); - ScanInt(file,fname,buf,buf_size,"Jmin=",&(b->Jmin)); - ScanInt(file,fname,buf,buf_size,"Jmax=",&(b->Jmax)); - ScanDouble(file,fname,buf,buf_size,"eps=",&(b->eps)); - ScanString(file,fname,buf,buf_size,"equiv=",temp); - if (strcmp(temp,"true")==0) b->equival=TRUE; - else if (strcmp(temp,"false")==0) b->equival=FALSE; - else LogError(EC_ERROR,ONE_POS,"Wrong argument of 'equiv' option in file %s",fname); - ScanString(file,fname,buf,buf_size,"periodic=",temp); - if (strcmp(temp,"true")==0) b->periodic=TRUE; - else if (strcmp(temp,"false")==0) b->periodic=FALSE; - else LogError(EC_ERROR,ONE_POS,"Wrong argument of 'periodic' option in file %s",fname); - - // fill all parameters - if (a->min==a->max) { - a->N=b->Grid_size=1; - b->Jmax=1; - } - else { - // consistency check - if (a->min>a->max) LogError(EC_ERROR,ONE_POS, - "Wrong range (min=%g, max=%g) in file %s (max must be >= min)",a->min,a->max,fname); - if (b->Jmax<b->Jmin) LogError(EC_ERROR,ONE_POS, - "Wrong Jmax (%d) in file %s; it must be >= Jmin (%d)",b->Jmax,fname,b->Jmin); - if (b->Jmin<1) LogError(EC_ERROR,ONE_POS, - "Wrong Jmin (%d) in file %s (must be >=1)",b->Jmin,fname); - if (b->eps<0) LogError(EC_ERROR,ONE_POS, - "Wrong eps (%g) in file %s (must be >=0)",b->eps,fname); - if (b->Jmax >= (int)(8*sizeof(int))) LogError(EC_ERROR,ONE_POS, - "Too large Jmax(%d) in file %s, it will cause integer overflow",b->Jmax,fname); - - a->N=b->Grid_size=(1 << b->Jmax) + 1; - if (b->equival && a->N>1) (a->N)--; - } - // initialize points of integration - MALLOC_VECTOR(a->val,double,a->N,ALL); - memory += a->N*sizeof(double); - - if (ifcos) { // make equal intervals in cos(angle) - // consistency check - if (a->min<0) LogError(EC_ERROR,ONE_POS, - "Wrong min (%g) in file %s (must be >=0 for this angle)",a->min,fname); - if (a->max>180) LogError(EC_ERROR,ONE_POS, - "Wrong max (%g) in file %s (must be <=180 for this angle)",a->max,fname); - b->min=cos(Deg2Rad(a->max)); - b->max=cos(Deg2Rad(a->min)); - if (fabs(b->min)<ROUND_ERR) b->min=0; // just for convenience of display in log file - if (fabs(b->max)<ROUND_ERR) b->max=0; - if (b->Grid_size==1) a->val[0]=a->min; - else { - unit = (b->max - b->min)/(b->Grid_size-1); - for (i=0;i<a->N;i++) a->val[i] = Rad2Deg(acos(b->min+unit*i)); - } - } - else { // make equal intervals in angle - b->min=Deg2Rad(a->min); - b->max=Deg2Rad(a->max); - if (b->Grid_size==1) a->val[0]=a->min; - else { - unit = (a->max - a->min)/(b->Grid_size-1); - for (i=0;i<a->N;i++) a->val[i] = a->min + unit*i; - } - } + size_t i; + double unit; + + /* scan file */ + ScanDouble(file,fname,buf,buf_size,"min=",&(a->min)); + ScanDouble(file,fname,buf,buf_size,"max=",&(a->max)); + ScanInt(file,fname,buf,buf_size,"Jmin=",&(b->Jmin)); + ScanInt(file,fname,buf,buf_size,"Jmax=",&(b->Jmax)); + ScanDouble(file,fname,buf,buf_size,"eps=",&(b->eps)); + ScanString(file,fname,buf,buf_size,"equiv=",temp); + if (strcmp(temp,"true")==0) b->equival=TRUE; + else if (strcmp(temp,"false")==0) b->equival=FALSE; + else LogError(EC_ERROR,ONE_POS,"Wrong argument of 'equiv' option in file %s",fname); + ScanString(file,fname,buf,buf_size,"periodic=",temp); + if (strcmp(temp,"true")==0) b->periodic=TRUE; + else if (strcmp(temp,"false")==0) b->periodic=FALSE; + else LogError(EC_ERROR,ONE_POS,"Wrong argument of 'periodic' option in file %s",fname); + + /* fill all parameters */ + if (a->min==a->max) { + a->N=b->Grid_size=1; + b->Jmax=1; + } + else { + /* consistency check */ + if (a->min>a->max) LogError(EC_ERROR,ONE_POS, + "Wrong range (min=%g, max=%g) in file %s (max must be >= min)",a->min,a->max,fname); + if (b->Jmax<b->Jmin) LogError(EC_ERROR,ONE_POS, + "Wrong Jmax (%d) in file %s; it must be >= Jmin (%d)",b->Jmax,fname,b->Jmin); + if (b->Jmin<1) LogError(EC_ERROR,ONE_POS, + "Wrong Jmin (%d) in file %s (must be >=1)",b->Jmin,fname); + if (b->eps<0) LogError(EC_ERROR,ONE_POS, + "Wrong eps (%g) in file %s (must be >=0)",b->eps,fname); + if (b->Jmax >= (int)(8*sizeof(int))) LogError(EC_ERROR,ONE_POS, + "Too large Jmax(%d) in file %s, it will cause integer overflow",b->Jmax,fname); + + a->N=b->Grid_size=(1 << b->Jmax) + 1; + if (b->equival && a->N>1) (a->N)--; + } + /* initialize points of integration */ + MALLOC_VECTOR(a->val,double,a->N,ALL); + memory += a->N*sizeof(double); + + if (ifcos) { /* make equal intervals in cos(angle) */ + /* consistency check */ + if (a->min<0) LogError(EC_ERROR,ONE_POS, + "Wrong min (%g) in file %s (must be >=0 for this angle)",a->min,fname); + if (a->max>180) LogError(EC_ERROR,ONE_POS, + "Wrong max (%g) in file %s (must be <=180 for this angle)",a->max,fname); + b->min=cos(Deg2Rad(a->max)); + b->max=cos(Deg2Rad(a->min)); + if (fabs(b->min)<ROUND_ERR) b->min=0; /* just for convenience of display in log file */ + if (fabs(b->max)<ROUND_ERR) b->max=0; + if (b->Grid_size==1) a->val[0]=a->min; + else { + unit = (b->max - b->min)/(b->Grid_size-1); + for (i=0;i<a->N;i++) a->val[i] = Rad2Deg(acos(b->min+unit*i)); + } + } + else { /* make equal intervals in angle */ + b->min=Deg2Rad(a->min); + b->max=Deg2Rad(a->max); + if (b->Grid_size==1) a->val[0]=a->min; + else { + unit = (a->max - a->min)/(b->Grid_size-1); + for (i=0;i<a->N;i++) a->val[i] = a->min + unit*i; + } + } } -//===================================================================== +/*=====================================================================*/ -static int ScanAngleSet(FILE *file,const char *fname, // opened file and filename - angle_set *a, // pointers to angle set - char *buf,char *temp, // 2 buffers - const int buf_size) // and their size -// scan range or set of angles (theta or phi) from file (used for scat_grid) +static int ScanAngleSet(FILE *file,const char *fname, /* opened file and filename */ + angle_set *a, /* pointers to angle set */ + char *buf,char *temp, /* 2 buffers */ + const int buf_size) /* and their size */ + /* scan range or set of angles (theta or phi) from file (used for scat_grid) */ { - size_t i; - double unit; - - ScanString(file,fname,buf,buf_size,"type=",temp); - ScanSizet(file,fname,buf,buf_size,"N=",&(a->N)); - // initialize angle array - MALLOC_VECTOR(a->val,double,a->N,ALL); - memory += a->N*sizeof(double); - - if (strcmp(temp,"range")==0) { - ScanDouble(file,fname,buf,buf_size,"min=",&(a->min)); - ScanDouble(file,fname,buf,buf_size,"max=",&(a->max)); - if (a->min>a->max) LogError(EC_ERROR,ONE_POS, - "Wrong range (min=%g, max=%g) in file %s (max must be >= min)",a->min,a->max,fname); - if (a->N==1) a->val[0]=(a->max + a->min)/2; - else { - unit = (a->max - a->min)/(a->N - 1); - for (i=0;i<a->N;i++) a->val[i] = a->min + unit*i; - } - return SG_RANGE; - } - else if (strcmp(temp,"values")==0) { - ReadLineStart(file,fname,buf,buf_size,"values="); - for (i=0;i<a->N;i++) { - fgets(buf,buf_size,file); - if (strstr(buf,"\n")==NULL && !feof(file)) LogError(EC_ERROR,ONE_POS, - "Buffer overflow while scanning lines in file '%s' (line size > %d)", - fname,buf_size-1); - if (sscanf(buf,"%lf\n",a->val+i)!=1) LogError(EC_ERROR,ONE_POS, - "Failed scanning values from line '%s' in file '%s'",buf,fname); - } - return SG_VALUES; - } - else LogError(EC_ERROR,ONE_POS,"Unknown type '%s' in file '%s'",temp,fname); - // not actually reached - return -1; + size_t i; + double unit; + int value; + + ScanString(file,fname,buf,buf_size,"type=",temp); + ScanInt(file,fname,buf,buf_size,"N=",&value); + if (value<=0) LogError(EC_ERROR,ONE_POS, + "Number of angles in file '%s' (after 'N=') must be positive",fname); + else a->N=value; + /* initialize angle array */ + MALLOC_VECTOR(a->val,double,a->N,ALL); + memory += a->N*sizeof(double); + + if (strcmp(temp,"range")==0) { + ScanDouble(file,fname,buf,buf_size,"min=",&(a->min)); + ScanDouble(file,fname,buf,buf_size,"max=",&(a->max)); + if (a->min>a->max) LogError(EC_ERROR,ONE_POS, + "Wrong range (min=%g, max=%g) in file %s (max must be >= min)",a->min,a->max,fname); + if (a->N==1) a->val[0]=(a->max + a->min)/2; + else { + unit = (a->max - a->min)/(a->N - 1); + for (i=0;i<a->N;i++) a->val[i] = a->min + unit*i; + } + return SG_RANGE; + } + else if (strcmp(temp,"values")==0) { + ReadLineStart(file,fname,buf,buf_size,"values="); + for (i=0;i<a->N;i++) { + fgets(buf,buf_size,file); + if (strstr(buf,"\n")==NULL && !feof(file)) LogError(EC_ERROR,ONE_POS, + "Buffer overflow while scanning lines in file '%s' (line size > %d)",fname,buf_size-1); + if (sscanf(buf,"%lf\n",a->val+i)!=1) LogError(EC_ERROR,ONE_POS, + "Failed scanning values from line '%s' in file '%s'",buf,fname); + } + return SG_VALUES; + } + else LogError(EC_ERROR,ONE_POS,"Unknown type '%s' in file '%s'",temp,fname); + /* not actually reached */ + return -1; } -//===================================================================== +/*=====================================================================*/ void ReadAvgParms(const char *fname) -// read parameters of orientation averaging from a file + /* read parameters of orientation averaging from a file */ { - FILE *input; - char buf[BUF_LINE],temp[BUF_LINE]; - - // open file - input=FOpenErr(fname,"r",ALL_POS); - //scan file - ReadLineStart(input,fname,buf,BUF_LINE,"alpha:"); - ScanIntegrParms(input,fname,&alpha_int,&parms_alpha,FALSE,buf,temp,BUF_LINE); - full_al_range=fabs(alpha_int.max-alpha_int.min-FULL_ANGLE)<FULL_ANGLE*ROUND_ERR; - ReadLineStart(input,fname,buf,BUF_LINE,"beta:"); - ScanIntegrParms(input,fname,&beta_int,&parms[THETA],TRUE,buf,temp,BUF_LINE); - ReadLineStart(input,fname,buf,BUF_LINE,"gamma:"); - ScanIntegrParms(input,fname,&gamma_int,&parms[PHI],FALSE,buf,temp,BUF_LINE); - // close file - FCloseErr(input,fname,ALL_POS); - // print info to string - SPRINTZ(avg_string, - "alpha: from %g to %g in %lu steps\n"\ - "beta: from %g to %g in (up to) %lu steps (equally spaced in cosine values)\n"\ - "gamma: from %g to %g in (up to) %lu steps\n"\ - "see file 'log_orient_avg' for details\n", - alpha_int.min,alpha_int.max,(unsigned long)alpha_int.N,beta_int.min,beta_int.max, - (unsigned long)beta_int.N,gamma_int.min,gamma_int.max,(unsigned long)gamma_int.N); - /* conversions to (unsigned long) are needed (to remove warnings) because %z printf argument is - * not yet supported by all target compiler environmets - */ - D("ReadAvgParms finished"); + FILE *input; + char buf[BUF_LINE],temp[BUF_LINE]; + + /* open file */ + input=FOpenErr(fname,"r",ALL_POS); + /*scan file */ + ReadLineStart(input,fname,buf,BUF_LINE,"alpha:"); + ScanIntegrParms(input,fname,&alpha_int,&parms_alpha,FALSE,buf,temp,BUF_LINE); + full_al_range=fabs(alpha_int.max-alpha_int.min-FULL_ANGLE)<FULL_ANGLE*ROUND_ERR; + ReadLineStart(input,fname,buf,BUF_LINE,"beta:"); + ScanIntegrParms(input,fname,&beta_int,&parms[THETA],TRUE,buf,temp,BUF_LINE); + ReadLineStart(input,fname,buf,BUF_LINE,"gamma:"); + ScanIntegrParms(input,fname,&gamma_int,&parms[PHI],FALSE,buf,temp,BUF_LINE); + /* close file */ + FCloseErr(input,fname,ALL_POS); + /* print info to string */ + SPRINTZ(avg_string, + "alpha: from %g to %g in %u steps\n"\ + "beta: from %g to %g in (up to) %u steps (equally spaced in cosine values)\n"\ + "gamma: from %g to %g in (up to) %u steps\n"\ + "see file 'log_orient_avg' for details\n", + alpha_int.min,alpha_int.max,alpha_int.N,beta_int.min,beta_int.max,beta_int.N, + gamma_int.min,gamma_int.max,gamma_int.N); + + D("ReadAvgParms finished"); } -//===================================================================== +/*=====================================================================*/ void ReadAlldirParms(const char *fname) -/* read integration parameters for asymmetry-parameter & C_sca; should not be used together with - * orientation averaging because they use the same storage space - parms - */ + /* read integration parameters for asymmetry-paramter & C_sca + should not be used together with orientation averaging because + they use the same storage space - parms */ { - FILE *input; - char buf[BUF_LINE],temp[BUF_LINE]; - - // open file - input=FOpenErr(fname,"r",ALL_POS); - //scan file - ReadLineStart(input,fname,buf,BUF_LINE,"theta:"); - ScanIntegrParms(input,fname,&theta_int,&parms[THETA],TRUE,buf,temp,BUF_LINE); - ReadLineStart(input,fname,buf,BUF_LINE,"phi:"); - ScanIntegrParms(input,fname,&phi_int,&parms[PHI],FALSE,buf,temp,BUF_LINE); - // close file - FCloseErr(input,fname,ALL_POS); - // print info - FPRINTZ(logfile, - "\nScattered field is calculated for all directions (for integrated scattering quantities)\n"\ - "theta: from %g to %g in (up to) %lu steps (equally spaced in cosine values)\n"\ - "phi: from %g to %g in (up to) %lu steps\n"\ - "see files 'log_int_***' for details\n\n", - theta_int.min,theta_int.max,(unsigned long)theta_int.N,phi_int.min,phi_int.max, - (unsigned long)phi_int.N); - /* conversions to (unsigned long) are needed (to remove warnings) because %z printf argument is - * not yet supported by all target compiler environmets - */ - - D("ReadAlldirParms finished"); + FILE *input; + char buf[BUF_LINE],temp[BUF_LINE]; + + /* open file */ + input=FOpenErr(fname,"r",ALL_POS); + /*scan file */ + ReadLineStart(input,fname,buf,BUF_LINE,"theta:"); + ScanIntegrParms(input,fname,&theta_int,&parms[THETA],TRUE,buf,temp,BUF_LINE); + ReadLineStart(input,fname,buf,BUF_LINE,"phi:"); + ScanIntegrParms(input,fname,&phi_int,&parms[PHI],FALSE,buf,temp,BUF_LINE); + /* close file */ + FCloseErr(input,fname,ALL_POS); + /* print info */ + FPRINTZ(logfile, + "\nScattered field is calculated for all directions (for integrated scattering quantities)\n"\ + "theta: from %g to %g in (up to) %u steps (equally spaced in cosine values)\n"\ + "phi: from %g to %g in (up to) %u steps\n"\ + "see files 'log_int_***' for details\n\n", + theta_int.min,theta_int.max,theta_int.N,phi_int.min,phi_int.max,phi_int.N); + + D("ReadAlldirParms finished"); } -//===================================================================== +/*=====================================================================*/ void ReadScatGridParms(const char *fname) -// read parameters of the grid on which to calculate scattered field + /* read parameters of the grid on which to calculate scattered field */ { - FILE *input; - char buf[BUF_LINE],temp[BUF_LINE]; - int theta_type,phi_type; - size_t i; - - // open file - input=FOpenErr(fname,"r",ALL_POS); - // scan file - ScanString(input,fname,buf,BUF_LINE,"global_type=",temp); - if (strcmp(temp,"grid")==0) { - angles.type = SG_GRID; - ReadLineStart(input,fname,buf,BUF_LINE,"theta:"); - theta_type=ScanAngleSet(input,fname,&(angles.theta),buf,temp,BUF_LINE); - if (phi_integr) { - ReadLineStart(input,fname,buf,BUF_LINE,"phi_integr:"); - ScanIntegrParms(input,fname,&(angles.phi),&phi_sg,FALSE,buf,temp,BUF_LINE); - phi_type = SG_RANGE; - } - else { - ReadLineStart(input,fname,buf,BUF_LINE,"phi:"); - phi_type=ScanAngleSet(input,fname,&(angles.phi),buf,temp,BUF_LINE); - } - angles.N=MultOverflow(angles.theta.N,angles.phi.N,ONE_POS,"angles.N");; - } - else if (strcmp(temp,"pairs")==0) { - if (phi_integr) - LogError(EC_ERROR,ONE_POS,"Integration over phi can't be done with 'global_type=pairs'"); - angles.type = SG_PAIRS; - ScanSizet(input,fname,buf,BUF_LINE,"N=",&(angles.N)); - angles.theta.N=angles.phi.N=angles.N; - // malloc angle arrays - MALLOC_VECTOR(angles.theta.val,double,angles.N,ALL); - MALLOC_VECTOR(angles.phi.val,double,angles.N,ALL); - memory += 2*angles.N*sizeof(double); - - ReadLineStart(input,fname,buf,BUF_LINE,"pairs="); - for (i=0;i<angles.N;i++) { - fgets(buf,BUF_LINE,input); - if (strstr(buf,"\n")==NULL && !feof(input)) LogError(EC_ERROR,ONE_POS, - "Buffer overflow while scanning lines in file '%s' (line size > %d)", - fname,BUF_LINE-1); - if (sscanf(buf,"%lf %lf\n",angles.theta.val+i,angles.phi.val+i)!=2) LogError(EC_ERROR, - ONE_POS,"Failed scanning values from line '%s' in file '%s'",buf,fname); - } - } - else LogError(EC_ERROR,ONE_POS,"Unknown global_type '%s' in file '%s'",temp,fname); - // close file - FCloseErr(input,fname,ALL_POS); - /* print info; conversions to (unsigned long) are needed (to remove warnings) because %z printf - * argument is not yet supported by all target compiler environmets - */ - if (ringid==ROOT) { - fprintf(logfile,"\nScattered field is calculated for multiple directions\n"); - if (angles.type==SG_GRID) { - if (theta_type==SG_RANGE) - fprintf(logfile,"theta: from %g to %g in %lu steps\n",angles.theta.min, - angles.theta.max,(unsigned long)angles.theta.N); - else if (theta_type==SG_VALUES) - fprintf(logfile,"theta: %lu given values\n",(unsigned long)angles.theta.N); - if (phi_type==SG_RANGE) { - fprintf(logfile,"phi: from %g to %g in %lu steps\n",angles.phi.min,angles.phi.max, - (unsigned long)angles.phi.N); - if (phi_integr) fprintf(logfile,"(Mueller matrix is integrated over phi)\n"); - } - else if (phi_type==SG_VALUES) - fprintf(logfile,"phi: %lu given values\n",(unsigned long)angles.phi.N); - } - else if (angles.type==SG_PAIRS) - fprintf(logfile,"Total %lu given (theta,phi) pairs\n",(unsigned long)angles.N); - fprintf(logfile,"\n"); - } - D("ReadScatGridParms finished"); + FILE *input; + char buf[BUF_LINE],temp[BUF_LINE]; + int theta_type,phi_type,value; + size_t i; + + /* open file */ + input=FOpenErr(fname,"r",ALL_POS); + /* scan file */ + ScanString(input,fname,buf,BUF_LINE,"global_type=",temp); + if (strcmp(temp,"grid")==0) { + angles.type = SG_GRID; + ReadLineStart(input,fname,buf,BUF_LINE,"theta:"); + theta_type=ScanAngleSet(input,fname,&(angles.theta),buf,temp,BUF_LINE); + if (phi_integr) { + ReadLineStart(input,fname,buf,BUF_LINE,"phi_integr:"); + ScanIntegrParms(input,fname,&(angles.phi),&phi_sg,FALSE,buf,temp,BUF_LINE); + phi_type = SG_RANGE; + } + else { + ReadLineStart(input,fname,buf,BUF_LINE,"phi:"); + phi_type=ScanAngleSet(input,fname,&(angles.phi),buf,temp,BUF_LINE); + } + angles.N=MultOverflow(angles.theta.N,angles.phi.N,ONE_POS,"angles.N");; + } + else if (strcmp(temp,"pairs")==0) { + if (phi_integr) + LogError(EC_ERROR,ONE_POS,"Integration over phi can't be done with 'global_type=pairs'"); + angles.type = SG_PAIRS; + ScanInt(input,fname,buf,BUF_LINE,"N=",&value); + if (value<=0) LogError(EC_ERROR,ONE_POS, + "Number of angle pairs in file '%s' (after 'N=') must be positive",fname); + else angles.N=value; + angles.theta.N=angles.phi.N=angles.N; + /* malloc angle arrays */ + MALLOC_VECTOR(angles.theta.val,double,angles.N,ALL); + MALLOC_VECTOR(angles.phi.val,double,angles.N,ALL); + memory += 2*angles.N*sizeof(double); + + ReadLineStart(input,fname,buf,BUF_LINE,"pairs="); + for (i=0;i<angles.N;i++) { + fgets(buf,BUF_LINE,input); + if (strstr(buf,"\n")==NULL && !feof(input)) LogError(EC_ERROR,ONE_POS, + "Buffer overflow while scanning lines in file '%s' (line size > %d)",fname,BUF_LINE-1); + if (sscanf(buf,"%lf %lf\n",angles.theta.val+i,angles.phi.val+i)!=2) + LogError(EC_ERROR,ONE_POS,"Failed scanning values from line '%s' in file '%s'",buf,fname); + } + } + else LogError(EC_ERROR,ONE_POS,"Unknown global_type '%s' in file '%s'",temp,fname); + /* close file */ + FCloseErr(input,fname,ALL_POS); + /* print info */ + if (ringid==ROOT) { + fprintf(logfile,"\nScattered field is calculated for multiple directions\n"); + if (angles.type==SG_GRID) { + if (theta_type==SG_RANGE) + fprintf(logfile,"theta: from %g to %g in %u steps\n", + angles.theta.min,angles.theta.max,angles.theta.N); + else if (theta_type==SG_VALUES) + fprintf(logfile,"theta: %u given values\n",angles.theta.N); + if (phi_type==SG_RANGE) { + fprintf(logfile,"phi: from %g to %g in %u steps\n", + angles.phi.min,angles.phi.max,angles.phi.N); + if (phi_integr) fprintf(logfile,"(Mueller matrix is integrated over phi)\n"); + } + else if (phi_type==SG_VALUES) + fprintf(logfile,"phi: %u given values\n",angles.phi.N); + } + else if (angles.type==SG_PAIRS) + fprintf(logfile,"Total %u given (theta,phi) pairs\n",angles.N); + fprintf(logfile,"\n"); + } + D("ReadScatGridParms finished"); } -//=====================================================================*/ +/*=====================================================================*/ -void CalcField (doublecomplex *ebuff, // where to write calculated scattering amplitude - const double *n) // scattering direction -/* Near-optimal routine to compute the scattered fields at one specific angle (more exactly - - * scattering amplitude); Specific optimization are possible when e.g. n[0]=0 for scattering in - * yz-plane, however in this case it is very improbable that the routine will become a bottleneck. - * The latter happens mostly for cases, when grid of scattering angles is used with only small - * fraction of n, allowing simplifications. - */ +void CalcField (doublecomplex *ebuff, /* where to write calculated scattering amplitude */ + const double *n) /* scattering direction */ + /* Near-optimal routine to compute the scattered fields at one specific + angle (more exactly - scattering amplitude); + Specific optimization are possible when e.g. n[0]=0 for scattering in yz-plane, however in + this case it is very unprobable that the routine will become a bottleneck. The latter happens + mostly for cases, when grid of scattering angles is used with only small fraction of n, + allowing simplifications. */ { - double kkk; - doublecomplex a,m2,dpr; - doublecomplex sum[3],tbuff[3],tmp; - int i; - unsigned short ix,iy1,iy2,iz1,iz2; - size_t j,jjj; - double temp, na; - doublecomplex mult_mat[MAX_NMAT]; - const int scat_avg=TRUE; - - if (ScatRelation==SQ_SO) { - // !!! this should never happen - if (anisotropy) LogError(EC_ERROR,ONE_POS,"Incompatibility error in CalcField"); - // calculate correction coefficient - if (scat_avg) na=0; - else na=DotProd(n,prop); - temp=kd*kd/24; - for(i=0;i<Nmat;i++) { - cSquare(ref_index[i],m2); - // mult_mat=1-(kd^2/24)(m^2-2(n.a)m+1) - mult_mat[i][RE]=1-temp*(m2[RE]-2*na*ref_index[i][RE]+1); - mult_mat[i][IM]=temp*(2*na*ref_index[i][IM]-m2[IM]); - } - } - for(i=0;i<3;i++) sum[i][RE]=sum[i][IM]=0.0; - // prepare values of exponents, along each of the coordinates - imExp_arr(-kd*n[0],boxX,expsX); - imExp_arr(-kd*n[1],boxY,expsY); - imExp_arr(-kd*n[2],local_Nz_unif,expsZ); - /* not to double the code in the source we use two temporary defines,since the following 'if' - * cases differ only by one line of code; (taking 'if' inside the cycle will affect performance) - */ - /* this piece of code tries to use that usually only x position changes from dipole to dipole, - * saving a complex multiplication seems to be beneficial, even considering bookkeeping - * overhead; it may not be as good for very porous particles though, but for them this part of - * code is anyway fast relative to the FFT on a large grid; Further optimization is possible - * using some kind of plans, i.e. by preliminary analyzing the position of the real dipoles on - * the grid. - */ + double kkk; + doublecomplex a,m2,dpr; + doublecomplex sum[3],tbuff[3],tmp; + int i; + unsigned short ix,iy1,iy2,iz1,iz2; + size_t j,jjj; + double temp, na; + doublecomplex mult_mat[MAX_NMAT]; + const int scat_avg=TRUE; + + if (ScatRelation==SQ_SO) { + /* this should never happen !!! */ + if (anisotropy) LogError(EC_ERROR,ONE_POS,"Incompatibility error in CalcField"); + /* calculate correction coefficient */ + if (scat_avg) na=0; + else na=DotProd(n,prop); + temp=kd*kd/24; + for(i=0;i<Nmat;i++) { + cSquare(ref_index[i],m2); + /* mult_mat=1-(kd^2/24)(m^2-2(n.a)m+1) */ + mult_mat[i][RE]=1-temp*(m2[RE]-2*na*ref_index[i][RE]+1); + mult_mat[i][IM]=temp*(2*na*ref_index[i][IM]-m2[IM]); + } + } + for(i=0;i<3;i++) sum[i][RE]=sum[i][IM]=0.0; + /* prepare values of exponents, along each of the coordinates */ + imExp_arr(-kd*n[0],boxX,expsX); + imExp_arr(-kd*n[1],boxY,expsY); + imExp_arr(-kd*n[2],local_Nz_unif,expsZ); + /* not to double the code in the source we use two temporary defines, + since the following if cases differ only by one line of code; + (taking 'if' inside the cycle will affect performance) */ + /* this piece of code tries to use that usually only x position changes from dipole to dipole, + saving a complex multiplication seems to be beneficial, even considering bookkeeping overhead; + it may not be as good for very porous particles though, but for them this part of code is + anyway fast relative to the FFT on a large grid; Further optimization is possible using some + kind of plans, i.e. by preliminary analysing the position of the real dipoles on the grid. */ #define PART1\ - iy1=iz1=UNDEF;\ - for (j=0;j<local_nvoid_Ndip;++j) {\ - jjj=3*j;\ - /* a=exp(-ikr.n), but r is taken relative to the first dipole of the local box */\ - ix=position[jjj];\ - iy2=position[jjj+1];\ - iz2=position[jjj+2];\ - /* the second part is very improbable, but needed for robustness */\ - if (iy2!=iy1 || iz2!=iz1) {\ - iy1=iy2;\ - iz1=iz2;\ - cMult(expsY[iy2],expsZ[iz2],tmp);\ - }\ - cMult(tmp,expsX[ix],a); + iy1=iz1=UNDEF;\ + for (j=0;j<local_nvoid_Ndip;++j) {\ + jjj=3*j;\ + /* a=exp(-ikr.n), but r is taken relative to the first dipole of the local box */\ + ix=position[jjj];\ + iy2=position[jjj+1];\ + iz2=position[jjj+2];\ + if (iy2!=iy1 || iz2!=iz1) { /* the second part is very unprobable, but needed for robustness */\ + iy1=iy2;\ + iz1=iz2;\ + cMult(expsY[iy2],expsZ[iz2],tmp);\ + }\ + cMult(tmp,expsX[ix],a); #define PART2\ - /* sum(P*exp(-ik*r.n)) */\ - for(i=0;i<3;i++) {\ - sum[i][RE]+=pvec[jjj+i][RE]*a[RE]-pvec[jjj+i][IM]*a[IM];\ - sum[i][IM]+=pvec[jjj+i][RE]*a[IM]+pvec[jjj+i][IM]*a[RE];\ - }\ - } /* end for j */ - if (ScatRelation==SQ_SO) { - PART1 - cMultSelf(a,mult_mat[material[j]]); - PART2 - } - else { - PART1 - PART2 - } + /* sum(P*exp(-ik*r.n)) */\ + for(i=0;i<3;i++) {\ + sum[i][RE]+=pvec[jjj+i][RE]*a[RE]-pvec[jjj+i][IM]*a[IM];\ + sum[i][IM]+=pvec[jjj+i][RE]*a[IM]+pvec[jjj+i][IM]*a[RE];\ + }\ + } /* end for j */ + if (ScatRelation==SQ_SO) { + PART1 + cMultSelf(a,mult_mat[material[j]]); + PART2 + } + else { + PART1 + PART2 + } #undef PART1 #undef PART2 - // tbuff=(I-nxn).sum=sum-n*(n.sum) - crDotProd(sum,n,dpr); - cScalMultRVec(n,dpr,tbuff); - cvSubtr(sum,tbuff,tbuff); - // ebuff=(-i*k^3)*exp(-ikr0.n)*tbuff, where r0=box_origin_unif - imExp(-WaveNum*DotProd(box_origin_unif,n),a); // a=exp(-ikr0.n) - kkk=WaveNum*WaveNum*WaveNum; - tmp[RE]=a[IM]*kkk; // tmp=(-i*k^3)*exp(-ikr0.n) - tmp[IM]=-a[RE]*kkk; - cvMultScal_cmplx(tmp,tbuff,ebuff); + /* tbuff=(I-nxn).sum=sum-n*(n.sum) */ + crDotProd(sum,n,dpr); + cScalMultRVec(n,dpr,tbuff); + cvSubtr(sum,tbuff,tbuff); + /* ebuff=(-i*k^3)*exp(-ikr0.n)*tbuff, where r0=box_origin_unif */ + imExp(-WaveNum*DotProd(box_origin_unif,n),a); /* a=exp(-ikr0.n) */ + kkk=WaveNum*WaveNum*WaveNum; + tmp[RE]=a[IM]*kkk; /* tmp=(-i*k^3)*exp(-ikr0.n) */ + tmp[IM]=-a[RE]*kkk; + cvMultScal_cmplx(tmp,tbuff,ebuff); } -//===================================================================== +/*=====================================================================*/ double ExtCross(const double *incPol) -// Calculate the Extinction cross-section + /* Calculate the Extinction cross-section */ { - doublecomplex ebuff[3]; - double sum; - size_t i; - - if (beamtype==B_PLANE) { - CalcField (ebuff,prop); - sum=crDotProd_Re(ebuff,incPol); // incPol is real, so no conjugate is needed - MyInnerProduct(&sum,double_type,1,&Timing_ScatQuan_comm); - sum*=FOUR_PI/(WaveNum*WaveNum); - } - else { /* more general formula; normalization is done assuming the unity amplitude of the - * electric field in the focal point of the beam; Currently does not comply with - * ScatRelation=SO - */ - sum=0; - for (i=0;i<local_nvoid_Ndip;++i) sum+=cDotProd_Im(pvec+3*i,Einc+3*i); // sum{Im(P.E_inc*)} - MyInnerProduct(&sum,double_type,1,&Timing_ScatQuan_comm); - sum*=FOUR_PI*WaveNum; - } - return sum; + doublecomplex ebuff[3]; + double sum; + size_t i; + + if (beamtype==B_PLANE) { + CalcField (ebuff,prop); + sum=crDotProd_Re(ebuff,incPol); /* incPol is real, so no conjugate is needed */ + MyInnerProduct(&sum,double_type,1,&Timing_ScatQuan_comm); + sum*=FOUR_PI/(WaveNum*WaveNum); + } + else { /* more general formula; normalization is done assuming the unity amplitude of the + electric field in the focal point of the beam; + Currently does not comply with ScatRelation=SO */ + sum=0; + for (i=0;i<local_nvoid_Ndip;++i) sum+=cDotProd_Im(pvec+3*i,Einc+3*i); /* sum{Im(P.E_inc*)} */ + MyInnerProduct(&sum,double_type,1,&Timing_ScatQuan_comm); + sum*=FOUR_PI*WaveNum; + } + return sum; } -//===================================================================== +/*=====================================================================*/ double AbsCross(void) -// Calculate the Absorption cross-section for process 0 + /* Calculate the Absorption cross-section for process 0 */ { - size_t dip,index; - int i,j; - unsigned char mat; - double sum,dummy,temp1,temp2; - doublecomplex m2; - double *m; // not doublecomplex=double[2] to allow assignment to it - double cc_inv_im[MAX_NMAT][3]; // Im(1/cc)=-Im(cc)/|cc|^2 - double mult_mat[MAX_NMAT]; - - if (ScatRelation==SQ_DRAINE) { - // calculate constant and cc_inv_im - dummy = 2*WaveNum*WaveNum*WaveNum/3; - for (i=0;i<Nmat;i++) for (j=0;j<3;j++) cc_inv_im[i][j]=cInvIm(cc[i][j]); - // main cycle - for (dip=0,sum=0;dip<local_nvoid_Ndip;++dip) { - mat=material[dip]; - index=3*dip; - // Im(P.Eexc(*))-(2/3)k^3*|P|^2=|P|^2*(-Im(1/cc)-(2/3)k^3) - for(i=0;i<3;i++) sum-=(cc_inv_im[mat][i]+dummy)*cAbs2(pvec[index+i]); - } - } - else if (ScatRelation==SQ_SO) { - // !!! this should never happen - if (anisotropy) LogError(EC_ERROR,ONE_POS,"Incompatibility error in AbsCross"); - // calculate constants - temp1=kd*kd/6; - temp2=FOUR_PI/(gridspace*gridspace*gridspace); - for (i=0;i<Nmat;i++) { - m=ref_index[i]; - cSquare(m,m2); - m2[RE]-=1; - // mult_mat=-Im(1/chi)*(1+(kd*Im(m))^2)/d^3; chi=(m^2-1)/(4*PI) - mult_mat[i]=temp2*m2[IM]*(1+temp1*m[IM]*m[IM])/cAbs2(m2); - } - // main cycle - for (dip=0,sum=0;dip<local_nvoid_Ndip;++dip) - sum+=mult_mat[material[dip]]*cvNorm2(pvec+3*dip); - } - MyInnerProduct(&sum,double_type,1,&Timing_ScatQuan_comm); - return FOUR_PI*WaveNum*sum; + size_t dip,index; + int i,j; + unsigned char mat; + double sum, dummy, temp1,temp2; + doublecomplex m2; + double *m; /* not doublecomplex=double[2] to allow assignment to it */ + double cc_inv_im[MAX_NMAT][3]; /* Im(1/cc)=-Im(cc)/|cc|^2 */ + double mult_mat[MAX_NMAT]; + + if (ScatRelation==SQ_DRAINE) { + /* calculate constant and cc_inv_im */ + dummy = 2*WaveNum*WaveNum*WaveNum/3; + for (i=0;i<Nmat;i++) for (j=0;j<3;j++) cc_inv_im[i][j]=cInvIm(cc[i][j]); + /* main cycle */ + for (dip=0,sum=0;dip<local_nvoid_Ndip;++dip) { + mat=material[dip]; + index=3*dip; + /* Im(P.Eexc(*))-(2/3)k^3*|P|^2=|P|^2*(-Im(1/cc)-(2/3)k^3) */ + for(i=0;i<3;i++) sum-=(cc_inv_im[mat][i]+dummy)*cAbs2(pvec[index+i]); + } + } + else if (ScatRelation==SQ_SO) { + /* this should never happen !!! */ + if (anisotropy) LogError(EC_ERROR,ONE_POS,"Incompatibility error in AbsCross"); + /* calculate constants */ + temp1=kd*kd/6; + temp2=FOUR_PI/(gridspace*gridspace*gridspace); + for (i=0;i<Nmat;i++) { + m=ref_index[i]; + cSquare(m,m2); + m2[RE]-=1; + /* mult_mat=-Im(1/hi)*(1+(kd*Im(m))^2)/d^3; hi=(m^2-1)/(4*PI) */ + mult_mat[i]=temp2*m2[IM]*(1+temp1*m[IM]*m[IM])/cAbs2(m2); + } + /* main cycle */ + for (dip=0,sum=0;dip<local_nvoid_Ndip;++dip) + sum+=mult_mat[material[dip]]*cvNorm2(pvec+3*dip); + } + MyInnerProduct(&sum,double_type,1,&Timing_ScatQuan_comm); + return FOUR_PI*WaveNum*sum; } -//===================================================================== +/*=====================================================================*/ void CalcAlldir(void) -// calculate scattered field in many directions + /* calculate scattered field in many directions */ { - int index,npoints,point; - size_t i,j; - TIME_TYPE tstart; - double robserver[3],incPolpar[3],incPolper[3],cthet,sthet,cphi,sphi,th,ph; - doublecomplex ebuff[3]; - - // Calculate field - tstart = GET_TIME(); - npoints = theta_int.N*phi_int.N; - PRINTZ("Calculating scattered field for the whole solid angle:\n"); - for (i=0,point=0;i<theta_int.N;++i) { - th=Deg2Rad(theta_int.val[i]); - cthet=cos(th); - sthet=sin(th); - for (j=0;j<phi_int.N;++j) { - ph=Deg2Rad(phi_int.val[j]); - cphi=cos(ph); - sphi=sin(ph); - // robserver = cos(theta)*prop + sin(theta)*[cos(phi)*incPolX + sin(phi)*incPolY]; - LinComb(incPolX,incPolY,cphi,sphi,robserver); - LinComb(prop,robserver,cthet,sthet,robserver); - // calculate scattered field - main bottleneck - CalcField(ebuff,robserver); - /* set Epar and Eper - use E2_alldir array to store them this is done to decrease - * communications in 1.5 times - */ - // incPolper = sin(phi)*incPolX - cos(phi)*incPolY; - LinComb(incPolX,incPolY,sphi,-cphi,incPolper); - // incPolpar = -sin(theta)*prop + cos(theta)*[cos(phi)*incPolX + sin(phi)*incPolY]; - LinComb(incPolX,incPolY,cphi,sphi,incPolpar); - LinComb(prop,incPolpar,-sthet,cthet,incPolpar); - index=2*point; - crDotProd(ebuff,incPolper,((doublecomplex*)E2_alldir)[index]); - crDotProd(ebuff,incPolpar,((doublecomplex*)E2_alldir)[index+1]); - point++; - // show progress - if (((10*point)%npoints)<10) { - PRINTZ(" %d%%",100*point/npoints); - FFLUSHZ(stdout); - } - } - } - // accumulate fields - Accumulate(E2_alldir,4*npoints,E2_alldir_buffer,&Timing_comm_EField_ad); - // calculate square of the field - for (point=0;point<npoints;point++) - E2_alldir[point] = cAbs2(((doublecomplex*)E2_alldir)[2*point]) + - cAbs2(((doublecomplex*)E2_alldir)[2*point+1]); - PRINTZ(" done\n"); - FFLUSHZ(stdout); - // timing - Timing_EField_ad = GET_TIME() - tstart; - Timing_EField += Timing_EField_ad; + int index,npoints,point; + size_t i,j; + TIME_TYPE tstart; + double robserver[3],incPolpar[3],incPolper[3],cthet,sthet,cphi,sphi,th,ph; + doublecomplex ebuff[3]; + + /* Calculate field */ + tstart = GET_TIME(); + npoints = theta_int.N*phi_int.N; + PRINTZ("Calculating scattered field for the whole solid angle:\n"); + for (i=0,point=0;i<theta_int.N;++i) { + th=Deg2Rad(theta_int.val[i]); + cthet=cos(th); + sthet=sin(th); + for (j=0;j<phi_int.N;++j) { + ph=Deg2Rad(phi_int.val[j]); + cphi=cos(ph); + sphi=sin(ph); + /* robserver = cos(theta)*prop + sin(theta)*[cos(phi)*incPolX + sin(phi)*incPolY]; */ + LinComb(incPolX,incPolY,cphi,sphi,robserver); + LinComb(prop,robserver,cthet,sthet,robserver); + /* calculate scattered field - main bottleneck */ + CalcField(ebuff,robserver); + /* set Epar and Eper - use E2_alldir array to store them + this is done to decrease communications in 1.5 times */ + + /* incPolper = sin(phi)*incPolX - cos(phi)*incPolY; */ + LinComb(incPolX,incPolY,sphi,-cphi,incPolper); + /* incPolpar = -sin(theta)*prop + cos(theta)*[cos(phi)*incPolX + sin(phi)*incPolY]; */ + LinComb(incPolX,incPolY,cphi,sphi,incPolpar); + LinComb(prop,incPolpar,-sthet,cthet,incPolpar); + + index=2*point; + crDotProd(ebuff,incPolper,((doublecomplex*)E2_alldir)[index]); + crDotProd(ebuff,incPolpar,((doublecomplex*)E2_alldir)[index+1]); + + point++; + if (((10*point)%npoints)<10) { + PRINTZ(" %d%%",100*point/npoints); + FFLUSHZ(stdout); + } + } + } + /* accumulate fields */ + Accumulate(E2_alldir,4*npoints,E2_alldir_buffer,&Timing_comm_EField_ad); + /* calculate square of the field */ + for (point=0;point<npoints;point++) + E2_alldir[point] = cAbs2(((doublecomplex*)E2_alldir)[2*point]) + + cAbs2(((doublecomplex*)E2_alldir)[2*point+1]); + PRINTZ(" done\n"); + FFLUSHZ(stdout); + /* timing */ + Timing_EField_ad = GET_TIME() - tstart; + Timing_EField += Timing_EField_ad; } -//===================================================================== +/*=====================================================================*/ void CalcScatGrid(const char which) -// calculate scattered field in many directions + /* calculate scattered field in many directions */ { - size_t i,j,n,point,index; - TIME_TYPE tstart; - double robserver[3],incPolpar[3],incPolper[3],cthet,sthet,cphi,sphi,th,ph; - doublecomplex ebuff[3]; - doublecomplex *Egrid; // either EgridX or EgridY - - // Calculate field - tstart = GET_TIME(); - // choose which array to fill - if (which=='X') Egrid=EgridX; - else if (which=='Y') Egrid=EgridY; - // set type of cycling through angles - if (angles.type==SG_GRID) n=angles.phi.N; - else if (angles.type==SG_PAIRS) n=1; - PRINTZ("Calculating grid of scattered field:\n"); - // main cycle - for (i=0,point=0;i<angles.theta.N;++i) { - th=Deg2Rad(angles.theta.val[i]); - cthet=cos(th); - sthet=sin(th); - for (j=0;j<n;++j) { - if (angles.type==SG_GRID) ph=Deg2Rad(angles.phi.val[j]); - else if (angles.type==SG_PAIRS) ph=Deg2Rad(angles.phi.val[i]); - cphi=cos(ph); - sphi=sin(ph); - // robserver = cos(theta)*prop + sin(theta)*[cos(phi)*incPolX + sin(phi)*incPolY]; - LinComb(incPolX,incPolY,cphi,sphi,robserver); - LinComb(prop,robserver,cthet,sthet,robserver); - // calculate scattered field - main bottleneck - CalcField(ebuff,robserver); - /* set Epar and Eper - use Egrid array to store them this is done to decrease - * communications in 1.5 times - */ - // incPolper = sin(phi)*incPolX - cos(phi)*incPolY; - LinComb(incPolX,incPolY,sphi,-cphi,incPolper); - // incPolpar = -sin(theta)*prop + cos(theta)*[cos(phi)*incPolX + sin(phi)*incPolY]; - LinComb(incPolX,incPolY,cphi,sphi,incPolpar); - LinComb(prop,incPolpar,-sthet,cthet,incPolpar); - index=2*point; - crDotProd(ebuff,incPolper,Egrid[index]); - crDotProd(ebuff,incPolpar,Egrid[index+1]); - point++; - // show progress - if (((10*point)%angles.N)<10) { - // the value is always from 0 to 100, so conversion to int is absolutely safe - PRINTZ(" %d%%",(int)(100*point/angles.N)); - FFLUSHZ(stdout); - } - } - } - // accumulate fields; timing - Accumulate((double *)Egrid,4*angles.N,Egrid_buffer,&Timing_comm_EField_sg); - PRINTZ(" done\n"); - FFLUSHZ(stdout); - Timing_EField_sg = GET_TIME() - tstart; - Timing_EField += Timing_EField_sg; + size_t i,j,n,point,index; + TIME_TYPE tstart; + double robserver[3],incPolpar[3],incPolper[3],cthet,sthet,cphi,sphi,th,ph; + doublecomplex ebuff[3]; + doublecomplex *Egrid; /* either EgridX or EgridY */ + + /* Calculate field */ + tstart = GET_TIME(); + /* choose which array to fill */ + if (which=='X') Egrid=EgridX; + else if (which=='Y') Egrid=EgridY; + /* set type of cycling through angles */ + if (angles.type==SG_GRID) n=angles.phi.N; + else if (angles.type==SG_PAIRS) n=1; + PRINTZ("Calculating grid of scattered field:\n"); + /* main cycle */ + for (i=0,point=0;i<angles.theta.N;++i) { + th=Deg2Rad(angles.theta.val[i]); + cthet=cos(th); + sthet=sin(th); + for (j=0;j<n;++j) { + if (angles.type==SG_GRID) ph=Deg2Rad(angles.phi.val[j]); + else if (angles.type==SG_PAIRS) ph=Deg2Rad(angles.phi.val[i]); + cphi=cos(ph); + sphi=sin(ph); + /* robserver = cos(theta)*prop + sin(theta)*[cos(phi)*incPolX + sin(phi)*incPolY]; */ + LinComb(incPolX,incPolY,cphi,sphi,robserver); + LinComb(prop,robserver,cthet,sthet,robserver); + /* calculate scattered field - main bottleneck */ + CalcField(ebuff,robserver); + /* set Epar and Eper - use Egrid array to store them + this is done to decrease communications in 1.5 times */ + + /* incPolper = sin(phi)*incPolX - cos(phi)*incPolY; */ + LinComb(incPolX,incPolY,sphi,-cphi,incPolper); + /* incPolpar = -sin(theta)*prop + cos(theta)*[cos(phi)*incPolX + sin(phi)*incPolY]; */ + LinComb(incPolX,incPolY,cphi,sphi,incPolpar); + LinComb(prop,incPolpar,-sthet,cthet,incPolpar); + + index=2*point; + crDotProd(ebuff,incPolper,Egrid[index]); + crDotProd(ebuff,incPolpar,Egrid[index+1]); + + point++; + if (((10*point)%angles.N)<10) { + PRINTZ(" %u%%",100*point/angles.N); + FFLUSHZ(stdout); + } + } + } + /* accumulate fields; timing */ + Accumulate((double *)Egrid,4*angles.N,Egrid_buffer,&Timing_comm_EField_sg); + PRINTZ(" done\n"); + FFLUSHZ(stdout); + Timing_EField_sg = GET_TIME() - tstart; + Timing_EField += Timing_EField_sg; } -//===================================================================== +/*=====================================================================*/ static double CscaIntegrand(const int theta,const int phi,double *res) -// function that is transferred to integration module when calculating Csca + /* function that is transferred to integration module + when calculating Csca */ { - res[0]=E2_alldir[AlldirIndex(theta,phi)]; - return 0; + res[0]=E2_alldir[AlldirIndex(theta,phi)]; + return 0; } -//===================================================================== +/*=====================================================================*/ double ScaCross(char *f_suf) -// Calculate the scattering cross section from the integral + /* Calculate the scattering cross section + * from the integral */ { - TIME_TYPE tstart; - char fname[MAX_FNAME]; - double res; + TIME_TYPE tstart; + char fname[MAX_FNAME]; + double res; - sprintf(fname,"%s/" F_LOG_INT_CSCA "%s",directory,f_suf); + sprintf(fname,"%s/" F_LOG_INT_CSCA "%s",directory,f_suf); - tstart = GET_TIME(); - Romberg2D(parms,CscaIntegrand,1,&res,fname); - res*=FOUR_PI/(WaveNum*WaveNum); - Timing_Integration += GET_TIME() - tstart; - return res; + tstart = GET_TIME(); + Romberg2D(parms,CscaIntegrand,1,&res,fname); + res*=FOUR_PI/(WaveNum*WaveNum); + Timing_Integration += GET_TIME() - tstart; + return res; } -//===================================================================== +/*=====================================================================*/ static double gIntegrand(const int theta,const int phi,double *res) -// function that is transferred to integration module when calculating g + /* function that is transferred to integration module + when calculating g */ { - double E_square,th,ph; - th=Deg2Rad(theta_int.val[theta]); - ph=Deg2Rad(phi_int.val[phi]); - - E_square=E2_alldir[AlldirIndex(theta,phi)]; - res[0] = E_square*sin(th)*cos(ph); - res[1] = E_square*sin(th)*sin(ph); - res[2] = E_square*cos(th); - return 0; + double E_square,th,ph; + th=Deg2Rad(theta_int.val[theta]); + ph=Deg2Rad(phi_int.val[phi]); + + E_square=E2_alldir[AlldirIndex(theta,phi)]; + res[0] = E_square*sin(th)*cos(ph); + res[1] = E_square*sin(th)*sin(ph); + res[2] = E_square*cos(th); + return 0; } -//===================================================================== +/*=====================================================================*/ void AsymParm(double *vec,char *f_suf) -// Calculate the unnormalized asymmetry parameter, i.e. not yet normalized by Csca + /* Calculate the unnormalized asymmetry parameter, + * i.e. not yet normalized by Csca */ { - int comp; - TIME_TYPE tstart; - char log_int[MAX_FNAME]; + int comp; + TIME_TYPE tstart; + char log_int[MAX_FNAME]; - sprintf(log_int,"%s/" F_LOG_INT_ASYM "%s",directory,f_suf); + sprintf(log_int,"%s/" F_LOG_INT_ASYM "%s",directory,f_suf); - tstart = GET_TIME(); - Romberg2D(parms,gIntegrand,3,vec,log_int); - for (comp=0;comp<3;++comp) vec[comp]*=FOUR_PI/(WaveNum*WaveNum); - Timing_Integration += GET_TIME() - tstart; + tstart = GET_TIME(); + Romberg2D(parms,gIntegrand,3,vec,log_int); + for (comp=0;comp<3;++comp) vec[comp]*=FOUR_PI/(WaveNum*WaveNum); + Timing_Integration += GET_TIME() - tstart; } -//===================================================================== +/*=====================================================================*/ static double gxIntegrand(const int theta,const int phi,double *res) -// function that is transferred to integration module when calculating g_x + /* function that is transferred to integration module + when calculating g_x */ { - res[0]=E2_alldir[AlldirIndex(theta,phi)]*sin(Deg2Rad(theta_int.val[theta])) - *cos(Deg2Rad(phi_int.val[phi])); - return 0; + res[0]=E2_alldir[AlldirIndex(theta,phi)]*sin(Deg2Rad(theta_int.val[theta])) + *cos(Deg2Rad(phi_int.val[phi])); + return 0; } -//===================================================================== +/*=====================================================================*/ void AsymParm_x(double *vec,char *f_suf) -// Calculate the unnormalized asymmetry parameter, i.e. not yet normalized by Csca + /* Calculate the unnormalized asymmetry parameter, + * i.e. not yet normalized by Csca */ { - TIME_TYPE tstart; - char log_int[MAX_FNAME]; + TIME_TYPE tstart; + char log_int[MAX_FNAME]; - sprintf(log_int,"%s/" F_LOG_INT_ASYM F_LOG_X "%s",directory,f_suf); + sprintf(log_int,"%s/" F_LOG_INT_ASYM F_LOG_X "%s",directory,f_suf); - tstart = GET_TIME(); - Romberg2D(parms,gxIntegrand,1,vec,log_int); - vec[0] *= FOUR_PI/(WaveNum*WaveNum); - Timing_Integration += GET_TIME() - tstart; + tstart = GET_TIME(); + Romberg2D(parms,gxIntegrand,1,vec,log_int); + vec[0] *= FOUR_PI/(WaveNum*WaveNum); + Timing_Integration += GET_TIME() - tstart; } -//===================================================================== +/*=====================================================================*/ static double gyIntegrand(const int theta,const int phi,double *res) -// function that is transferred to integration module when calculating g_y + /* function that is transferred to integration module + when calculating g_y */ { - res[0]=E2_alldir[AlldirIndex(theta,phi)]*sin(Deg2Rad(theta_int.val[theta])) - *sin(Deg2Rad(phi_int.val[phi])); - return 0; + res[0]=E2_alldir[AlldirIndex(theta,phi)]*sin(Deg2Rad(theta_int.val[theta])) + *sin(Deg2Rad(phi_int.val[phi])); + return 0; } -//===================================================================== +/*=====================================================================*/ void AsymParm_y(double *vec,char *f_suf) -// Calculate the unnormalized asymmetry parameter, i.e. not yet normalized by Csca + /* Calculate the unnormalized asymmetry parameter, + * i.e. not yet normalized by Csca */ { - TIME_TYPE tstart; - char log_int[MAX_FNAME]; + TIME_TYPE tstart; + char log_int[MAX_FNAME]; - sprintf(log_int,"%s/" F_LOG_INT_ASYM F_LOG_Y "%s",directory,f_suf); + sprintf(log_int,"%s/" F_LOG_INT_ASYM F_LOG_Y "%s",directory,f_suf); - tstart = GET_TIME(); - Romberg2D(parms,gyIntegrand,1,vec,log_int); - vec[0] *= FOUR_PI/(WaveNum*WaveNum); - Timing_Integration += GET_TIME() - tstart; + tstart = GET_TIME(); + Romberg2D(parms,gyIntegrand,1,vec,log_int); + vec[0] *= FOUR_PI/(WaveNum*WaveNum); + Timing_Integration += GET_TIME() - tstart; } -//===================================================================== +/*=====================================================================*/ static double gzIntegrand(const int theta,const int phi,double *res) -// function that is transferred to integration module when calculating g_z + /* function that is transferred to integration module + when calculating g_z */ { - res[0]=E2_alldir[AlldirIndex(theta,phi)]*cos(Deg2Rad(theta_int.val[theta])); - return 0; + res[0]=E2_alldir[AlldirIndex(theta,phi)]*cos(Deg2Rad(theta_int.val[theta])); + return 0; } -//===================================================================== +/*=====================================================================*/ void AsymParm_z(double *vec,char *f_suf) -// Calculate the unnormalized asymmetry parameter, i.e. not yet normalized by Csca + /* Calculate the unnormalized asymmetry parameter, + * i.e. not yet normalized by Csca */ { - TIME_TYPE tstart; - char log_int[MAX_FNAME]; + TIME_TYPE tstart; + char log_int[MAX_FNAME]; - sprintf(log_int,"%s/" F_LOG_INT_ASYM F_LOG_Z "%s",directory,f_suf); + sprintf(log_int,"%s/" F_LOG_INT_ASYM F_LOG_Z "%s",directory,f_suf); - tstart = GET_TIME(); - Romberg2D(parms,gzIntegrand,1,vec,log_int); - vec[0] *= FOUR_PI/(WaveNum*WaveNum); - Timing_Integration += GET_TIME() - tstart; + tstart = GET_TIME(); + Romberg2D(parms,gzIntegrand,1,vec,log_int); + vec[0] *= FOUR_PI/(WaveNum*WaveNum); + Timing_Integration += GET_TIME() - tstart; } -//===================================================================== +/*=====================================================================*/ -void Frp_mat(double Fsca_tot[3],double *Fsca,double Finc_tot[3],double *Finc,double Frp_tot[3], - double *Frp) -/* Calculate the Radiation Pressure by direct calculation of the scattering force. Per dipole the - * force of the incoming photons, the scattering force and the radiation pressure are calculated - * as intermediate results - */ +void Frp_mat(double Fsca_tot[3],double *Fsca, + double Finc_tot[3],double *Finc, + double Frp_tot[3],double *Frp) + /* Calculate the Radiation Pressure by direct calculation + * of the scattering force. Per dipole the force of + * the incoming photons, the scattering force and the + * radiation pressure are calculated as intermediate results */ { - size_t j,l,lll,index,comp; - int i; - size_t local_d0; - size_t local_nvoid_d0, local_nvoid_d1; - double *nvoid_array; - unsigned char *materialT; - double *rdipT; - doublecomplex *pT; - doublecomplex temp; - doublecomplex dummy,_E_inc; - double r,r2; // (squared) absolute distance - doublecomplex - n[3], // unit vector in the direction of r_{jl}; complex part is always zero - a,ab1,ab2,c1[3],c2[3], // see chapter ... - x_cg[3], // complex conjugate P*_j - Pn_j, // n_jl.P_l - Pn_l, // P*_j.n_jl - inp; // P*_j.P_l - - // check if it can work at all - CheckOverflow(3*nvoid_Ndip,ONE_POS,"Frp_mat()"); - // initialize - local_d0=boxX*boxY*local_z0; - for (comp=0;comp<3;++comp) Fsca_tot[comp]=Finc_tot[comp]=Frp_tot[comp]=0.0; - // Convert internal fields to dipole moments; Calculate incoming force per dipole - for (j=0;j<local_nvoid_Ndip;++j) { - dummy[RE]=dummy[IM]=0.0; - for (comp=0;comp<3;++comp) { - index = 3*j+comp; - // Im(P.E*inc) - _E_inc[RE] = Einc[index][RE]; - _E_inc[IM] = -Einc[index][IM]; - cMult(pvec[index],_E_inc,temp); - cAdd(dummy,temp,dummy); - } - Finc[3*j+2] = WaveNum*dummy[IM]/2; - Finc_tot[2] += Finc[3*j+2]; - } - /* Because of the parallelization by row-block decomposition the distributed arrays involved - * need to be gathered on each node a) material -> materialT; b) DipoleCoord -> rdipT; - * c) pvec -> pT - */ - // initialize local_nvoid_d0 and local_nvoid_d1 - MALLOC_VECTOR(nvoid_array,double,nprocs,ALL); - nvoid_array[ringid]=local_nvoid_Ndip; - AllGather(nvoid_array+ringid,nvoid_array,double_type,nprocs); - local_nvoid_d0=0; - for (i=0;i<ringid;i++) local_nvoid_d0+=nvoid_array[i]; - local_nvoid_d1=local_nvoid_d0+local_nvoid_Ndip; - Free_general(nvoid_array); - // requires a lot of additional memory - MALLOC_VECTOR(materialT,uchar,nvoid_Ndip,ALL); - MALLOC_VECTOR(rdipT,double,3*nvoid_Ndip,ALL); - MALLOC_VECTOR(pT,complex,3*nvoid_Ndip,ALL); - - memcpy(materialT+local_nvoid_d0,material,local_nvoid_Ndip*sizeof(char)); - memcpy(pT+3*local_nvoid_d0,pvec,3*local_nvoid_Ndip*sizeof(doublecomplex)); - memcpy(rdipT+3*local_nvoid_d0,DipoleCoord,3*local_nvoid_Ndip*sizeof(double)); - - AllGather(materialT+local_nvoid_d0,materialT,char_type,local_nvoid_Ndip); - AllGather(pT+3*local_nvoid_d0,pT,cmplx_type,3*local_nvoid_Ndip); - AllGather(rdipT+3*local_nvoid_d0,rdipT,double_type,3*local_nvoid_Ndip); - // Calculate scattering force per dipole - for (j=local_nvoid_d0;j<local_nvoid_d1;++j) { - int jjj = 3*j; - - for (l=0;l<nvoid_Ndip;++l) if (j!=l) { - lll = 3*l; - r2 = 0; - Pn_j[RE]=Pn_j[IM]=Pn_l[RE]=Pn_l[IM]=inp[RE]=inp[IM]=0.0; - // Set distance related variables - for (comp=0;comp<3;++comp) { - n[comp][IM] = 0; - n[comp][RE] = rdipT[jjj+comp] - rdipT[lll+comp]; - r2 += n[comp][RE]*n[comp][RE]; - } - r = sqrt(r2); - n[0][RE]/=r; n[1][RE]/=r; n[2][RE]/=r; - // Set the scalar products a.b1 and a.b2 - a[RE] = cos(WaveNum*r); - a[IM] = sin(WaveNum*r); - ab1[RE] = 3/(r2*r2) - WaveNum*WaveNum/r2; - ab2[RE] = -WaveNum*WaveNum/r2; - ab1[IM] = -3*WaveNum/(r*r2); - ab2[IM] = WaveNum*WaveNum*WaveNum/r; - cMultSelf(ab1,a); - cMultSelf(ab2,a); - // Prepare c1 and c2 - for (comp=0;comp<3;++comp) { - x_cg[comp][RE] = pT[jjj+comp][RE]; - x_cg[comp][IM] = -pT[jjj+comp][IM]; - cMult(x_cg[comp],n[comp],temp); - cAdd(Pn_j,temp,Pn_j); - cMult(n[comp],pT[lll+comp],temp); - cAdd(Pn_l,temp,Pn_l); - cMult(x_cg[comp],pT[lll+comp],temp); - cAdd(inp,temp,inp); - } - for (comp=0;comp<3;++comp) { - // Set c1 - cMult(Pn_j,Pn_l,temp); - cMult(n[comp],temp,c1[comp]); - c1[comp][RE] *= -5; - c1[comp][IM] *= -5; - cMult(inp,n[comp],temp); - cAdd(c1[comp],temp,c1[comp]); - cMult(Pn_j,pT[lll+comp],temp); - cAdd(c1[comp],temp,c1[comp]); - cMult(x_cg[comp],Pn_l,temp); - cAdd(c1[comp],temp,c1[comp]); - // Set c2 - cMult(Pn_j,Pn_l,temp); - cMult(n[comp],temp,c2[comp]); - c2[comp][RE] *= -1; - c2[comp][IM] *= -1; - cMult(inp,n[comp],temp); - cAdd(c2[comp],temp,c2[comp]); - // Fsca_{jl} = ... - cMultSelf(c1[comp],ab1); - cMultSelf(c2[comp],ab2); - Fsca[jjj-3*local_d0+comp] += (c1[comp][RE] + c2[comp][RE])/2; - } - } // end l-loop - // Concluding - for (comp=0;comp<3;++comp) { - Fsca_tot[comp] += Fsca[jjj-3*local_d0+comp]; - Frp[jjj-3*local_d0+comp] = Finc[jjj-3*local_d0+comp] + Fsca[jjj-3*local_d0+comp]; - Frp_tot[comp] += Frp[jjj-3*local_d0+comp]; - } - } // end j-loop - - // Accumulate the total forces on all nodes - MyInnerProduct(Finc_tot+2,double_type,1,&Timing_ScatQuan_comm); - MyInnerProduct(Fsca_tot,double_type,3,&Timing_ScatQuan_comm); - MyInnerProduct(Frp_tot,double_type,3,&Timing_ScatQuan_comm); - - Free_general(materialT); - Free_general(rdipT); - Free_general(pT); + size_t j,l,lll,index,comp; + int i; + size_t local_d0; + size_t local_nvoid_d0, local_nvoid_d1; + double *nvoid_array; + unsigned char *materialT; + double *rdipT; + doublecomplex *pT; + doublecomplex temp; + doublecomplex dummy,_E_inc; + double r,r2; /* (squared) absolute distance */ + doublecomplex + n[3], /* unit vector in the direction of r_{jl} + * complex will always be zero */ + a,ab1,ab2, /* see chapter ... */ + c1[3],c2[3], /* idem */ + x_cg[3], /* complex conjungate P*_j */ + Pn_j, /* n_jl.P_l */ + Pn_l, /* P*_j.n_jl */ + inp; /* P*_j.P_l */ + + /* check if it can work at all */ + CheckOverflow(3*nvoid_Ndip,ONE_POS,"Frp_mat()"); + + /* initialize */ + local_d0=boxX*boxY*local_z0; + + for (comp=0;comp<3;++comp) Fsca_tot[comp]=Finc_tot[comp]=Frp_tot[comp]=0.0; + /* Convert internal fields to dipole moments; + Calculate incoming force per dipole */ + for (j=0;j<local_nvoid_Ndip;++j) { + dummy[RE]=dummy[IM]=0.0; + for (comp=0;comp<3;++comp) { + index = 3*j+comp; + /* Im(P.E*inc) */ + _E_inc[RE] = Einc[index][RE]; + _E_inc[IM] = -Einc[index][IM]; + cMult(pvec[index],_E_inc,temp); + cAdd(dummy,temp,dummy); + } + Finc[3*j+2] = WaveNum*dummy[IM]/2; + Finc_tot[2] += Finc[3*j+2]; + } + + /* Because of the parallelisation by row-block decomposition + the distributed arrays involved need to be gathered on each node + a) material -> materialT + b) DipoleCoord -> rdipT + c) pvec -> pT + */ + /* initialize local_nvoid_d0 and local_nvoid_d1 */ + MALLOC_VECTOR(nvoid_array,double,nprocs,ALL); + nvoid_array[ringid]=local_nvoid_Ndip; + AllGather(nvoid_array+ringid,nvoid_array,double_type,nprocs); + local_nvoid_d0=0; + for (i=0;i<ringid;i++) local_nvoid_d0+=nvoid_array[i]; + local_nvoid_d1=local_nvoid_d0+local_nvoid_Ndip; + Free_general(nvoid_array); + /* requires a lot of additional memory */ + MALLOC_VECTOR(materialT,uchar,nvoid_Ndip,ALL); + MALLOC_VECTOR(rdipT,double,3*nvoid_Ndip,ALL); + MALLOC_VECTOR(pT,complex,3*nvoid_Ndip,ALL); + + memcpy(materialT+local_nvoid_d0,material,local_nvoid_Ndip*sizeof(char)); + memcpy(pT+3*local_nvoid_d0,pvec,3*local_nvoid_Ndip*sizeof(doublecomplex)); + memcpy(rdipT+3*local_nvoid_d0,DipoleCoord,3*local_nvoid_Ndip*sizeof(double)); + + AllGather(materialT+local_nvoid_d0,materialT,char_type,local_nvoid_Ndip); + AllGather(pT+3*local_nvoid_d0,pT,cmplx_type,3*local_nvoid_Ndip); + AllGather(rdipT+3*local_nvoid_d0,rdipT,double_type,3*local_nvoid_Ndip); + + /* Calculate scattering force per dipole */ + for (j=local_nvoid_d0;j<local_nvoid_d1;++j) { + int jjj = 3*j; + + for (l=0;l<nvoid_Ndip;++l) if (j!=l) { + lll = 3*l; + r2 = 0; + Pn_j[RE]=Pn_j[IM]=Pn_l[RE]=Pn_l[IM]=inp[RE]=inp[IM]=0.0; + + /* Set distance related variables */ + for (comp=0;comp<3;++comp) { + n[comp][IM] = 0; + n[comp][RE] = rdipT[jjj+comp] - rdipT[lll+comp]; + r2 += n[comp][RE]*n[comp][RE]; + } + r = sqrt(r2); + n[0][RE]/=r; n[1][RE]/=r; n[2][RE]/=r; + + /* Set the scalar products a.b1 and a.b2 */ + a[RE] = cos(WaveNum*r); + a[IM] = sin(WaveNum*r); + ab1[RE] = 3/(r2*r2) - WaveNum*WaveNum/r2; + ab2[RE] = -WaveNum*WaveNum/r2; + ab1[IM] = -3*WaveNum/(r*r2); + ab2[IM] = WaveNum*WaveNum*WaveNum/r; + cMultSelf(ab1,a); + cMultSelf(ab2,a); + + /* Prepare c1 and c2 */ + for (comp=0;comp<3;++comp) { + x_cg[comp][RE] = pT[jjj+comp][RE]; + x_cg[comp][IM] = -pT[jjj+comp][IM]; + cMult(x_cg[comp],n[comp],temp); + cAdd(Pn_j,temp,Pn_j); + cMult(n[comp],pT[lll+comp],temp); + cAdd(Pn_l,temp,Pn_l); + cMult(x_cg[comp],pT[lll+comp],temp); + cAdd(inp,temp,inp); + } + + for (comp=0;comp<3;++comp) { + /* Set c1 */ + cMult(Pn_j,Pn_l,temp); + cMult(n[comp],temp,c1[comp]); + c1[comp][RE] *= -5; + c1[comp][IM] *= -5; + + cMult(inp,n[comp],temp); + cAdd(c1[comp],temp,c1[comp]); + cMult(Pn_j,pT[lll+comp],temp); + cAdd(c1[comp],temp,c1[comp]); + cMult(x_cg[comp],Pn_l,temp); + cAdd(c1[comp],temp,c1[comp]); + + /* Set c2 */ + cMult(Pn_j,Pn_l,temp); + cMult(n[comp],temp,c2[comp]); + c2[comp][RE] *= -1; + c2[comp][IM] *= -1; + + cMult(inp,n[comp],temp); + cAdd(c2[comp],temp,c2[comp]); + + /* Fsca_{jl} = ... */ + cMultSelf(c1[comp],ab1); + cMultSelf(c2[comp],ab2); + Fsca[jjj-3*local_d0+comp] += (c1[comp][RE] + c2[comp][RE])/2; + } + } /* end l-loop */ + + /* Concluding */ + for (comp=0;comp<3;++comp) { + Fsca_tot[comp] += Fsca[jjj-3*local_d0+comp]; + + Frp[jjj-3*local_d0+comp] = Finc[jjj-3*local_d0+comp] + Fsca[jjj-3*local_d0+comp]; + + Frp_tot[comp] += Frp[jjj-3*local_d0+comp]; + } + } /* end j-loop */ + + /* Accumulate the total forces on all nodes */ + MyInnerProduct(Finc_tot+2,double_type,1,&Timing_ScatQuan_comm); + MyInnerProduct(Fsca_tot,double_type,3,&Timing_ScatQuan_comm); + MyInnerProduct(Frp_tot,double_type,3,&Timing_ScatQuan_comm); + + Free_general(materialT); + Free_general(rdipT); + Free_general(pT); } diff --git a/src/crosssec.h b/src/crosssec.h index 2b53c9a2..9c38a2f6 100644 --- a/src/crosssec.h +++ b/src/crosssec.h @@ -3,13 +3,13 @@ * DESCR: definitions of functions for * calculation of different measured quantities * - * Copyright (C) 2006,2008 University of Amsterdam + * Copyright (C) 2006 University of Amsterdam * This code is covered by the GNU General Public License. */ #ifndef __crosssec_h #define __crosssec_h -#include "function.h" // for function attributes +#include "function.h" /* for function attributes */ void CalcField(doublecomplex *ebuff,const double *n); void InitRotation(void); @@ -25,7 +25,8 @@ void AsymParm(double *vec,char *f_suf); void AsymParm_x(double *vec,char *f_suf); void AsymParm_y(double *vec,char *f_suf); void AsymParm_z(double *vec,char *f_suf); -void Frp_mat(double Fsca_tot[3],double *Fsca,double Finc_tot[3], - double *Finc,double Frp_tot[3],double *Frp); +void Frp_mat(double Fsca_tot[3],double *Fsca, + double Finc_tot[3],double *Finc, + double Frp_tot[3],double *Frp); -#endif // __crosssec_h +#endif /*__crosssec_h*/ diff --git a/src/debug.c b/src/debug.c index 0232a34d..db9f9a03 100644 --- a/src/debug.c +++ b/src/debug.c @@ -16,39 +16,38 @@ #include "const.h" #ifdef DEBUG -//============================================================ +/*============================================================*/ void DebugPrintf(const char *fname,const int line,const char *fmt, ... ) { - va_list args; - char pr_line[MAX_PARAGRAPH]; - extern int ringid; + va_list args; + char pr_line[MAX_PARAGRAPH]; + extern int ringid; - va_start(args, fmt); - sprintf(pr_line,"DEBUG: %s:%d: ",fname,line); - vsprintf(pr_line+strlen(pr_line),fmt,args); - strcat(pr_line, "\n"); - printf("(ringID=%i) %s",ringid,pr_line); - fflush(stdout); - va_end(args); + va_start(args, fmt); + sprintf(pr_line,"DEBUG: %s:%d: ",fname,line); + vsprintf(pr_line+strlen(pr_line),fmt,args); + strcat(pr_line, "\n"); + printf("(ringID=%i) %s",ringid,pr_line); + fflush(stdout); + va_end(args); } -//======================================================= +/*=======================================================*/ void FieldPrint (doublecomplex *x) -/* print current field at certain dipole -- not used; left for deep debug; NOT ROBUST, since - * DipoleCoord is not always available - */ + /* print current field at certain dipole -- not used; left for deep debug + NOT ROBUST, since DipoleCoord is not always available */ { - extern FILE *logfile; - extern double *DipoleCoord; - int i=9810; + extern FILE *logfile; + extern double *DipoleCoord; + int i=9810; - i*=3; - fprintf(logfile,"Dipole coordinates = %.10E, %.10E, %.10E\n",DipoleCoord[i],DipoleCoord[i+1], - DipoleCoord[i+2]); - fprintf(logfile,"E = %.10E%+.10Ei, %.10E%+.10Ei, %.10E%+.10Ei\n",x[i][RE],x[i][IM],x[i+1][RE], - x[i+1][IM],x[i+2][RE],x[i+2][IM]); + i*=3; + fprintf(logfile,"Dipole coords = %.10E, %.10E, %.10E\n", + DipoleCoord[i],DipoleCoord[i+1],DipoleCoord[i+2]); + fprintf(logfile,"E = %.10E%+.10Ei, %.10E%+.10Ei, %.10E%+.10Ei\n", + x[i][RE],x[i][IM],x[i+1][RE],x[i+1][IM],x[i+2][RE],x[i+2][IM]); } -#endif // DEBUG +#endif /* DEBUG */ diff --git a/src/debug.h b/src/debug.h index 70e0c78e..404c45d4 100644 --- a/src/debug.h +++ b/src/debug.h @@ -4,30 +4,33 @@ * * Previous versions by "vesseur" * - * Copyright (C) 2006,2008 University of Amsterdam + * Copyright (C) 2006 University of Amsterdam * This code is covered by the GNU General Public License. */ #ifndef __debug_h #define __debug_h -/* Debugging implies turning on additional information messages during the code execution. A simple - * and convenient tool to generate such messages is used. - */ +/* Debugging implies turning on additional information messages + during the code execution. A simple and convenient tool to + generate such messages is used. */ -//#define DEBUG // uncomment to degug +/*#define DEBUG /* uncomment to degug */ #ifdef DEBUG -# include "function.h" // for function attributes -# define D(p) DebugPrintf(__FILE__,__LINE__,p) -# define D2(p,a) DebugPrintf(__FILE__,__LINE__,p,a) -# define D2z(p,a) if (ringid==ROOT) DebugPrintf(__FILE__,__LINE__,p,a) -void DebugPrintf(const char *fname,int line,const char *fmt,...) ATT_PRINTF(3,4); +#include "function.h" /* for function attributes */ + +# define D(p) DebugPrintf(__FILE__, __LINE__, p) +# define D2(p,a) DebugPrintf(__FILE__, __LINE__, p,a) +# define D3(p,a,b) DebugPrintf(__FILE__, __LINE__, p,a,b) +# define D4(p,a,b,c) DebugPrintf(__FILE__, __LINE__, p,a,b,c) +void DebugPrintf(const char *fname,int line,const char *fmt, ...) ATT_PRINTF(3,4); void FieldPrint(doublecomplex *x) ATT_UNUSED; #else -# define D(p) -# define D2(p,a) -# define D2z(p,a) +# define D(p) +# define D2(p,a) +# define D3(p,a,b) +# define D4(p,a,b,c) #endif -#endif // __debug_h +#endif /*__debug_h*/ diff --git a/src/fft.c b/src/fft.c index 397102d5..4be325ec 100644 --- a/src/fft.c +++ b/src/fft.c @@ -6,7 +6,7 @@ * * Previous versions by Michel Grimminck and Alfons Hoekstra * - * Copyright (C) 2006-2008 University of Amsterdam + * Copyright (C) 2006-2007 University of Amsterdam * This code is covered by the GNU General Public License. */ #include <stdlib.h> @@ -24,1140 +24,1137 @@ #include "function.h" #ifdef FFTW3 -# include <fftw3.h> -/* define level of planning for usual and Dmatrix (DM) FFT: FFTW_ESTIMATE (heuristics), - * FFTW_MEASURE (default), FTW_PATIENT, or FFTW_EXHAUSTIVE - */ -# define PLAN_FFTW FFTW_MEASURE -# define PLAN_FFTW_DM FFTW_ESTIMATE +# include <fftw3.h> +/* define level of planning for usual and Dmatrix (DM) FFT */ +/* FFTW_ESTIMATE (heuristics), FFTW_MEASURE (def), FTW_PATIENT, or FFTW_EXHAUSTIVE */ +# define PLAN_FFTW FFTW_MEASURE +# define PLAN_FFTW_DM FFTW_ESTIMATE #endif -// for transpose YZ +/* for transpose YZ */ #define TR_BLOCK 64 -// SEMI-GLOBAL VARIABLES +/* SEMI-GLOBAL VARIABLES */ -// defined ant initialized in calculator.c +/* defined ant initialized in calculator.c */ extern const double *tab1,*tab2,*tab3,*tab4,*tab5,*tab6,*tab7,*tab8,*tab9,*tab10; extern const int **tab_index; -// defined and initialized in timing.c +/* defined and initialized in timing.c */ extern TIME_TYPE Timing_FFT_Init,Timing_Dm_Init; -// used in matvec.c -doublecomplex *Dmatrix; // holds FFT of the interaction matrix -doublecomplex *Xmatrix; // holds input vector (on expanded grid) to matvec -doublecomplex *slices; // used in inner cycle of matvec - holds 3 components (for fixed x) -doublecomplex *slices_tr; // additional storage space for slices to accelerate transpose -size_t DsizeY,DsizeZ,DsizeYZ; // size of the 'matrix' D +/* used in matvec.c */ +doublecomplex *Dmatrix; /* holds FFT of the interaction matrix */ +doublecomplex *Xmatrix; /* holds input vector (on expanded grid) to matvec */ +doublecomplex *slices; /* used in inner cycle of matvec - holds 3 components (for fixed x) */ +doublecomplex *slices_tr; /* additional storage space for slices to accelerate transpose */ +size_t DsizeY,DsizeZ,DsizeYZ; /* size of the 'matrix' D */ -// used in comm.c -double *BT_buffer, *BT_rbuffer; // buffers for BlockTranspose +/* used in comm.c */ +double *BT_buffer, *BT_rbuffer; /* buffers for BlockTranspose */ -// LOCAL VARIABLES +/* LOCAL VARIABLES */ -// D2 matrix and its two slices; used only temporary for InitDmatrix +/* D2 matrix and its two slices; used only temporary for InitDmatrix */ static doublecomplex *slice,*slice_tr,*D2matrix; -static size_t D2sizeX,D2sizeY,D2sizeZ; // size of the 'matrix' D2 -static size_t blockTr=TR_BLOCK; // block size for TransposeYZ; see fft.h -static int weird_nprocs; // whether weird number of processors is used +static size_t D2sizeX,D2sizeY,D2sizeZ; /* size of the 'matrix' D2 */ +static size_t blockTr=TR_BLOCK; /* block size for TransposeYZ; see fft.h */ +static int weird_nprocs; /* whether weird number of processors is used */ #ifdef FFTW3 -// FFTW3 plans: f - FFT_FORWARD; b - FFT_BACKWARD -static fftw_plan planXf,planXb,planYf,planYb,planZf,planZb,planXf_Dm,planYf_Dm,planZf_Dm; + /* FFTW3 plans: f - FFT_FORWARD; b - FFT_BACKWARD */ + static fftw_plan planXf,planXb,planYf,planYb,planZf,planZb,planXf_Dm,planYf_Dm,planZf_Dm; #elif defined(FFT_TEMPERTON) # define IFAX_SIZE 20 -static double *trigsX,*trigsY,*trigsZ,*work; // arrays for Temperton FFT -static int ifaxX[IFAX_SIZE],ifaxY[IFAX_SIZE],ifaxZ[IFAX_SIZE]; -// Fortran routines from cfft99D.f -void cftfax_(const int *nn,int *ifax,double *trigs); -void cfft99_(double *data,double *_work,const double *trigs,const int *ifax,const int *inc, - const int *jump,const int *nn,const int *lot,const int *isign); + static double *trigsX,*trigsY,*trigsZ,*work; /* arrays for Temperton FFT */ + static int ifaxX[IFAX_SIZE],ifaxY[IFAX_SIZE],ifaxZ[IFAX_SIZE]; + /* Fortran routines from cfft99D.f */ + void cftfax_(const int *nn,int *ifax,double *trigs); + void cfft99_(double *data,double *_work,const double *trigs,const int *ifax,const int *inc, + const int *jump,const int *nn,const int *lot,const int *isign); #endif -// EXTERNAL FUNCTIONS +/* EXTERNAL FUNCTIONS */ -// sinint.c +/* sinint.c */ void cisi(double x,double *ci,double *si); -//============================================================ +/*============================================================*/ INLINE size_t IndexDmatrix(const size_t x,size_t y,size_t z) -// index D matrix to store final result + /* index D matrix to store final result */ { - if (y>=DsizeY) y=gridY-y; - if (z>=DsizeZ) z=gridZ-z; + if (y>=DsizeY) y=gridY-y; + if (z>=DsizeZ) z=gridZ-z; - return(NDCOMP*(x*DsizeYZ+z*DsizeY+y)); + return(NDCOMP*(x*DsizeYZ+z*DsizeY+y)); } -//============================================================ +/*============================================================*/ INLINE size_t IndexGarbledD(const size_t x,int y,int z,const size_t lengthN) -// index D2 matrix after BlockTranspose + /* index D2 matrix after BlockTranspose */ { - if (y<0) y+=D2sizeY; - if (z<0) z+=D2sizeZ; + if (y<0) y+=D2sizeY; + if (z<0) z+=D2sizeZ; #ifdef PARALLEL - return(((z%lengthN)*D2sizeY+y)*gridX+(z/lengthN)*local_Nx+x%local_Nx); + return(((z%lengthN)*D2sizeY+y)*gridX+(z/lengthN)*local_Nx+x%local_Nx); #else - return((z*D2sizeY+y)*gridX+x); + return((z*D2sizeY+y)*gridX+x); #endif } -//============================================================ +/*============================================================*/ INLINE size_t IndexD2matrix(int x,int y,int z,const int nnn) -// index D2 matrix to store calculated elements + /* index D2 matrix to store calculate elements */ { - if (x<0) x+=gridX; - if (y<0) y+=D2sizeY; - // if (z<0) z+=D2sizeZ; - return(((z-nnn*local_z0)*D2sizeY+y)*gridX+x); + if (x<0) x+=gridX; + if (y<0) y+=D2sizeY; +/* if (z<0) z+=D2sizeZ; */ + return(((z-nnn*local_z0)*D2sizeY+y)*gridX+x); } -//============================================================ +/*============================================================*/ INLINE size_t IndexSliceD2matrix(int y,int z) -// index slice of D2 matrix + /* index slice of D2 matrix */ { - if (y<0) y+=gridY; - if (z<0) z+=gridZ; + if (y<0) y+=gridY; + if (z<0) z+=gridZ; - return(y*gridZ+z); + return(y*gridZ+z); } -//============================================================ +/*============================================================*/ INLINE size_t IndexSlice_zyD2matrix(const size_t y,const size_t z) -// index transposed slice of D2 matrix + /* index transposed slice of D2 matrix */ { - return (z*gridY+y); + return (z*gridY+y); } -//============================================================ +/*============================================================*/ void TransposeYZ(const int direction) -// optimized routine to transpose y and z; forward: slices->slices_tr; backward: slices_tr->slices + /* optimised routine to transpose y and z + forward: slices -> slices_tr + backward: slices_tr -> slices */ { - size_t y,z,Y,Z,y1,y2,z1,z2,i,j,y0,z0,Xcomp; - doublecomplex *t0,*t1,*t2,*t3,*t4,*w0,*w1,*w2,*w3; - - if (direction==FFT_FORWARD) { - Y=gridY; - Z=gridZ; - w0=slices; - t0=slices_tr-Y; - } - else { // direction==FFT_BACKWARD - Y=gridZ; - Z=gridY; - w0=slices_tr; - t0=slices-Y; - } - - y1=Y/blockTr; - y2=Y%blockTr; - z1=Z/blockTr; - z2=Z%blockTr; - - for(Xcomp=0;Xcomp<3;Xcomp++) { - w1=w0+Xcomp*gridYZ; - t1=t0+Xcomp*gridYZ; - for(i=0;i<=y1;i++) { - if (i==y1) y0=y2; - else y0=blockTr; - w2=w1; - t2=t1; - for(j=0;j<=z1;j++) { - if (j==z1) z0=z2; - else z0=blockTr; - w3=w2; - t3=t2; - for (y=0;y<y0;y++) { - t4=t3+y; - for (z=0;z<z0;z++) { - cEqual(w3[z],*(t4+=Y)); - } - w3+=Z; - } - w2+=blockTr; - t2+=blockTr*Y; - } - w1+=blockTr*Z; - t1+=blockTr; - } - } + size_t y,z,Y,Z,y1,y2,z1,z2,i,j,y0,z0,Xcomp; + doublecomplex *t0,*t1,*t2,*t3,*t4,*w0,*w1,*w2,*w3; + + if (direction==FFT_FORWARD) { + Y=gridY; + Z=gridZ; + w0=slices; + t0=slices_tr-Y; + } + else { /* direction==FFT_BACKWARD */ + Y=gridZ; + Z=gridY; + w0=slices_tr; + t0=slices-Y; + } + + y1=Y/blockTr; + y2=Y%blockTr; + z1=Z/blockTr; + z2=Z%blockTr; + + for(Xcomp=0;Xcomp<3;Xcomp++) { + w1=w0+Xcomp*gridYZ; + t1=t0+Xcomp*gridYZ; + for(i=0;i<=y1;i++) { + if (i==y1) y0=y2; + else y0=blockTr; + w2=w1; + t2=t1; + for(j=0;j<=z1;j++) { + if (j==z1) z0=z2; + else z0=blockTr; + w3=w2; + t3=t2; + for (y=0;y<y0;y++) { + t4=t3+y; + for (z=0;z<z0;z++) { + cEqual(w3[z],*(t4+=Y)); + } + w3+=Z; + } + w2+=blockTr; + t2+=blockTr*Y; + } + w1+=blockTr*Z; + t1+=blockTr; + } + } } -//============================================================ +/*============================================================*/ static void transposeYZ_Dm(doublecomplex *data,doublecomplex *trans) -// optimized routine to transpose y and z for Dmatrix: data -> trans + /* optimised routine to transpose y and z for Dmatrix: data -> trans */ { - size_t y,z,Y,Z,y1,y2,z1,z2,i,j,y0,z0; - doublecomplex *t1,*t2,*t3,*t4,*w1,*w2,*w3; - - Y=gridY; - Z=gridZ; - - y1=Y/blockTr; - y2=Y%blockTr; - z1=Z/blockTr; - z2=Z%blockTr; - - w1=data; - t1=trans-Y; - - for(i=0;i<=y1;i++) { - if (i==y1) y0=y2; - else y0=blockTr; - w2=w1; - t2=t1; - for(j=0;j<=z1;j++) { - if (j==z1) z0=z2; - else z0=blockTr; - w3=w2; - t3=t2; - for (y=0;y<y0;y++) { - t4=t3+y; - for (z=0;z<z0;z++) { - cEqual(w3[z],*(t4+=Y)); - } - w3+=Z; - } - w2+=blockTr; - t2+=blockTr*Y; - } - w1+=blockTr*Z; - t1+=blockTr; - } + size_t y,z,Y,Z,y1,y2,z1,z2,i,j,y0,z0; + doublecomplex *t1,*t2,*t3,*t4,*w1,*w2,*w3; + + Y=gridY; + Z=gridZ; + + y1=Y/blockTr; + y2=Y%blockTr; + z1=Z/blockTr; + z2=Z%blockTr; + + w1=data; + t1=trans-Y; + + for(i=0;i<=y1;i++) { + if (i==y1) y0=y2; + else y0=blockTr; + w2=w1; + t2=t1; + for(j=0;j<=z1;j++) { + if (j==z1) z0=z2; + else z0=blockTr; + w3=w2; + t3=t2; + for (y=0;y<y0;y++) { + t4=t3+y; + for (z=0;z<z0;z++) { + cEqual(w3[z],*(t4+=Y)); + } + w3+=Z; + } + w2+=blockTr; + t2+=blockTr*Y; + } + w1+=blockTr*Z; + t1+=blockTr; + } } -//============================================================ +/*============================================================*/ void fftX(const int isign) -// FFT three components of Xmatrix(x) for all y,z; called from matvec + /* FFT three components of Xmatrix(x) for all y,z; called from matvec */ { + #ifdef FFTW3 - if (isign==FFT_FORWARD) fftw_execute(planXf); - else fftw_execute(planXb); + if (isign==FFT_FORWARD) fftw_execute(planXf); + else fftw_execute(planXb); #elif defined(FFT_TEMPERTON) - int nn=gridX,inc=1,jump=nn,lot=boxY; - size_t z; + int nn=gridX,inc=1,jump=nn,lot=boxY; + size_t z; - for (z=0;z<3*local_Nz;z++) // -f - cfft99_((double *)(Xmatrix+z*gridX*smallY),work,trigsX,ifaxX,&inc,&jump,&nn,&lot,&isign); + for (z=0;z<3*local_Nz;z++) /* -f */ + cfft99_((double *)(Xmatrix+z*gridX*smallY),work,trigsX,ifaxX,&inc,&jump,&nn,&lot,&isign); #endif } -//============================================================ +/*============================================================*/ void fftY(const int isign) -// FFT three components of slices_tr(y) for all z; called from matvec + /* FFT three components of slices_tr(y) for all z; called from matvec */ { #ifdef FFTW3 - if (isign==FFT_FORWARD) fftw_execute(planYf); - else fftw_execute(planYb); + if (isign==FFT_FORWARD) fftw_execute(planYf); + else fftw_execute(planYb); #elif defined(FFT_TEMPERTON) - int nn=gridY,inc=1,jump=nn,lot=smallZ,j; - // cfft99_ slows down rapidly when lot is big, hence a small loop - for(j=0;j<6;j++) - cfft99_((double *)(slices_tr+j*gridY*smallZ),work,trigsY,ifaxY,&inc,&jump,&nn,&lot,&isign); + int nn=gridY,inc=1,jump=nn,lot=smallZ,j; + for(j=0;j<6;j++) + cfft99_((double *)(slices_tr+j*gridY*smallZ),work,trigsY,ifaxY,&inc,&jump,&nn,&lot,&isign); + /* cfft99_ slows down rapidly when lot is big, hence a small loop */ #endif } -//============================================================ +/*============================================================*/ void fftZ(const int isign) -// FFT three components of slices(z) for all y; called from matvec + /* FFT three components of slices(z) for all y; called from matvec */ { #ifdef FFTW3 - if (isign==FFT_FORWARD) fftw_execute(planZf); - else fftw_execute(planZb); + if (isign==FFT_FORWARD) fftw_execute(planZf); + else fftw_execute(planZb); #elif defined(FFT_TEMPERTON) - int nn=gridZ,inc=1,jump=nn,lot=boxY,Xcomp; + int nn=gridZ,inc=1,jump=nn,lot=boxY,Xcomp; - for (Xcomp=0;Xcomp<3;Xcomp++) - cfft99_((double *)(slices+gridYZ*Xcomp),work,trigsZ,ifaxZ,&inc,&jump,&nn,&lot,&isign); + for (Xcomp=0;Xcomp<3;Xcomp++) + cfft99_((double *)(slices+gridYZ*Xcomp),work,trigsZ,ifaxZ,&inc,&jump,&nn,&lot,&isign); #endif } -//============================================================ +/*============================================================*/ static void fftX_Dm(const size_t lengthZ) -// FFT(forward) D2matrix(x) for all y,z; used for Dmatrix calculation + /* FFT(forward) D2matrix(x) for all y,z; used for Dmatrix calculation */ { #ifdef FFTW3 - fftw_execute(planXf_Dm); + fftw_execute(planXf_Dm); #elif defined(FFT_TEMPERTON) - int nn=gridX,inc=1,jump=nn,lot=D2sizeY,isign=FFT_FORWARD; - size_t z; + int nn=gridX,inc=1,jump=nn,lot=D2sizeY,isign=FFT_FORWARD; + size_t z; - for (z=0;z<lengthZ;z++) - cfft99_((double *)(D2matrix+z*gridX*D2sizeY),work,trigsX,ifaxX,&inc,&jump,&nn,&lot,&isign); + for (z=0;z<lengthZ;z++) + cfft99_((double *)(D2matrix+z*gridX*D2sizeY),work,trigsX,ifaxX,&inc,&jump,&nn,&lot,&isign); #endif } -//============================================================ +/*============================================================*/ static void fftY_Dm(void) -// FFT(forward) slice_tr(y) for all z; used for Dmatrix calculation + /* FFT(forward) slice_tr(y) for all z; used for Dmatrix calculation */ { #ifdef FFTW3 - fftw_execute(planYf_Dm); + fftw_execute(planYf_Dm); #elif defined(FFT_TEMPERTON) - int nn=gridY,inc=1,jump=nn,lot=gridZ,isign=FFT_FORWARD; + int nn=gridY,inc=1,jump=nn,lot=gridZ,isign=FFT_FORWARD; - cfft99_((double *)slice_tr,work,trigsY,ifaxY,&inc,&jump,&nn,&lot,&isign); + cfft99_((double *)slice_tr,work,trigsY,ifaxY,&inc,&jump,&nn,&lot,&isign); #endif } -//============================================================ +/*============================================================*/ static void fftZ_Dm(void) -// FFT(forward) slice(z) for all y; used for Dmatrix calculation + /* FFT(forward) slice(z) for all y; used for Dmatrix calculation */ { #ifdef FFTW3 - fftw_execute(planZf_Dm); + fftw_execute(planZf_Dm); #elif defined(FFT_TEMPERTON) - int nn=gridZ,inc=1,jump=nn,lot=gridY,isign=FFT_FORWARD; + int nn=gridZ,inc=1,jump=nn,lot=gridY,isign=FFT_FORWARD; - cfft99_((double *)slice,work,trigsZ,ifaxZ,&inc,&jump,&nn,&lot,&isign); + cfft99_((double *)slice,work,trigsZ,ifaxZ,&inc,&jump,&nn,&lot,&isign); #endif } -//============================================================ +/*============================================================*/ void CheckNprocs(void) -// checks for consistency the specified number of processors; called in the beginning from InitComm + /* checks for consistency the specified number of processors; + called in the beginning from InitComm */ { - int y=nprocs; - - // initialize weird_nprocs - weird_nprocs=FALSE; - // remove simple prime divisors of y - while (y%2==0) y/=2; - while (y%3==0) y/=3; - while (y%5==0) y/=5; + int y=nprocs; + + /* initialize weird_nprocs*/ + weird_nprocs=FALSE; + /* remove simple prime divisors of y */ + while (y%2==0) y/=2; + while (y%3==0) y/=3; + while (y%5==0) y/=5; #ifdef FFT_TEMPERTON - if (y!=1) PrintError("Specified number of processors (%d) is weird (has prime divisors larger " - "than 5). That is incompatible with Temperton FFT. Revise the number of processors " - "(recommended) or recompile with FFTW 3 support.",nprocs); + if (y!=1) PrintError("Specified number of processors (%d) is weird (has prime divisors larger "\ + "than 5). That is incompatible with Temperton FFT. Revise the number of processors "\ + "(recommended) or recompile with FFTW 3 support.",nprocs); #elif defined(FFTW3) - while (y%7==0) y/=7; - // one multiplier of either 11 or 13 is allowed - if (y%11==0) y/=11; - else if (y%13==0) y/=13; - if (y!=1) { - LogError(EC_WARN,ONE_POS,"Specified number of processors (%d) is weird (has prime divisors " - "larger than 13 or more than one divisor of either 11 or 13). FFTW3 will work less " - "efficiently. It is strongly recommended to revise the number of processors.",nprocs); - weird_nprocs=TRUE; - } + while (y%7==0) y/=7; + /* one multiplier of either 11 or 13 is allowed */ + if (y%11==0) y/=11; + else if (y%13==0) y/=13; + if (y!=1) { + LogError(EC_WARN,ONE_POS,"Specified number of processors (%d) is weird (has prime divisors "\ + "larger than 13 or more than one divisor of either 11 or 13). FFTW3 will work less "\ + "efficiently. It is strongly recommended to revise the number of processors.",nprocs); + weird_nprocs=TRUE; + } #endif } -//============================================================ +/*============================================================*/ int fftFit(int x,int divis) -/* find the first number >=x divisible by 2,3,5 only (if FFTW3 7 and one of 11 or 13 are allowed), - * and also divisible by 2 and divis. If weird_nprocs is used, only the latter condition is required - */ + /* find the first number >=x divisible by 2,3,5 only, + (if FFTW3 7 and one of 11 or 13 are allowed) + and divisible by 2 and divis + if weird_nprocs is used, only the latter condition is required */ { - int y; - - if (weird_nprocs) { - if (!IS_EVEN(divis)) divis*=2; - return (divis*((x+divis-1)/divis)); - } - else while (TRUE) { - y=x; - while (y%2==0) y/=2; - while (y%3==0) y/=3; - while (y%5==0) y/=5; + int y; + + if (weird_nprocs) { + if (divis%2!=0) divis*=2; + return (divis*((x+divis-1)/divis)); + } + else while (TRUE) { + y=x; + while (y%2==0) y/=2; + while (y%3==0) y/=3; + while (y%5==0) y/=5; #ifdef FFTW3 - while (y%7==0) y/=7; - // one multiplier of either 11 or 13 is allowed - if (y%11==0) y/=11; - else if (y%13==0) y/=13; + while (y%7==0) y/=7; + /* one multiplier of either 11 or 13 is allowed */ + if (y%11==0) y/=11; + else if (y%13==0) y/=13; #endif - if (y==1 && IS_EVEN(x) && x%divis==0) return(x); - x++; - } + if (y==1 && x%2==0 && x%divis==0) return(x); + x++; + } } -//============================================================= +/*=============================================================*/ static void fftInitBeforeD(const int lengthZ) -// initialize fft before initialization of Dmatrix + /* initialize fft before initialization of Dmatrix */ { #ifdef FFTW3 - int grXint=gridX,grYint=gridY,grZint=gridZ; // this is needed to provide 'int *' to grids - - planYf_Dm=fftw_plan_many_dft(1,&grYint,gridZ,slice_tr,NULL,1,gridY, - slice_tr,NULL,1,gridY,FFT_FORWARD,PLAN_FFTW_DM); - planZf_Dm=fftw_plan_many_dft(1,&grZint,gridY,slice,NULL,1,gridZ, - slice,NULL,1,gridZ,FFT_FORWARD,PLAN_FFTW_DM); - planXf_Dm=fftw_plan_many_dft(1,&grXint,lengthZ*D2sizeY,D2matrix,NULL,1,gridX, - D2matrix,NULL,1,gridX,FFT_FORWARD,PLAN_FFTW_DM); + planYf_Dm=fftw_plan_many_dft(1,(int *)&gridY,gridZ,slice_tr,NULL,1,gridY, + slice_tr,NULL,1,gridY,FFT_FORWARD,PLAN_FFTW_DM); + planZf_Dm=fftw_plan_many_dft(1,(int *)&gridZ,gridY,slice,NULL,1,gridZ, + slice,NULL,1,gridZ,FFT_FORWARD,PLAN_FFTW_DM); + planXf_Dm=fftw_plan_many_dft(1,(int *)&gridX,lengthZ*D2sizeY,D2matrix,NULL,1,gridX, + D2matrix,NULL,1,gridX,FFT_FORWARD,PLAN_FFTW_DM); #elif defined(FFT_TEMPERTON) - int size,nn; - - // allocate memory - MALLOC_VECTOR(trigsX,double,2*gridX,ALL); - MALLOC_VECTOR(trigsY,double,2*gridY,ALL); - MALLOC_VECTOR(trigsZ,double,2*gridZ,ALL); - size=MAX(gridX*D2sizeY,3*gridYZ); - MALLOC_VECTOR(work,double,2*size,ALL); - // initialize ifax and trigs - nn=gridX; - cftfax_ (&nn,ifaxX,trigsX); - nn=gridY; - cftfax_ (&nn,ifaxY,trigsY); - nn=gridZ; - cftfax_ (&nn,ifaxZ,trigsZ); + int size,nn; + + /* allocate memory */ + MALLOC_VECTOR(trigsX,double,2*gridX,ALL); + MALLOC_VECTOR(trigsY,double,2*gridY,ALL); + MALLOC_VECTOR(trigsZ,double,2*gridZ,ALL); + size=MAX(gridX*D2sizeY,3*gridYZ); + MALLOC_VECTOR(work,double,2*size,ALL); + /* initialize ifax and trigs */ + nn=gridX; + cftfax_ (&nn,ifaxX,trigsX); + nn=gridY; + cftfax_ (&nn,ifaxY,trigsY); + nn=gridZ; + cftfax_ (&nn,ifaxZ,trigsZ); #endif } -//============================================================ +/*============================================================*/ static void fftInitAfterD(void) -// second part of fft initialization + /* second part of fft initialization */ { #ifdef FFTW3 - int lot; - fftw_iodim dims,howmany_dims[2]; - int grYint=gridY; // this is needed to provide 'int *' to gridY -# ifdef PRECISE_TIMING - SYSTEM_TIME tvp[13]; -# endif - PRINTZ("Initializing FFTW3\n"); - FFLUSHZ(stdout); -# ifdef PRECISE_TIMING - GetTime(tvp); -# endif - lot=3*gridZ; - planYf=fftw_plan_many_dft(1,&grYint,lot,slices_tr,NULL,1,gridY, - slices_tr,NULL,1,gridY,FFT_FORWARD,PLAN_FFTW); -# ifdef PRECISE_TIMING - GetTime(tvp+1); -# endif - planYb=fftw_plan_many_dft(1,&grYint,lot,slices_tr,NULL,1,gridY, - slices_tr,NULL,1,gridY,FFT_BACKWARD,PLAN_FFTW); -# ifdef PRECISE_TIMING - GetTime(tvp+2); -# endif - dims.n=gridZ; - dims.is=dims.os=1; - howmany_dims[0].n=3; - howmany_dims[0].is=howmany_dims[0].os=gridZ*gridY; - howmany_dims[1].n=boxY; - howmany_dims[1].is=howmany_dims[1].os=gridZ; - planZf=fftw_plan_guru_dft(1,&dims,2,howmany_dims,slices,slices,FFT_FORWARD,PLAN_FFTW); -# ifdef PRECISE_TIMING - GetTime(tvp+3); -# endif - planZb=fftw_plan_guru_dft(1,&dims,2,howmany_dims,slices,slices,FFT_BACKWARD,PLAN_FFTW); -# ifdef PRECISE_TIMING - GetTime(tvp+4); -# endif - dims.n=gridX; - dims.is=dims.os=1; - howmany_dims[0].n=3*local_Nz; - howmany_dims[0].is=howmany_dims[0].os=smallY*gridX; - howmany_dims[1].n=boxY; - howmany_dims[1].is=howmany_dims[1].os=gridX; - planXf=fftw_plan_guru_dft(1,&dims,2,howmany_dims,Xmatrix,Xmatrix,FFT_FORWARD,PLAN_FFTW); -# ifdef PRECISE_TIMING - GetTime(tvp+5); -# endif - planXb=fftw_plan_guru_dft(1,&dims,2,howmany_dims,Xmatrix,Xmatrix,FFT_BACKWARD,PLAN_FFTW); -# ifdef PRECISE_TIMING - GetTime(tvp+6); - // print precise timing of FFT planning - SetTimerFreq(); - PRINTBOTHZ(logfile, - "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n" - " FFTW3 planning \n" - "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n" - "Yf = %4.4f Total = %4.4f\n" - "Yb = %4.4f\n" - "Zf = %4.4f\n" - "Zb = %4.4f\n" - "Xf = %4.4f\n" - "Xb = %4.4f\n\n", - DiffSec(tvp,tvp+1),DiffSec(tvp,tvp+6),DiffSec(tvp+1,tvp+2),DiffSec(tvp+2,tvp+3), - DiffSec(tvp+3,tvp+4),DiffSec(tvp+4,tvp+5),DiffSec(tvp+5,tvp+6)); -# endif - // destroy old plans - fftw_destroy_plan(planXf_Dm); - fftw_destroy_plan(planYf_Dm); - fftw_destroy_plan(planZf_Dm); + int lot; + fftw_iodim dims,howmany_dims[2]; +# ifdef PRECISE_TIMING + SYSTEM_TIME tvp[13]; +# endif + PRINTZ("Initializing FFTW3\n"); + FFLUSHZ(stdout); +# ifdef PRECISE_TIMING + GetTime(tvp); +# endif + lot=3*gridZ; + planYf=fftw_plan_many_dft(1,(int *)&gridY,lot,slices_tr,NULL,1,gridY, + slices_tr,NULL,1,gridY,FFT_FORWARD,PLAN_FFTW); +# ifdef PRECISE_TIMING + GetTime(tvp+1); +# endif + planYb=fftw_plan_many_dft(1,(int *)&gridY,lot,slices_tr,NULL,1,gridY, + slices_tr,NULL,1,gridY,FFT_BACKWARD,PLAN_FFTW); +# ifdef PRECISE_TIMING + GetTime(tvp+2); +# endif + dims.n=gridZ; + dims.is=dims.os=1; + howmany_dims[0].n=3; + howmany_dims[0].is=howmany_dims[0].os=gridZ*gridY; + howmany_dims[1].n=boxY; + howmany_dims[1].is=howmany_dims[1].os=gridZ; + planZf=fftw_plan_guru_dft(1,&dims,2,howmany_dims,slices,slices,FFT_FORWARD,PLAN_FFTW); +# ifdef PRECISE_TIMING + GetTime(tvp+3); +# endif + planZb=fftw_plan_guru_dft(1,&dims,2,howmany_dims,slices,slices,FFT_BACKWARD,PLAN_FFTW); +# ifdef PRECISE_TIMING + GetTime(tvp+4); +# endif + dims.n=gridX; + dims.is=dims.os=1; + howmany_dims[0].n=3*local_Nz; + howmany_dims[0].is=howmany_dims[0].os=smallY*gridX; + howmany_dims[1].n=boxY; + howmany_dims[1].is=howmany_dims[1].os=gridX; + planXf=fftw_plan_guru_dft(1,&dims,2,howmany_dims,Xmatrix,Xmatrix,FFT_FORWARD,PLAN_FFTW); +# ifdef PRECISE_TIMING + GetTime(tvp+5); +# endif + planXb=fftw_plan_guru_dft(1,&dims,2,howmany_dims,Xmatrix,Xmatrix,FFT_BACKWARD,PLAN_FFTW); +# ifdef PRECISE_TIMING + GetTime(tvp+6); + /* print precise timing of FFT planning */ + SetTimerFreq(); + PRINTBOTHZ(logfile, + "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n"\ + " FFTW3 planning \n"\ + "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n"\ + "Yf = %4.4f Total = %4.4f\n"\ + "Yb = %4.4f\n"\ + "Zf = %4.4f\n"\ + "Zb = %4.4f\n"\ + "Xf = %4.4f\n"\ + "Xb = %4.4f\n\n", + DiffSec(tvp,tvp+1),DiffSec(tvp,tvp+6),DiffSec(tvp+1,tvp+2),DiffSec(tvp+2,tvp+3), + DiffSec(tvp+3,tvp+4),DiffSec(tvp+4,tvp+5),DiffSec(tvp+5,tvp+6)); +# endif + /* destroy old plans */ + fftw_destroy_plan(planXf_Dm); + fftw_destroy_plan(planYf_Dm); + fftw_destroy_plan(planZf_Dm); #endif } -//============================================================ +/*============================================================*/ static void CalcInterTerm(int i,int j,int k,int mu,int nu,doublecomplex result) -/* calculates interaction term between two dipoles; given integer distance vector {i,j,k} - * (in units of d), and component indices mu,nu. - */ + /* calculates interaction term between two dipoles; given integer distance vector {i,j,k} + (in units of d), and component indices mu,nu */ { - double rr,rtemp[3],qvec[3],q2[3],invr,invr3,qavec[3],av[3]; - double rr2,kr,kr2,kr3,kd2,q4,rn; - double temp,qmunu,qa,qamunu,invrn,invrn2,invrn3,invrn4,dmunu; - double kfr,ci,si,ci1,si1,ci2,si2,brd,cov,siv,g0,g2; - doublecomplex expval,br,br1,m,m2,Gf1,Gm0,Gm1,Gc1,Gc2; - int ind0,ind1,ind2,ind2m,ind3,ind4,indmunu; - int sigV[3],ic,sig,ivec[3],ord[3],invord[3]; - double t3q,t3a,t4q,t4a,t5tr,t5aa,t6tr,t6aa; - //int pr; - const int inter_avg=TRUE; - - // self interaction; self term is computed in different subroutine - if (i==0 && j==0 && k==0) { - result[RE]=result[IM]=0.0; - return; - } - - // for debugging - //pr=(i==1 && j==1 && k==1); - //if (pr) PRINTZ("%d,%d: ",mu,nu); - - // initialize rtemp - rtemp[0]=i*gridspace; - rtemp[1]=j*gridspace; - rtemp[2]=k*gridspace; - //====== calculate some basic constants ====== - rr2 = DotProd(rtemp,rtemp); - rr = sqrt(rr2); - rn=rr/gridspace; // normalized r - invr = 1/rr; - invr3 = invr*invr*invr; - MultScal(invr,rtemp,qvec); - kr = WaveNum * rr; - kr2 = kr*kr; - kfr=PI*rn; // k_F*r, for FCD - qmunu=qvec[mu]*qvec[nu]; - // cov=cos(kr); siv=sin(kr); expval=Exp(ikr)/r^3 - imExp(kr,expval); - cov=expval[RE]; - siv=expval[IM]; - cMultReal(invr3,expval,expval); - //====== calculate Gp ======== - // br=delta[mu,nu]*(-1+ikr+kr^2)-qmunu*(-3+3ikr+kr^2) - br[RE]=(3-kr2)*qmunu; - br[IM]=-3*kr*qmunu; - if(mu==nu) { - br[RE]+=kr2-1; - br[IM]+=kr; - } - // result=Gp=expval*br - cMult(br,expval,result); - //====== FCD (static and full) ======== - /* !!! speed of FCD can be improved by using faster version of sici routine, using predefined - * tables, etc (e.g. as is done in GSL library). But currently this do not seem to consume a - * significant portion of the total simulation time. - */ - if (IntRelation==G_FCD_ST) { - /* FCD is based on Gay-Balmaz P., Martin O.J.F. "A library for computing the filtered and - * non-filtered 3D Green's tensor associated with infinite homogeneous space and surfaces", - * Comp. Phys. Comm. 144:111-120 (2002), and Piller N.B. "Increasing the performance of the - * coupled-dipole approximation: A spectral approach", IEEE Trans.Ant.Propag. 46(8): - * 1126-1137. Here it differs by a factor of 4*pi*k^2. - */ - // result = Gp*[3*Si(k_F*r)+k_F*r*cos(k_F*r)-4*sin(k_F*r)]*2/(3*pi) - cisi(kfr,&ci,&si); - brd=TWO_OVER_PI*ONE_THIRD*(3*si+kfr*cos(kfr)-4*sin(kfr)); - cMultReal(brd,result,result); - } - else if (IntRelation==G_FCD) { - // ci,si_1,2 = ci,si_+,- = Ci,Si((k_F +,- k)r) - cisi(kfr+kr,&ci1,&si1); - cisi(kfr-kr,&ci2,&si2); - // ci=ci1-ci2; si=pi-si1-si2 - ci=ci1-ci2; - si=PI-si1-si2; - g0=INV_PI*(siv*ci+cov*si); - g2=INV_PI*(kr*(cov*ci-siv*si)+2*ONE_THIRD*(kfr*cos(kfr)-4*sin(kfr)))-g0; - temp=g0*kr2; - // brd=(delta[mu,nu]*(-g0*kr^2-g2)+qmunu*(g0*kr^2+3g2))/r^3 - brd=qmunu*(temp+3*g2); - if (mu==nu) brd-=temp+g2; - brd*=invr3; - // result=Gp+brd - result[RE]+=brd; - } - //======= second order corrections ======== - else if (IntRelation==G_SO) { - // !!! this should never happen - if (anisotropy) LogError(EC_ERROR,ONE_POS,"Incompatibility error in CalcInterTerm"); - kd2=kd*kd; - kr3=kr2*kr; - // only one refractive index can be used for FFT-compatible algorithm !!! - cEqual(ref_index[0],m); - cSquare(m,m2); - if (!inter_avg) { - qa=DotProd(qvec,prop); - // qamunu=qvec[mu]*prop[nu] + qvec[nu]*prop[mu] - qamunu=qvec[mu]*prop[nu]; - if (mu==nu) qamunu*=2; - else qamunu+=qvec[nu]*prop[mu]; - } - if (kr*rn < G_BOUND_CLOSE) { - //====== G close ============= - // check if inside the table bounds; needed to recompute to make an integer comparison - if ((i*i+j*j+k*k) > TAB_RMAX*TAB_RMAX) LogError(EC_ERROR,ALL_POS, - "Not enough table size (available only up to R/d=%d)",TAB_RMAX); - - // av is copy of propagation vector - if (!inter_avg) memcpy(av,prop,3*sizeof(double)); - ivec[0]=i; - ivec[1]=j; - ivec[2]=k; - // transformation of negative coordinates - for (ic=0;ic<3;ic++) { - if (ivec[ic]<0) { - sigV[ic]=-1; - av[ic]*=-1; - qvec[ic]*=-1; - ivec[ic]*=-1; - } - else sigV[ic]=1; - } - i=ivec[0]; - j=ivec[1]; - k=ivec[2]; - sig=sigV[mu]*sigV[nu]; // sign of some terms below - // transformation to case i>=j>=k>=0 - // building of ord; ord[x] is x-th largest coordinate (0-th - the largest) - if (i>=j) { - if (i>=k) { - ord[0]=0; - if (j>=k) { - ord[1]=1; - ord[2]=2; - } - else { - ord[1]=2; - ord[2]=1; - } - } - else { - ord[0]=2; - ord[1]=0; - ord[2]=1; - } - } - else { - if (i>=k) { - ord[0]=1; - ord[1]=0; - ord[2]=2; - } - else { - ord[2]=0; - if (j>=k) { - ord[0]=1; - ord[1]=2; - } - else { - ord[0]=2; - ord[1]=1; - } - } - } - // change parameters according to coordinate transforms - Permutate(qvec,ord); - if (!inter_avg) Permutate(av,ord); - Permutate_i(ivec,ord); - i=ivec[0]; - j=ivec[1]; - k=ivec[2]; - // compute inverse permutation - memcpy(invord,ord,3*sizeof(int)); - Permutate_i(invord,ord); - if (invord[0]==0 && invord[1]==1 && invord[2]==2) memcpy(invord,ord,3*sizeof(int)); - // compute transformed indices mu and nu - mu=invord[mu]; - nu=invord[nu]; - // indexes for tables of different dimensions - // indmunu is a number of component[mu,nu] in symmetric matrix - indmunu=mu+nu; - if (mu==2 || nu==2) indmunu++; - - ind0=tab_index[i][j]+k; - ind1=3*ind0; - ind2m=6*ind0; - ind2=ind2m+indmunu; - ind3=3*ind2; - ind4=6*ind2; - // computing several quantities with table integrals - t3q=DotProd(qvec,tab3+ind1); - t4q=DotProd(qvec,tab4+ind3); - t5tr=TrSym(tab5+ind2m); - t6tr=TrSym(tab6+ind4); - if (inter_avg) { - // <a[mu]*a[nu]>=1/3*delta[mu,nu] - t5aa=ONE_THIRD*t5tr; - t6aa=ONE_THIRD*t6tr; - } - else { - t3a=DotProd(av,tab3+ind1); - t4a=DotProd(av,tab4+ind3); - t5aa=QuadForm(tab5+ind2m,av); - t6aa=QuadForm(tab6+ind4,av); - } - //====== computing Gc0 ===== - // temp = kr/24 - temp=kr/24; - /* br = delta[mu,nu]*(-I7-I9/2-kr*(i+kr)/24+2*t3q+t5tr) - * - (-3I8[mu,nu]-3I10[mu,nu]/2-qmunu*kr*(i+kr)/24+2*t4q+t6tr) - */ - br[RE]=sig*(3*(tab10[ind2]/2+tab8[ind2])-2*t4q-t6tr)+temp*qmunu*kr; - br[IM]=3*temp*qmunu; - if (mu==nu) { - br[RE]+=2*t3q+t5tr-temp*kr-tab9[ind0]/2-tab7[ind0]; - br[IM]-=temp; - } - // br*=kd^2 - cMultReal(kd2,br,br); - // br+=I1*delta[mu,nu]*(-1+ikr+kr^2)-sig*I2[mu,nu]*(-3+3ikr+kr^2) - br[RE]+=sig*tab2[ind2]*(3-kr2); - br[IM]-=sig*tab2[ind2]*3*kr; - if (mu==nu) { - br[RE]+=tab1[ind0]*(kr2-1); - br[IM]+=tab1[ind0]*kr; - } - // Gc0=expval*br - cMult(expval,br,result); - //==== computing Gc1 ====== - if (!inter_avg) { - // br=(kd*kr/24)*(qa*(delta[mu,nu]*(-2+ikr)-qmunu*(-6+ikr))-qamunu) - br[RE]=6*qmunu; - br[IM]=-kr*qmunu; - if (mu==nu) { - br[RE]-=2; - br[IM]+=kr; - } - cMultReal(qa,br,br); - br[RE]-=qamunu; - cMultReal(2*temp*kd,br,br); - // br1=(d/r)*(delta[mu,nu]*t3h*(-1+ikr)-sig*t4h*(-3+3ikr)) - br1[RE]=3*sig*t4a; - br1[IM]=-kr*br1[RE]; - if (mu==nu) { - br1[RE]-=t3a; - br1[IM]+=t3a*kr; - } - cMultReal(1/rn,br1,br1); - // Gc1=expval*i*m*kd*(br1+br) - cAdd(br,br1,Gc1); - cMultSelf(Gc1,m); - cMultReal(kd,Gc1,Gc1); - cMultSelf(Gc1,expval); - cMult_i(Gc1); - } - //==== computing Gc2 ====== - // br=delta[mu,nu]*t5aa-3*sig*t6aa-(kr/12)*(delta[mu,nu]*(i+kr)-qmunu*(3i+kr)) - br[RE]=-kr*qmunu; - br[IM]=-3*qmunu; - if (mu==nu) { - br[RE]+=kr; - br[IM]+=1; - } - cMultReal(-2*temp,br,br); - br[RE]-=3*sig*t6aa; - if (mu==nu) br[RE]+=t5aa; - // Gc2=expval*(kd^2/2)*m^2*br - cMult(m2,br,Gc2); - cMultReal(kd2/2,Gc2,Gc2); - cMultSelf(Gc2,expval); - // result = Gc0 + [ Gc1 ] + Gc2 - if (!inter_avg) cAdd(Gc2,Gc1,Gc2); - cAdd(Gc2,result,result); - } - else { - //====== Gfar (and part of Gmedian) ======= - // temp=kd^2/24 - temp=kd2/24; - // br=1-(1+m^2)*kd^2/24 - br[RE]=1-(1+m2[RE])*temp; - br[IM]=-m2[IM]*temp; - // Gf0 + Gf2 = Gp*br - cMultSelf(result,br); - //==== compute and add Gf1 === - if (!inter_avg) { - /* br = {delta[mu,nu]*(3-3ikr-2kr^2+ikr^3)-qmunu*(15-15ikr-6kr^2+ikr^3)}*qa - * + qamunu*(3-3ikr-kr^2) - */ - br[RE]=(6*kr2-15)*qmunu; - br[IM]=(15*kr-kr3)*qmunu; - if(mu==nu) { - br[RE]+=3-2*kr2; - br[IM]+=kr3-3*kr; - } - cMultReal(qa,br,br); - br[RE]+=(3-kr2)*qamunu; - br[IM]-=3*kr*qamunu; - // temp = kd^2/(12*kr) - temp*=2/kr; - // Gf1=expval*i*m*temp*br - cMult(m,br,Gf1); - cMultReal(temp,Gf1,Gf1); - cMultSelf(Gf1,expval); - cMult_i(Gf1); - // result = Gf - cAdd(Gf1,result,result); - } - if (kr < G_BOUND_MEDIAN) { - //===== G median ======== - vMult(qvec,qvec,q2); - q4=DotProd(q2,q2); - invrn=1/rn; - invrn2=invrn*invrn; - invrn3=invrn2*invrn; - invrn4=invrn2*invrn2; - // Gm0=expval*br*temp - temp=qmunu*(33*q4-7-12*(q2[mu]+q2[nu])); - if (mu == nu) temp+=(1-3*q4+4*q2[mu]); - temp*=7*invrn4/64; - br[RE]=-1; - br[IM]=kr; - cMultReal(temp,br,Gm0); - cMultSelf(Gm0,expval); - if (!inter_avg) { - // Gm1=expval*i*m*temp - vMult(qvec,prop,qavec); - if (mu == nu) dmunu=1; - else dmunu=0; - temp = 3*qa*(dmunu-7*qmunu) + 6*dmunu*qvec[mu]*prop[mu] - - 7*(dmunu-9*qmunu)*DotProd(qavec,q2) - + 3*(prop[mu]*qvec[nu]*(1-7*q2[mu])+prop[nu]*qvec[mu]*(1-7*q2[nu])); - temp*=kd*invrn3/48; - cMultReal(temp,m,Gm1); - cMult_i(Gm1); - cMultSelf(Gm1,expval); - // add Gm1 to Gm0 - cAdd(Gm0,Gm1,Gm0); - } - // result = Gf + Gm0 + [ Gm1 ] - cAdd(Gm0,result,result); - } - } - } - // if (pr) PRINTZ("%d,%d: %f+%fi\n",mu,nu,result[RE],result[IM]); + double rr,rtemp[3],qvec[3],q2[3],invr,invr3,qavec[3],av[3]; + double rr2,kr,kr2,kr3,kd2,q4,rn; + double temp,qmunu,qa,qamunu,invrn,invrn2,invrn3,invrn4,dmunu; + double kfr,ci,si,ci1,si1,ci2,si2,brd,cov,siv,g0,g2; + doublecomplex expval,br,br1,m,m2,Gf1,Gm0,Gm1,Gc1,Gc2; + int ind0,ind1,ind2,ind2m,ind3,ind4,indmunu; + int sigV[3],ic,sig,ivec[3],ord[3],invord[3]; + double t3q,t3a,t4q,t4a,t5tr,t5aa,t6tr,t6aa; +/* int pr; */ + const int inter_avg=TRUE; + + /* self interaction; self term is computed in different subroutine */ + if (i==0 && j==0 && k==0) { + result[RE]=result[IM]=0.0; + return; + } + +/* pr=(i==1 && j==1 && k==1); + if (pr) PRINTZ("%d,%d: ",mu,nu); /* for debugging */ + + /* initialize rtemp */ + rtemp[0]=i*gridspace; + rtemp[1]=j*gridspace; + rtemp[2]=k*gridspace; + /*====== calculate some basic constants ======*/ + rr2 = DotProd(rtemp,rtemp); + rr = sqrt(rr2); + rn=rr/gridspace; /* normalized r */ + invr = 1/rr; + invr3 = invr*invr*invr; + MultScal(invr,rtemp,qvec); + kr = WaveNum * rr; + kr2 = kr*kr; + kfr=PI*rn; /* k_F*r, for FCD */ + qmunu=qvec[mu]*qvec[nu]; + /* cov=cos(kr); siv=sin(kr); expval=Exp(ikr)/r^3 */ + imExp(kr,expval); + cov=expval[RE]; + siv=expval[IM]; + cMultReal(invr3,expval,expval); + /*====== calculate Gp ========*/ + /* br=delta[mu,nu]*(-1+ikr+kr^2)-qmunu*(-3+3ikr+kr^2) */ + br[RE]=(3-kr2)*qmunu; + br[IM]=-3*kr*qmunu; + if(mu==nu) { + br[RE]+=kr2-1; + br[IM]+=kr; + } + /* result=Gp=expval*br */ + cMult(br,expval,result); + /*====== FCD (static and full) ========*/ + /* !!! speed of FCD can be improved by using faster version of sici routine, using predefined + tables, etc (e.g. as is done in gsl library). But currently this do not seem to be significant + portion of the total simulation time */ + + if (IntRelation==G_FCD_ST) { + /* FCD is Based on + Gay-Balmaz P., Martin O.J.F. "A library for computing the filtered and non-filtered 3D + Green's tensor associated with infinite homogeneous space and surfaces", Comp. Phys. Comm. + 144:111-120 (2002), and + Piller N.B. "Increasing the performance of the coupled-dipole approximation: A spectral + approach", IEEE Trans.Ant.Propag. 46(8):1126-1137. + differing by a factor of 4*pi*k^2 */ + + /* result = Gp*[3*Si(k_F*r)+k_F*r*cos(k_F*r)-4*sin(k_F*r)]*2/(3*pi) */ + cisi(kfr,&ci,&si); + brd=TWO_OVER_PI*ONE_THIRD*(3*si+kfr*cos(kfr)-4*sin(kfr)); + cMultReal(brd,result,result); + } + else if (IntRelation==G_FCD) { + /* ci,si1,2=ci,si+-=Ci,Si((k_F+-k)r) */ + cisi(kfr+kr,&ci1,&si1); + cisi(kfr-kr,&ci2,&si2); + /* ci=ci1-ci2; si=pi-si1-si2 */ + ci=ci1-ci2; + si=PI-si1-si2; + g0=INV_PI*(siv*ci+cov*si); + g2=INV_PI*(kr*(cov*ci-siv*si)+2*ONE_THIRD*(kfr*cos(kfr)-4*sin(kfr)))-g0; + temp=g0*kr2; + /* brd=(delta[mu,nu]*(-g0*kr^2-g2)+qmunu*(g0*kr^2+3g2))/r^3 */ + brd=qmunu*(temp+3*g2); + if (mu==nu) brd-=temp+g2; + brd*=invr3; + /* result=Gp+brd */ + result[RE]+=brd; + } + /*======= second order corrections ========*/ + else if (IntRelation==G_SO) { + /* this should never happen !!! */ + if (anisotropy) LogError(EC_ERROR,ONE_POS,"Incompatibility error in CalcInterTerm"); + kd2=kd*kd; + kr3=kr2*kr; + /* only one refractive index can be used for FFT-compatible algorithm !!! */ + cEqual(ref_index[0],m); + cSquare(m,m2); + if (!inter_avg) { + qa=DotProd(qvec,prop); + /* qamunu=qvec[mu]*prop[nu] + qvec[nu]*prop[mu] */ + qamunu=qvec[mu]*prop[nu]; + if (mu==nu) qamunu*=2; + else qamunu+=qvec[nu]*prop[mu]; + } + if (kr*rn < G_BOUND_CLOSE) { + /*====== G close =============*/ + /* check if inside the table bounds; needed to recompute to make an integer comparison */ + if ((i*i+j*j+k*k) > TAB_RMAX*TAB_RMAX) LogError(EC_ERROR,ALL_POS, + "Not enough table size (available only up to R/d=%d)",TAB_RMAX); + + /* av is copy of propagation vector */ + if (!inter_avg) memcpy(av,prop,3*sizeof(double)); + ivec[0]=i; + ivec[1]=j; + ivec[2]=k; + /* transformation of negative coordinates */ + for (ic=0;ic<3;ic++) { + if (ivec[ic]<0) { + sigV[ic]=-1; + av[ic]*=-1; + qvec[ic]*=-1; + ivec[ic]*=-1; + } + else sigV[ic]=1; + } + i=ivec[0]; + j=ivec[1]; + k=ivec[2]; + sig=sigV[mu]*sigV[nu]; /* sign of some terms below */ + /* transformation to case i>=j>=k>=0 */ + /* building of ord; ord[x] is x-th largest coordinate (0-th - the largest) */ + if (i>=j) { + if (i>=k) { + ord[0]=0; + if (j>=k) { + ord[1]=1; + ord[2]=2; + } + else { + ord[1]=2; + ord[2]=1; + } + } + else { + ord[0]=2; + ord[1]=0; + ord[2]=1; + } + } + else { + if (i>=k) { + ord[0]=1; + ord[1]=0; + ord[2]=2; + } + else { + ord[2]=0; + if (j>=k) { + ord[0]=1; + ord[1]=2; + } + else { + ord[0]=2; + ord[1]=1; + } + } + } + /* change parameters according to coordinate transforms */ + Permutate(qvec,ord); + if (!inter_avg) Permutate(av,ord); + Permutate_i(ivec,ord); + i=ivec[0]; + j=ivec[1]; + k=ivec[2]; + /* compute inverse permutation */ + memcpy(invord,ord,3*sizeof(int)); + Permutate_i(invord,ord); + if (invord[0]==0 && invord[1]==1 && invord[2]==2) memcpy(invord,ord,3*sizeof(int)); + /* compute transformed indices mu and nu */ + mu=invord[mu]; + nu=invord[nu]; + /* indexes for tables of different dimensions */ + /* indmunu is a number of component[mu,nu] in symmetric matrix */ + indmunu=mu+nu; + if (mu==2 || nu==2) indmunu++; + + ind0=tab_index[i][j]+k; + ind1=3*ind0; + ind2m=6*ind0; + ind2=ind2m+indmunu; + ind3=3*ind2; + ind4=6*ind2; + /* computing several quantities with table integrals */ + t3q=DotProd(qvec,tab3+ind1); + t4q=DotProd(qvec,tab4+ind3); + t5tr=TrSym(tab5+ind2m); + t6tr=TrSym(tab6+ind4); + if (inter_avg) { + /* <a[mu]*a[nu]>=1/3*delta[mu,nu] */ + t5aa=ONE_THIRD*t5tr; + t6aa=ONE_THIRD*t6tr; + } + else { + t3a=DotProd(av,tab3+ind1); + t4a=DotProd(av,tab4+ind3); + t5aa=QuadForm(tab5+ind2m,av); + t6aa=QuadForm(tab6+ind4,av); + } + /*====== computing Gc0 =====*/ + /* temp = kr/24 */ + temp=kr/24; + /* br=delta[mu,nu]*(-I7-I9/2-kr*(i+kr)/24+2*t3q+t5tr)- + (-3I8[mu,nu]-3I10[mu,nu]/2-qmunu*kr*(i+kr)/24+2*t4q+t6tr) */ + br[RE]=sig*(3*(tab10[ind2]/2+tab8[ind2])-2*t4q-t6tr)+temp*qmunu*kr; + br[IM]=3*temp*qmunu; + if (mu==nu) { + br[RE]+=2*t3q+t5tr-temp*kr-tab9[ind0]/2-tab7[ind0]; + br[IM]-=temp; + } + /* br*=kd^2 */ + cMultReal(kd2,br,br); + /* br+=I1*delta[mu,nu]*(-1+ikr+kr^2)-sig*I2[mu,nu]*(-3+3ikr+kr^2) */ + br[RE]+=sig*tab2[ind2]*(3-kr2); + br[IM]-=sig*tab2[ind2]*3*kr; + if (mu==nu) { + br[RE]+=tab1[ind0]*(kr2-1); + br[IM]+=tab1[ind0]*kr; + } + /* Gc0=expval*br */ + cMult(expval,br,result); + /*==== computing Gc1 ======*/ + if (!inter_avg) { + /* br=(kd*kr/24)*(qa*(delta[mu,nu]*(-2+ikr)-qmunu*(-6+ikr))-qamunu)*/ + br[RE]=6*qmunu; + br[IM]=-kr*qmunu; + if (mu==nu) { + br[RE]-=2; + br[IM]+=kr; + } + cMultReal(qa,br,br); + br[RE]-=qamunu; + cMultReal(2*temp*kd,br,br); + /* br1=(d/r)*(delta[mu,nu]*t3h*(-1+ikr)-sig*t4h*(-3+3ikr)) */ + br1[RE]=3*sig*t4a; + br1[IM]=-kr*br1[RE]; + if (mu==nu) { + br1[RE]-=t3a; + br1[IM]+=t3a*kr; + } + cMultReal(1/rn,br1,br1); + /* Gc1=expval*i*m*kd*(br1+br) */ + cAdd(br,br1,Gc1); + cMultSelf(Gc1,m); + cMultReal(kd,Gc1,Gc1); + cMultSelf(Gc1,expval); + cMult_i(Gc1); + } + /*==== computing Gc2 ======*/ + /* br=delta[mu,nu]*t5aa-3*sig*t6aa-(kr/12)*(delta[mu,nu]*(i+kr)-qmunu*(3i+kr)) */ + br[RE]=-kr*qmunu; + br[IM]=-3*qmunu; + if (mu==nu) { + br[RE]+=kr; + br[IM]+=1; + } + cMultReal(-2*temp,br,br); + br[RE]-=3*sig*t6aa; + if (mu==nu) br[RE]+=t5aa; + /* Gc2=expval*(kd^2/2)*m^2*br */ + cMult(m2,br,Gc2); + cMultReal(kd2/2,Gc2,Gc2); + cMultSelf(Gc2,expval); + /* result = Gc0 + [ Gc1 ] + Gc2 */ + if (!inter_avg) cAdd(Gc2,Gc1,Gc2); + cAdd(Gc2,result,result); + } + else { + /*====== Gfar (and part of Gmedian) =======*/ + /* temp=kd^2/24 */ + temp=kd2/24; + /* br=1-(1+m^2)*kd^2/24 */ + br[RE]=1-(1+m2[RE])*temp; + br[IM]=-m2[IM]*temp; + /* Gf0 + Gf2 = Gp*br */ + cMultSelf(result,br); + /*==== compute and add Gf1 ===*/ + if (!inter_avg) { + /* br={delta[mu,nu]*(3-3ikr-2kr^2+ikr^3)-qmunu*(15-15ikr-6kr^2+ikr^3)}*qa + +qamunu*(3-3ikr-kr^2) */ + br[RE]=(6*kr2-15)*qmunu; + br[IM]=(15*kr-kr3)*qmunu; + if(mu==nu) { + br[RE]+=3-2*kr2; + br[IM]+=kr3-3*kr; + } + cMultReal(qa,br,br); + br[RE]+=(3-kr2)*qamunu; + br[IM]-=3*kr*qamunu; + /* temp = kd^2/(12*kr) */ + temp*=2/kr; + /* Gf1=expval*i*m*temp*br */ + cMult(m,br,Gf1); + cMultReal(temp,Gf1,Gf1); + cMultSelf(Gf1,expval); + cMult_i(Gf1); + /* result = Gf */ + cAdd(Gf1,result,result); + } + if (kr < G_BOUND_MEDIAN) { + /*===== G median ========*/ + vMult(qvec,qvec,q2); + q4=DotProd(q2,q2); + invrn=1/rn; + invrn2=invrn*invrn; + invrn3=invrn2*invrn; + invrn4=invrn2*invrn2; + /* Gm0=expval*br*temp */ + temp=qmunu*(33*q4-7-12*(q2[mu]+q2[nu])); + if (mu == nu) temp+=(1-3*q4+4*q2[mu]); + temp*=7*invrn4/64; + br[RE]=-1; + br[IM]=kr; + cMultReal(temp,br,Gm0); + cMultSelf(Gm0,expval); + if (!inter_avg) { + /* Gm1=expval*i*m*temp */ + vMult(qvec,prop,qavec); + if (mu == nu) dmunu=1; + else dmunu=0; + temp=3*qa*(dmunu-7*qmunu)+6*dmunu*qvec[mu]*prop[mu]-7*(dmunu-9*qmunu)*DotProd(qavec,q2)+ + 3*(prop[mu]*qvec[nu]*(1-7*q2[mu])+prop[nu]*qvec[mu]*(1-7*q2[nu])); + temp*=kd*invrn3/48; + cMultReal(temp,m,Gm1); + cMult_i(Gm1); + cMultSelf(Gm1,expval); + /* add Gm1 to Gm0 */ + cAdd(Gm0,Gm1,Gm0); + } + /* result = Gf + Gm0 + [ Gm1 ]*/ + cAdd(Gm0,result,result); + } + } + } + /* if (pr) PRINTZ("%d,%d: %f+%fi\n",mu,nu,result[RE],result[IM]); */ } -//============================================================ +/*============================================================*/ void InitDmatrix(void) -/* Initializes the matrix D. D[i][j][k]=A[i1-i2][j1-j2][k1-k2]. Actually D=-FFT(G)/Ngrid. - * Then -G.x=invFFT(D*FFT(x)) for practical implementation of FFT such that invFFT(FFT(x))=Ngrid*x. - * G is exactly Green's tensor. The routine is called only once, so needs not to be very fast, - * however we tried to optimize it. - */ + /* initialises the matrix D. D[i][j][k]=A[i1-i2][j1-j2][k1-k2] + Actually D=-FFT(G)/Ngrid. Then -G.x=invFFT(D*FFT(x)) for practical + implementation of FFT such that invFFT(FFT(x))=Ngrid*x. G is exactly Green's tensor + The routine is called only once, so needs not to be very fast, however we tried + to optimize it. */ { - int i,j,k,kcor,Dcomp; - size_t x,y,z,indexfrom,indexto,ind,index,D2sizeTot; - double invNgrid,mem; - int nnn; // multiplier used for reduced_FFT or not reduced; 1 or 2 - int jstart, kstart; - size_t lengthN; - int mu, nu; // indices for interaction term - TIME_TYPE start,time1; + int i,j,k,kcor,Dcomp; + size_t x,y,z,indexfrom,indexto,ind,index,D2sizeTot; + double invNgrid,mem; + int nnn; /* multiplier used for reduced_FFT or not reduced; 1 or 2 */ + int jstart, kstart; + size_t lengthN; + int mu, nu; /* indices for interaction term */ + TIME_TYPE start,time1; #ifdef PARALLEL - size_t bufsize; + size_t bufsize; #endif #ifdef PRECISE_TIMING - // precise timing of the Dmatrix computation - SYSTEM_TIME tvp[13]; - SYSTEM_TIME Timing_fftX,Timing_fftY,Timing_fftZ,Timing_ar1,Timing_ar2,Timing_ar3, - Timing_BT,Timing_TYZ,Timing_beg; - double t_fftX,t_fftY,t_fftZ,t_ar1,t_ar2,t_ar3, - t_TYZ,t_beg,t_Arithm,t_FFT,t_BT; - - InitTime(&Timing_fftX); - InitTime(&Timing_fftY); - InitTime(&Timing_fftZ); - InitTime(&Timing_ar1); - InitTime(&Timing_ar2); - InitTime(&Timing_ar3); - InitTime(&Timing_BT); - InitTime(&Timing_TYZ); - GetTime(tvp); + /* precise timing of the Dmatrix computation */ + SYSTEM_TIME tvp[13]; + SYSTEM_TIME Timing_fftX,Timing_fftY,Timing_fftZ,Timing_ar1,Timing_ar2,Timing_ar3, + Timing_BT,Timing_TYZ,Timing_beg; + double t_fftX,t_fftY,t_fftZ,t_ar1,t_ar2,t_ar3, + t_TYZ,t_beg,t_Arithm,t_FFT,t_BT; + + InitTime(&Timing_fftX); + InitTime(&Timing_fftY); + InitTime(&Timing_fftZ); + InitTime(&Timing_ar1); + InitTime(&Timing_ar2); + InitTime(&Timing_ar3); + InitTime(&Timing_BT); + InitTime(&Timing_TYZ); + GetTime(tvp); #endif - start=GET_TIME(); - // initialize sizes of D and D2 matrices - D2sizeX=gridX; - if (reduced_FFT) { - D2sizeY=gridY/2; - D2sizeZ=gridZ/2; - DsizeY=gridY/2+1; - DsizeZ=gridZ/2+1; - nnn=1; - jstart=0; - kstart=0; - } - else { - D2sizeY=DsizeY=gridY; - D2sizeZ=DsizeZ=gridZ; - nnn=2; - jstart=1-boxY; - kstart=1-boxZ; - } - // auxiliary parameters - lengthN=nnn*local_Nz; - DsizeYZ=DsizeY*DsizeZ; - invNgrid=1.0/(gridX*((double)gridYZ)); - local_Nsmall=(gridX/2)*(gridYZ/(2*nprocs)); // size of X vector (for 1 component) - /* calculate size of matvec matrices (X,D,slices,slices_tr) and BT buffers (if parallel); - * uses complex expression to avoid overflows and enable prognosis for large grids - */ - mem = sizeof(doublecomplex)*(3*(2+(gridX/(4.0*nprocs)))*((double)gridYZ) - + NDCOMP*local_Nx*((double)DsizeYZ)); + start=GET_TIME(); + /* initialize sizes of D and D2 matrices */ + D2sizeX=gridX; + if (reduced_FFT) { + D2sizeY=gridY/2; + D2sizeZ=gridZ/2; + DsizeY=gridY/2+1; + DsizeZ=gridZ/2+1; + nnn=1; + jstart=0; + kstart=0; + } + else { + D2sizeY=DsizeY=gridY; + D2sizeZ=DsizeZ=gridZ; + nnn=2; + jstart=1-boxY; + kstart=1-boxZ; + } + /* auxiliary parameters */ + lengthN=nnn*local_Nz; + DsizeYZ=DsizeY*DsizeZ; + invNgrid=1.0/(gridX*((double)gridYZ)); + local_Nsmall=(gridX/2)*(gridYZ/(2*nprocs)); /* size of X vector (for 1 component) */ + /* calculate size of matvec matrices (X,D,slices,slices_tr) and BT buffers (if parallel) + uses complex expression to avoid overflows and enable prognose for large grids */ + mem=sizeof(doublecomplex)*(3*(2+(gridX/(4.0*nprocs)))*((double)gridYZ) + +NDCOMP*local_Nx*((double)DsizeYZ)); #ifdef PARALLEL - mem+=12*smallY*((double)(local_Nz*local_Nx))*sizeof(double); + mem+=12*smallY*((double)(local_Nz*local_Nx))*sizeof(double); #endif - // printout some information - /* conversions to (unsigned long) are needed (to remove warnings) because %z printf argument is not - * yet supported by all target compiler environmets - */ - FPRINTZ(logfile,"The FFT grid is: %lux%lux%lu\n",(unsigned long)gridX,(unsigned long)gridY, - (unsigned long)gridZ); + /* printout some information */ + FPRINTZ(logfile,"The FFT grid is: %ux%ux%u\n",gridX,gridY,gridZ); #ifdef PARALLEL - PRINTBOTHZ(logfile,"Memory usage for MatVec matrices (per processor): %.1f Mb\n",mem/MBYTE); + PRINTBOTHZ(logfile,"Memory usage for MatVec matrices (per processor): %.1f Mb\n",mem/MBYTE); #else - PRINTBOTHZ(logfile,"Memory usage for MatVec matrices: %.1f Mb\n",mem/MBYTE); + PRINTBOTHZ(logfile,"Memory usage for MatVec matrices: %.1f Mb\n",mem/MBYTE); #endif - FFLUSHZ(logfile); - memory+=mem; - if (prognose) return; - // allocate memory for Dmatrix - MALLOC_VECTOR(Dmatrix,complex,MultOverflow(NDCOMP*local_Nx,DsizeYZ,ONE_POS,"Dmatrix"),ALL); - // allocate memory for D2matrix components - D2sizeTot=nnn*local_Nz*D2sizeY*D2sizeX; - MALLOC_VECTOR(D2matrix,complex,D2sizeTot,ALL); - MALLOC_VECTOR(slice,complex,gridYZ,ALL); - MALLOC_VECTOR(slice_tr,complex,gridYZ,ALL); - // actually allocation of Xmatrix, slices, slices_tr is below after freeing of Dmatrix and its slice + FFLUSHZ(logfile); + memory+=mem; + if (prognose) return; + /* allocate memory for Dmatrix */ + MALLOC_VECTOR(Dmatrix,complex,MultOverflow(NDCOMP*local_Nx,DsizeYZ,ONE_POS,"Dmatrix"),ALL); + /* allocate memory for D2matrix components */ + D2sizeTot=nnn*local_Nz*D2sizeY*D2sizeX; + MALLOC_VECTOR(D2matrix,complex,D2sizeTot,ALL); + MALLOC_VECTOR(slice,complex,gridYZ,ALL); + MALLOC_VECTOR(slice_tr,complex,gridYZ,ALL); + /* actually allocation of Xmatrix, slices, slices_tr is below; + after freeing of Dmatrix and its slice */ #ifdef PARALLEL - // allocate buffer for BlockTranspose_Dm - bufsize = 2*lengthN*D2sizeY*local_Nx; - MALLOC_VECTOR(BT_buffer,double,bufsize,ALL); - MALLOC_VECTOR(BT_rbuffer,double,bufsize,ALL); + /* allocate buffer for BlockTranspose_Dm */ + bufsize = 2*lengthN*D2sizeY*local_Nx; + MALLOC_VECTOR(BT_buffer,double,bufsize,ALL); + MALLOC_VECTOR(BT_rbuffer,double,bufsize,ALL); #endif - D("Initialize FFT (1st part)"); - fftInitBeforeD(lengthN); + D("init FFT (1st part)"); + fftInitBeforeD(lengthN); #ifdef PRECISE_TIMING - GetTime(tvp+1); - elapsed(tvp,tvp+1,&Timing_beg); + GetTime(tvp+1); + elapsed(tvp,tvp+1,&Timing_beg); #endif - PRINTZ("Calculating Dmatrix"); - FFLUSHZ(stdout); + PRINTZ("Calculating Dmatrix"); + FFLUSHZ(stdout); - for(Dcomp=0;Dcomp<NDCOMP;Dcomp++) { // main cycle over components of Dmatrix + for(Dcomp=0;Dcomp<NDCOMP;Dcomp++) { /* main cycle over components of Dmatrix */ #ifdef PRECISE_TIMING - GetTime(tvp+2); + GetTime(tvp+2); #endif - switch((char) Dcomp) { // determine mu,nu - case 0: { - mu=0; - nu=0; - break; - } - case 1: { - mu=0; - nu=1; - break; - } - case 2: { - mu=0; - nu=2; - break; - } - case 3: { - mu=1; - nu=1; - break; - } - case 4: { - mu=1; - nu=2; - break; - } - case 5: { - mu=2; - nu=2; - break; - } - } // end of switch - - // fill D2matrix with 0.0 - for (ind=0;ind<D2sizeTot;ind++) D2matrix[ind][RE]=D2matrix[ind][IM]=0.0; - - // fill D (F'i-j) - for(k=nnn*local_z0;k<nnn*local_z1;k++) { - if (k>(int)smallZ) kcor=k-gridZ; - else kcor=k; - for (j=jstart;j<boxY;j++) for (i=1-boxX;i<boxX;i++) { - index=IndexD2matrix(i,j,k,nnn); - CalcInterTerm(i,j,kcor,mu,nu,D2matrix[index]); // calculate F[mu][nu] - } - } // end of i,j,k loop + switch((char) Dcomp) { /* determine mu,nu */ + case 0: { + mu=0; + nu=0; + break; + } + case 1: { + mu=0; + nu=1; + break; + } + case 2: { + mu=0; + nu=2; + break; + } + case 3: { + mu=1; + nu=1; + break; + } + case 4: { + mu=1; + nu=2; + break; + } + case 5: { + mu=2; + nu=2; + break; + } + } /* end of switch */ + + /* fill D2matrix with 0.0 */ + for (ind=0;ind<D2sizeTot;ind++) D2matrix[ind][RE]=D2matrix[ind][IM]=0.0; + + /* fill D (F'i-j) */ + for(k=nnn*local_z0;k<nnn*local_z1;k++) { + if (k>(int)smallZ) kcor=k-gridZ; + else kcor=k; + for (j=jstart;j<boxY;j++) for (i=1-boxX;i<boxX;i++) { + index=IndexD2matrix(i,j,k,nnn); + CalcInterTerm(i,j,kcor,mu,nu,D2matrix[index]); /* calculate F[mu][nu] */ + } + } /* end of i,j,k loop */ #ifdef PRECISE_TIMING - GetTime(tvp+3); - ElapsedInc(tvp+2,tvp+3,&Timing_ar1); + GetTime(tvp+3); + ElapsedInc(tvp+2,tvp+3,&Timing_ar1); #endif - fftX_Dm(lengthN); // fftX D2matrix + fftX_Dm(lengthN); /* fftX D2matrix */ #ifdef PRECISE_TIMING - GetTime(tvp+4); - ElapsedInc(tvp+3,tvp+4,&Timing_fftX); + GetTime(tvp+4); + ElapsedInc(tvp+3,tvp+4,&Timing_fftX); #endif - BlockTranspose_Dm(D2matrix,D2sizeY,lengthN); + BlockTranspose_Dm(D2matrix,D2sizeY,lengthN); #ifdef PRECISE_TIMING - GetTime(tvp+5); - ElapsedInc(tvp+4,tvp+5,&Timing_BT); + GetTime(tvp+5); + ElapsedInc(tvp+4,tvp+5,&Timing_BT); #endif - for(x=local_x0;x<local_x1;x++) { + for(x=local_x0;x<local_x1;x++) { #ifdef PRECISE_TIMING - GetTime(tvp+6); + GetTime(tvp+6); #endif - for (ind=0;ind<gridYZ;ind++) slice[ind][RE]=slice[ind][IM]=0.0; // fill slice with 0.0 - for(j=jstart;j<boxY;j++) for(k=kstart;k<boxZ;k++) { - indexfrom=IndexGarbledD(x,j,k,lengthN); - indexto=IndexSliceD2matrix(j,k); - cEqual(D2matrix[indexfrom],slice[indexto]); - } - if (reduced_FFT) { - for(j=1;j<boxY;j++) for(k=0;k<boxZ;k++) { - // mirror along y - indexfrom=IndexSliceD2matrix(j,k); - indexto=IndexSliceD2matrix(-j,k); - if (Dcomp==1 || Dcomp==4) cInvSign2(slice[indexfrom],slice[indexto]); - else cEqual(slice[indexfrom],slice[indexto]); - } - for(j=1-boxY;j<boxY;j++) for(k=1;k<boxZ;k++) { - // mirror along z - indexfrom=IndexSliceD2matrix(j,k); - indexto=IndexSliceD2matrix(j,-k); - if (Dcomp==2 || Dcomp==4) cInvSign2(slice[indexfrom],slice[indexto]); - else cEqual(slice[indexfrom],slice[indexto]); - } - } + for (ind=0;ind<gridYZ;ind++) slice[ind][RE]=slice[ind][IM]=0.0; /* fill slice with 0.0 */ + + for(j=jstart;j<boxY;j++) for(k=kstart;k<boxZ;k++) { + indexfrom=IndexGarbledD(x,j,k,lengthN); + indexto=IndexSliceD2matrix(j,k); + cEqual(D2matrix[indexfrom],slice[indexto]); + } + + if (reduced_FFT) { + for(j=1;j<boxY;j++) for(k=0;k<boxZ;k++) { + /* mirror along y */ + indexfrom=IndexSliceD2matrix(j,k); + indexto=IndexSliceD2matrix(-j,k); + if (Dcomp==1 || Dcomp==4) cInvSign2(slice[indexfrom],slice[indexto]); + else cEqual(slice[indexfrom],slice[indexto]); + } + for(j=1-boxY;j<boxY;j++) for(k=1;k<boxZ;k++) { + /* mirror along z */ + indexfrom=IndexSliceD2matrix(j,k); + indexto=IndexSliceD2matrix(j,-k); + if (Dcomp==2 || Dcomp==4) cInvSign2(slice[indexfrom],slice[indexto]); + else cEqual(slice[indexfrom],slice[indexto]); + } + } #ifdef PRECISE_TIMING - GetTime(tvp+7); - ElapsedInc(tvp+6,tvp+7,&Timing_ar2); + GetTime(tvp+7); + ElapsedInc(tvp+6,tvp+7,&Timing_ar2); #endif - fftZ_Dm(); // fftZ slice + fftZ_Dm(); /* fftZ slice */ #ifdef PRECISE_TIMING - GetTime(tvp+8); - ElapsedInc(tvp+7,tvp+8,&Timing_fftZ); + GetTime(tvp+8); + ElapsedInc(tvp+7,tvp+8,&Timing_fftZ); #endif - transposeYZ_Dm(slice,slice_tr); + transposeYZ_Dm(slice,slice_tr); #ifdef PRECISE_TIMING - GetTime(tvp+9); - ElapsedInc(tvp+8,tvp+9,&Timing_TYZ); + GetTime(tvp+9); + ElapsedInc(tvp+8,tvp+9,&Timing_TYZ); #endif - fftY_Dm(); // fftY slice_tr + fftY_Dm(); /* fftY slice_tr */ #ifdef PRECISE_TIMING - GetTime(tvp+10); - ElapsedInc(tvp+9,tvp+10,&Timing_fftY); + GetTime(tvp+10); + ElapsedInc(tvp+9,tvp+10,&Timing_fftY); #endif - for(z=0;z<DsizeZ;z++) for(y=0;y<DsizeY;y++) { - indexto=IndexDmatrix(x-local_x0,y,z)+Dcomp; - indexfrom=IndexSlice_zyD2matrix(y,z); - cMultReal(-invNgrid,slice_tr[indexfrom],Dmatrix[indexto]); - } + for(z=0;z<DsizeZ;z++) for(y=0;y<DsizeY;y++) { + indexto=IndexDmatrix(x-local_x0,y,z)+Dcomp; + indexfrom=IndexSlice_zyD2matrix(y,z); + cMultReal(-invNgrid,slice_tr[indexfrom],Dmatrix[indexto]); + } #ifdef PRECISE_TIMING - GetTime(tvp+11); - ElapsedInc(tvp+10,tvp+11,&Timing_ar3); + GetTime(tvp+11); + ElapsedInc(tvp+10,tvp+11,&Timing_ar3); #endif - } // end slice X - PRINTZ("."); - FFLUSHZ(stdout); - } // end of Dcomp - // free vectors used for computation of Dmatrix - Free_cVector(D2matrix); - Free_cVector(slice); - Free_cVector(slice_tr); + } /* end slice X */ + PRINTZ("."); + FFLUSHZ(stdout); + } /* end of Dcomp */ + /* free vectors used for computation of Dmatrix */ + Free_cVector(D2matrix); + Free_cVector(slice); + Free_cVector(slice_tr); #ifdef PARALLEL - // deallocate buffers for BlockTranspose_Dm - Free_general(BT_buffer); - Free_general(BT_rbuffer); - // allocate buffers for BlockTranspose - bufsize = 6*smallY*local_Nz*local_Nx; // in doubles - MALLOC_VECTOR(BT_buffer,double,bufsize,ALL); - MALLOC_VECTOR(BT_rbuffer,double,bufsize,ALL); + /* deallocate buffers for BlockTranspose_Dm */ + Free_general(BT_buffer); + Free_general(BT_rbuffer); + /* allocate buffers for BlockTranspose */ + bufsize = 6*smallY*local_Nz*local_Nx; /* in doubles */ + MALLOC_VECTOR(BT_buffer,double,bufsize,ALL); + MALLOC_VECTOR(BT_rbuffer,double,bufsize,ALL); #endif - // allocate memory for Xmatrix, slices and slices_tr - used in matvec - MALLOC_VECTOR(Xmatrix,complex,3*local_Nsmall,ALL); - MALLOC_VECTOR(slices,complex,3*gridYZ,ALL); - MALLOC_VECTOR(slices_tr,complex,3*gridYZ,ALL); - - PRINTZ("\n"); - time1=GET_TIME(); - Timing_Dm_Init=time1-start; + /* allocate memory for Xmatrix, slices and slices_tr - used in matvec */ + MALLOC_VECTOR(Xmatrix,complex,3*local_Nsmall,ALL); + MALLOC_VECTOR(slices,complex,3*gridYZ,ALL); + MALLOC_VECTOR(slices_tr,complex,3*gridYZ,ALL); + + PRINTZ("\n"); + time1=GET_TIME(); + Timing_Dm_Init=time1-start; #ifdef PRECISE_TIMING - GetTime(tvp+12); - // analyze and print precise timing information - SetTimerFreq(); - t_beg=TimerToSec(&Timing_beg); - t_ar1=TimerToSec(&Timing_ar1); - t_ar2=TimerToSec(&Timing_ar2); - t_ar3=TimerToSec(&Timing_ar3); - t_fftX=TimerToSec(&Timing_fftX); - t_fftY=TimerToSec(&Timing_fftY); - t_fftZ=TimerToSec(&Timing_fftZ); - t_TYZ=TimerToSec(&Timing_TYZ); - t_BT=TimerToSec(&Timing_BT); - t_Arithm=t_beg+t_ar1+t_ar2+t_ar3+t_TYZ; - t_FFT=t_fftX+t_fftY+t_fftZ; - - PRINTBOTHZ(logfile, - "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n" - " Init Dmatrix timing \n" - "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n" - "Begin = %4.4f Arithmetics = %4.4f\n" - "Arith1 = %4.4f FFT = %4.4f\n" - "FFTX = %4.4f Comm = %4.4f\n" - "BT = %4.4f\n" - "Arith2 = %4.4f Total = %4.4f\n" - "FFTZ = %4.4f\n" - "TYZ = %4.4f\n" - "FFTY = %4.4f\n" - "Arith3 = %4.4f\n\n", - t_beg,t_Arithm,t_ar1,t_FFT,t_fftX,t_BT,t_BT, - t_ar2,DiffSec(tvp,tvp+12),t_fftZ,t_TYZ,t_fftY,t_ar3); + GetTime(tvp+12); + /* analyze and print precise timing information */ + SetTimerFreq(); + t_beg=TimerToSec(&Timing_beg); + t_ar1=TimerToSec(&Timing_ar1); + t_ar2=TimerToSec(&Timing_ar2); + t_ar3=TimerToSec(&Timing_ar3); + t_fftX=TimerToSec(&Timing_fftX); + t_fftY=TimerToSec(&Timing_fftY); + t_fftZ=TimerToSec(&Timing_fftZ); + t_TYZ=TimerToSec(&Timing_TYZ); + t_BT=TimerToSec(&Timing_BT); + t_Arithm=t_beg+t_ar1+t_ar2+t_ar3+t_TYZ; + t_FFT=t_fftX+t_fftY+t_fftZ; + + PRINTBOTHZ(logfile, + "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n"\ + " Init Dmatrix timing \n"\ + "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n"\ + "Begin = %4.4f Arithmetics = %4.4f\n"\ + "Arith1 = %4.4f FFT = %4.4f\n"\ + "FFTX = %4.4f Comm = %4.4f\n"\ + "BT = %4.4f\n"\ + "Arith2 = %4.4f Total = %4.4f\n"\ + "FFTZ = %4.4f\n"\ + "TYZ = %4.4f\n"\ + "FFTY = %4.4f\n"\ + "Arith3 = %4.4f\n\n", + t_beg,t_Arithm,t_ar1,t_FFT,t_fftX,t_BT,t_BT, + t_ar2,DiffSec(tvp,tvp+12),t_fftZ,t_TYZ,t_fftY,t_ar3); #endif - fftInitAfterD(); + fftInitAfterD(); - Timing_FFT_Init = GET_TIME()-time1; + Timing_FFT_Init = GET_TIME()-time1; } -//============================================================ +/*============================================================*/ void Free_FFT_Dmat(void) -// free all vectors that were allocated in fft.c (all used for FFT and MatVec) + /* free all vectors that were allocated in fft.c + (all used for FFT and MatVec) */ { - Free_cVector(Dmatrix); - Free_cVector(Xmatrix); - Free_cVector(slices); - Free_cVector(slices_tr); + Free_cVector(Dmatrix); + Free_cVector(Xmatrix); + Free_cVector(slices); + Free_cVector(slices_tr); #ifdef PARALLEL - Free_general(BT_buffer); - Free_general(BT_rbuffer); + Free_general(BT_buffer); + Free_general(BT_rbuffer); #endif #ifdef FFTW3 - fftw_destroy_plan(planXf); - fftw_destroy_plan(planXb); - fftw_destroy_plan(planYf); - fftw_destroy_plan(planYb); - fftw_destroy_plan(planZf); - fftw_destroy_plan(planZb); + fftw_destroy_plan(planXf); + fftw_destroy_plan(planXb); + fftw_destroy_plan(planYf); + fftw_destroy_plan(planYb); + fftw_destroy_plan(planZf); + fftw_destroy_plan(planZb); #elif defined(FFT_TEMPERTON) - Free_general(work); - Free_general(trigsX); - Free_general(trigsY); - Free_general(trigsZ); + Free_general(work); + Free_general(trigsX); + Free_general(trigsY); + Free_general(trigsZ); #endif } diff --git a/src/fft.h b/src/fft.h index d5e024f8..0169fc4e 100644 --- a/src/fft.h +++ b/src/fft.h @@ -1,28 +1,29 @@ /* FILE: fft.h * AUTH: Maxim Yurkin - * DESCR: definitions of FFT parameters and routines + * DESCR: definitions of fft parameters and routines * - * Copyright (C) 2006,2008 University of Amsterdam + * Copyright (C) 2006 University of Amsterdam * This code is covered by the GNU General Public License. */ #ifndef __fft_h #define __fft_h -/* Temperton FFT is a simple one, its source code is supplied together with ADDA. The only - * inconvenience is that it is in Fortran (not easily incorporated into a project under Windows - * (using any C/C++ developing tool) and should be compiled separately. - * FFTW3 requires separate installation of package from http://www.fftw.org, however it is highly - * optimized to the particular hardware and is generally significantly faster. Therefore, it is the - * default. - */ +/* Temperton FFT is a simple one, its source code is supplied together with ADDA. + The only inconvenience is that it is in fortran (not easily incorporated into a + project under Windows (using any C/C++ developing tool) and should be + compiled separately. + FFTW3 requires separate installation of package from http://www.fftw.org, however + it is hoghly optimized to the particular hardware and is generally significantly faster. + Therfore, it is the default. */ -//#define FFT_TEMPERTON // uncomment to use Temperton FFT +/*#define FFT_TEMPERTON /* uncomment to use Temperton FFT */ #ifndef FFT_TEMPERTON -# define FFTW3 // FFTW3 is default +# define FFTW3 /* FFTW3 is default */ #endif -// direction of FFT and transpose; complies with FFTW3 definition +/* direction of FFT and transpose */ +/* complies with FFTW3 definition */ #define FFT_FORWARD -1 #define FFT_BACKWARD 1 @@ -35,4 +36,4 @@ void Free_FFT_Dmat(void); int fftFit(int size, int _div); void CheckNprocs(void); -#endif // __fft_h +#endif /*__fft_h*/ diff --git a/src/function.h b/src/function.h index ffe32db0..5aa15171 100644 --- a/src/function.h +++ b/src/function.h @@ -2,28 +2,29 @@ * AUTH: Maxim Yurkin * DESCR: INLINE definition and function attributes * - * Copyright (C) 2006,2008 University of Amsterdam + * Copyright (C) 2006 University of Amsterdam * This code is covered by the GNU General Public License. */ #ifndef __function_h #define __function_h -// specify to inline some functions; if there are problems with compiler change to "static" +/* specify to inline some functions */ +/* if problems with compiler change to "static" */ #define INLINE static __inline -// some optimization shortcuts; can be easily removed if not supported by compiler +/* some optimization shortcuts; can be easily removed if not supported by compiler */ #ifdef __GNUC__ -# define ATT_NORETURN __attribute__ ((__noreturn__)) -# define ATT_PRINTF(a,b) __attribute__ ((__format__(__printf__,a,b))) -# define ATT_UNUSED __attribute__ ((__unused__)) -# define ATT_PURE __attribute__ ((__pure__)) -# define ATT_MALLOC __attribute__ ((__malloc__)) +# define ATT_NORETURN __attribute__ ((__noreturn__)) +# define ATT_PRINTF(a,b) __attribute__ ((__format__(__printf__,a,b))) +# define ATT_UNUSED __attribute__ ((__unused__)) +# define ATT_PURE __attribute__ ((__pure__)) +# define ATT_MALLOC __attribute__ ((__malloc__)) #else -# define ATT_NORETURN -# define ATT_PRINTF(a,b) -# define ATT_UNUSED -# define ATT_PURE -# define ATT_MALLOC +# define ATT_NORETURN +# define ATT_PRINTF(a,b) +# define ATT_UNUSED +# define ATT_PURE +# define ATT_MALLOC #endif -#endif // __function_h +#endif /*__function_h*/ diff --git a/src/io.c b/src/io.c index 3beb3bed..1f7c3f48 100644 --- a/src/io.c +++ b/src/io.c @@ -10,11 +10,11 @@ #include <string.h> #include <stdarg.h> #include <errno.h> -// the following is for MkDirErr +/* the following is for MkDirErr */ #include "os.h" #ifdef POSIX -# include <sys/stat.h> -# include <sys/types.h> +# include <sys/stat.h> +# include <sys/types.h> #endif #include "io.h" @@ -23,281 +23,286 @@ #include "vars.h" #include "memory.h" -// SEMI-GLOBAL VARIABLES +/* SEMI-GLOBAL VARIABLES */ -// defined and initialized in param.c +/* defined and initialized in param.c */ extern const char logname[]; -// LOCAL VARIABLES +/* LOCAL VARIABLES */ - // error buffer for warning message generated before logfile is opened -static char warn_buf[MAX_MESSAGE2]=""; -//============================================================ +static char warn_buf[MAX_MESSAGE2]=""; /* error buffer for warning message generated before + logfile is opened */ +/*============================================================*/ void WrapLines(char *str) -/* wraps long lines in a string without breaking words; it replaces a number of spaces in string by - * '\n' characters; line width is determined by variable term_width. - */ + /* wraps long lines in a string without breaking words; it replaces a number of spaces in string + by '\n' characters; line width is determined by variable term_width */ { - char *left,*right,*mid,*end; - int divided; - - end=str+strlen(str); - left=str; - while (left<end) { - // left and right define beginning and end of current working line - right=strchr(left,'\n'); - if (right==NULL) right=end; - while ((right-left)>term_width) { - divided=FALSE; - mid=left+term_width; - // search backward for space - while (mid>=left) { - if(mid[0]==' ') { - mid[0]='\n'; - left=mid+1; - divided=TRUE; - break; - } - mid--; - } - // if backward search failed (too long word), search forward for space - if (!divided) { - mid=left+term_width+1; - while (mid<right) { - if(mid[0]==' ') { - mid[0]='\n'; - left=mid+1; - divided=TRUE; - break; - } - mid++; - } - // if no spaces are found at all, leave long line and proceed further - if (!divided) break; - } - } - if (right==end) left=end; - else left=right+1; - } + char *left,*right,*mid,*end; + int divided; + + end=str+strlen(str); + left=str; + while (left<end) { + /* left and right define beginning and end of current working line */ + right=strchr(left,'\n'); + if (right==NULL) right=end; + while ((right-left)>term_width) { + divided=FALSE; + mid=left+term_width; + /* search backward for space */ + while (mid>=left) { + if(mid[0]==' ') { + mid[0]='\n'; + left=mid+1; + divided=TRUE; + break; + } + mid--; + } + /* if backward search failed (too long word), search forward for space */ + if (!divided) { + mid=left+term_width+1; + while (mid<right) { + if(mid[0]==' ') { + mid[0]='\n'; + left=mid+1; + divided=TRUE; + break; + } + mid++; + } + /* if no spaces are found at all, leave long line and proceed further */ + if (!divided) break; + } + } + if (right==end) left=end; + else left=right+1; + } } -//============================================================ +/*============================================================*/ char *WrapLinesCopy(const char *str) -/* same as WrapLines, but creates a copy of the string, leaving the original intact; it is designed - * to be run once during the program (or not many), since this memory is not freed afterwards - */ + /* same as WrapLines, but creates a copy of the string, leaving the original intact + it is designed to be run once during the program (or not many), since this memory is not + freed afterwards */ { - char *dup; + char *dup; - dup=charVector(strlen(str)+1,ONE_POS,"string duplicate"); - strcpy(dup,str); - WrapLines(dup); - return dup; + dup=charVector(strlen(str)+1,ONE_POS,"string duplicate"); + strcpy(dup,str); + WrapLines(dup); + return dup; } -//============================================================ +/*============================================================*/ -void LogError(const int code,const int who,const char *fname,const int lineN,const char *fmt, ... ) -/* performs output of error specified by 'code' at 'fname':'lineN'; 'fmt' & arguments ('...') - * specify error message, 'who' specifies whether 1 (ringid=ROOT) or all processors should produce - * output. If 'code' is EC_ERROR program aborts after output. We use sprintf a couple of times, - * because we want each node to generate an atomic message, not a couple of messages after each - * other, since other nodes may then interfere with our output. INFO is printed to stdout and - * without showing the position in the source file, ERROR and WARN - to stderr and logfile. - */ +void LogError(const int code,const int who,const char *fname, + const int lineN,const char *fmt, ... ) + /* performs output of error specified by code at fname:lineN + * fmt + args (...) specify error message + * who specifies whether 1 (ringid=ROOT) or all processors should produce output + * if code is EC_ERROR program aborts after output. + * We use sprintf a couple of times, because we want each node to + * generate an atomic message, not a couple of messages after + * each other, since other nodes may then interfere with our output + * INFO is printed to stdout and without showing the position in the source file + * ERROR and WARN - to stderr and logfile + */ { - va_list args; - char line[MAX_MESSAGE2]; - char *pos; - - if (who==ALL || ringid==ROOT) { // controls whether output should be produced - // first build output string - va_start(args,fmt); - if (code==EC_ERROR) strcpy(line,"ERROR: "); - else if (code==EC_WARN) strcpy(line,"WARNING: "); - else if (code==EC_INFO) strcpy(line,"INFO: "); - else sprintf(line,"Error code=%d: ",code); - pos=line+strlen(line); + va_list args; + char line[MAX_MESSAGE2]; + char *pos; + + if (who==ALL || ringid==ROOT) { /* controls whether output should be produced */ + /* first build output string */ + va_start(args,fmt); + if (code==EC_ERROR) strcpy(line,"ERROR: "); + else if (code==EC_WARN) strcpy(line,"WARNING: "); + else if (code==EC_INFO) strcpy(line,"INFO: "); + else sprintf(line,"Error code=%d: ",code); + pos=line+strlen(line); #ifdef PARALLEL - if (code!=EC_INFO) { // for EC_INFO position in source code is not saved - pos+=sprintf(pos,"(%s:%d) ",fname,lineN); - // rewrites last 2 chars - if (who==ALL) { - pos-=2; - pos+=sprintf(pos," - ringID=%d) ",ringid); - } - } - else if (who==ALL) pos+=sprintf(pos,"(ringID=%d) ",ringid); + if (code!=EC_INFO) { /* for EC_INFO position in source code is not saved */ + pos+=sprintf(pos,"(%s:%d) ",fname,lineN); + /* rewrites last 2 chars */ + if (who==ALL) { + pos-=2; + pos+=sprintf(pos," - ringID=%d) ",ringid); + } + } + else if (who==ALL) pos+=sprintf(pos,"(ringID=%d) ",ringid); #else - if (code!=EC_INFO) pos+=sprintf(pos,"(%s:%d) ",fname,lineN); + if (code!=EC_INFO) pos+=sprintf(pos,"(%s:%d) ",fname,lineN); #endif - pos+=vsprintf(pos,fmt,args); - strcpy(pos,"\n"); - va_end(args); - // print line - if (code==EC_INFO) { - // put message to stdout, wrapping lines - WrapLines(line); - printf("%s",line); - fflush(stdout); - } - else if (code==EC_ERROR || code==EC_WARN) { - // first put error message in logfile - if (logname[0]!=0) { // otherwise can't produce output at all - if (ringid==ROOT) { - /* logfile is initialized to NULL in the beginning of the program. Hence if - * logfile!=NULL, logfile is necessarily initialized (open or already closed). - * logfile==NULL (when logname!=0) means that error is in opening logfile - * itself - */ - if (logfile!=NULL) { - if (fprintf(logfile,"%s",line)==EOF) { - fclose(logfile); // in most cases this is redundant - // try to reopen logfile and save message - if ((logfile=fopen(logname,"a"))!=NULL) fprintf(logfile,"%s",line); - } - fflush(logfile); // needed for warnings to appear on time - } - } // other processors - else if ((logfile=fopen(logname,"a"))!=NULL) { - fprintf(logfile,"%s",line); - fclose(logfile); - } - } // save line to buffer to save into logfile afterwards - else if (code==EC_WARN) strcpy(warn_buf,line); - // duplicate message to stderr, wrapping lines - WrapLines(line); - fprintf(stderr,"%s",line); - fflush(stderr); - } - } - if (code==EC_ERROR) { - if (who==ONE && ringid!=ROOT) Synchronize(); - Stop(1); - } + pos+=vsprintf(pos,fmt,args); + strcpy(pos,"\n"); + va_end(args); + /* print line */ + if (code==EC_INFO) { + /* put message to stdout, wrapping lines */ + WrapLines(line); + printf("%s",line); + fflush(stdout); + } + else if (code==EC_ERROR || code==EC_WARN) { + /* first put error message in logfile */ + if (logname[0]!=0) { /* otherwise can't produce output at all */ + if (ringid==ROOT) { + /* logfile is initialized to NULL in the beginning of the program. Hence if logfile!=NULL + then logfile is necessarily initialized (open or allready closed) + logfile==NULL (when logname!=0) means that error is in opening logfile itself */ + if (logfile!=NULL) { + if (fprintf(logfile,"%s",line)==EOF) { + fclose(logfile); /* in most cases this is redundant */ + /* try to reopen logfile and save message */ + if ((logfile=fopen(logname,"a"))!=NULL) fprintf(logfile,"%s",line); + } + fflush(logfile); /* needed for warnings to appear on time */ + } + } /* other processors */ + else if ((logfile=fopen(logname,"a"))!=NULL) { + fprintf(logfile,"%s",line); + fclose(logfile); + } + } /* save line to buffer to save into logfile afterwards */ + else if (code==EC_WARN) strcpy(warn_buf,line); + /* duplicate message to stderr, wrapping lines */ + WrapLines(line); + fprintf(stderr,"%s",line); + fflush(stderr); + } + } + if (code==EC_ERROR) { + if (who==ONE && ringid!=ROOT) Synchronize(); + Stop(1); + } } -//============================================================ +/*============================================================*/ void PrintError(const char *fmt, ... ) -/* print anything to stderr (on root processor) and stop; much simpler than LogError (do not print - * location of the error, and does not duplicate errors to files; assumes that all processors call - * it. - */ + /* print anything to stderr (on root processor) and stop; + much simpler than LogError (do not print location of the error, + and does not duplicate errors to files; + assumes that all processors call it; */ { - va_list args; - char line[MAX_MESSAGE]; - char *pos; - - if (ringid==ROOT) { - va_start(args,fmt); - strcpy(line,"ERROR: "); - pos=line+strlen(line); - pos+=vsprintf(pos,fmt,args); - strcpy(pos,"\n"); - va_end(args); - WrapLines(line); - fprintf(stderr,"%s",line); - fflush(stderr); - } - // wait for root to generate an error message - Synchronize(); - Stop(1); + va_list args; + char line[MAX_MESSAGE]; + char *pos; + + if (ringid==ROOT) { + va_start(args,fmt); + strcpy(line,"ERROR: "); + pos=line+strlen(line); + pos+=vsprintf(pos,fmt,args); + strcpy(pos,"\n"); + va_end(args); + WrapLines(line); + fprintf(stderr,"%s",line); + fflush(stderr); + } + /* wait for root to generate an error message */ + Synchronize(); + Stop(1); } -//============================================================ +/*============================================================*/ void LogPending(void) -/* Logs pending warning messages (currently only one maximum). Should be called when logname is - * created and logfile is opened on root. - */ + /* Logs pending warning messages (currently only one maximum). + Should be called when logname is created and logfile is opened + on root */ { - if (warn_buf[0]!=0) { - if (ringid==ROOT) fprintf(logfile,"%s",warn_buf); - else if ((logfile=fopen(logname,"a"))!=NULL) { - fprintf(logfile,"%s",warn_buf); - fclose(logfile); - } - // empty buffer - warn_buf[0]=0; - } + if (warn_buf[0]!=0) { + if (ringid==ROOT) fprintf(logfile,"%s",warn_buf); + else if ((logfile=fopen(logname,"a"))!=NULL) { + fprintf(logfile,"%s",warn_buf); + fclose(logfile); + } + /* empty buffer */ + warn_buf[0]=0; + } } -//============================================================ +/*============================================================*/ void PrintBoth(FILE *file,const char *fmt, ... ) -/* print anything both to file and to stdout; it is assumed that size of the message is limited to - * MAX_PARAGRAPH (i.e. no filenames in the message) - */ + /* print anything both to file and to stdout; + assumed that size of the message is limited to MAX_PARAGRAPH + (i.e. no filenames in the message) */ { - va_list args; - char line[MAX_PARAGRAPH]; - - va_start(args,fmt); - vsprintf(line,fmt,args); - fprintf(file,"%s",line); - printf("%s",line); - va_end(args); + va_list args; + char line[MAX_PARAGRAPH]; + + va_start(args,fmt); + vsprintf(line,fmt,args); + fprintf(file,"%s",line); + printf("%s",line); + va_end(args); } -//============================================================ +/*============================================================*/ FILE *FOpenErr(const char *fname,const char *mode,const int who,const char *err_fname, - const int lineN) -// open file and check for error + const int lineN) + /* open file and check for error */ { - FILE *file; + FILE *file; - if ((file=fopen(fname,mode))==NULL) - LogError(EC_ERROR,who,err_fname,lineN,"Failed to open file '%s'",fname); - return file; + if ((file=fopen(fname,mode))==NULL) + LogError(EC_ERROR,who,err_fname,lineN,"Failed to open file '%s'",fname); + + return file; } -//============================================================ +/*============================================================*/ -void FCloseErr(FILE *file,const char *fname,const int who,const char *err_fname,const int lineN) -// close file and check for error +void FCloseErr(FILE *file,const char *fname,const int who,const char *err_fname, + const int lineN) + /* close file and check for error */ { - if (fclose(file)) LogError(EC_WARN,who,err_fname,lineN, - "Errors detected during work with file '%s'",fname); + if (fclose(file)) LogError(EC_WARN,who,err_fname,lineN, + "Errors detected during work with file '%s'",fname); } -//============================================================ +/*============================================================*/ void RemoveErr(const char *fname,const int who,const char *err_fname,const int lineN) -// remove file and check the result + /* remove file and check the result */ { - if(remove(fname) && errno!=ENOENT) LogError(EC_WARN,who,err_fname,lineN, - "Failed to remove temporary file '%s' (%s). Remove it manually, if needed", - fname,strerror(errno)); + + if(remove(fname) && errno!=ENOENT) LogError(EC_WARN,who,err_fname,lineN, + "Failed to remove temporary file '%s' (%s). Remove it manually, if needed", + fname,strerror(errno)); } -//============================================================ +/*============================================================*/ void MkDirErr(const char *dir,const int who,const char *err_fname,const int lineN) -// make directory and check for error. + /* make directory and check for error. */ { #ifdef WINDOWS - if (!CreateDirectory(dir,NULL)) { - if (GetLastError()==ERROR_ALREADY_EXISTS) - LogError(EC_WARN,who,err_fname,lineN,"Directory '%s' already exists",dir); - else LogError(EC_ERROR,who,err_fname,lineN,"Failed to make directory '%s'",dir); - } + if (!CreateDirectory(dir,NULL)) { + if (GetLastError()==ERROR_ALREADY_EXISTS) + LogError(EC_WARN,who,err_fname,lineN,"Directory '%s' allready exists",dir); + else LogError(EC_ERROR,who,err_fname,lineN,"Failed to make directory '%s'",dir); + } #elif defined(POSIX) - if (mkdir(dir,0755)==-1) { - if (errno==EEXIST) - LogError(EC_WARN,who,err_fname,lineN,"Directory '%s' already exists",dir); - else LogError(EC_ERROR,who,err_fname,lineN,"Failed to make directory '%s'",dir); - } + if (mkdir(dir,0755)==-1) { + if (errno==EEXIST) + LogError(EC_WARN,who,err_fname,lineN,"Directory '%s' allready exists",dir); + else LogError(EC_ERROR,who,err_fname,lineN,"Failed to make directory '%s'",dir); + } #else - /* this should work for many cases (unknown OS), but e.g. system calls may fail when used in - * combination with MPI - */ - char sbuffer[MAX_DIRSYS]; + /* this should work for many cases (unknown OS), but e.g. system calls may fail + when used in combination with MPI */ + char sbuffer[MAX_DIRSYS]; - sprintf(sbuffer,"mkdir \"%s\"",dir); - if (system(sbuffer)) LogError(EC_WARN,who,err_fname,lineN,"Failed to make directory '%s'",dir); + sprintf(sbuffer,"mkdir \"%s\"",dir); + if (system(sbuffer)) LogError(EC_WARN,who,err_fname,lineN, + "Failed to make directory '%s'",dir); #endif } + diff --git a/src/io.h b/src/io.h index 952c6f96..adc8a8d6 100644 --- a/src/io.h +++ b/src/io.h @@ -1,30 +1,30 @@ /* FILE: io.h * AUTH: Maxim Yurkin - * DESCR: i/o and error handling routines + * DESCR: io and error handling routines * - * Copyright (C) 2006,2008 University of Amsterdam + * Copyright (C) 2006 University of Amsterdam * This code is covered by the GNU General Public License. */ #ifndef __io_h #define __io_h -#include <stdio.h> // for file -#include "function.h" // for function attributes +#include <stdio.h> /* for file */ +#include "function.h" /* for function attributes */ -/* File locking is made quite robust, however it is a complex operation that can cause unexpected - * behavior (permanent locks) especially when program is terminated externally (e.g. because of MPI - * failure). Moreover, it is not ANSI C, hence may have problems on some particular systems. - * Currently file locking functions are only in param.c - */ +/* file locking is made quite robust, however it is a complex operation that + can cause unexpected behaviour (permanent locks) especially when + program is terminated externally (e.g. because of MPI failure). + Moreover, it is not ANSI C, hence may have problems on some particular systems. + Currently file locking functions are only in param.c */ -//#define NOT_USE_LOCK // uncomment to disable file locking -//#define ONLY_LOCKFILE // uncomment to use only lock file, without file locking over NFS +/*#define NOT_USE_LOCK /* uncomment to disable file locking */ +/*#define ONLY_LOCKFILE /* uncomment to use only lock file, without file locking over NFS */ #ifndef NOT_USE_LOCK -# define USE_LOCK -# ifndef ONLY_LOCKFILE -# define LOCK_FOR_NFS // currently this works only for POSIX -# endif +# define USE_LOCK +# ifndef ONLY_LOCKFILE +# define LOCK_FOR_NFS /* currently this works only for POSIX */. +# endif #endif void WrapLines(char *str); @@ -34,10 +34,14 @@ void PrintError(const char *fmt, ... ) ATT_PRINTF(1,2) ATT_NORETURN; void LogPending(void); void PrintBoth(FILE *file,const char *fmt, ... ) ATT_PRINTF(2,3); -FILE *FOpenErr(const char *fname,const char *mode,int who,const char *err_fname, - int lineN) ATT_MALLOC; +FILE *FOpenErr(const char *fname,const char *mode,int who, + const char *err_fname,int lineN) ATT_MALLOC; void FCloseErr(FILE *file,const char *fname,int who,const char *err_fname,int lineN); void RemoveErr(const char *fname,int who,const char *err_fname,int lineN); void MkDirErr(const char *dirname,int who,const char *err_fname,int lineN); -#endif // __io_h +#endif /* __io_h */ + + + + diff --git a/src/iterative.c b/src/iterative.c index 72854fb2..ab1c6da8 100644 --- a/src/iterative.c +++ b/src/iterative.c @@ -28,7 +28,7 @@ * This code is covered by the GNU General Public License. */ #include <stdlib.h> -#include <time.h> // for time_t & time +#include <time.h> /* for time_t & time */ #include <string.h> #include <math.h> #include "vars.h" @@ -39,700 +39,696 @@ #include "io.h" #include "timing.h" #include "function.h" -#include "debug.h" -// maximum allowed iterations without residual decrease +/* maximum allowed iterations without residual decrease */ #define MAXCOUNT_CGNR 10 #define MAXCOUNT_BICGSTAB 30000 #define MAXCOUNT_BICG_CS 50000 #define MAXCOUNT_QMR_CS 50000 -// zero value for checks -#define EPS_BICGSTAB1 1E-16 // for (r~.r)/(r.r) -#define EPS_BICGSTAB2 1E-10 // for 1/|beta_k| -#define EPS_BICG_CS1 1E-10 // for (rT.r)/(r.r) -#define EPS_BICG_CS2 1E-10 // for (pT.A.p)/(rT.r) -#define EPS_QMR_CS1 1E-10 // for (vT.v)/(r.r) -#define EPS_QMR_CS2 1E-40 // for overflow of exponent number +/* zero value for checks */ +#define EPS_BICGSTAB 1E-8 +#define EPS_BICG_CS 1E-8 +#define EPS_QMR_CS 1E-8 +#define EPS_QMR_CS_1 1E-40 /* problem can only occur if overflow of exponent number */ -// SEMI-GLOBAL VARIABLES +/* SEMI-GLOBAL VARIABLES */ -// defined and initialized in CalculateE.c +/* defined and initialized in CalculateE.c */ extern const TIME_TYPE tstart_CE; -// defined and initialized in calculator.c +/* defined and initialized in calculator.c */ extern doublecomplex *rvec,*vec1,*vec2,*vec3,*Avecbuffer; -// defined and initialized in param.c +/* defined and initialized in param.c */ extern const double eps; -// defined and initialized in timing.c +/* defined and initialized in timing.c */ extern TIME_TYPE Timing_OneIter,Timing_InitIter,Timing_InitIter_comm; extern unsigned long TotalIter; -// LOCAL VARIABLES - -static double inprodR; // used as r_0 (and main residual) in each method -static double epsB; // stopping criterion -static double resid_scale; // scale to get square of relative error -static double prev_err; /* previous relative error; used in ProgressReport, initialized in - * IterativeSolver - */ -static int method; // iteration method -static int count; // iteration count -static int counter; // number of successive iterations without residual decrease -static int max_count; // maximum allowed value of counter -static int chp_exit; // checkpoint occurred - exit -static int chp_skip; // skip checkpoint, even if it is time to do -typedef struct // data for checkpoints +/* LOCAL VARIABLES */ + +static double inprodR; /* uses as r_0 (and main residual) in each method */ +static double epsB; /* stopping criterion */ +static double resid_scale; /* scale to get square of relative error */ +static double prev_err; /* previous Rel.Error; used in ProgressReport, + initilized in IterativeSolver */ +static int method; /* iteration method */ +static int count; /* iteration count */ +static int counter; /* number of successive iterations without residual decrease */ +static int max_count; /* maximum allowed value of counter */ +static int chp_exit; /* checkpoint occured - exit */ +static int chp_skip; /* skip checkpoint, even if it is time to do */ +typedef struct /* data for checkpoints */ { - void *ptr; // pointer to the data - int size; // size of one element + void *ptr; /* pointer to the data */ + int size; /* size of one element */ } chp_data; -typedef struct // structure to hold information about different scalars and vectors +typedef struct /* structure to hold information about different scalars and vectors */ { - chp_data *sc; // array of scalar data - int sc_N; // number of scalars - chp_data *vec; // array of vector data - int vec_N; // number of vectors + chp_data *sc; /* array of scalar data */ + int sc_N; /* number of scalars */ + chp_data *vec; /* array of vector data */ + int vec_N; /* number of vectors */ } iter_data_type; -static iter_data_type iter_data; // actually the structure +static iter_data_type iter_data; /* actually the structure */ -// EXTERNAL FUNCTIONS +/* EXTERNAL FUNCTIONS */ -// matvec.c +/* matvec.c */ void MatVec(doublecomplex *in,doublecomplex *out,double *inprod,int her); -//============================================================ +/*============================================================*/ INLINE void SwapPointers(doublecomplex **a,doublecomplex **b) -/* swap two pointers of (doublecomplex *) type; should work for others but will give - * "Suspicious pointer conversion" warning. - */ + /* swap two pointers of (doublecomplex *) type; + should work for others but will give "Suspisious pointer conversion" warning */ { - doublecomplex *tmp; + doublecomplex *tmp; - tmp=*a; - *a=*b; - *b=tmp; + tmp=*a; + *a=*b; + *b=tmp; } -//============================================================ +/*============================================================*/ static void SaveIterChpoint(void) -/* save a binary checkpoint; only limitedly foolproof - user should take care to load checkpoints - * on the same machine (number of processors) and with the same command line. - */ + /* save a binary checkpoint; + only limitedly foolproof - user should take care to load checkpoints + on the same machine (number of processors) and with the same command line */ { - int i; - char fname[MAX_FNAME]; - FILE *chp_file; - TIME_TYPE tstart; - - tstart=GET_TIME(); - if (ringid==ROOT) { - // create directory "chp_dir" if needed and open info file - sprintf(fname,"%s/" F_CHP_LOG,chp_dir); - if ((chp_file=fopen(fname,"w"))==NULL) { - MkDirErr(chp_dir,ONE_POS); - chp_file=FOpenErr(fname,"w",ONE_POS); - } - // write info and close file - fprintf(chp_file, - "Info about the run, which produced the checkpoint, can be found in ../%s",directory); - FCloseErr(chp_file,fname,ONE_POS); - } - // wait to ensure that directory exists - Synchronize(); - // open output file; writing errors are checked only for vectors - sprintf(fname,"%s/" F_CHP,chp_dir,ringid); - chp_file=FOpenErr(fname,"wb",ALL_POS); - // write common scalars - fwrite(&method,sizeof(int),1,chp_file); - fwrite(&nlocalRows,sizeof(size_t),1,chp_file); - fwrite(&count,sizeof(int),1,chp_file); - fwrite(&counter,sizeof(int),1,chp_file); - fwrite(&inprodR,sizeof(double),1,chp_file); - fwrite(&prev_err,sizeof(double),1,chp_file); // written on ALL processors but used only on ROOT - fwrite(&resid_scale,sizeof(double),1,chp_file); - // write specific scalars - for (i=0;i<iter_data.sc_N;i++) - fwrite(iter_data.sc[i].ptr,iter_data.sc[i].size,1,chp_file); - // write common vectors - if (fwrite(xvec,sizeof(doublecomplex),nlocalRows,chp_file)!=nlocalRows) - LogError(EC_ERROR,ALL_POS,"Failed writing to file '%s'",fname); - if (fwrite(rvec,sizeof(doublecomplex),nlocalRows,chp_file)!=nlocalRows) - LogError(EC_ERROR,ALL_POS,"Failed writing to file '%s'",fname); - if (fwrite(pvec,sizeof(doublecomplex),nlocalRows,chp_file)!=nlocalRows) - LogError(EC_ERROR,ALL_POS,"Failed writing to file '%s'",fname); - if (fwrite(Avecbuffer,sizeof(doublecomplex),nlocalRows,chp_file)!=nlocalRows) - LogError(EC_ERROR,ALL_POS,"Failed writing to file '%s'",fname); - // write specific vectors - for (i=0;i<iter_data.vec_N;i++) - if (fwrite(iter_data.vec[i].ptr,iter_data.vec[i].size,nlocalRows,chp_file)!=nlocalRows) - LogError(EC_ERROR,ALL_POS,"Failed writing to file '%s'",fname); - // close file - FCloseErr(chp_file,fname,ALL_POS); - // write info to logfile after everyone is finished - Synchronize(); - PRINTBOTHZ(logfile,"Checkpoint (iteration) saved\n"); - Timing_FileIO+=GET_TIME()-tstart; - Synchronize(); // this is to ensure that message above appears if and only if OK + int i; + char fname[MAX_FNAME]; + FILE *chp_file; + TIME_TYPE tstart; + + tstart=GET_TIME(); + if (ringid==ROOT) { + /* create directory "chp_dir" if needed and open info file */ + sprintf(fname,"%s/" F_CHP_LOG,chp_dir); + if ((chp_file=fopen(fname,"w"))==NULL) { + MkDirErr(chp_dir,ONE_POS); + chp_file=FOpenErr(fname,"w",ONE_POS); + } + /* write info and close file */ + fprintf(chp_file, + "Info about the run, which produced the checkpoint, can be found in ../%s",directory); + FCloseErr(chp_file,fname,ONE_POS); + } + /* wait to ensure that directory exists */ + Synchronize(); + /* open output file; writing errors are checked only for vectors */ + sprintf(fname,"%s/" F_CHP,chp_dir,ringid); + chp_file=FOpenErr(fname,"wb",ALL_POS); + /* write commmon scalars */ + fwrite(&method,sizeof(int),1,chp_file); + fwrite(&nlocalRows,sizeof(size_t),1,chp_file); + fwrite(&count,sizeof(int),1,chp_file); + fwrite(&counter,sizeof(int),1,chp_file); + fwrite(&inprodR,sizeof(double),1,chp_file); + fwrite(&prev_err,sizeof(double),1,chp_file); /* written on ALL processors but used only on ROOT */ + fwrite(&resid_scale,sizeof(double),1,chp_file); + /* write specific scalars*/ + for (i=0;i<iter_data.sc_N;i++) + fwrite(iter_data.sc[i].ptr,iter_data.sc[i].size,1,chp_file); + /* write commmon vectors */ + if (fwrite(xvec,sizeof(doublecomplex),nlocalRows,chp_file)!=nlocalRows) + LogError(EC_ERROR,ALL_POS,"Failed writing to file '%s'",fname); + if (fwrite(rvec,sizeof(doublecomplex),nlocalRows,chp_file)!=nlocalRows) + LogError(EC_ERROR,ALL_POS,"Failed writing to file '%s'",fname); + if (fwrite(pvec,sizeof(doublecomplex),nlocalRows,chp_file)!=nlocalRows) + LogError(EC_ERROR,ALL_POS,"Failed writing to file '%s'",fname); + if (fwrite(Avecbuffer,sizeof(doublecomplex),nlocalRows,chp_file)!=nlocalRows) + LogError(EC_ERROR,ALL_POS,"Failed writing to file '%s'",fname); + /* write specific vectors*/ + for (i=0;i<iter_data.vec_N;i++) + if (fwrite(iter_data.vec[i].ptr,iter_data.vec[i].size,nlocalRows,chp_file)!=nlocalRows) + LogError(EC_ERROR,ALL_POS,"Failed writing to file '%s'",fname); + /* close file */ + FCloseErr(chp_file,fname,ALL_POS); + /* write info to logfile after everyone is finished */ + Synchronize(); + PRINTBOTHZ(logfile,"Checkpoint (iteration) saved\n"); + Timing_FileIO+=GET_TIME()-tstart; + Synchronize(); /* this is to ensure that message above appears if and only if OK */ } -//============================================================ +/*============================================================*/ static void LoadIterChpoint(void) -/* load a binary checkpoint; only limitedly foolproof - user should take care to load checkpoints - * on the same machine (number of processors) and with the same command line. - * */ + /* load a binary checkpoint; + only limitedly foolproof - user should take care to load checkpoints + on the same machine (number of processors) and with the same command line */ { - int i, method_new; - size_t nlocalRows_new; - char fname[MAX_FNAME],ch; - FILE *chp_file; - TIME_TYPE tstart; - - tstart=GET_TIME(); - // open input file; reading errors are checked only for vectors - sprintf(fname,"%s/" F_CHP,chp_dir,ringid); - chp_file=FOpenErr(fname,"rb",ALL_POS); - // check for consistency - fread(&method_new,sizeof(int),1,chp_file); - if (method_new!=method) - LogError(EC_ERROR,ALL_POS,"File '%s' is for different iterative method",fname); - fread(&nlocalRows_new,sizeof(size_t),1,chp_file); - if (nlocalRows_new!=nlocalRows) - LogError(EC_ERROR,ALL_POS,"File '%s' is for different vector size",fname); - // read common scalars - fread(&count,sizeof(int),1,chp_file); - fread(&counter,sizeof(int),1,chp_file); - fread(&inprodR,sizeof(double),1,chp_file); - fread(&prev_err,sizeof(double),1,chp_file); // read on ALL processors but used only on ROOT - fread(&resid_scale,sizeof(double),1,chp_file); - // read specific scalars - for (i=0;i<iter_data.sc_N;i++) - fread(iter_data.sc[i].ptr,iter_data.sc[i].size,1,chp_file); - // read common vectors - if (fread(xvec,sizeof(doublecomplex),nlocalRows,chp_file)!=nlocalRows) - LogError(EC_ERROR,ALL_POS,"Failed reading from file '%s'",fname); - if (fread(rvec,sizeof(doublecomplex),nlocalRows,chp_file)!=nlocalRows) - LogError(EC_ERROR,ALL_POS,"Failed reading from file '%s'",fname); - if (fread(pvec,sizeof(doublecomplex),nlocalRows,chp_file)!=nlocalRows) - LogError(EC_ERROR,ALL_POS,"Failed reading from file '%s'",fname); - if (fread(Avecbuffer,sizeof(doublecomplex),nlocalRows,chp_file)!=nlocalRows) - LogError(EC_ERROR,ALL_POS,"Failed reading from file '%s'",fname); - // read specific vectors - for (i=0;i<iter_data.vec_N;i++) - if (fread(iter_data.vec[i].ptr,iter_data.vec[i].size,nlocalRows,chp_file)!=nlocalRows) - LogError(EC_ERROR,ALL_POS,"Failed reading from file '%s'",fname); - // check if EOF reached and close file - if(fread(&ch,1,1,chp_file)!=0) LogError(EC_ERROR,ALL_POS,"File '%s' is too long",fname); - FCloseErr(chp_file,fname,ALL_POS); - // initialize auxiliary variables - epsB=eps*eps/resid_scale; - // print info - if (ringid==ROOT) { - PrintBoth(logfile,"Checkpoint (iteration) loaded\n"); - // if residual is stagnating print info about last minimum - if (counter!=0) fprintf(logfile, - "Residual has been stagnating already for %d iterations since:\n" - "RE_%03d = %.10E\n" - "...\n",counter,count-counter-1,sqrt(resid_scale*inprodR)); - } - Timing_FileIO+=GET_TIME()-tstart; + int i, method_new; + size_t nlocalRows_new; + char fname[MAX_FNAME],ch; + FILE *chp_file; + TIME_TYPE tstart; + + tstart=GET_TIME(); + /* open input file; reading errors are checked only for vectors */ + sprintf(fname,"%s/" F_CHP,chp_dir,ringid); + chp_file=FOpenErr(fname,"rb",ALL_POS); + /* check for consistency */ + fread(&method_new,sizeof(int),1,chp_file); + if (method_new!=method) + LogError(EC_ERROR,ALL_POS,"File '%s' is for different iterative method",fname); + fread(&nlocalRows_new,sizeof(size_t),1,chp_file); + if (nlocalRows_new!=nlocalRows) + LogError(EC_ERROR,ALL_POS,"File '%s' is for different vector size",fname); + /* read commmon scalars */ + fread(&count,sizeof(int),1,chp_file); + fread(&counter,sizeof(int),1,chp_file); + fread(&inprodR,sizeof(double),1,chp_file); + fread(&prev_err,sizeof(double),1,chp_file); /* read on ALL processors but used only on ROOT */ + fread(&resid_scale,sizeof(double),1,chp_file); + /* read specific scalars*/ + for (i=0;i<iter_data.sc_N;i++) + fread(iter_data.sc[i].ptr,iter_data.sc[i].size,1,chp_file); + /* read commmon vectors */ + if (fread(xvec,sizeof(doublecomplex),nlocalRows,chp_file)!=nlocalRows) + LogError(EC_ERROR,ALL_POS,"Failed reading from file '%s'",fname); + if (fread(rvec,sizeof(doublecomplex),nlocalRows,chp_file)!=nlocalRows) + LogError(EC_ERROR,ALL_POS,"Failed reading from file '%s'",fname); + if (fread(pvec,sizeof(doublecomplex),nlocalRows,chp_file)!=nlocalRows) + LogError(EC_ERROR,ALL_POS,"Failed reading from file '%s'",fname); + if (fread(Avecbuffer,sizeof(doublecomplex),nlocalRows,chp_file)!=nlocalRows) + LogError(EC_ERROR,ALL_POS,"Failed reading from file '%s'",fname); + /* read specific vectors*/ + for (i=0;i<iter_data.vec_N;i++) + if (fread(iter_data.vec[i].ptr,iter_data.vec[i].size,nlocalRows,chp_file)!=nlocalRows) + LogError(EC_ERROR,ALL_POS,"Failed reading from file '%s'",fname); + /* check if EOF reached and close file */ + if(fread(&ch,1,1,chp_file)!=0) LogError(EC_ERROR,ALL_POS,"File '%s' is too long",fname); + FCloseErr(chp_file,fname,ALL_POS); + /* initialize auxiliary variables */ + epsB=eps*eps/resid_scale; + /* print info */ + if (ringid==ROOT) { + PrintBoth(logfile,"Checkpoint (iteration) loaded\n"); + /* if residual is stagnating print info about last minimum */ + if (counter!=0) fprintf(logfile, + "Residual has been stagnating already for %d iterations since:\n"\ + "RE_%03d = %.10E\n"\ + "...\n",counter,count-counter-1,sqrt(resid_scale*inprodR)); + } + Timing_FileIO+=GET_TIME()-tstart; } -//============================================================ +/*============================================================*/ static void ProgressReport(const double inprod) -// Do common procedures; show progress in logfile and stdout; also check for checkpoint condition + /* Do common procedures; show progress in logfile and stdout + also check for checkpoint condition */ { - double err,progr,elapsed; - char progr_string[MAX_LINE]; - char temp[5]; - time_t wt; - - if (inprod<=inprodR) { - inprodR=inprod; - counter=0; - } - else counter++; - if (ringid==ROOT) { - err=sqrt(resid_scale*inprod); - progr=1-err/prev_err; - if (counter==0) strcpy(temp,"+ "); - else if (progr>0) strcpy(temp,"-+"); - else strcpy(temp,"- "); - sprintf(progr_string,"RE_%03d = %.10E %s",count,err,temp); - if (!orient_avg) { - fprintf(logfile,"%s progress = %.6f\n",progr_string,progr); - fflush(logfile); - } - printf("%s\n",progr_string); - fflush(stdout); - - prev_err=err; - } - count++; - TotalIter++; - // check condition for checkpoint; checkpoint is saved at first time - if (chp_type!=CHP_NONE && chp_time!=UNDEF && !chp_skip) { - time(&wt); - elapsed=difftime(wt,last_chp_wt); - if (chp_time<elapsed) { - SaveIterChpoint(); - time(&last_chp_wt); - if (chp_type!=CHP_REGULAR) chp_exit=TRUE; - } - } + double err,progr,elapsed; + char progr_string[MAX_LINE]; + char temp[5]; + time_t wt; + + if (inprod<=inprodR) { + inprodR=inprod; + counter=0; + } + else counter++; + + if (ringid==ROOT) { + err=sqrt(resid_scale*inprod); + progr=1-err/prev_err; + if (counter==0) strcpy(temp,"+ "); + else if (progr>0) strcpy(temp,"-+"); + else strcpy(temp,"- "); + sprintf(progr_string,"RE_%03d = %.10E %s",count,err,temp); + if (!orient_avg) { + fprintf(logfile,"%s progress = %.6f\n",progr_string,progr); + fflush(logfile); + } + printf("%s\n",progr_string); + fflush(stdout); + + prev_err=err; + } + count++; + TotalIter++; + + /* check condition for checkpoint; checkpoint is saved at first time */ + if (chp_type!=CHP_NONE && chp_time!=UNDEF && !chp_skip) { + time(&wt); + elapsed=difftime(wt,last_chp_wt); + if (chp_time<elapsed) { + SaveIterChpoint(); + time(&last_chp_wt); + if (chp_type!=CHP_REGULAR) chp_exit=TRUE; + } + } } -//============================================================ +/*============================================================*/ static void AfterIterFinished(void) -// Do common procedures after the iterations has finished + /* Do common procedures after the iterations has finished */ { - if (chp_type==CHP_ALWAYS && !chp_exit) SaveIterChpoint(); + if (chp_type==CHP_ALWAYS && !chp_exit) SaveIterChpoint(); } -//============================================================ +/*============================================================*/ static void CGNR(const int mc) -// Conjugate Gradient applied to Normalized Equations with minimization of Residual Norm + /* Conjugate Gradient applied to Normalized Equations with minimization of Residual Norm */ { - double inprodRplus1; // inner product of r_k+1 - double alpha, denumeratorAlpha; - double beta,ro_new,ro_old=0; // initialization to remove compiler warning - TIME_TYPE tstart; - chp_data scalars[1]; - - max_count=mc; - // initialize data structure for checkpoints - scalars[0].ptr=&ro_old; - scalars[0].size=sizeof(double); - iter_data.sc=scalars; - iter_data.sc_N=1; - iter_data.vec=NULL; - iter_data.vec_N=0; - // initialization of constants and vectors - if (load_chpoint) LoadIterChpoint(); - Timing_InitIter = GET_TIME() - tstart_CE; // initialization complete - // main iteration cycle - while (inprodR>=epsB && count<=maxiter && counter<=max_count && !chp_exit) { - Timing_OneIterComm=0; // initialize time - tstart=GET_TIME(); - // p_1=Ah.r_0 and ro_new=ro_0=|Ah.r_0|^2 - if (count==1) MatVec(rvec,pvec,&ro_new,TRUE); - else { - // Avecbuffer=AH.r_k-1, ro_new=ro_k-1=|AH.r_k-1|^2 - MatVec(rvec,Avecbuffer,&ro_new,TRUE); - // beta_k-1=ro_k-1/ro_k-2 - beta=ro_new/ro_old; - // p_k=beta_k-1*p_k-1+AH.r_k-1 - nIncrem10(pvec,Avecbuffer,beta,NULL,&Timing_OneIterComm); - } - // alpha_k=ro_k-1/|A.p_k|^2 - // Avecbuffer=A.p_k - MatVec(pvec,Avecbuffer,&denumeratorAlpha,FALSE); - alpha=ro_new/denumeratorAlpha; - // x_k=x_k-1+alpha_k*p_k - nIncrem01(xvec,pvec,alpha,NULL,&Timing_OneIterComm); - // r_k=r_k-1-alpha_k*A.p_k and |r_k|^2 - nIncrem01(rvec,Avecbuffer,-alpha,&inprodRplus1,&Timing_OneIterComm); - // initialize ro_old -> ro_k-2 for next iteration - ro_old=ro_new; - - Timing_OneIter=GET_TIME()-tstart; - // check progress - ProgressReport(inprodRplus1); - } // end of the big while loop - AfterIterFinished(); + double inprodRplus1; /* inner product of rk+1 */ + double alpha, denumeratorAlpha; + double beta,ro_new,ro_old=0; /* initialization to remove compiler warning */ + TIME_TYPE tstart; + chp_data scalars[1]; + + max_count=mc; + /* initialize data structure for checkpoints */ + scalars[0].ptr=&ro_old; + scalars[0].size=sizeof(double); + iter_data.sc=scalars; + iter_data.sc_N=1; + iter_data.vec=NULL; + iter_data.vec_N=0; + /* initialization of constants and vectors */ + if (load_chpoint) LoadIterChpoint(); + Timing_InitIter = GET_TIME() - tstart_CE; /* initialization complete */ + /* main iteration cycle */ + while (inprodR>=epsB && count<=maxiter && counter<=max_count && !chp_exit) { + Timing_OneIterComm=0; /* initialize time */ + tstart=GET_TIME(); + /* p_1=Ah.r_0 and ro_new=ro_0=|Ah.r_0|^2 */ + if (count==1) MatVec(rvec,pvec,&ro_new,TRUE); + else { + /* Avecbuffer=AH.r_k-1, ro_new=ro_k-1=|AH.r_k-1|^2 */ + MatVec(rvec,Avecbuffer,&ro_new,TRUE); + /* beta_k-1=ro_k-1/ro_k-2 */ + beta=ro_new/ro_old; + /* p_k=beta_k-1*p_k-1+AH.r_k-1 */ + nIncrem10(pvec,Avecbuffer,beta,NULL,&Timing_OneIterComm); + } + /* alpha_k=ro_k-1/|A.p_k|^2 */ + /* Avecbuffer=A.p_k */ + MatVec(pvec,Avecbuffer,&denumeratorAlpha,FALSE); + alpha=ro_new/denumeratorAlpha; + /* x_k=x_k-1+alpha_k*p_k */ + nIncrem01(xvec,pvec,alpha,NULL,&Timing_OneIterComm); + /* r_k=r_k-1-alpha_k*A.p_k and |r_k|^2 */ + nIncrem01(rvec,Avecbuffer,-alpha,&inprodRplus1,&Timing_OneIterComm); + /* initialize ro_old -> ro_k-2 for next iteration */ + ro_old=ro_new; + + Timing_OneIter=GET_TIME()-tstart; + /* check progress */ + ProgressReport(inprodRplus1); + } /* end of the big while loop */ + AfterIterFinished(); } -//============================================================ +/*============================================================*/ static void BiCGStab(const int mc) -// Bi-Conjugate Gradient Stabilized + /* Bi-Conjugate Gradient Stabilized */ { - double inprodRplus1; // inner product of r_k+1 - double denumOmega,dtmp; - doublecomplex beta,ro_new,ro_old,omega,alpha,temp1,temp2; - doublecomplex *v,*s,*rtilda; - TIME_TYPE tstart; - chp_data scalars[3],vectors[3]; - - max_count=mc; - // rename some vectors - v=vec1; - s=vec2; - rtilda=vec3; - // initialize data structure for checkpoints - scalars[0].ptr=&ro_old; - scalars[1].ptr=ω - scalars[2].ptr=α - scalars[0].size=scalars[1].size=scalars[2].size=sizeof(doublecomplex); - vectors[0].ptr=v; - vectors[1].ptr=s; - vectors[2].ptr=rtilda; - vectors[0].size=vectors[1].size=vectors[2].size=sizeof(doublecomplex); - iter_data.sc=scalars; - iter_data.sc_N=3; - iter_data.vec=vectors; - iter_data.vec_N=3; - // initialization of constants and vectors - if (load_chpoint) LoadIterChpoint(); - else nCopy(rtilda,rvec); // r~=r_0 - Timing_InitIter=GET_TIME()-tstart_CE; // initialization complete - // main iteration cycle - while (inprodR>=epsB && count<=maxiter && counter<=max_count && !chp_exit) { - Timing_OneIterComm = 0; // initialize time - tstart = GET_TIME(); - // ro_k-1=r_k-1.r~ ; check for ro_k-1!=0 - nDotProd(rvec,rtilda,ro_new,&Timing_OneIterComm); - dtmp=cAbs(ro_new)/inprodR; - D2z("(r~.r)/(r.r)=%.2g",dtmp); - if (dtmp<EPS_BICGSTAB1) - LogError(EC_ERROR,ONE_POS,"BiCGStab fails: (r~.r)/(r.r) is too small (%.2g).",dtmp); - if (count==1) nCopy(pvec,rvec); // p_1=r_0 - else { - // beta_k-1=(ro_k-1/ro_k-2)*(alpha_k-1/omega_k-1) - cMult(ro_new,alpha,temp1); - cMult(ro_old,omega,temp2); - // check that omega_k-1!=0 - dtmp=cAbs(temp2)/cAbs(temp1); - D2z("1/|beta_k|=%.2g",dtmp); - if (dtmp<EPS_BICGSTAB2) - LogError(EC_ERROR,ONE_POS,"Bi-CGStab fails: 1/|beta_k| is too small (%.2g).",dtmp); - cDiv(temp1,temp2,beta); - // p_k=beta_k-1*(p_k-1-omega_k-1*v_k-1)+r_k-1 - cMult(beta,omega,temp1); - cInvSign(temp1); - nIncrem110_cmplx(pvec,v,rvec,beta,temp1); - } - // calculate v_k=A.p_k - MatVec(pvec,v,NULL,FALSE); - // alpha_k=ro_new/(v_k.r~) - nDotProd(v,rtilda,temp1,&Timing_OneIterComm); - cDiv(ro_new,temp1,alpha); - // s=r_k-1-alpha*v_k-1 - cInvSign2(alpha,temp1); - nLinComb1_cmplx(s,v,rvec,temp1,&inprodRplus1,&Timing_OneIterComm); - // check convergence at this step; if yes, checkpoint should not be saved afterwards - if (inprodRplus1<epsB && chp_type!=CHP_ALWAYS) { - inprodR=inprodRplus1; - // x_k=x_k-1+alpha_k*p_k - nIncrem01_cmplx(xvec,pvec,alpha,NULL,&Timing_OneIterComm); - chp_skip=TRUE; - } - else { - // t=Avecbuffer=A.s - MatVec(s,Avecbuffer,&denumOmega,FALSE); - // omega_k=s.t/|t|^2 - nDotProd(s,Avecbuffer,temp1,&Timing_OneIterComm); - cMultReal(1/denumOmega,temp1,omega); - // x_k=x_k-1+alpha_k*p_k+omega_k*s - nIncrem011_cmplx(xvec,pvec,s,alpha,omega); - // r_k=s-omega_k*t and |r_k|^2 - cInvSign2(omega,temp1); - nLinComb1_cmplx(rvec,Avecbuffer,s,temp1,&inprodRplus1,&Timing_OneIterComm); - // initialize ro_old -> ro_k-2 for next iteration - cEqual(ro_new,ro_old); - /* take time stamp here, not to measure time of incomplete iteration - * (interrupted at the check above). - */ - Timing_OneIter=GET_TIME()-tstart; - } - // check progress - ProgressReport(inprodRplus1); - } // end of the big while loop - AfterIterFinished(); + double inprodRplus1; /* inner product of rk+1 */ + double denumOmega,dtmp; + doublecomplex beta,ro_new,ro_old,omega,alpha,temp1,temp2; + doublecomplex *v,*s,*rtilda; + TIME_TYPE tstart; + chp_data scalars[3],vectors[3]; + + max_count=mc; + /* rename some vectors */ + v=vec1; + s=vec2; + rtilda=vec3; + /* initialize data structure for checkpoints */ + scalars[0].ptr=&ro_old; + scalars[1].ptr=ω + scalars[2].ptr=α + scalars[0].size=scalars[1].size=scalars[2].size=sizeof(doublecomplex); + vectors[0].ptr=v; + vectors[1].ptr=s; + vectors[2].ptr=rtilda; + vectors[0].size=vectors[1].size=vectors[2].size=sizeof(doublecomplex); + iter_data.sc=scalars; + iter_data.sc_N=3; + iter_data.vec=vectors; + iter_data.vec_N=3; + /* initialization of constants and vectors */ + if (load_chpoint) LoadIterChpoint(); + else nCopy(rtilda,rvec); /* r~=r_0 */ + Timing_InitIter=GET_TIME()-tstart_CE; /* initialization complete */ + /* main iteration cycle */ + while (inprodR>=epsB && count<=maxiter && counter<=max_count && !chp_exit) { + Timing_OneIterComm = 0; /* initialize time */ + tstart = GET_TIME(); + /* ro_k-1=r_k-1.r~ ; check for ro_k-1!=0 */ + nDotProd(rvec,rtilda,ro_new,&Timing_OneIterComm); + dtmp=cAbs(ro_new)/inprodR; + if (dtmp<EPS_BICGSTAB) + LogError(EC_ERROR,ONE_POS,"BiCGStab fails: (r~.r)/(r.r) is too small (%.2g).",dtmp); + if (count==1) nCopy(pvec,rvec); /* p_1=r_0 */ + else { + /* beta_k-1=(ro_k-1/ro_k-2)*(alpha_k-1/omega_k-1) */ + cMult(ro_new,alpha,temp1); + cMult(ro_old,omega,temp2); + /* check that omega_k-1!=0 */ + dtmp=cAbs(temp2)/cAbs(temp1); + if (dtmp<EPS_BICGSTAB) + LogError(EC_ERROR,ONE_POS,"Bi-CGStab fails: 1/|beta_k| is too small (%.2g).",dtmp); + cDiv(temp1,temp2,beta); + /* p_k=beta_k-1*(p_k-1-omega_k-1*v_k-1)+r_k-1 */ + cMult(beta,omega,temp1); + cInvSign(temp1); + nIncrem110_cmplx(pvec,v,rvec,beta,temp1); + } + /* calculate v_k=A.p_k */ + MatVec(pvec,v,NULL,FALSE); + /* alpha_k=ro_new/(v_k.r~) */ + nDotProd(v,rtilda,temp1,&Timing_OneIterComm); + cDiv(ro_new,temp1,alpha); + /* s=r_k-1-alpha*v_k-1 */ + cInvSign2(alpha,temp1); + nLinComb1_cmplx(s,v,rvec,temp1,&inprodRplus1,&Timing_OneIterComm); + /* check convergence at this step; if yes, checkpoint should not be saved afterwards */ + if (inprodRplus1<epsB && chp_type!=CHP_ALWAYS) { + inprodR=inprodRplus1; + /* x_k=x_k-1+alpha_k*p_k */ + nIncrem01_cmplx(xvec,pvec,alpha,NULL,&Timing_OneIterComm); + chp_skip=TRUE; + } + else { + /* t=Avecbuffer=A.s */ + MatVec(s,Avecbuffer,&denumOmega,FALSE); + /* omega_k=s.t/|t|^2 */ + nDotProd(s,Avecbuffer,temp1,&Timing_OneIterComm); + cMultReal(1/denumOmega,temp1,omega); + /* x_k=x_k-1+alpha_k*p_k+omega_k*s */ + nIncrem011_cmplx(xvec,pvec,s,alpha,omega); + /* r_k=s-omega_k*t and |r_k|^2 */ + cInvSign2(omega,temp1); + nLinComb1_cmplx(rvec,Avecbuffer,s,temp1,&inprodRplus1,&Timing_OneIterComm); + /* initialize ro_old -> ro_k-2 for next iteration */ + cEqual(ro_new,ro_old); + /* take time stamp here, not to measure time of incomplete iteration + (interrupted at the check above */ + Timing_OneIter=GET_TIME()-tstart; + } + /* check progress */ + ProgressReport(inprodRplus1); + } /* end of the big while loop */ + AfterIterFinished(); } -//============================================================ +/*============================================================*/ static void BiCG_CS(const int mc) -// Bi-Conjugate Gradient for Complex Symmetric systems + /* Bi-Conjugate Gradient for Complex Symmetric systems */ { - double inprodRplus1; // inner product of r_k+1 - doublecomplex alpha, mu; - doublecomplex beta,ro_new,ro_old,temp; - double dtmp,abs_ro_new; - TIME_TYPE tstart; - chp_data scalars[1]; - - max_count=mc; - // initialize data structure for checkpoints - scalars[0].ptr=&ro_old; - scalars[0].size=sizeof(doublecomplex); - iter_data.sc=scalars; - iter_data.sc_N=1; - iter_data.vec=NULL; - iter_data.vec_N=0; - // initialization of constants and vectors - if (load_chpoint) LoadIterChpoint(); - Timing_InitIter = GET_TIME() - tstart_CE; // initialization complete - // main iteration cycle - while (inprodR>=epsB && count<=maxiter && counter<=max_count && !chp_exit) { - Timing_OneIterComm=0; // initialize time - tstart=GET_TIME(); - // ro_k-1=r_k-1(*).r_k-1; check for ro_k-1!=0 - nDotProdSelf_conj(rvec,ro_new,&Timing_OneIterComm); - abs_ro_new=cAbs(ro_new); - dtmp=abs_ro_new/inprodR; - D2z("(rT.r)/(r.r)=%.2g",dtmp); - if (dtmp<EPS_BICG_CS1) - LogError(EC_ERROR,ONE_POS,"BiCG_CS fails: (rT.r)/(r.r) is too small (%.2g).",dtmp); - if (count==1) nCopy(pvec,rvec); // p_1=r_0 - else { - // beta_k-1=ro_k-1/ro_k-2 - cDiv(ro_new,ro_old,beta); - // p_k=beta_k-1*p_k-1+r_k-1 - nIncrem10_cmplx(pvec,rvec,beta,NULL,&Timing_OneIterComm); - } - // q_k=Avecbuffer=A.p_k - MatVec(pvec,Avecbuffer,NULL,FALSE); - // mu_k=p_k.q_k; check for mu_k!=0 - nDotProd_conj(pvec,Avecbuffer,mu,&Timing_OneIterComm); - dtmp=cAbs(mu)/abs_ro_new; - D2z("(pT.A.p)/(rT.r)=%.2g",dtmp); - if (dtmp<EPS_BICG_CS2) - LogError(EC_ERROR,ONE_POS,"BiCG_CS fails: (pT.A.p)/(rT.r) is too small (%.2g).",dtmp); - // alpha_k=ro_k/mu_k - cDiv(ro_new,mu,alpha); - // x_k=x_k-1+alpha_k*p_k - nIncrem01_cmplx(xvec,pvec,alpha,NULL,&Timing_OneIterComm); - // r_k=r_k-1-alpha_k*A.p_k and |r_k|^2 - cInvSign2(alpha,temp); - nIncrem01_cmplx(rvec,Avecbuffer,temp,&inprodRplus1,&Timing_OneIterComm); - // initialize ro_old -> ro_k-2 for next iteration - cEqual(ro_new,ro_old); - Timing_OneIter=GET_TIME()-tstart; - // check progress - ProgressReport(inprodRplus1); - } // end of the big while loop - AfterIterFinished(); + double inprodRplus1; /* inner product of rk+1 */ + doublecomplex alpha, mu; + doublecomplex beta,ro_new,ro_old,temp; + double dtmp,abs_ro_new; + TIME_TYPE tstart; + chp_data scalars[1]; + + max_count=mc; + /* initialize data structure for checkpoints */ + scalars[0].ptr=&ro_old; + scalars[0].size=sizeof(doublecomplex); + iter_data.sc=scalars; + iter_data.sc_N=1; + iter_data.vec=NULL; + iter_data.vec_N=0; + /* initialization of constants and vectors */ + if (load_chpoint) LoadIterChpoint(); + Timing_InitIter = GET_TIME() - tstart_CE; /* initialization complete */ + /* main iteration cycle */ + while (inprodR>=epsB && count<=maxiter && counter<=max_count && !chp_exit) { + Timing_OneIterComm=0; /* initialize time */ + tstart=GET_TIME(); + /* ro_k-1=r_k-1(*).r_k-1; check for ro_k-1!=0 */ + nDotProdSelf_conj(rvec,ro_new,&Timing_OneIterComm); + abs_ro_new=cAbs(ro_new); + dtmp=abs_ro_new/inprodR; + if (dtmp<EPS_BICG_CS) + LogError(EC_ERROR,ONE_POS,"BiCG_CS fails: (rT.r)/(r.r) is too small (%.2g).",dtmp); + if (count==1) nCopy(pvec,rvec); /* p_1=r_0 */ + else { + /* beta_k-1=ro_k-1/ro_k-2 */ + cDiv(ro_new,ro_old,beta); + /* p_k=beta_k-1*p_k-1+r_k-1 */ + nIncrem10_cmplx(pvec,rvec,beta,NULL,&Timing_OneIterComm); + } + /* q_k=Avecbuffer=A.p_k */ + MatVec(pvec,Avecbuffer,NULL,FALSE); + /* mu_k=p_k.q_k; check for mu_k!=0 */ + nDotProd_conj(pvec,Avecbuffer,mu,&Timing_OneIterComm); + dtmp=cAbs(mu)/abs_ro_new; + if (dtmp<EPS_BICG_CS) + LogError(EC_ERROR,ONE_POS,"BiCG_CS fails: (pT.A.p)/(rT.r) is too small (%.2g).",dtmp); + /* alpha_k=ro_k/mu_k */ + cDiv(ro_new,mu,alpha); + /* x_k=x_k-1+alpha_k*p_k */ + nIncrem01_cmplx(xvec,pvec,alpha,NULL,&Timing_OneIterComm); + /* r_k=r_k-1-alpha_k*A.p_k and |r_k|^2 */ + cInvSign2(alpha,temp); + nIncrem01_cmplx(rvec,Avecbuffer,temp,&inprodRplus1,&Timing_OneIterComm); + /* initialize ro_old -> ro_k-2 for next iteration */ + cEqual(ro_new,ro_old); + + Timing_OneIter=GET_TIME()-tstart; + /* check progress */ + ProgressReport(inprodRplus1); + } /* end of the big while loop */ + AfterIterFinished(); } -//============================================================ +/*============================================================*/ static void QMR_CS(const int mc) -// Quasi Minimum Residual for Complex Symmetric systems + /* Quasi Minimum Residual for Complex Symmetric systems */ { - double inprodRplus1; // inner product of r_k+1 - double c_old,c_new,omega_old,omega_new,zetaabs,dtmp1,dtmp2; - doublecomplex alpha,beta,theta,eta,zeta,zetatilda,tau,tautilda; - doublecomplex s_new,s_old,temp1,temp2,temp3,temp4; - doublecomplex *v,*vtilda,*p_new,*p_old; - TIME_TYPE tstart; - chp_data scalars[8],vectors[3]; - - max_count=mc; - // rename some vectors - v=vec1; // v_k - vtilda=vec2; // also v_k-1 - p_new=pvec; // p_k - p_old=vec3; // p_k-1 - // initialize data structure for checkpoints - scalars[0].ptr=&omega_old; - scalars[1].ptr=&omega_new; - scalars[2].ptr=&c_old; - scalars[3].ptr=&c_new; - scalars[4].ptr=β - scalars[5].ptr=&tautilda; - scalars[6].ptr=&s_old; - scalars[7].ptr=&s_new; - scalars[0].size=scalars[1].size=scalars[2].size=scalars[3].size=sizeof(double); - scalars[4].size=scalars[5].size=scalars[6].size=scalars[7].size=sizeof(doublecomplex); - vectors[0].ptr=v; - vectors[1].ptr=vtilda; - vectors[2].ptr=p_old; - vectors[0].size=vectors[1].size=vectors[2].size=sizeof(doublecomplex); - iter_data.sc=scalars; - iter_data.sc_N=8; - iter_data.vec=vectors; - iter_data.vec_N=3; - // initialization of constants and vectors - if (load_chpoint) { - LoadIterChpoint(); - // change pointers names according to count parity - if (IS_EVEN(count)) SwapPointers(&v,&vtilda); - else SwapPointers(&p_old,&p_new); - } - else { - // omega_0=||v_0||=0 - omega_old=0.0; - // beta_1=sqrt(v~_1(*).v~_1); omega_1=||v~_1||/|beta_1|; (v~_1=r_0) - nDotProdSelf_conj(rvec,temp1,&Timing_InitIter_comm); - cSqrt(temp1,beta); - omega_new=sqrt(inprodR)/cAbs(beta); // inprodR=nNorm2(r_0) - // v_1=v~_1/beta_1 - cInv(beta,temp1); - nMult_cmplx(v,rvec,temp1); - // tau~_1=omega_1*beta_1 - cMultReal(omega_new,beta,tautilda); - // c_0=c_-1=1; s_0=s_-1=0 - c_new=c_old=1.0; - s_new[RE]=s_new[IM]=s_old[RE]=s_old[IM]=0.0; - } - Timing_InitIter = GET_TIME() - tstart_CE; // initialization complete - // main iteration cycle - while (inprodR>=epsB && count<=maxiter && counter<=max_count && !chp_exit) { - Timing_OneIterComm=0; // initialize time - tstart=GET_TIME(); - // check for zero beta - dtmp1=cAbs2(beta)/inprodR; - D2z("(vT.v)/(r.r)=%.2g",dtmp1); - if (dtmp1<EPS_QMR_CS1) - LogError(EC_ERROR,ONE_POS,"QMR_CS fails: (vT.v)/(r.r) is too small (%.2g).",dtmp1); - // A.v_k; alpha_k=v_k(*).(A.v_k) - MatVec(v,Avecbuffer,NULL,FALSE); - nDotProd_conj(v,Avecbuffer,alpha,&Timing_OneIterComm); - // v~_k+1=-beta_k*v_k-1-alpha_k*v_k+A.v_k - cInvSign2(alpha,temp2); - // use explicitly that v_0=0 - if (count==1) nLinComb1_cmplx(vtilda,v,Avecbuffer,temp2,NULL,&Timing_OneIterComm); - else { - cInvSign2(beta,temp1); - nIncrem110_cmplx(vtilda,v,Avecbuffer,temp1,temp2); - } - // theta_k=s_k-2(*)*omega_k-1*beta_k - cMultReal(omega_old,beta,temp3); // temp3=omega_k-1*beta_k - s_old[IM]=-s_old[IM]; // s_old is only used here, hence can be changed - cMult(s_old,temp3,theta); - // eta_k=c_k-1*c_k-2*omega_k-1*beta_k+s_k-1(*)*omega_k*alpha_k - cMultReal(omega_new,alpha,temp4); // temp4=omega_k*alpha_k - cMultReal(c_old*c_new,temp3,eta); - cConj(s_new,temp1); - cMult(temp1,temp4,temp2); - cAdd(eta,temp2,eta); - // zeta~_k=c_k-1*omega_k*alpha_k-s_k-1*c_k-2*omega_k-1*beta_k - cMult(s_new,temp3,temp1); - cLinComb(temp4,temp1,c_new,-c_old,zetatilda); - // beta_k+1=sqrt(v~_k+1(*).v~_k+1); omega_k+1=||v~_k+1||/|beta_k+1| - omega_old=omega_new; - nDotProdSelf_conj_Norm2(vtilda,temp1,&dtmp1,&Timing_OneIterComm); // dtmp1=||v~||^2 - cSqrt(temp1,beta); - omega_new=sqrt(dtmp1)/cAbs(beta); - // |zeta_k|=sqrt(|zeta~_k|^2+omega_k+1^2*|beta_k+1|^2) - dtmp2=cAbs2(zetatilda); // dtmp2=|zeta~_k|^2 - zetaabs=sqrt(dtmp2+dtmp1); - dtmp1=sqrt(dtmp2); // dtmp1=|zeta~_k| - // if (|zeta~_k|==0) zeta_k=|zeta_k|; else zeta=|zeta_k|*zeta~_k/|zeta~_k| - if (dtmp1<EPS_QMR_CS2) { - zeta[RE]=zetaabs; - zeta[IM]=0.0; - } - else cMultReal(zetaabs/dtmp1,zetatilda,zeta); - // c_k=zeta~_k/zeta_k = |zeta~_k|/|zeta_k| - c_old=c_new; - c_new=dtmp1/zetaabs; - // s_k+1=omega_k+1*beta_k+1/zeta_k - cEqual(s_new,s_old); - cInv(zeta,temp4); // temp4=1/zeta - cMult(beta,temp4,temp1); - cMultReal(omega_new,temp1,s_new); - // p_k=(-theta_k*p_k-2-eta_k*p_k-1+v_k)/zeta_k - if (count==1) nMult_cmplx(p_new,v,temp4); // use explicitly that p_0=p_-1=0 - else { - cMult(eta,temp4,temp2); - cInvSign(temp2); // temp2=-eta_k/zeta_k - if (count==2) nLinComb_cmplx(p_old,p_new,v,temp2,temp4,NULL,&Timing_OneIterComm); - else { - cMult(theta,temp4,temp1); - cInvSign(temp1); // temp1=-theta_k/zeta_k - nIncrem111_cmplx(p_old,p_new,v,temp1,temp2,temp4); - } - SwapPointers(&p_old,&p_new); - } - // tau_k=c_k*tau~_k - cMultReal(c_new,tautilda,tau); - // tau~_k+1=-s_k*tau~_k - cMult(s_new,tautilda,temp1); - cInvSign2(temp1,tautilda); - // x_k=x_k-1+tau_k*p_k - nIncrem01_cmplx(xvec,p_new,tau,NULL,&Timing_OneIterComm); - // v_k+1=v~_k+1/beta_k+1 - cInv(beta,temp1); - nMultSelf_cmplx(vtilda,temp1); - SwapPointers(&v,&vtilda); // v~ is as v_k-1 at next iteration - // r_k=r_k-1+(c_k*tau~_k+1/omega_k+1)*v_k+1 - cMultReal(c_new/omega_new,tautilda,temp1); - nIncrem11_d_c(rvec,v,cAbs2(s_new),temp1,&inprodRplus1,&Timing_OneIterComm); - Timing_OneIter=GET_TIME()-tstart; - // check progress - ProgressReport(inprodRplus1); - } // end of the big while loop - AfterIterFinished(); + double inprodRplus1; /* inner product of rk+1 */ + double c_old,c_new,omega_old,omega_new,zetaabs,dtmp1,dtmp2; + doublecomplex alpha,beta,theta,eta,zeta,zetatilda,tau,tautilda; + doublecomplex s_new,s_old,temp1,temp2,temp3,temp4; + doublecomplex *v,*vtilda,*p_new,*p_old; + TIME_TYPE tstart; + chp_data scalars[8],vectors[3]; + + max_count=mc; + /* rename some vectors */ + v=vec1; /* v_k */ + vtilda=vec2; /* also v_k-1 */ + p_new=pvec; /* p_k */ + p_old=vec3; /* p_k-1 */ + /* initialize data structure for checkpoints */ + scalars[0].ptr=&omega_old; + scalars[1].ptr=&omega_new; + scalars[2].ptr=&c_old; + scalars[3].ptr=&c_new; + scalars[4].ptr=β + scalars[5].ptr=&tautilda; + scalars[6].ptr=&s_old; + scalars[7].ptr=&s_new; + scalars[0].size=scalars[1].size=scalars[2].size=scalars[3].size=sizeof(double); + scalars[4].size=scalars[5].size=scalars[6].size=scalars[7].size=sizeof(doublecomplex); + vectors[0].ptr=v; + vectors[1].ptr=vtilda; + vectors[2].ptr=p_old; + vectors[0].size=vectors[1].size=vectors[2].size=sizeof(doublecomplex); + iter_data.sc=scalars; + iter_data.sc_N=8; + iter_data.vec=vectors; + iter_data.vec_N=3; + /* initialization of constants and vectors */ + if (load_chpoint) { + LoadIterChpoint(); + /* change pointers names according to count parity */ + if ((count%2)==0) SwapPointers(&v,&vtilda); + else SwapPointers(&p_old,&p_new); + } + else { + /* omega_0=||v_0||=0 */ + omega_old=0.0; + /* beta_1=sqrt(v~_1(*).v~_1); omega_1=||v~_1||/|beta_1|; (v~_1=r_0) */ + nDotProdSelf_conj(rvec,temp1,&Timing_InitIter_comm); + cSqrt(temp1,beta); + omega_new=sqrt(inprodR)/cAbs(beta); /* inprodR=nNorm2(r_0) */ + /* v_1=v~_1/beta_1 */ + cInv(beta,temp1); + nMult_cmplx(v,rvec,temp1); + /* tau~_1=omega_1*beta_1 */ + cMultReal(omega_new,beta,tautilda); + /* c_0=c_-1=1; s_0=s_-1=0 */ + c_new=c_old=1.0; + s_new[RE]=s_new[IM]=s_old[RE]=s_old[IM]=0.0; + } + Timing_InitIter = GET_TIME() - tstart_CE; /* initialization complete */ + /* main iteration cycle */ + while (inprodR>=epsB && count<=maxiter && counter<=max_count && !chp_exit) { + Timing_OneIterComm=0; /* initialize time */ + tstart=GET_TIME(); + /* check for zero beta */ + dtmp1=cAbs2(beta)/inprodR; + if (dtmp1<EPS_QMR_CS) + LogError(EC_ERROR,ONE_POS,"QMR_CS fails: (vT.v)/(r.r) is too small (%.2g).",dtmp1); + /* A.v_k; alpha_k=v_k(*).(A.v_k) */ + MatVec(v,Avecbuffer,NULL,FALSE); + nDotProd_conj(v,Avecbuffer,alpha,&Timing_OneIterComm); + /* v~_k+1=-beta_k*v_k-1-alpha_k*v_k+A.v_k */ + cInvSign2(alpha,temp2); + /* use explicitly that v_0=0 */ + if (count==1) nLinComb1_cmplx(vtilda,v,Avecbuffer,temp2,NULL,&Timing_OneIterComm); + else { + cInvSign2(beta,temp1); + nIncrem110_cmplx(vtilda,v,Avecbuffer,temp1,temp2); + } + /* theta_k=s_k-2(*)*omega_k-1*beta_k */ + cMultReal(omega_old,beta,temp3); /* temp3=omega_k-1*beta_k */ + s_old[IM]=-s_old[IM]; /* s_old is only used here, hence can be changed */ + cMult(s_old,temp3,theta); + /* eta_k=c_k-1*c_k-2*omega_k-1*beta_k+s_k-1(*)*omega_k*alpha_k */ + cMultReal(omega_new,alpha,temp4); /* temp4=omega_k*alpha_k */ + cMultReal(c_old*c_new,temp3,eta); + cConj(s_new,temp1); + cMult(temp1,temp4,temp2); + cAdd(eta,temp2,eta); + /* zeta~_k=c_k-1*omega_k*alpha_k-s_k-1*c_k-2*omega_k-1*beta_k */ + cMult(s_new,temp3,temp1); + cLinComb(temp4,temp1,c_new,-c_old,zetatilda); + /* beta_k+1=sqrt(v~_k+1(*).v~_k+1); omega_k+1=||v~_k+1||/|beta_k+1| */ + omega_old=omega_new; + nDotProdSelf_conj_Norm2(vtilda,temp1,&dtmp1,&Timing_OneIterComm); /* dtmp1=||v~||^2 */ + cSqrt(temp1,beta); + omega_new=sqrt(dtmp1)/cAbs(beta); + /* |zeta_k|=sqrt(|zeta~_k|^2+omega_k+1^2*|beta_k+1|^2) */ + dtmp2=cAbs2(zetatilda); /* dtmp2=|zeta~_k|^2 */ + zetaabs=sqrt(dtmp2+dtmp1); + dtmp1=sqrt(dtmp2); /* dtmp1=|zeta~_k| */ + /* if (|zeta~_k|==0) zeta_k=|zeta_k|; else zeta=|zeta_k|*zeta~_k/|zeta~_k| */ + if (dtmp1<EPS_QMR_CS_1) { + zeta[RE]=zetaabs; + zeta[IM]=0.0; + } + else cMultReal(zetaabs/dtmp1,zetatilda,zeta); + /* c_k=zeta~_k/zeta_k = |zeta~_k|/|zeta_k| */ + c_old=c_new; + c_new=dtmp1/zetaabs; + /* s_k+1=omega_k+1*beta_k+1/zeta_k */ + cEqual(s_new,s_old); + cInv(zeta,temp4); /* temp4=1/zeta */ + cMult(beta,temp4,temp1); + cMultReal(omega_new,temp1,s_new); + /* p_k=(-theta_k*p_k-2-eta_k*p_k-1+v_k)/zeta_k */ + /* use explicitly that p_0=p_-1=0 */ + if (count==1) nMult_cmplx(p_new,v,temp4); + else { + cMult(eta,temp4,temp2); + cInvSign(temp2); /* temp2=-eta_k/zeta_k */ + if (count==2) nLinComb_cmplx(p_old,p_new,v,temp2,temp4,NULL,&Timing_OneIterComm); + else { + cMult(theta,temp4,temp1); + cInvSign(temp1); /* temp1=-theta_k/zeta_k */ + nIncrem111_cmplx(p_old,p_new,v,temp1,temp2,temp4); + } + SwapPointers(&p_old,&p_new); + } + /* tau_k=c_k*tau~_k */ + cMultReal(c_new,tautilda,tau); + /* tau~_k+1=-s_k*tau~_k */ + cMult(s_new,tautilda,temp1); + cInvSign2(temp1,tautilda); + /* x_k=x_k-1+tau_k*p_k */ + nIncrem01_cmplx(xvec,p_new,tau,NULL,&Timing_OneIterComm); + /* v_k+1=v~_k+1/beta_k+1 */ + cInv(beta,temp1); + nMultSelf_cmplx(vtilda,temp1); + SwapPointers(&v,&vtilda); /* vtilda is as v_k-1 at next iteration */ + /* r_k=r_k-1+(c_k*tau~_k+1/omega_k+1)*v_k+1 */ + cMultReal(c_new/omega_new,tautilda,temp1); + nIncrem11_d_c(rvec,v,cAbs2(s_new),temp1,&inprodRplus1,&Timing_OneIterComm); + + Timing_OneIter=GET_TIME()-tstart; + /* check progress */ + ProgressReport(inprodRplus1); + } /* end of the big while loop */ + AfterIterFinished(); } -//============================================================ +/*============================================================*/ int IterativeSolver(const int method_in) -/* choose required iterative method; do common initialization part */ + /* choose required iterative method + do common initialization part */ { - double temp; - char tmp_str[MAX_LINE]; - - method=method_in; - chp_exit=FALSE; - chp_skip=FALSE; - /* Instead of solving system (I+D.C).x=b , C - diagonal matrix with couple constants - * D - symmetric interaction matrix of Green's tensor - * we solve system (I+S.D.S).(S.x)=(S.b), S=sqrt(C), then total interaction matrix is symmetric - * and Jacobi-preconditioned for any distribution of refractive index. - */ - /* p=b=(S.Einc) is right part of the linear system; used only here. In iteration methods - * themselves p is completely different vector. - */ - if (!load_chpoint) { - nMult_mat(pvec,Einc,cc_sqrt); - temp=nNorm2(pvec,&Timing_InitIter_comm); // |r_0|^2 when x_0=0 - resid_scale=1/temp; - // calculate A.(x_0=b), r_0=b-A.(x_0=b) and |r_0|^2 - MatVec(pvec,Avecbuffer,NULL,FALSE); - nSubtr(rvec,pvec,Avecbuffer,&inprodR,&Timing_InitIter_comm); - // check which x_0 is better - if (temp<inprodR) { // use x_0=0 - nInit(xvec); - // r=p, but faster than copy, p is not used afterwards - SwapPointers(&rvec,&pvec); - inprodR=temp; - strcpy(tmp_str,"x_0 = 0\n"); - } - else { // use x_0=Einc - // x=p, but faster than copy, p is not used afterwards - SwapPointers(&xvec,&pvec); - strcpy(tmp_str,"x_0 = E_inc\n"); - } - epsB=eps*eps/resid_scale; - // print start values - if (ringid==ROOT) { - prev_err=sqrt(resid_scale*inprodR); - sprintf(tmp_str+strlen(tmp_str),"RE_000 = %.10E\n",prev_err); - if (!orient_avg) { - fprintf(logfile,"%s",tmp_str); - fflush(logfile); - } - printf("%s",tmp_str); - fflush(stdout); - } - // initialize counters - count=1; - counter=0; - } - // call appropriate iterative method - if (method==IT_CGNR) CGNR(MAXCOUNT_CGNR); - else if (method==IT_BICGSTAB) BiCGStab(MAXCOUNT_BICGSTAB); - else if (method==IT_BICG_CS) BiCG_CS(MAXCOUNT_BICG_CS); - else if (method==IT_QMR_CS) QMR_CS(MAXCOUNT_QMR_CS); - // error output - if (count>maxiter) LogError(EC_ERROR,ONE_POS, - "Iterations haven't converged in maximum allowed number of iterations (%d)",maxiter); - else if (counter>max_count) LogError(EC_ERROR,ONE_POS, - "Residual norm haven't decreased for maximum allowed number of iterations (%d)",max_count); - // post-processing - /* x is a solution of a modified system, not exactly internal field; should not be used further - * except for adaptive technique (as starting vector for next system) - */ - nMult_mat(pvec,xvec,cc_sqrt); /* p is now vector of polarizations. Can be used to calculate - * e.g. scattered field faster. - */ - if (chp_exit) return CHP_EXIT; // check if exiting after checkpoint - return count; + double temp; + char tmp_str[MAX_LINE]; + + method=method_in; + chp_exit=FALSE; + chp_skip=FALSE; + /* instead of solving system (I+D.C).x=b , C - diagonal matrix with couple constants + * D - symmetric interaction matrix of Green's tensor + * we solve system (I+S.D.S).(S.x)=(S.b), S=sqrt(C), them + * total interaction matrix is symmetric and Jacobi-preconditioned for any discribution of m */ + + /* p=b=(S.Einc) is right part of the linear system; used only here, + in iteration methods themselves p is completely different vector */ + if (!load_chpoint) { + nMult_mat(pvec,Einc,cc_sqrt); + + temp=nNorm2(pvec,&Timing_InitIter_comm); /* |r_0|^2 when x_0=0 */ + resid_scale=1/temp; + /* calculate A.(x_0=b), r_0=b-A.(x_0=b) and |r_0|^2 */ + MatVec(pvec,Avecbuffer,NULL,FALSE); + nSubtr(rvec,pvec,Avecbuffer,&inprodR,&Timing_InitIter_comm); + /* check which x_0 is better */ + if (temp<inprodR) { /* use x_0=0 */ + nInit(xvec); + /* r=p, but faster than copy, p is not used afterwards */ + SwapPointers(&rvec,&pvec); + inprodR=temp; + strcpy(tmp_str,"x_0 = 0\n"); + } + else { /* use x_0=Einc */ + /* x=p, but faster than copy, p is not used afterwards */ + SwapPointers(&xvec,&pvec); + strcpy(tmp_str,"x_0 = E_inc\n"); + } + epsB=eps*eps/resid_scale; + /* print start values */ + if (ringid==ROOT) { + prev_err=sqrt(resid_scale*inprodR); + sprintf(tmp_str+strlen(tmp_str),"RE_000 = %.10E\n",prev_err); + if (!orient_avg) { + fprintf(logfile,"%s",tmp_str); + fflush(logfile); + } + printf("%s",tmp_str); + fflush(stdout); + } + /* initialize counters */ + count=1; + counter=0; + } + /* call appropriate iterative method */ + if (method==IT_CGNR) CGNR(MAXCOUNT_CGNR); + else if (method==IT_BICGSTAB) BiCGStab(MAXCOUNT_BICGSTAB); + else if (method==IT_BICG_CS) BiCG_CS(MAXCOUNT_BICG_CS); + else if (method==IT_QMR_CS) QMR_CS(MAXCOUNT_QMR_CS); + /* error output */ + if (count>maxiter) LogError(EC_ERROR,ONE_POS, + "Iterations haven't converged in maximum allowed number of iterations (%d)",maxiter); + else if (counter>max_count) LogError(EC_ERROR,ONE_POS, + "Residual norm haven't decreased for maximum allowed number of iterations (%d)",max_count); + /* postprocessing */ + /* x is a solution of a modified system, not exactly internal field + should not be used further except fot adaptive technique + (as starting vector for next system) */ + nMult_mat(pvec,xvec,cc_sqrt); /* p is now vector of polarizations - */ + /* faster to calculate ,e.g. scattered field */ + /* check if exiting after checkpoint */ + if (chp_exit) return CHP_EXIT; + return count; } diff --git a/src/linalg.c b/src/linalg.c index 04a51eeb..946b4df1 100644 --- a/src/linalg.c +++ b/src/linalg.c @@ -8,7 +8,7 @@ * to be a principal limitation of C standard (some compilers may work, some produce * warnings) * - * Copyright (C) 2006-2008 University of Amsterdam + * Copyright (C) 2006-2007 University of Amsterdam * This code is covered by the GNU General Public License. */ #include <string.h> @@ -17,492 +17,494 @@ #include "comm.h" #include "linalg.h" -//============================================================ +/*============================================================*/ void nInit(doublecomplex *a) -// initialize vector a with null values + /* initialize vector a with null values */ { - size_t i; + size_t i; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;i++) a[i][RE]=a[i][IM]=0.0; + for (i=0;i<nlocalRows;i++) a[i][RE]=a[i][IM]=0.0; } -//============================================================ +/*============================================================*/ void nCopy(doublecomplex *a,doublecomplex *b) -// copy vector b to a + /* copy vector b to a */ { - memcpy(a,b,nlocalRows*sizeof(doublecomplex)); + memcpy(a,b,nlocalRows*sizeof(doublecomplex)); } -//============================================================ +/*============================================================*/ double nNorm2(doublecomplex *a,TIME_TYPE *timing) -// squared norm of a large vector a + /* squared norm of a large vector a */ { - size_t i; - double inprod=0.0; + size_t i; + double inprod=0.0; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) inprod += a[i][RE]*a[i][RE] + a[i][IM]*a[i][IM]; - // this function is not called inside the main iteration loop - MyInnerProduct(&inprod,double_type,1,timing); - return inprod; + for (i=0;i<nlocalRows;++i) + inprod += a[i][RE]*a[i][RE] + a[i][IM]*a[i][IM]; + /* this function is not called inside the main iteration loop */ + MyInnerProduct(&inprod,double_type,1,timing); + return inprod; } -//============================================================ +/*============================================================*/ void nDotProd(doublecomplex *a,doublecomplex *b,doublecomplex c,TIME_TYPE *timing) -// dot product of two large vectors; c=a.b + /* dot product of two large vectors; c=a.b */ { - size_t i; + size_t i; - c[RE]=c[IM]=0.0; + c[RE]=c[IM]=0.0; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - c[RE] += a[i][RE]*b[i][RE] + a[i][IM]*b[i][IM]; - c[IM] += a[i][IM]*b[i][RE] - a[i][RE]*b[i][IM]; - } - MyInnerProduct(c,cmplx_type,1,timing); + for (i=0;i<nlocalRows;++i) { + c[RE] += a[i][RE]*b[i][RE] + a[i][IM]*b[i][IM]; + c[IM] += a[i][IM]*b[i][RE] - a[i][RE]*b[i][IM]; + } + MyInnerProduct(c,cmplx_type,1,timing); } -//============================================================ +/*============================================================*/ void nDotProd_conj(doublecomplex *a,doublecomplex *b,doublecomplex c,TIME_TYPE *timing) -// conjugate dot product of two large vectors; c=a.b*=b.a* + /* conjugate dot product of two large vectors; c=a.b*=b.a* */ { - size_t i; + size_t i; - c[RE]=c[IM]=0.0; + c[RE]=c[IM]=0.0; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - c[RE] += a[i][RE]*b[i][RE] - a[i][IM]*b[i][IM]; - c[IM] += a[i][IM]*b[i][RE] + a[i][RE]*b[i][IM]; - } - MyInnerProduct(c,cmplx_type,1,timing); + for (i=0;i<nlocalRows;++i) { + c[RE] += a[i][RE]*b[i][RE] - a[i][IM]*b[i][IM]; + c[IM] += a[i][IM]*b[i][RE] + a[i][RE]*b[i][IM]; + } + MyInnerProduct(c,cmplx_type,1,timing); } -//============================================================ +/*============================================================*/ void nDotProdSelf_conj(doublecomplex *a,doublecomplex c,TIME_TYPE *timing) -// conjugate dot product of vector on itself; c=a.a* + /* conjugate dot product of vector on itself; c=a.a* */ { - size_t i; + size_t i; - c[RE]=c[IM]=0.0; + c[RE]=c[IM]=0.0; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - c[RE]+=a[i][RE]*a[i][RE]-a[i][IM]*a[i][IM]; - c[IM]+=a[i][RE]*a[i][IM]; - } - MyInnerProduct(c,cmplx_type,1,timing); - c[IM]*=2; + for (i=0;i<nlocalRows;++i) { + c[RE]+=a[i][RE]*a[i][RE]-a[i][IM]*a[i][IM]; + c[IM]+=a[i][RE]*a[i][IM]; + } + MyInnerProduct(c,cmplx_type,1,timing); + c[IM]*=2; } -//============================================================ +/*============================================================*/ void nDotProdSelf_conj_Norm2(doublecomplex *a,doublecomplex c,double *norm,TIME_TYPE *timing) -/* Computes both conjugate dot product of vector on itself (c=a.a*) - * and its Hermitian squared norm=||a||^2 - */ + /* Computes both conjugate dot product of vector on itself (c=a.a*) + and its hermitian Norm squared norm=||a||^2 */ { - size_t i; - double buf[3]; + size_t i; + double buf[3]; - buf[0]=buf[1]=buf[2]=0.0; + buf[0]=buf[1]=buf[2]=0.0; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - buf[0]+=a[i][RE]*a[i][RE]; - buf[1]+=a[i][IM]*a[i][IM]; - buf[2]+=a[i][RE]*a[i][IM]; - } - MyInnerProduct(buf,double_type,3,timing); - *norm=buf[0]+buf[1]; - c[RE]=buf[0]-buf[1]; - c[IM]=2*buf[2]; + for (i=0;i<nlocalRows;++i) { + buf[0]+=a[i][RE]*a[i][RE]; + buf[1]+=a[i][IM]*a[i][IM]; + buf[2]+=a[i][RE]*a[i][IM]; + } + MyInnerProduct(buf,double_type,3,timing); + *norm=buf[0]+buf[1]; + c[RE]=buf[0]-buf[1]; + c[IM]=2*buf[2]; } -//============================================================ +/*============================================================*/ -void nIncrem110_cmplx(doublecomplex *a,doublecomplex *b,doublecomplex *c,const doublecomplex c1, - const doublecomplex c2) -// a=c1*a+c2*b+c +void nIncrem110_cmplx(doublecomplex *a,doublecomplex *b,doublecomplex *c, + const doublecomplex c1,const doublecomplex c2) + /* a=c1*a+c2*b+c */ { - size_t i; - double tmp; + size_t i; + double tmp; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - tmp=a[i][RE]; - a[i][RE] = c1[RE]*a[i][RE] - c1[IM]*a[i][IM] + c2[RE]*b[i][RE] - c2[IM]*b[i][IM] + c[i][RE]; - a[i][IM] = c1[RE]*a[i][IM] + c1[IM]*tmp + c2[RE]*b[i][IM] + c2[IM]*b[i][RE] + c[i][IM]; - } + for (i=0;i<nlocalRows;++i) { + tmp=a[i][RE]; + a[i][RE] = c1[RE]*a[i][RE] - c1[IM]*a[i][IM] + c2[RE]*b[i][RE] - c2[IM]*b[i][IM] + c[i][RE]; + a[i][IM] = c1[RE]*a[i][IM] + c1[IM]*tmp + c2[RE]*b[i][IM] + c2[IM]*b[i][RE] + c[i][IM]; + } } -//============================================================ +/*============================================================*/ -void nIncrem011_cmplx(doublecomplex *a,doublecomplex *b,doublecomplex *c,const doublecomplex c1, - const doublecomplex c2) -// a+=c1*b+c2*c +void nIncrem011_cmplx(doublecomplex *a,doublecomplex *b,doublecomplex *c, + const doublecomplex c1,const doublecomplex c2) + /* a+=c1*b+c2*c */ { - size_t i; + size_t i; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - a[i][RE] += c1[RE]*b[i][RE] - c1[IM]*b[i][IM] + c2[RE]*c[i][RE] - c2[IM]*c[i][IM]; - a[i][IM] += c1[RE]*b[i][IM] + c1[IM]*b[i][RE] + c2[RE]*c[i][IM] + c2[IM]*c[i][RE]; - } + for (i=0;i<nlocalRows;++i) { + a[i][RE] += c1[RE]*b[i][RE] - c1[IM]*b[i][IM] + c2[RE]*c[i][RE] - c2[IM]*c[i][IM]; + a[i][IM] += c1[RE]*b[i][IM] + c1[IM]*b[i][RE] + c2[RE]*c[i][IM] + c2[IM]*c[i][RE]; + } } -//============================================================ +/*============================================================*/ -void nIncrem111_cmplx(doublecomplex *a,doublecomplex *b,doublecomplex *c,const doublecomplex c1, - const doublecomplex c2,const doublecomplex c3) -// a=c1*a+c2*b+c3*c +void nIncrem111_cmplx(doublecomplex *a,doublecomplex *b,doublecomplex *c, + const doublecomplex c1,const doublecomplex c2,const doublecomplex c3) + /* a=c1*a+c2*b+c3*c */ { - size_t i; - double tmp; + size_t i; + double tmp; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - tmp=a[i][RE]; - a[i][RE] = c1[RE]*a[i][RE] - c1[IM]*a[i][IM] + c2[RE]*b[i][RE] - c2[IM]*b[i][IM] - + c3[RE]*c[i][RE] - c3[IM]*c[i][IM]; - a[i][IM] = c1[RE]*a[i][IM] + c1[IM]*tmp + c2[RE]*b[i][IM] + c2[IM]*b[i][RE] - + c3[RE]*c[i][IM] + c3[IM]*c[i][RE]; - } + for (i=0;i<nlocalRows;++i) { + tmp=a[i][RE]; + a[i][RE] = c1[RE]*a[i][RE] - c1[IM]*a[i][IM] + c2[RE]*b[i][RE] - c2[IM]*b[i][IM] + + c3[RE]*c[i][RE] - c3[IM]*c[i][IM]; + a[i][IM] = c1[RE]*a[i][IM] + c1[IM]*tmp + c2[RE]*b[i][IM] + c2[IM]*b[i][RE] + + c3[RE]*c[i][IM] + c3[IM]*c[i][RE]; + } } -//============================================================ +/*============================================================*/ void nIncrem01(doublecomplex *a,doublecomplex *b,const double c,double *inprod,TIME_TYPE *timing) -// a=a+c*b, inprod=|a|^2 + /* a=a+c*b, inprod=|a|^2 */ { - size_t i; + size_t i; - if (inprod==NULL) { + if (inprod==NULL) { #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - a[i][RE] += c*b[i][RE]; // a+=c*b - a[i][IM] += c*b[i][IM]; - } - } - else { - *inprod=0.0; + for (i=0;i<nlocalRows;++i) { + a[i][RE] += c*b[i][RE]; /* a+=c*b */ + a[i][IM] += c*b[i][IM]; + } + } + else { + *inprod=0.0; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - a[i][RE] += c*b[i][RE]; // a+=c*b - a[i][IM] += c*b[i][IM]; - (*inprod) += a[i][RE]*a[i][RE] + a[i][IM]*a[i][IM]; // *inprod=|a|^2 - } - MyInnerProduct(inprod,double_type,1,timing); - } + for (i=0;i<nlocalRows;++i) { + a[i][RE] += c*b[i][RE]; /* a+=c*b */ + a[i][IM] += c*b[i][IM]; + (*inprod) += a[i][RE]*a[i][RE] + a[i][IM]*a[i][IM]; /* *inprod=|a|^2 */ + } + MyInnerProduct(inprod,double_type,1,timing); + } } -//============================================================ +/*============================================================*/ void nIncrem10(doublecomplex *a,doublecomplex *b,const double c,double *inprod,TIME_TYPE *timing) -// a=c*a+b, inprod=|a|^2 + /* a=c*a+b, inprod=|a|^2 */ { - size_t i; + size_t i; - if (inprod==NULL) { + if (inprod==NULL) { #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - a[i][RE] = c*a[i][RE] + b[i][RE]; // a=c*a+b - a[i][IM] = c*a[i][IM] + b[i][IM]; - } - } - else { - *inprod=0.0; + for (i=0;i<nlocalRows;++i) { + a[i][RE] = c*a[i][RE] + b[i][RE]; /* a=c*a+b */ + a[i][IM] = c*a[i][IM] + b[i][IM]; + } + } + else { + *inprod=0.0; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - a[i][RE] = c*a[i][RE] + b[i][RE]; // a=c*a+b - a[i][IM] = c*a[i][IM] + b[i][IM]; - (*inprod) += a[i][RE]*a[i][RE] + a[i][IM]*a[i][IM]; // *inprod=|a|^2 - } - MyInnerProduct(inprod,double_type,1,timing); - } + for (i=0;i<nlocalRows;++i) { + a[i][RE] = c*a[i][RE] + b[i][RE]; /* a=c*a+b */ + a[i][IM] = c*a[i][IM] + b[i][IM]; + (*inprod) += a[i][RE]*a[i][RE] + a[i][IM]*a[i][IM]; /* *inprod=|a|^2 */ + } + MyInnerProduct(inprod,double_type,1,timing); + } } -//============================================================ +/*============================================================*/ -void nIncrem11_d_c(doublecomplex *a,doublecomplex *b,const double c1,const doublecomplex c2, - double *inprod,TIME_TYPE *timing) -// a=c1*a+c2*b, inprod=|a|^2 , one constant is double, another - complex +void nIncrem11_d_c(doublecomplex *a,doublecomplex *b,const double c1, + const doublecomplex c2,double *inprod,TIME_TYPE *timing) + /* a=c1*a+c2*b, inprod=|a|^2 , one constant is double, another - complex */ { - size_t i; + size_t i; - if (inprod==NULL) { + if (inprod==NULL) { #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - a[i][RE] = c1*a[i][RE] + c2[RE]*b[i][RE] - c2[IM]*b[i][IM]; // a=c1*a+c2*b - a[i][IM] = c1*a[i][IM] + c2[RE]*b[i][IM] + c2[IM]*b[i][RE]; - } - } - else { - *inprod=0.0; + for (i=0;i<nlocalRows;++i) { + a[i][RE] = c1*a[i][RE] + c2[RE]*b[i][RE] - c2[IM]*b[i][IM]; /* a=c1*a+c2*b */ + a[i][IM] = c1*a[i][IM] + c2[RE]*b[i][IM] + c2[IM]*b[i][RE]; + } + } + else { + *inprod=0.0; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - a[i][RE] = c1*a[i][RE] + c2[RE]*b[i][RE] - c2[IM]*b[i][IM]; // a=c1*a+c2*b - a[i][IM] = c1*a[i][IM] + c2[RE]*b[i][IM] + c2[IM]*b[i][RE]; - (*inprod) += a[i][RE]*a[i][RE] + a[i][IM]*a[i][IM]; // *inprod=|a|^2 - } - MyInnerProduct(inprod,double_type,1,timing); - } + for (i=0;i<nlocalRows;++i) { + a[i][RE] = c1*a[i][RE] + c2[RE]*b[i][RE] - c2[IM]*b[i][IM]; /* a=c1*a+c2*b */ + a[i][IM] = c1*a[i][IM] + c2[RE]*b[i][IM] + c2[IM]*b[i][RE]; + (*inprod) += a[i][RE]*a[i][RE] + a[i][IM]*a[i][IM]; /* *inprod=|a|^2 */ + } + MyInnerProduct(inprod,double_type,1,timing); + } } -//============================================================ +/*============================================================*/ -void nIncrem01_cmplx(doublecomplex *a,doublecomplex *b,const doublecomplex c,double *inprod, - TIME_TYPE *timing) -// a=a+c*b, inprod=|a|^2 +void nIncrem01_cmplx(doublecomplex *a,doublecomplex *b,const doublecomplex c, + double *inprod,TIME_TYPE *timing) + /* a=a+c*b, inprod=|a|^2 */ { - size_t i; + size_t i; - if (inprod==NULL) { + if (inprod==NULL) { #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - a[i][RE] += c[RE]*b[i][RE] - c[IM]*b[i][IM]; // a+=c*b - a[i][IM] += c[RE]*b[i][IM] + c[IM]*b[i][RE]; - } - } - else { - *inprod=0.0; + for (i=0;i<nlocalRows;++i) { + a[i][RE] += c[RE]*b[i][RE] - c[IM]*b[i][IM]; /* a+=c*b */ + a[i][IM] += c[RE]*b[i][IM] + c[IM]*b[i][RE]; + } + } + else { + *inprod=0.0; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - a[i][RE] += c[RE]*b[i][RE] - c[IM]*b[i][IM]; // a+=c*b - a[i][IM] += c[RE]*b[i][IM] + c[IM]*b[i][RE]; - (*inprod) += a[i][RE]*a[i][RE] + a[i][IM]*a[i][IM]; // *inprod=|a|^2 - } - MyInnerProduct(inprod,double_type,1,timing); - } + for (i=0;i<nlocalRows;++i) { + a[i][RE] += c[RE]*b[i][RE] - c[IM]*b[i][IM]; /* a+=c*b */ + a[i][IM] += c[RE]*b[i][IM] + c[IM]*b[i][RE]; + (*inprod) += a[i][RE]*a[i][RE] + a[i][IM]*a[i][IM]; /* *inprod=|a|^2 */ + } + MyInnerProduct(inprod,double_type,1,timing); + } } -//============================================================ +/*============================================================*/ -void nIncrem10_cmplx(doublecomplex *a,doublecomplex *b,const doublecomplex c,double *inprod, - TIME_TYPE *timing) -// a=c*a+b, inprod=|a|^2 +void nIncrem10_cmplx(doublecomplex *a,doublecomplex *b,const doublecomplex c, + double *inprod,TIME_TYPE *timing) + /* a=c*a+b, inprod=|a|^2 */ { - size_t i; - double tmp; + size_t i; + double tmp; - if (inprod==NULL) { + if (inprod==NULL) { #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - tmp=a[i][RE]; // a=c*a+b - a[i][RE] = c[RE]*a[i][RE] - c[IM]*a[i][IM] + b[i][RE]; - a[i][IM] = c[RE]*a[i][IM] + c[IM]*tmp + b[i][IM]; - } - } - else { - *inprod=0.0; + for (i=0;i<nlocalRows;++i) { + tmp=a[i][RE]; /* a=c*a+b */ + a[i][RE] = c[RE]*a[i][RE] - c[IM]*a[i][IM] + b[i][RE]; + a[i][IM] = c[RE]*a[i][IM] + c[IM]*tmp + b[i][IM]; + } + } + else { + *inprod=0.0; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - tmp=a[i][RE]; // a=c*a+b - a[i][RE] = c[RE]*a[i][RE] - c[IM]*a[i][IM] + b[i][RE]; - a[i][IM] = c[RE]*a[i][IM] + c[IM]*tmp + b[i][IM]; - (*inprod) += a[i][RE]*a[i][RE] + a[i][IM]*a[i][IM]; // *inprod=|a|^2 - } - MyInnerProduct(inprod,double_type,1,timing); - } + for (i=0;i<nlocalRows;++i) { + tmp=a[i][RE]; /* a=c*a+b */ + a[i][RE] = c[RE]*a[i][RE] - c[IM]*a[i][IM] + b[i][RE]; + a[i][IM] = c[RE]*a[i][IM] + c[IM]*tmp + b[i][IM]; + (*inprod) += a[i][RE]*a[i][RE] + a[i][IM]*a[i][IM]; /* *inprod=|a|^2 */ + } + MyInnerProduct(inprod,double_type,1,timing); + } } -//============================================================ +/*============================================================*/ -void nLinComb_cmplx(doublecomplex *a,doublecomplex *b,doublecomplex *c,const doublecomplex c1, - const doublecomplex c2,double *inprod,TIME_TYPE *timing) -// a=c1*b+c2*c, inprod=|a|^2 +void nLinComb_cmplx(doublecomplex *a,doublecomplex *b,doublecomplex *c, + const doublecomplex c1,const doublecomplex c2,double *inprod,TIME_TYPE *timing) + /* a=c1*b+c2*c, inprod=|a|^2 */ { - size_t i; + size_t i; - if (inprod==NULL) { + if (inprod==NULL) { #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - // a=c1*b+c2*c - a[i][RE] = c1[RE]*b[i][RE] - c1[IM]*b[i][IM] + c2[RE]*c[i][RE] - c2[IM]*c[i][IM]; - a[i][IM] = c1[RE]*b[i][IM] + c1[IM]*b[i][RE] + c2[RE]*c[i][IM] + c2[IM]*c[i][RE]; - } - } - else { - *inprod=0.0; + for (i=0;i<nlocalRows;++i) { + /* a=c1*b+c2*c */ + a[i][RE] = c1[RE]*b[i][RE] - c1[IM]*b[i][IM] + c2[RE]*c[i][RE] - c2[IM]*c[i][IM]; + a[i][IM] = c1[RE]*b[i][IM] + c1[IM]*b[i][RE] + c2[RE]*c[i][IM] + c2[IM]*c[i][RE]; + } + } + else { + *inprod=0.0; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - // a=c1*b+c2*c - a[i][RE] = c1[RE]*b[i][RE] - c1[IM]*b[i][IM] + c2[RE]*c[i][RE] - c2[IM]*c[i][IM]; - a[i][IM] = c1[RE]*b[i][IM] + c1[IM]*b[i][RE] + c2[RE]*c[i][IM] + c2[IM]*c[i][RE]; - (*inprod) += a[i][RE]*a[i][RE] + a[i][IM]*a[i][IM]; // *inprod=|a|^2 - } - MyInnerProduct(inprod,double_type,1,timing); - } + for (i=0;i<nlocalRows;++i) { + /* a=c1*b+c2*c */ + a[i][RE] = c1[RE]*b[i][RE] - c1[IM]*b[i][IM] + c2[RE]*c[i][RE] - c2[IM]*c[i][IM]; + a[i][IM] = c1[RE]*b[i][IM] + c1[IM]*b[i][RE] + c2[RE]*c[i][IM] + c2[IM]*c[i][RE]; + (*inprod) += a[i][RE]*a[i][RE] + a[i][IM]*a[i][IM]; /* *inprod=|a|^2 */ + } + MyInnerProduct(inprod,double_type,1,timing); + } } -//============================================================ +/*============================================================*/ -void nLinComb1_cmplx(doublecomplex *a,doublecomplex *b,doublecomplex *c,const doublecomplex c1, - double *inprod,TIME_TYPE *timing) -// a=c1*b+c, inprod=|a|^2 +void nLinComb1_cmplx(doublecomplex *a,doublecomplex *b,doublecomplex *c, + const doublecomplex c1,double *inprod,TIME_TYPE *timing) + /* a=c1*b+c, inprod=|a|^2 */ { - size_t i; + size_t i; - if (inprod==NULL) { + if (inprod==NULL) { #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - // a=c1*b+c - a[i][RE] = c1[RE]*b[i][RE] - c1[IM]*b[i][IM] + c[i][RE]; - a[i][IM] = c1[RE]*b[i][IM] + c1[IM]*b[i][RE] + c[i][IM]; - } - } - else { - *inprod=0.0; + for (i=0;i<nlocalRows;++i) { + /* a=c1*b+c */ + a[i][RE] = c1[RE]*b[i][RE] - c1[IM]*b[i][IM] + c[i][RE]; + a[i][IM] = c1[RE]*b[i][IM] + c1[IM]*b[i][RE] + c[i][IM]; + } + } + else { + *inprod=0.0; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - // a=c1*b+c - a[i][RE] = c1[RE]*b[i][RE] - c1[IM]*b[i][IM] + c[i][RE]; - a[i][IM] = c1[RE]*b[i][IM] + c1[IM]*b[i][RE] + c[i][IM]; - (*inprod) += a[i][RE]*a[i][RE] + a[i][IM]*a[i][IM]; // *inprod=|a|^2 - } - MyInnerProduct(inprod,double_type,1,timing); - } + for (i=0;i<nlocalRows;++i) { + /* a=c1*b+c */ + a[i][RE] = c1[RE]*b[i][RE] - c1[IM]*b[i][IM] + c[i][RE]; + a[i][IM] = c1[RE]*b[i][IM] + c1[IM]*b[i][RE] + c[i][IM]; + (*inprod) += a[i][RE]*a[i][RE] + a[i][IM]*a[i][IM]; /* *inprod=|a|^2 */ + } + MyInnerProduct(inprod,double_type,1,timing); + } } -//============================================================ +/*============================================================*/ void nSubtr(doublecomplex *a,doublecomplex *b,doublecomplex *c,double *inprod,TIME_TYPE *timing) -// a=b-c, inprod=|a|^2 + /* a=b-c, inprod=|a|^2 */ { - size_t i; + size_t i; - if (inprod==NULL) { + if (inprod==NULL) { #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - a[i][RE] = b[i][RE] - c[i][RE]; // a=b-c - a[i][IM] = b[i][IM] - c[i][IM]; - } - } - else { - *inprod=0.0; + for (i=0;i<nlocalRows;++i) { + a[i][RE] = b[i][RE] - c[i][RE]; /* a=b-c */ + a[i][IM] = b[i][IM] - c[i][IM]; + } + } + else { + *inprod=0.0; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - a[i][RE] = b[i][RE] - c[i][RE]; // a=b-c - a[i][IM] = b[i][IM] - c[i][IM]; - (*inprod) += a[i][RE]*a[i][RE] + a[i][IM]*a[i][IM]; // *inprod=|a|^2 - } - MyInnerProduct(inprod,double_type,1,timing); - } + for (i=0;i<nlocalRows;++i) { + a[i][RE] = b[i][RE] - c[i][RE]; /* a=b-c */ + a[i][IM] = b[i][IM] - c[i][IM]; + (*inprod) += a[i][RE]*a[i][RE] + a[i][IM]*a[i][IM]; /* *inprod=|a|^2 */ + } + MyInnerProduct(inprod,double_type,1,timing); + } } -//============================================================ +/*============================================================*/ void nMult_cmplx(doublecomplex *a,doublecomplex *b,const doublecomplex c) -// multiply vector by a complex constant; a=c*b + /* multiply vector by a complex constant; a=c*b */ { - size_t i; + size_t i; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - a[i][RE] = c[RE]*b[i][RE] - c[IM]*b[i][IM]; // a[i]=c*b[i] - a[i][IM] = c[RE]*b[i][IM] + c[IM]*b[i][RE]; - } + for (i=0;i<nlocalRows;++i) { + a[i][RE] = c[RE]*b[i][RE] - c[IM]*b[i][IM]; /* a[i]=c*b[i] */ + a[i][IM] = c[RE]*b[i][IM] + c[IM]*b[i][RE]; + } } -//============================================================ +/*============================================================*/ void nMultSelf_cmplx(doublecomplex *a,const doublecomplex c) -// multiply vector by a complex constant; a*=c + /* multiply vector by a complex constant; a*=c */ { - size_t i; - double tmp; + size_t i; + double tmp; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) { - tmp=a[i][RE]; - a[i][RE] = c[RE]*a[i][RE] - c[IM]*a[i][IM]; // a[i]*=c - a[i][IM] = c[RE]*a[i][IM] + c[IM]*tmp; - } + for (i=0;i<nlocalRows;++i) { + tmp=a[i][RE]; + a[i][RE] = c[RE]*a[i][RE] - c[IM]*a[i][IM]; /* a[i]*=c */ + a[i][IM] = c[RE]*a[i][IM] + c[IM]*tmp; + } } -//============================================================ +/*============================================================*/ void nMult_mat(doublecomplex *a,doublecomplex *b,doublecomplex c[][3]) -// multiply by a function of material of a dipole and component; a[3*i+j]=c[mat[i]][j]*b[3*i+j] + /* multiply by a function of material of a dipole and component; + a[3*i+j]=c[mat[i]][j]*b[3*i+j] */ { - size_t i,k; - int j; - doublecomplex *val; + size_t i,k; + int j; + doublecomplex *val; - k=0; + k=0; #pragma loop count (100000) #pragma ivdep - for (i=0;i<local_nvoid_Ndip;++i) { - val=c[material[i]]; - for (j=0;j<3;j++) { - a[k][RE] = val[j][RE]*b[k][RE] - val[j][IM]*b[k][IM]; - a[k][IM] = val[j][RE]*b[k][IM] + val[j][IM]*b[k][RE]; - k++; - } - } + for (i=0;i<local_nvoid_Ndip;++i) { + val=c[material[i]]; + for (j=0;j<3;j++) { + a[k][RE] = val[j][RE]*b[k][RE] - val[j][IM]*b[k][IM]; + a[k][IM] = val[j][RE]*b[k][IM] + val[j][IM]*b[k][RE]; + k++; + } + } } -//============================================================ +/*============================================================*/ void nMultSelf_mat(doublecomplex *a,doublecomplex c[][3]) -// multiply by a function of material of a dipole and component; a[3*i+j]*=c[mat[i]][j] + /* multiply by a function of material of a dipole and component; + a[3*i+j]*=c[mat[i]][j] */ { - size_t i,k; - int j; - double tmp; - doublecomplex *val; + size_t i,k; + int j; + double tmp; + doublecomplex *val; - k=0; + k=0; #pragma loop count (100000) #pragma ivdep - for (i=0;i<local_nvoid_Ndip;++i) { - val=c[material[i]]; - for (j=0;j<3;j++) { - tmp=a[k][RE]; - a[k][RE] = val[j][RE]*a[k][RE] - val[j][IM]*a[k][IM]; - a[k][IM] = val[j][RE]*a[k][IM] + val[j][IM]*tmp; - k++; - } - } + for (i=0;i<local_nvoid_Ndip;++i) { + val=c[material[i]]; + for (j=0;j<3;j++) { + tmp=a[k][RE]; + a[k][RE] = val[j][RE]*a[k][RE] - val[j][IM]*a[k][IM]; + a[k][IM] = val[j][RE]*a[k][IM] + val[j][IM]*tmp; + k++; + } + } } -//============================================================ +/*============================================================*/ void nConj(doublecomplex *a) -// complex conjugate of the vector + /* complex conjugate of the vector */ { - size_t i; + size_t i; #pragma loop count (100000) #pragma ivdep - for (i=0;i<nlocalRows;++i) a[i][IM]=-a[i][IM]; + for (i=0;i<nlocalRows;++i) a[i][IM]=-a[i][IM]; } diff --git a/src/linalg.h b/src/linalg.h index 391bc1ae..b1f418d1 100644 --- a/src/linalg.h +++ b/src/linalg.h @@ -3,15 +3,15 @@ * DESCR: Definitions for linear algebra operations on large vectors * see source (linalg.c) for description * - * Copyright (C) 2006,2008 University of Amsterdam + * Copyright (C) 2006 University of Amsterdam * This code is covered by the GNU General Public License. */ #ifndef __linalg_h #define __linalg_h -#include "types.h" // for doublecomplex -#include "function.h" // for function attributes -#include "timing.h" // for TIME_TYPE +#include "types.h" /* for doublecomplex */ +#include "function.h" /* for function attributes */ +#include "timing.h" /* for TIME_TYPE */ void nInit(doublecomplex *a); void nCopy(doublecomplex *a,doublecomplex *b); @@ -45,4 +45,4 @@ void nMult_mat(doublecomplex *a,doublecomplex *b,doublecomplex c[][3]); void nMultSelf_mat(doublecomplex *a,doublecomplex c[][3]); void nConj(doublecomplex *a); -#endif // __linalg_h +#endif /* __linalg_h */ diff --git a/src/make_mpi b/src/make_mpi index c0e8b98e..bcf7424b 100644 --- a/src/make_mpi +++ b/src/make_mpi @@ -2,22 +2,38 @@ # Most options are defined in Makefile # AUTH: Maxim Yurkin -#=============================================================================== +#========================================================== # !!! Start of control section. Flags and options here are designed to be # modified by user to choose particular options for compilation. However, the # default set of options may work out-of-box on some systems. -#=============================================================================== +#========================================================== # Either use compiler directly or use MPI wrapper #MPICC = $(CC) MPICC = mpicc # Compiler dependent options -# These are options for a particular Alpha system +ifeq ($(COMPILER),gnu) +endif +ifeq ($(COMPILER),intel9.x) + # this may help to eliminate some problems with static linking using + # Intel MPICH, but this library may be not present in some installations + LDLIBS += -lifcore +endif +ifeq ($(COMPILER),intel9.x_ns) +endif +ifeq ($(COMPILER),intel8.1) + # this may help to eliminate some problems with static linking using + # Intel MPICH, but this library may be not present in some installations + LDLIBS += -lifcore +endif +# This are options for a particular Alpha system ifeq ($(COMPILER),compaq) MPICC = cc LDLIBS += -lmpi -lelan endif +ifeq ($(COMPILER),other) +endif # If the compiler is used directly, few additional options are needed ifeq ($(MPICC),$(CC)) @@ -66,7 +82,7 @@ $(LASTMPI): # we assume that each Fortran file is completely independent $(CDEPEND): %.d: %.c $(MFILES) - $(MPICC) $(DEPFLAG) $(CFLAGS) $< $(DFFLAG) $@.$$$$; \ + $(MPICC) $(DEPFLAG) $(CFLAGS) $< > $@.$$$$; \ sed 's,\($*\)\.o[ :]*,\1.o $@ : ,g' < $@.$$$$ > $@; \ rm -f $@.$$$$ diff --git a/src/make_particle.c b/src/make_particle.c index e49c387a..62304aa7 100644 --- a/src/make_particle.c +++ b/src/make_particle.c @@ -16,8 +16,6 @@ * (not used now) * ----------------------------------------------------------- * Shapes 'capsule' and 'egg' are implemented by Daniel Hahn and Richard Joseph. - * ----------------------------------------------------------- - * Shape 'axisymmetric' is based on the code by Konstantin Gilev * * Currently is developed by Maxim Yurkin * @@ -27,9 +25,8 @@ #include <stdlib.h> #include <math.h> #include <string.h> -#include <time.h> // for time and clock (used for random seed) +#include <time.h> /* for time and clock (used for random seed) */ #include <limits.h> -#include <stdbool.h> #include "vars.h" #include "const.h" #include "cmplx.h" @@ -42,16 +39,16 @@ #include "timing.h" #include "mt19937ar.h" -// SEMI-GLOBAL VARIABLES +/* SEMI-GLOBAL VARIABLES */ -// defined and initialized in param.c +/* defined and initialized in param.c */ extern const int shape,sh_Npars; extern const double sh_pars[]; extern const int sym_type; extern const double lambda; extern double sizeX,dpl,a_eq; extern const int jagged; -extern const char shape_fname[]; +extern const char aggregate_file[]; extern char shapename[]; extern char save_geom_fname[]; extern const int volcor,save_geom; @@ -60,1119 +57,825 @@ extern const double gr_vf; extern double gr_d; extern const int gr_mat; extern int sg_format; -extern int store_grans; -// defined and initialized in timing.c +/* defined and initialized in timing.c */ extern TIME_TYPE Timing_Particle,Timing_Granul,Timing_Granul_comm; -// used in param.c -int volcor_used; // volume correction was actually employed -char sh_form_str[MAX_PARAGRAPH]; // string for log file with shape parameters -size_t gr_N; // number of granules -double gr_vf_real; // actual granules volume fraction -double mat_count[MAX_NMAT+1]; // number of dipoles in each domain - -// LOCAL VARIABLES - -static const char geom_format[]="%d %d %d\n"; // format of the geom file -static const char geom_format_ext[]="%d %d %d %d\n"; // extended format of the geom file -/* C99 allows use of %zu for size_t variables, but this is not supported by MinGW due to dependence - * on Microsoft libraries - */ -static const char ddscat_format[]="%ld %d %d %d %d %d %d\n";// DDSCAT shape format (FRMFIL) -// ratio of scatterer volume to enclosing cube; used for dpl correction and initialization by a_eq -static double volume_ratio; -static double Ndip; // total number of dipoles (in a circumscribing cube) -static double dpl_def; // default value of dpl -static int minX,minY,minZ; // minimum values of dipole positions in dipole file -static FILE *dipfile; // handle of dipole file -static int read_format; // format of dipole file, which is read -static char linebuf[BUF_LINE]; // buffer for reading lines from dipole file -double cX,cY,cZ; // center for DipoleCoord, it is sometimes used in PlaceGranules -// shape parameters +/* used in param.c */ +int volcor_used; /* volume correction was actually employed */ +char sh_form_str[MAX_PARAGRAPH]; /* string for log file with shape parameters */ +size_t gr_N; /* number of granules */ +double gr_vf_real; /* actual granules volume fraction */ +double mat_count[MAX_NMAT+1]; /* number of dipoles in each domain */ + +/* LOCAL VARIABLES */ + +static const char geom_format[]="%d %d %d\n"; /* format of the geom file */ +static const char geom_format_ext[]="%d %d %d %d\n"; /* extended format of the geom file */ +static const char ddscat_format[]="%d %d %d %d %d %d %d\n";/* ddscat shape format (FRMFIL) */ +static double volume_ratio; /* ratio of scatterer volume to enclosing cube; + used for dpl correction and initialization by a_eq */ +static double Ndip; /* total number of dipoles (in a circumscribing cube) */ +static double dpl_def; /* default value of dpl */ +static int minX,minY,minZ; /* minimum values of dipole positions in dipole file */ +static FILE *dipfile; /* handle of dipole file */ +static int read_format; /* format of dipole file, which is read */ +static char linebuf[BUF_LINE]; /* buffer for reading lines from dipole file */ +/* shape parameters */ static double coat_ratio,coat_x,coat_y,coat_z,coat_r2; -static double ad2,egnu,egeps,egz0; // for egg +static double ad2,egnu,egeps,egz0; /* for egg */ static double hdratio,invsqY,invsqZ,haspY,haspZ; -static double P,Q,R,S; // for RBC -// for axisymmetric; all coordinates defined here are relative -static double *contSegRoMin,*contSegRoMax,*contRo,*contZ; -static double contCurRo, contCurZ, contRoSqMin; -static int contNseg; -struct segment { - bool single; // whether segment consists of a single joint - int first; // index of the first point in the segment - int last; // index of the last point in the segment - double zmin; // minimum z-coordinate of the segment points - double zmax; // maximum z-coordinate of the segment points - double romid; // ro-coordinate of the point in the middle - struct segment *left; // pointer to left subsegment - struct segment *right; // pointer to right subsegment - double slope; // only for single; (z[i+1]-z[i])/(ro[i+1]-ro[i]) - double add; // only for single; ro[i](1-slope); -}; -struct segment *contSeg; +static double P,Q,R,S; /* for RBC */ /* TO ADD NEW SHAPE - * Add here all internal variables (aspect ratios, etc.), which you initialize in InitShape() - * and use in MakeParticle() afterwards. If you need local, intermediate variables, put them into - * the beginning of the corresponding function. - * Add descriptive comments, use 'static'. - */ + Add here all internal variables (aspect ratios, etc.), which you initialize in InitShape() + and use in MakeParticle() afterwards. If you need local, intermediate variables, put them into + the beginning of the corresponding function. + Add descriptive comments, use 'static'. */ -// temporary arrays before their real counterparts are allocated +/* temporary arrays before their real counterparts are allocated */ static unsigned char *material_tmp; static double *DipoleCoord_tmp; static unsigned short *position_tmp; -//============================================================ +/*============================================================*/ static void SaveGeometry(void) -// saves dipole configuration to .geom file + /* saves dipole configuration to .geom file */ { - char fname[MAX_FNAME]; - FILE *geom; - size_t i,j; - int mat; - - // create save_geom_fname if not specified - if (save_geom_fname[0]==0) sprintf(save_geom_fname,"%s.geom",shapename); - // automatically change format if needed - if (sg_format==SF_TEXT && Nmat>1) sg_format=SF_TEXT_EXT; - // choose filename + char fname[MAX_FNAME]; + FILE *geom; + size_t i,j; + int mat; + + /* create save_geom_fname if not specified */ + if (save_geom_fname[0]==0) + sprintf(save_geom_fname,"%s.geom",shapename); + /* automatically change format if needed */ + if (sg_format==SF_TEXT && Nmat>1) sg_format=SF_TEXT_EXT; + /* choose filename */ #ifdef PARALLEL - sprintf(fname,"%s/" F_GEOM_TMP,directory,ringid); + sprintf(fname,"%s/" F_GEOM_TMP,directory,ringid); #else - sprintf(fname,"%s/%s",directory,save_geom_fname); + sprintf(fname,"%s/%s",directory,save_geom_fname); #endif - geom=FOpenErr(fname,"w",ALL_POS); - // print head of file + geom=FOpenErr(fname,"w",ALL_POS); + /* print head of file */ #ifdef PARALLEL - if (ringid==0) { // this condition can be different from being ROOT + if (ringid==0) { /* this condition can be different from being ROOT */ #endif - if (sg_format==SF_TEXT || sg_format==SF_TEXT_EXT) { - fprintf(geom,"#generated by ADDA v." ADDA_VERSION "\n" - "#shape: '%s'\n" - "#box size: %dx%dx%d\n",shapename,boxX,boxY,boxZ); - if (sg_format==SF_TEXT_EXT) fprintf(geom,"Nmat=%d\n",Nmat); - } - else if (sg_format==SF_DDSCAT) - fprintf(geom,"shape: '%s'; box size: %dx%dx%d; generated by ADDA v." ADDA_VERSION "\n" - "%0.f = NAT\n" - "1 0 0 = A_1 vector\n" - "0 1 0 = A_2 vector\n" - "1 1 1 = lattice spacings (d_x,d_y,d_z)/d\n" - "JA IX IY IZ ICOMP(x,y,z)\n",shapename,boxX,boxY,boxZ,nvoid_Ndip); + if (sg_format==SF_TEXT || sg_format==SF_TEXT_EXT) { + fprintf(geom,"#generated by ADDA v." ADDA_VERSION "\n"\ + "#shape: '%s'\n"\ + "#box size: %dx%dx%d\n",shapename,boxX,boxY,boxZ); + if (sg_format==SF_TEXT_EXT) fprintf(geom,"Nmat=%d\n",Nmat); + } + else if (sg_format==SF_DDSCAT) + fprintf(geom,"shape: '%s'; box size: %dx%dx%d; generated by ADDA v." ADDA_VERSION "\n"\ + "%0.f = NAT\n"\ + "1 0 0 = A_1 vector\n"\ + "0 1 0 = A_2 vector\n"\ + "1 1 1 = lattice spacings (d_x,d_y,d_z)/d\n"\ + "JA IX IY IZ ICOMP(x,y,z)\n",shapename,boxX,boxY,boxZ,nvoid_Ndip); #ifdef PARALLEL - } // end of if + } /* end of if */ #endif - // save geometry - if (sg_format==SF_TEXT) for(i=0;i<local_nvoid_Ndip;i++) { - j=3*i; - fprintf(geom,geom_format,position[j],position[j+1],position[j+2]); - } - else if (sg_format==SF_TEXT_EXT) for(i=0;i<local_nvoid_Ndip;i++) { - j=3*i; - fprintf(geom,geom_format_ext,position[j],position[j+1],position[j+2],material[i]+1); - } - else if (sg_format==SF_DDSCAT) for(i=0;i<local_nvoid_Ndip;i++) { - j=3*i; - mat=material[i]+1; - fprintf(geom,ddscat_format,(long)(i+1),position[j],position[j+1],position[j+2],mat,mat,mat); - /* conversion to long is needed (to remove warnings) because %z printf - * argument is not yet supported by all target compiler environments - */ - } - FCloseErr(geom,fname,ALL_POS); + /* save geometry */ + if (sg_format==SF_TEXT) for(i=0;i<local_nvoid_Ndip;i++) { + j=3*i; + fprintf(geom,geom_format,position[j],position[j+1],position[j+2]); + } + else if (sg_format==SF_TEXT_EXT) for(i=0;i<local_nvoid_Ndip;i++) { + j=3*i; + fprintf(geom,geom_format_ext,position[j],position[j+1],position[j+2],material[i]+1); + } + else if (sg_format==SF_DDSCAT) for(i=0;i<local_nvoid_Ndip;i++) { + j=3*i; + mat=material[i]+1; + fprintf(geom,ddscat_format,i+1,position[j],position[j+1],position[j+2],mat,mat,mat); + } + FCloseErr(geom,fname,ALL_POS); #ifdef PARALLEL - // wait for all processes to save their part of geometry - Synchronize(); - // combine all files into one and clean - if (ringid==ROOT) CatNFiles(directory,F_GEOM_TMP,save_geom_fname); + /* wait for all processes to save their part of geometry */ + Synchronize(); + /* combine all files into one and clean */ + if (ringid==ROOT) CatNFiles(directory,F_GEOM_TMP,save_geom_fname); #endif - PRINTZ("Geometry saved to file\n"); + PRINTZ("Geometry saved to file\n"); } -//=========================================================== +/*===========================================================*/ INLINE void SkipFullLine(FILE* file) -// skips full line in the file, starting from current position; it uses predefined buffer 'linebuf' + /* skips full line in the file, starting from current position; + it uses predefined buffer 'linebuf' */ { - do fgets(linebuf,BUF_LINE,file); while (strchr(linebuf,'\n')==NULL && !feof(file)); + do fgets(linebuf,BUF_LINE,file); while (strchr(linebuf,'\n')==NULL && !feof(file)); } -//=========================================================== +/*===========================================================*/ INLINE char *FgetsError(FILE* file,const char *fname,int *line,const char *s_fname,const int s_line) -/* calls fgets, checks for errors and increments line number; s_fname and s_line are source fname - * and line number to be shown in error message; result is stored in predefined buffer 'linebuf'. - */ + /* calls fgets, checks for errors and increments line number; s_fname and s_line are + source fname and line number to be shown in error message; + result is stored in predefined buffer 'linebuf' */ { - char *res; - - res=fgets(linebuf,BUF_LINE,file); - if (res!=NULL) { - (*line)++; - if (strchr(linebuf,'\n')==NULL && !feof(file)) LogError(EC_ERROR,ONE,s_fname,s_line, - "Buffer overflow while scanning lines in file '%s' (size of line %d > %d)", - fname,*line,BUF_LINE-1); - } - return res; + char *res; + + res=fgets(linebuf,BUF_LINE,file); + if (res!=NULL) { + (*line)++; + if (strchr(linebuf,'\n')==NULL && !feof(file)) LogError(EC_ERROR,ONE,s_fname,s_line, + "Buffer overflow while scanning lines in file '%s' (size of line %d > %d)", + fname,*line,BUF_LINE-1); + } + return res; } -//=========================================================== +/*===========================================================*/ INLINE void SkipNLines(FILE *file,int n) -// skips n lines from the file starting from current position in a file + /* skips n lines from the file starting from current position in a file */ { - while (n>0) { - SkipFullLine(file); - n--; - } + while (n>0) { + SkipFullLine(file); + n--; + } } -//=========================================================== +/*===========================================================*/ static int SkipComments(FILE *file) -/* skips comments (#...), all lines, starting from current position in a file. - * returns number of lines skipped - */ + /* skips comments (#...), all lines, starting from current position in a file + returns number of lines skipped */ { - int lines=0,ch; + int lines=0,ch; - while ((ch=fgetc(file))=='#') { - SkipFullLine(file); - lines++; - } - if (ch!=EOF) ungetc(ch,file); + while ((ch=fgetc(file))=='#') { + SkipFullLine(file); + lines++; + } + if (ch!=EOF) ungetc(ch,file); - return lines; + return lines; } -//=========================================================== -#define DDSCAT_HL 6 // number of header lines in DDSCAT format +/*===========================================================*/ +#define DDSCAT_HL 6 /* number of header lines in DDSCAT format */ -static void InitDipFile(const char *fname,int *bX,int *bY,int *bZ,int *Nm) -/* read dipole file first to determine box sizes and Nmat; input is not checked for very large - * numbers (integer overflows) to increase speed; this function opens file for reading, the file is - * closed in ReadDipFile. - */ +static void InitDipFile(const char *fname,int * bX,int *bY,int *bZ,int *Nm) + /* read dipole file first to determine box sizes and Nmat; + input is not checked for very large numbers (integer overflows) to increase speed + this funstion opens file for reading, the file is closed in ReadDipFile */ { - int x,y,z,mat,line,scanned,mustbe,skiplines,anis_warned; - long tl; // dumb variable - int t2,t3; // dumb variables - int maxX,maxY,maxZ,maxN; - char formtext[MAX_LINE]; - - dipfile=FOpenErr(fname,"r",ALL_POS); - read_format=UNDEF; - /* test for DDSCAT format; in not-DDSCAT format, the line scanned below may be a long comment; - * therefore we first skip all comments - */ - line=SkipComments(dipfile); - if (line<=DDSCAT_HL) { - SkipNLines(dipfile,DDSCAT_HL-line); - if (FgetsError(dipfile,fname,&line,POSIT)!=NULL - && sscanf(linebuf,ddscat_format,&tl,&x,&y,&z,&mat,&t2,&t3)==7) { - read_format=SF_DDSCAT; - strcpy(formtext,"DDSCAT format (FRMFIL)"); - mustbe=7; - line=DDSCAT_HL; - fseek(dipfile,0,SEEK_SET); - SkipNLines(dipfile,line); - } - } - // if format is not yet determined, test for ADDA text formats - if (read_format==UNDEF) { - fseek(dipfile,0,SEEK_SET); - line=SkipComments(dipfile); - /* scanf and analyze Nmat; if there is blank line between comments and Nmat, it fails later; - * the value of Nmat obtained here is not actually relevant, the main factor is maximum - * domain number among all dipoles. - */ - scanned=fscanf(dipfile,"Nmat=%d\n",Nm); - if (scanned==EOF) LogError(EC_ERROR,ONE_POS,"No dipole positions are found in %s",fname); - else if (scanned==0) { // no "Nmat=..." - read_format=SF_TEXT; - strcpy(formtext,"ADDA text format (single domain)"); - *Nm=1; - mustbe=3; - } - else { // "Nmat=..." present - read_format=SF_TEXT_EXT; - strcpy(formtext,"ADDA text format (multi-domain)"); - mustbe=4; - line++; - } - } - // scan main part of the file - skiplines=line; - maxX=maxY=maxZ=INT_MIN; - minX=minY=minZ=INT_MAX; - maxN=1; - anis_warned=FALSE; - // reading is performed in lines - while(FgetsError(dipfile,fname,&line,POSIT)!=NULL) { - // scan numbers in a line - if (read_format==SF_TEXT) scanned=sscanf(linebuf,geom_format,&x,&y,&z); - else if (read_format==SF_TEXT_EXT) scanned=sscanf(linebuf,geom_format_ext,&x,&y,&z,&mat); - // for ddscat format, only first material is used, other two are ignored - else if (read_format==SF_DDSCAT) { - scanned=sscanf(linebuf,ddscat_format,&tl,&x,&y,&z,&mat,&t2,&t3); - if (!anis_warned && (t2!=mat || t3!=mat)) { - LogError(EC_WARN,ONE_POS,"Anisotropic dipoles are detected in file %s (first on " - "line %d). ADDA ignores this anisotropy, using only the identifier of " - "x-component of refractive index as domain number",fname,line); - anis_warned=TRUE; - } - } - // if sscanf returns EOF, that is a blank line -> just skip - if (scanned!=EOF) { - if (scanned!=mustbe) // this in most cases indicates wrong format - LogError(EC_ERROR,ONE_POS,"%s was detected, but error occurred during scanning " - "of line %d from dipole file %s",formtext,line,fname); - if (read_format!=SF_TEXT) { - if (mat<=0) LogError(EC_ERROR,ONE_POS,"%s was detected, but nonpositive material " - "number (%d) encountered during scanning of line %d from dipole file %s", - formtext,mat,line,fname); - else if (mat>maxN) maxN=mat; - } - // update maxima and minima - if (x>maxX) maxX=x; - if (x<minX) minX=x; - if (y>maxY) maxY=y; - if (y<minY) minY=y; - if (z>maxZ) maxZ=z; - if (z<minZ) minZ=z; - } - } - if (read_format==SF_TEXT_EXT) { - if (*Nm!=maxN) LogError(EC_WARN,ONE_POS,"Nmat (%d), as given in %s, is not equal to the " - "maximum domain number (%d) among all specified dipoles; hence the former is " - "ignored",*Nm,fname,maxN); - } - *Nm=maxN; - // set grid (box) sizes - *bX=jagged*(maxX-minX+1); - *bY=jagged*(maxY-minY+1); - *bZ=jagged*(maxZ-minZ+1); - // not optimal way, but works more robustly when non-system EOL is used in data file - fseek(dipfile,0,SEEK_SET); - SkipNLines(dipfile,skiplines); + int x,y,z,mat,line,scanned,mustbe,skiplines,anis_warned; + int t1,t2,t3; /* dumb variables */ + int maxX,maxY,maxZ,maxN; + char formtext[MAX_LINE]; + + dipfile=FOpenErr(fname,"r",ALL_POS); + read_format=UNDEF; + /* test for ddscat format; in not-ddscat format, the line scanned below may be a long comment; + therefore we first skip all comments */ + line=SkipComments(dipfile); + if (line<=DDSCAT_HL) { + SkipNLines(dipfile,DDSCAT_HL-line); + if (FgetsError(dipfile,fname,&line,POSIT)!=NULL + && sscanf(linebuf,ddscat_format,&t1,&x,&y,&z,&mat,&t2,&t3)==7) { + read_format=SF_DDSCAT; + strcpy(formtext,"DDSCAT format (FRMFIL)"); + mustbe=7; + line=DDSCAT_HL; + fseek(dipfile,0,SEEK_SET); + SkipNLines(dipfile,line); + } + } + /* if format is not yet determined, test for ADDA text formats */ + if (read_format==UNDEF) { + fseek(dipfile,0,SEEK_SET); + line=SkipComments(dipfile); + /* scanf and analyze Nmat; if there is blank line between comments and Nmat, it fails later; + the value of Nmat obtained here is not actually relevant, the main factor in maximum domain + number among all dipoles */ + scanned=fscanf(dipfile,"Nmat=%d\n",Nm); + if (scanned==EOF) LogError(EC_ERROR,ONE_POS,"No dipole positions are found in %s",fname); + else if (scanned==0) { /* no "Nmat=..." */ + read_format=SF_TEXT; + strcpy(formtext,"ADDA text format (single domain)"); + *Nm=1; + mustbe=3; + } + else { /* "Nmat=..." present */ + read_format=SF_TEXT_EXT; + strcpy(formtext,"ADDA text format (multi-domain)"); + mustbe=4; + line++; + } + } + /* scan main part of the file */ + skiplines=line; + maxX=maxY=maxZ=INT_MIN; + minX=minY=minZ=INT_MAX; + maxN=1; + anis_warned=FALSE; + /* reading is performed in lines */ + while(FgetsError(dipfile,fname,&line,POSIT)!=NULL) { + /* scan numbers in a line */ + if (read_format==SF_TEXT) scanned=sscanf(linebuf,geom_format,&x,&y,&z); + else if (read_format==SF_TEXT_EXT) scanned=sscanf(linebuf,geom_format_ext,&x,&y,&z,&mat); + /* for ddscat format, only first material is used, other two are ignored */ + else if (read_format==SF_DDSCAT) { + scanned=sscanf(linebuf,ddscat_format,&t1,&x,&y,&z,&mat,&t2,&t3); + if (!anis_warned && (t2!=t1 || t3!=t1)) { + LogError(EC_WARN,ONE_POS,"Anisotropic dipoles are detected in file %s (first on line %d). "\ + "ADDA ignores this anisotropy, using only the identifier of x-component of refractive "\ + "index as domain number",fname,line); + anis_warned=TRUE; + } + } + /* if sscanf returns EOF, that is a blank line -> just skip */ + if (scanned!=EOF) { + if (scanned!=mustbe) /* this in most cases indicates wrong format */ + LogError(EC_ERROR,ONE_POS,"%s was detected, but error occured during scaning of line %d "\ + "from dipole file %s",formtext,line,fname); + if (read_format!=SF_TEXT) { + if (mat<=0) LogError(EC_ERROR,ONE_POS,"%s was detected, but nonpositive material number "\ + "(%d) encountered during scaning of line %d from dipole file %s",formtext,mat,line,fname); + else if (mat>maxN) maxN=mat; + } + /* update maximums and minimums */ + if (x>maxX) maxX=x; + if (x<minX) minX=x; + if (y>maxY) maxY=y; + if (y<minY) minY=y; + if (z>maxZ) maxZ=z; + if (z<minZ) minZ=z; + } + } + if (read_format==SF_TEXT_EXT) { + if (*Nm!=maxN) LogError(EC_WARN,ONE_POS,"Nmat (%d), as given in %s, is not equal to the "\ + "maximum domain number (%d) among all specified dipoles; hence the former is ignored", + *Nm,fname,maxN); + } + *Nm=maxN; + /* set grid (box) sizes */ + *bX=jagged*(maxX-minX+1); + *bY=jagged*(maxY-minY+1); + *bZ=jagged*(maxZ-minZ+1); + /* not optimal way, but works more robusty when non-system EOL is used in data file */ + fseek(dipfile,0,SEEK_SET); + SkipNLines(dipfile,skiplines); } -#undef DDSCAT_HL -//=========================================================== +#undef DDCCAT_HL +/*===========================================================*/ static void ReadDipFile(const char *fname) -/* read dipole file; no consistency checks are made since they are made in InitDipFile. - * the file is opened in InitDipFile; this function only closes the file. - */ + /* read dipole file; + no consistency checks are made since they are made in InitDipFile + the file is opened in InitDipFile; this funstion only closes the file */ { - int x,y,z,x0,y0,z0,mat,scanned; - long tl; // dumb variable - int t2,t3; // dumb variables - int index; - size_t boxXY,boxX_l; - - // to remove possible overflows - boxX_l=(size_t)boxX; - boxXY=boxX_l*boxY; - - mat=1; - while(fgets(linebuf,BUF_LINE,dipfile)!=NULL) { - // scan numbers in a line - if (read_format==SF_TEXT) scanned=sscanf(linebuf,geom_format,&x0,&y0,&z0); - else if (read_format==SF_TEXT_EXT) scanned=sscanf(linebuf,geom_format_ext,&x0,&y0,&z0,&mat); - else if (read_format==SF_DDSCAT) - scanned=sscanf(linebuf,ddscat_format,&tl,&x0,&y0,&z0,&mat,&t2,&t3); - // if sscanf returns EOF, that is a blank line -> just skip - if (scanned!=EOF) { - // shift dipole position to be nonnegative - x0-=minX; - y0-=minY; - z0-=minZ; - // initialize box jagged*jagged*jagged instead of one dipole - for (z=jagged*z0;z<jagged*(z0+1);z++) if (z>=local_z0 && z<local_z1_coer) - for (x=jagged*x0;x<jagged*(x0+1);x++) for (y=jagged*y0;y<jagged*(y0+1);y++) { - index=(z-local_z0)*boxXY+y*boxX_l+x; - material_tmp[index]=(unsigned char)(mat-1); - } - } - } - FCloseErr(dipfile,fname,ALL_POS); + int x,y,z,x0,y0,z0,mat,scanned; + int t1,t2,t3; /* dumb variables */ + int index; + size_t boxXY,boxX_l; + + /* to remove possible overflows */ + boxX_l=(size_t)boxX; + boxXY=boxX_l*boxY; + + mat=1; + while(fgets(linebuf,BUF_LINE,dipfile)!=NULL) { + /* scan numbers in a line */ + if (read_format==SF_TEXT) scanned=sscanf(linebuf,geom_format,&x0,&y0,&z0); + else if (read_format==SF_TEXT_EXT) scanned=sscanf(linebuf,geom_format_ext,&x0,&y0,&z0,&mat); + else if (read_format==SF_DDSCAT) + scanned=sscanf(linebuf,ddscat_format,&t1,&x0,&y0,&z0,&mat,&t2,&t3); + /* if sscanf returns EOF, that is a blank line -> just skip */ + if (scanned!=EOF) { + /* shift dipole position to be nonnegative */ + x0-=minX; + y0-=minY; + z0-=minZ; + /* initialize box jagged*jagged*jagged instead of one dipole */ + for (z=jagged*z0;z<jagged*(z0+1);z++) if (z>=local_z0 && z<local_z1_coer) + for (x=jagged*x0;x<jagged*(x0+1);x++) for (y=jagged*y0;y<jagged*(y0+1);y++) { + index=(z-local_z0)*boxXY+y*boxX_l+x; + material_tmp[index]=(unsigned char)(mat-1); + } + } + } + FCloseErr(dipfile,fname,ALL_POS); } -//========================================================== -#define ALLOCATE_SEGMENTS(N) (struct segment *)voidVector((N)*sizeof(struct segment),ALL_POS,\ - "contour segment"); - -void InitContourSegment(struct segment *seg,const bool increasing) -/* recursively initialize a segment of a contour: allocates memory and calculates all elements - * some elements are calculated during the forward sweep (from long segments to short), others - - * during the backward sweep. - * Recursive function calls incurs certain overhead, however here it is not critical. - */ -{ - int i; - struct segment *s1,*s2; +/*==========================================================*/ - /* Remove constant parts in the beginning and end of segment, if present. - * After this procedure first is guaranteed to be less than last by definition of the segment - */ - while (contRo[seg->first]==contRo[seg->first+1]) (seg->first)++; - while (contRo[seg->last-1]==contRo[seg->last]) (seg->last)--; - if (seg->first+1 == seg->last) { // segment with a single fragment - seg->single=true; - seg->zmin=MIN(contZ[seg->first],contZ[seg->last]); - seg->zmax=MAX(contZ[seg->first],contZ[seg->last]); - seg->slope=(contZ[seg->last]-contZ[seg->first])/(contRo[seg->last]-contRo[seg->first]); - seg->add=contZ[seg->first]-contRo[seg->first]*seg->slope; - } - else { // divide segment into two, and initialize each of them - seg->single=false; - i=(seg->first+seg->last)/2; - seg->romid=contRo[i]; - // construct subsegments - s1=ALLOCATE_SEGMENTS(1); - s2=ALLOCATE_SEGMENTS(1); - s1->first=seg->first; - s1->last=s2->first=i; - s2->last=seg->last; - // initialize subsegments - InitContourSegment(s1,increasing); - InitContourSegment(s2,increasing); - // calculate zmax and zmin - seg->zmax=MAX(s1->zmax,s2->zmax); - seg->zmin=MIN(s1->zmin,s2->zmin); - // assign new segments to left and right based on 'increasing' - if (increasing) { - seg->left=s1; - seg->right=s2; - } - else { - seg->left=s2; - seg->right=s1; - } - } -} - -//=========================================================== -#define CHUNK_SIZE 128 // how many numbers are allocated at once for adjustable arrays - -static void InitContour(const char *fname,double *ratio,double *shSize) -/* Reads a contour from the file, rotates it so that it starts from a local minimum in ro, - * then divides it into monotonic (over ro) segments. It produces data, which are later used to - * test each dipole for being inside the contour. Segments are either increasing or non-decreasing. - */ -{ - int line; // current line number - int nr; // number of contour points read from the file - int size; // current size of the allocated memory for contour - int i,j,scanned; - double *bufRo,*bufZ; // temporary buffers - int *index; - double ro,z,romin,romax,zmin,zmax,mult,zmid; - FILE* file; - bool increasing; - - D("InitContour has started"); - // Read contour from file - file=FOpenErr(fname,"r",ALL_POS); - line=SkipComments(file); - size=CHUNK_SIZE; - MALLOC_VECTOR(bufRo,double,size,ALL); - MALLOC_VECTOR(bufZ,double,size,ALL); - nr=0; - // reading is performed in lines - while(FgetsError(file,fname,&line,POSIT)!=NULL) { - // scan numbers in a line - scanned=sscanf(linebuf,"%lf %lf",&ro,&z); - // if sscanf returns EOF, that is a blank line -> just skip - if (scanned!=EOF) { - if (scanned!=2) // this in most cases indicates wrong format - LogError(EC_ERROR,ONE_POS,"Error occurred during scanning of line %d from contour " - "file %s",line,fname); - // check for consistency of input - if (ro<0) LogError(EC_ERROR,ONE_POS,"Negative ro-coordinate is found on line %d in " - "contour file %s",line,fname); - // update extreme values - if (nr==0) { - zmax=zmin=z; - romax=romin=ro; - } - else { - if (z>zmax) zmax=z; - if (z<zmin) zmin=z; - if (ro>romax) romax=ro; - if (ro<romin) romin=ro; - } - // add allocated memory to buf, if needed - if (nr >= size) { - size+=CHUNK_SIZE; - REALLOC_DVECTOR(bufRo,size,ALL); - REALLOC_DVECTOR(bufZ,size,ALL); - } - bufRo[nr]=ro; - bufZ[nr]=z; - nr++; - } - } - FCloseErr(file,fname,ALL_POS); - // Check number of points read - if (nr<3) LogError(EC_ERROR,ONE_POS, - "Contour from file %s contains less than three points",fname); - - // Determine initial point with local minimum ro[i-1]>=ro[i]<ro[i+1] - i=0; - while (i<nr-1 && bufRo[i]>=bufRo[i+1]) i++; - if (i==0) { // first point is a minimum candidate - if (bufRo[0]>bufRo[nr-1]) { // if required, search backwards; guaranteed to converge - i=nr-1; - while (bufRo[i]>bufRo[i-1]) i--; - } - } - // if the whole contour is non-decreasing, check for constancy - else if (i==nr-1 && bufRo[nr-1]==bufRo[0]) LogError(EC_ERROR,ONE_POS, - "Contour from file %s has zero area. Hence the scatterer is void",fname); - /* Construct working contour so that its first point = last and is a local minimum. It is done - * by rotating buf and adding one extra point. Then free the buffer. - */ - MALLOC_VECTOR(contRo,double,nr+1,ALL); - memcpy(contRo,bufRo+i,(nr-i)*sizeof(double)); - memcpy(contRo+nr-i,bufRo,i*sizeof(double)); - contRo[nr]=contRo[0]; - Free_general(bufRo); - // same for Z vectors - MALLOC_VECTOR(contZ,double,nr+1,ALL); - memcpy(contZ,bufZ+i,(nr-i)*sizeof(double)); - memcpy(contZ+nr-i,bufZ,i*sizeof(double)); - contZ[nr]=contZ[0]; - Free_general(bufZ); - // scale coordinates to be relative to total diameter, and centered (by z) around 0 - mult=1/(2*romax); - zmid=(zmax+zmin)/2; - *ratio=(zmax-zmin)*mult; - *shSize=2*romax; - contRoSqMin=romin*romin*mult*mult; - for (i=0;i<=nr;i++) { - contRo[i]*=mult; - contZ[i]=(contZ[i]-zmid)*mult; - } - - /* divide the contour into the segments; actually only the index is constructed marking end - * points of the segments - */ - MALLOC_VECTOR(index,int,nr+1,ALL); // this is enough, even if all segments are of one joint - index[0]=0; - i=j=1; - increasing=true; - while (i<nr) { - while (i<nr && (increasing == (contRo[i]<contRo[i+1]))) i++; - index[j]=i; - j++; - increasing=!increasing; - } - contNseg=j-1; - /* Calculate maximum and minimum ro for segments; - * We implicitly use that first segment is increasing, second - decreasing, and so on. - */ - MALLOC_VECTOR(contSegRoMin,double,contNseg,ALL); - MALLOC_VECTOR(contSegRoMax,double,contNseg,ALL); - for (j=0;j<contNseg;j++) { - if (IS_EVEN(j)) { - contSegRoMin[j]=contRo[index[j]]; - contSegRoMax[j]=contRo[index[j+1]]; - } - else { - contSegRoMin[j]=contRo[index[j+1]]; - contSegRoMax[j]=contRo[index[j]]; - } - } - - // Construct a tree of segments - contSeg=ALLOCATE_SEGMENTS(contNseg); - for (j=0;j<contNseg;j++) { - contSeg[j].first=index[j]; - contSeg[j].last=index[j+1]; - InitContourSegment(contSeg+j,IS_EVEN(j)); - } - - Free_general(index); - Free_general(contRo); - Free_general(contZ); - D("InitContour has finished"); - D2("Nseg=%d",contNseg); - D2("minroSq=%g",contRoSqMin); -} -#undef CHUNK_SIZE -#undef ALLOCATE_SEGMENTS - -//========================================================== - -bool CheckContourSegment(struct segment *seg) -/* Checks, whether point is under or above the segment, by traversing the tree of segments. - * It returns true, if intersecting z value is larger than given z, and false otherwise. Point is - * defined by local variables contCurRo and contCurZ. - */ -{ - while (true) { - if (contCurZ < seg->zmin) return true; - else if (contCurZ > seg->zmax) return false; - else if (seg->single) return (contCurZ < seg->add + contCurRo*seg->slope); - else seg=(contCurRo<seg->romid ? seg->left : seg->right); - } -} - -//========================================================== - -void FreeContourSegment(struct segment *seg) -/* recursively frees memory allocated for contour segments - * Recursive function calls incurs certain overhead, however here it is not critical. - */ -{ - if (!(seg->single)) { - FreeContourSegment(seg->left); - FreeContourSegment(seg->right); - } -} - -//========================================================== - -#define KEY_LENGTH 2 // length of key for initialization of random generator -#define MAX_ZERO_FITS 1E4 // maximum number of zero fits in a row (each - many granules) -#define MAX_FALSE_SKIP 10 // number of false skips in granule placement to complete the set -#define MAX_FALSE_SKIP_SMALL 10 // the same for small granules -#define MAX_GR_SET USHRT_MAX // maximum size of granule set -#define MIN_CELL_SIZE 4.0 // minimum cell size for small granules +#define KEY_LENGTH 2 /* length of key for initialization of random generator */ +#define MAX_ZERO_FITS 1E4 /* maximum number of zero fits in a row (each - many granules) */ +#define MAX_FALSE_SKIP 10 /* number of false skips in granule placement to complete the set */ +#define MAX_FALSE_SKIP_SMALL 10 /* the same for small granules */ +#define MAX_GR_SET USHRT_MAX /* maximum size of granule set */ +#define MIN_CELL_SIZE 4.0 /* minimum cell size for small granules */ INLINE int CheckCell(const double *gr,const double *vgran,const unsigned short *tree_index, const double Di2,const int start,int *fits) -// function that checks whether granule intersects anything in the cell + /* function that checks whether granule intersects anything in the cell */ { - int index,last,index1; - double t1,t2,t3; - - last=index=start; - while (index!=MAX_GR_SET && (*fits)) { - last=index; - index1=3*index; - t1=gr[0]-vgran[index1]; - t2=gr[1]-vgran[index1+1]; - t3=gr[2]-vgran[index1+2]; - if ((t1*t1+t2*t2+t3*t3)<Di2) *fits=FALSE; - index=tree_index[index]; - } - return last; + int index,last,index1; + double t1,t2,t3; + + last=index=start; + while (index!=MAX_GR_SET && (*fits)) { + last=index; + index1=3*index; + t1=gr[0]-vgran[index1]; + t2=gr[1]-vgran[index1+1]; + t3=gr[2]-vgran[index1+2]; + if ((t1*t1+t2*t2+t3*t3)<Di2) *fits=FALSE; + index=tree_index[index]; + } + return last; } -//========================================================== +/*==========================================================*/ static double PlaceGranules(void) -/* Randomly places granules inside the specified domain; - * Mersenne Twister is used for generating random numbers - * - * A simplest algorithm is used: to place randomly a sphere, and see if it overlaps with any - * dipoles (more exactly: centers of dipoles) of not correct domain; if not, accept it and - * fill all this dipoles with granules' domain. Optimized to perform in two steps: - * First it places of set of not-intersecting granules and do only a quick check against the - * "domain pattern" - coarse representation of the domain. On the second step granules of the - * whole set are thoroughly checked against the whole domain on each processor. When small - * granules are used, no domain pattern is used - makes it simpler. - * Intersection of two granules between the sets is checked only through dipoles, which is not - * exact, however it allows considering arbitrary complex domains, which is described only by - * a set of occupied dipoles. - * This algorithm is unsuitable for high volume fractions, it becomes very slow and for some - * volume fractions may fail at all (Metropolis algorithm should be more suitable, however it - * is hard to code for arbitrary domains). Moreover, statistical properties of the obtained - * granules distribution may be not perfect, however it seems good enough for our applications. - * - * Currently it is not working with jagged. That should be improved, by rewriting the jagged - * calculation throughout the program - */ + /* rangomly places granules inside the specified domain; + Mersenne Twister is used for generating random numbers + + A simplest algorithm is used: to place randomly a sphere, and see if it overlaps with any + dipoles (more exactly: centers of dipoles) of not correct domain; if not, accept it and + fill all this dipoles with granules' domain. Optimized to perform in two steps: + Firtst it places of set of not-intersecting granules and do only a quick ckeck against the + "domain pattern" - coarse representation of the domain. On the second step granules of the + whole set are thorougly checked against the whole domain on each processor. When small + granules are used, no domain pattern is used - makes it simpler. + Intersection of two granules between the sets is checked only through dipoles, which is not + exact, however it allows considering arbitrary complex domains, which is described only by + a set of occupied dipoles. + This algorithm is unsuitable for high volume fractions, it becomes very slow and for some + volume fractions may fail at all (Metropolis algorithm should be more suitable, however it + is hard to code for arbitrary domains). Moreover, statistical properties of the obtained + granules distribution may be not perfect, however it seems good enough for our applications. + + Currently it is not working with jagged. That should be improved, by rewriting the jagged + calculation throughout the program */ { - int i,j,k,zerofit,last; - size_t n,count,count_gr,false_count,ui; - size_t boxXY; - double nd; // number of dipoles occupied by granules - int index,index1,index2; // indices for dipole grid - int dom_index,dom_index1,dom_index2; // indices for auxiliary grid - int gX,gY,gZ; // auxiliary grid dimensions - size_t gXY,gr_gN; // ... and their products - size_t avail; // number of available (free) domain cells - int gX2,gY2,gZ2,locgZ2; - int i0,i1,j0,j1,k0,k1; - int fits; - int cur_Ngr,ig,max_Ngr; // number of granules in a current set, index, and maximum set size - double gdX,gdY,gdZ,gdXh,gdYh,gdZh; // auxiliary grid cell sizes and their halfs (h) - int locz0,locz1,locgZ,gr_locgN; - double R,R2,Di,Di2; // radius and diameter of granule, and their squares - double x0,x1,y0,y1,z0,z1; // where to put random number (inner box) - int id0,id1,jd0,jd1,kd0,kd1; // dipoles limit that fall inside inner box - int Nfit; // number of successfully placed granules in a current set - double overhead; // estimate of the overhead needed to have exactly needed N of granules - double tmp1,tmp2,tmp3,t1,t2,t3; - int sx,sy,sz; /* maximum shifts for checks of neighboring cells in auxiliary grid - * for 'small' it is the shift in index - */ - unsigned long key[KEY_LENGTH]; // key to initialize random number generator - unsigned char *dom; // information about the domain on a granule grid - unsigned short *occup; // information about the occupied cells - int sm_gr; // whether granules are small (then simpler algorithm is used) - unsigned short *tree_index; // index for traversing granules inside one cell (for small) - double *vgran; // coordinates of a set of granules - char *vfit; // results of granule fitting on the grid (boolean) - int *ginX,*ginY,*ginZ; // indices to find dipoles inside auxiliary grid - int indX,indY,indZ; // indices for doubled auxiliary grid - int bit; // bit position in char of 'dom' - double gr[3]; // coordinates of a single granule - FILE *file; // file for saving granule positions - char fname[MAX_FNAME]; // filename of file - - // prepare granule file for saving if needed - if (store_grans && ringid==ROOT) { - sprintf(fname,"%s/" F_GRANS,directory); - file=FOpenErr(fname,"w",ONE_POS); - fprintf(file,"#generated by ADDA v." ADDA_VERSION "\n" - "#granule diameter = %.10g\n",gr_d); - } - // set variables; consider jagged - Di=gr_d/(gridspace*jagged); - if (Di<1) LogError(EC_WARN,ONE_POS,"Granule diameter is smaller than dipole size. It is " - "recommended to increase resolution"); - R=Di/2; - R2=R*R; - Di2=4*R2; - boxXY=boxX*(size_t)boxY; - // inner box - if (Di>MIN(boxX,MIN(boxY,boxZ))) LogError(EC_WARN,ONE_POS, - "Granule size is larger than minimum particle dimension"); - x0=R-0.5; - x1=boxX-R-0.5; - y0=R-0.5; - y1=boxY-R-0.5; - z0=R-0.5; - z1=boxZ-R-0.5; - // initialize auxiliary grid - CheckOverflow(MAX(boxX,MAX(boxY,boxZ))*10/Di,ONE_POS,"PlaceGranules()"); - tmp1=sqrt(3)/Di; - gX=(int)ceil((x1-x0)*tmp1); - gdX=(x1-x0)/gX; - gY=(int)ceil((y1-y0)*tmp1); - gdY=(y1-y0)/gY; - gZ=(int)ceil((z1-z0)*tmp1); - gdZ=(z1-z0)/gZ; - sm_gr=(gdX<2 || gdY<2 || gdZ<2); // sets the discrimination for small or large granules - if (sm_gr) { - PRINTZ("Using algorithm for small granules\n"); - // redefine auxiliary grid - tmp1=1/MAX(2*Di,MIN_CELL_SIZE); - gX=(int)floor((x1-x0)*tmp1); - gdX=(x1-x0)/gX; - gY=(int)floor((y1-y0)*tmp1); - gdY=(y1-y0)/gY; - gZ=(int)floor((z1-z0)*tmp1); - gdZ=(z1-z0)/gZ; - } - else { - PRINTZ("Using algorithm for large granules\n"); - gX2=2*gX; - gdXh=gdX/2; - gY2=2*gY; - gdYh=gdY/2; - gZ2=2*gZ; - gdZh=gdZ/2; - /* this sets maximum distance of neighboring cells to check; condition gdX<R can only occur - * if gX<=7, which is quite rare, so no optimization is performed. sx>3 can only occur if - * gX<=2 and then it doesn't make sense to take bigger sx. Absolutely analogous for y, z. - */ - if (gdX<R) sx=3; - else sx=2; - if (gdY<R) sy=3; - else sy=2; - if (gdZ<R) sz=3; - else sz=2; - } - gXY=MultOverflow(gX,gY,ONE_POS,"PlaceGranules()"); - gr_gN=MultOverflow(gXY,gZ,ONE_POS,"PlaceGranules()"); - // calculate maximum number of granules in a grid; crude estimate - tmp2=(ceil((x1-x0)/Di)+1)*(ceil((y1-y0)/Di)+1)*(ceil((z1-z0)/Di)+1); - max_Ngr=MIN(MAX_GR_SET,tmp2); - // local z grid + initialize communications - SetGranulComm(z0,z1,gdZ,gZ,gXY,max_Ngr,&locz0,&locz1,sm_gr); - if (!sm_gr) { - locgZ=locz1-locz0; - locgZ2=2*locgZ; - gr_locgN=gXY*locgZ; - } - if (ringid==ROOT) { - // initialize random generator - key[0]=(unsigned long)time(NULL); - key[1]=(unsigned long)(clock()-wt_start); - init_by_array(key,KEY_LENGTH); - // allocate memory - MALLOC_VECTOR(occup,ushort,gr_gN,ONE); - if (sm_gr) MALLOC_VECTOR(tree_index,ushort,max_Ngr,ONE); - else MALLOC_VECTOR(dom,uchar,gr_gN,ALL); - } - else if (!sm_gr && locgZ!=0) MALLOC_VECTOR(dom,uchar,gr_locgN,ALL); - MALLOC_VECTOR(vgran,double,3*max_Ngr,ALL); - MALLOC_VECTOR(vfit,char,max_Ngr,ALL); - if (!sm_gr && locgZ!=0) { - // build some more indices - MALLOC_VECTOR(ginX,int,gX2+1,ALL); - MALLOC_VECTOR(ginY,int,gY2+1,ALL); - MALLOC_VECTOR(ginZ,int,locgZ2+1,ALL); - for (i=0;i<=gX2;i++) ginX[i]=(int)ceil(x0+i*gdXh); - id0=ginX[0]; - id1=ginX[gX2]; - for (i=0;i<=gY2;i++) ginY[i]=(int)ceil(y0+i*gdYh); - jd0=ginY[0]; - jd1=ginY[gZ2]; - for (i=0;i<=locgZ2;i++) ginZ[i]=(int)ceil(z0+(i+2*locz0)*gdZh); - kd0=MAX(ginZ[0],local_z0); - indZ=1; - if (kd0>=ginZ[1]) indZ++; - kd1=MIN(ginZ[locgZ2],local_z1_coer); - } - n=count=count_gr=false_count=0; - nd=0; - // crude estimate of the probability to place a small granule into domain - if (sm_gr) overhead=Ndip/mat_count[gr_mat]; - else overhead=1; - // main cycle - while (n<gr_N) { - if (sm_gr) { // small granules - // just generate granules - if (ringid==ROOT) { - cur_Ngr=MIN(ceil((gr_N-n)*overhead),max_Ngr); - // generate points and quick check - ig=false_count=0; - for (ui=0;ui<gr_gN;ui++) occup[ui]=MAX_GR_SET; // used as undefined - while (ig<cur_Ngr) { - count++; - false_count++; - fits=TRUE; - // random position in a grid - gr[0]=genrand(0,gX); - gr[1]=genrand(0,gY); - gr[2]=genrand(0,gZ); - // coordinates in a grid - t1=floor(gr[0]); - t2=floor(gr[1]); - t3=floor(gr[2]); - indX=(int)t1; - indY=(int)t2; - indZ=(int)t3; - t1=gr[0]-t1; // t_i are distances to the edges - t2=gr[1]-t2; - t3=gr[2]-t3; - // convert to usual coordinates (in dipole grid) - gr[0]=gr[0]*gdX+x0; - gr[1]=gr[1]*gdY+y0; - gr[2]=gr[2]*gdZ+z0; - index=indZ*gXY+indY*gX+indX; - last=CheckCell(gr,vgran,tree_index,Di2,occup[index],&fits); - // weird construction (7 inclosed 'if' structures) but should be fast - if (fits) { - // possible x-neighbor - t1*=gdX; // transform shifts to usual coordinates; done only when needed - sx=0; - if (t1<Di) { - if (indX!=0) sx=-1; - } - else if ((t1=gdX-t1)<Di && indX!=gX-1) sx=1; - if (sx!=0) CheckCell(gr,vgran,tree_index,Di2,occup[index+sx],&fits); - if (fits) { - // possible y-neighbor - t2*=gdY; - sy=0; - if (t2<Di) { - if (indY!=0) sy=-gX; - } - else if ((t2=gdY-t2)<Di && indY!=gY-1) sy=gX; - if (sy!=0) CheckCell(gr,vgran,tree_index,Di2,occup[index+sy],&fits); - if (fits) { - // possible z-neighbor - t3*=gdZ; - sz=0; - if (t3<Di) { - if (indZ!=0) sz=-(int)gXY; - } - else if ((t3=gdZ-t3)<Di && indZ!=gZ-1) sz=gXY; - if (sz!=0) CheckCell(gr,vgran,tree_index,Di2,occup[index+sz],&fits); - if (fits) { - // possible xy-neighbor - if (sx!=0 && sy!=0 && ((tmp1=t1*t1)+(tmp2=t2*t2)<Di2)) - CheckCell(gr,vgran,tree_index,Di2,occup[index+sx+sy],&fits); - if (fits) { - // possible xz-neighbor - if (sx!=0 && sz!=0 && ((tmp1+(tmp3=t3*t3))<Di2)) - CheckCell(gr,vgran,tree_index,Di2, - occup[index+sx+sz],&fits); - if (fits) { - // possible yz-neighbor & xyz-neighbor - if (sy!=0 && sz!=0 && (tmp2+tmp3<Di2)) { - CheckCell(gr,vgran,tree_index,Di2, - occup[index+sy+sz],&fits); - if (fits && sx!=0 && (tmp1+tmp2+tmp3<Di2)) - CheckCell(gr,vgran,tree_index,Di2, - occup[index+sx+sy+sz],&fits); - } - } - } - } - } - } - } - if (fits) { - memcpy(vgran+3*ig,gr,3*sizeof(double)); - tree_index[ig]=MAX_GR_SET; - if (last==MAX_GR_SET) occup[index]=(unsigned short)ig; - else tree_index[last]=(unsigned short)ig; - ig++; - false_count=0; - } - if (false_count>MAX_FALSE_SKIP_SMALL) break; - } - // real number of placed granules for this set - cur_Ngr=ig; - } - } - else { // large granules - // generate domain pattern - if (locgZ!=0) { - for (i=0;i<gr_locgN;i++) dom[i]=0; - dom_index2=0; - index2=(kd0-local_z0)*boxXY; - bit=((indZ&1)^1)<<2; - for (k=kd0;k<kd1;k++,index2+=boxXY) { - index1=index2+jd0*boxX; - dom_index1=dom_index2; - indY=1; - bit&=~2; - for (j=jd0;j<jd1;j++,index1+=boxX) { - index=index1+id0; - dom_index=dom_index1; - indX=1; - bit&=~1; - for (i=id0;i<id1;i++,index++) { - if (material_tmp[index]!=gr_mat) - dom[dom_index]|=(unsigned char)(1<<bit); - if (i+1==ginX[indX]) { - indX++; - bit^=1; - if (indX&1) dom_index++; - } - } - if (j+1==ginY[indY]) { - indY++; - bit^=2; - if (indY&1) dom_index1+=gX; - } - } - if (k+1==ginZ[indZ]) { - indZ++; - bit^=4; - if (indZ&1) dom_index2+=gXY; - } - } - } - // send/collect domain pattern - CollectDomainGranul(dom,gXY,locz0,locgZ,&Timing_Granul_comm); - if (ringid==ROOT) { - // analyze domain pattern - avail=0; - for (ui=0;ui<gr_gN;ui++) if (dom[ui]!=0xFF) avail++; - cur_Ngr=MIN(avail,(size_t)max_Ngr); - tmp1=(gr_N-n)*overhead; - if (cur_Ngr>tmp1) cur_Ngr=(int)ceil(tmp1); - // generate points and quick check - ig=false_count=0; - for (ui=0;ui<gr_gN;ui++) occup[ui]=MAX_GR_SET; // used as undefined - while (ig<cur_Ngr) { - count++; - // random position in a double grid - gr[0]=genrand(0,gX2); - gr[1]=genrand(0,gY2); - gr[2]=genrand(0,gZ2); - // coordinates in doubled grid - indX=(int)floor(gr[0]); - indY=(int)floor(gr[1]); - indZ=(int)floor(gr[2]); // position bit inside one cell - bit=1<<((indX&1)+((indY&1)<<1)+((indZ&1)<<2)); - // coordinates in usual grid - indX/=2; - indY/=2; - indZ/=2; - index=indZ*gXY+indY*gX+indX; - // two simple checks - if (!(dom[index]&bit) && occup[index]==MAX_GR_SET) { - // convert to usual coordinates (in dipole grid) - gr[0]=gr[0]*gdXh+x0; - gr[1]=gr[1]*gdYh+y0; - gr[2]=gr[2]*gdZh+z0; - fits=TRUE; - false_count++; - if ((i0=indX-sx)<0) i0=0; - if ((i1=indX+sx+1)>gZ) i1=gX; - if ((j0=indY-sy)<0) j0=0; - if ((j1=indY+sy+1)>gY) j1=gY; - if ((k0=indZ-sz)<0) k0=0; - if ((k1=indZ+sz+1)>gZ) k1=gZ; - dom_index2=k0*gXY; - for (k=k0;k<k1;k++,dom_index2+=gXY) { - dom_index1=dom_index2+j0*gX; - for (j=j0;j<j1;j++,dom_index1+=gX) { - dom_index=dom_index1+i0; - for (i=i0;i<i1;i++,dom_index++) if (occup[dom_index]!=MAX_GR_SET) { - index1=3*occup[dom_index]; - t1=gr[0]-vgran[index1]; - t2=gr[1]-vgran[index1+1]; - t3=gr[2]-vgran[index1+2]; - if ((t1*t1+t2*t2+t3*t3)<Di2) { - fits=FALSE; - break; - } - } - if (!fits) break; - } - if (!fits) break; - } - if (fits) { - memcpy(vgran+3*ig,gr,3*sizeof(double)); - occup[index]=(unsigned short)ig; - ig++; - false_count=0; - /* Here it is possible to correct the domain pattern because of the - * presence of a new granule. However it probably will be useful only - * for large volume fractions - */ - } - if (false_count>MAX_FALSE_SKIP) break; - } - } - // real number of placed granules for this set - cur_Ngr=ig; - } - } // end of large granules - // cast to all processors - MyBcast(&cur_Ngr,int_type,1,&Timing_Granul_comm); - MyBcast(vgran,double_type,3*cur_Ngr,&Timing_Granul_comm); - count_gr+=cur_Ngr; - // final check if granules belong to the domain - for (ig=0;ig<cur_Ngr;ig++) { - memcpy(gr,vgran+3*ig,3*sizeof(double)); - k0=MAX((int)ceil(gr[2]-R),local_z0); - k1=MIN((int)floor(gr[2]+R),local_z1_coer-1); - fits=TRUE; - index2=(k0-local_z0)*boxXY; - for (k=k0;k<=k1;k++,index2+=boxXY) { - tmp1=R2-(gr[2]-k)*(gr[2]-k); - tmp2=sqrt(tmp1); - j0=(int)ceil(gr[1]-tmp2); - j1=(int)floor(gr[1]+tmp2); - index1=index2+j0*boxX; - for (j=j0;j<=j1;j++,index1+=boxX) { - tmp2=sqrt(tmp1-(gr[1]-j)*(gr[1]-j)); - i0=(int)ceil(gr[0]-tmp2); - i1=(int)floor(gr[0]+tmp2); - index=index1+i0; - for (i=i0;i<=i1;i++,index++) { - if (material_tmp[index]!=gr_mat) { - fits=FALSE; - break; - } - } - if (!fits) break; - } - if (!fits) break; - } - vfit[ig]=(char)fits; - } - // collect fits - ExchangeFits(vfit,cur_Ngr,&Timing_Granul_comm); - // fit dipole grid with successive granules - Nfit=n; - for (ig=0;ig<cur_Ngr;ig++) { - if (vfit[ig]) { // a successful granule - n++; - // fill dipoles in the sphere with granule material - memcpy(gr,vgran+3*ig,3*sizeof(double)); - k0=MAX((int)ceil(gr[2]-R),local_z0); - k1=MIN((int)floor(gr[2]+R),local_z1_coer-1); - index2=(k0-local_z0)*boxXY; - for (k=k0;k<=k1;k++,index2+=boxXY) { - tmp1=R2-(gr[2]-k)*(gr[2]-k); - tmp2=sqrt(tmp1); - j0=(int)ceil(gr[1]-tmp2); - j1=(int)floor(gr[1]+tmp2); - index1=index2+j0*boxX; - for (j=j0;j<=j1;j++,index1+=boxX) { - tmp2=sqrt(tmp1-(gr[1]-j)*(gr[1]-j)); - i0=(int)ceil(gr[0]-tmp2); - i1=(int)floor(gr[0]+tmp2); - index=index1+i0; - for (i=i0;i<=i1;i++,index++) { - material_tmp[index]=(unsigned char)(Nmat-1); - nd++; - } - } - } - // if the allocation was too optimistic - if (n>=gr_N) break; - } - } - // save correct granule positions to file - if (store_grans && ringid==ROOT) for (ig=0;ig<cur_Ngr;ig++) if (vfit[ig]) - fprintf(file,"%.10g %.10g %.10g\n",gridspace*(vgran[3*ig]-cX), - gridspace*(vgran[3*ig+1]-cY),gridspace*(vgran[3*ig+2]-cZ)); - Nfit=n-Nfit; - /* overhead is estimated based on the estimation of mean value - 1*standard deviation - * for the probability of fitting one granule. It is estimated from the Bernoulli statistics - * k out of n successful hits. M(p)=(k+1)/(n+2); s^2(p)=(k+1)(n-k+1)/(n+3)(n+2)^2 - * M(p)-s(p)=[(k+1)/(n+2)]*[1-sqrt((n-k+1)/(k+1)(n+3))]; - * overhead=1/latter. - */ - overhead=(cur_Ngr+2)/((1-sqrt((cur_Ngr-Nfit+1)/(double)((Nfit+1)*(cur_Ngr+3))))*(Nfit+1)); - if (Nfit!=0) zerofit=0; - else { - zerofit++; - // check if taking too long - if (zerofit>MAX_ZERO_FITS) { - MyInnerProduct(&nd,double_type,1,&Timing_Granul_comm); - LogError(EC_ERROR,ONE_POS,"The granule generator failed to reach required volume " - "fraction (%g) of granules. %zu granules were successfully placed up to a " - "volume fraction of %g.",gr_vf,n,nd/mat_count[gr_mat]); - } - } - } - /* conversions to (unsigned long) are needed (to remove warnings) because %z printf argument is - * not yet supported by all target compiler environments - */ - PRINTZ("Granule generator: total random placements= %lu (efficiency 1 = %g)\n" - " possible granules= %lu (efficiency 2 = %g)\n", - (unsigned long)count,count_gr/(double)count,(unsigned long)count_gr, - gr_N/(double)count_gr); - MyInnerProduct(&nd,double_type,1,&Timing_Granul_comm); - // free everything - if (ringid==ROOT) { - Free_general(occup); - if (sm_gr) Free_general(tree_index); - else Free_general(dom); - } - else if (!sm_gr && locgZ!=0) Free_general(dom); - FreeGranulComm(sm_gr); - Free_general(vgran); - Free_general(vfit); - if (!sm_gr && locgZ!=0) { - Free_general(ginX); - Free_general(ginY); - Free_general(ginZ); - } - // close granule file if needed and print info - if (store_grans && ringid==ROOT) { - FCloseErr(file,fname,ONE_POS); - printf("Granule coordinates saved to file\n"); - } - return nd; + int i,j,k,zerofit,last; + size_t n,count,count_gr,false_count,ui; + size_t boxXY; + double nd; /* number of dipoles occupied by granules */ + int index,index1,index2; /* indices for dipole grid */ + int dom_index,dom_index1,dom_index2; /* indices for auxilliary grid */ + int gX,gY,gZ; /* auxilliary grid dimensions */ + size_t gXY,gr_gN; /* ... and their products */ + size_t avail; /* number of available (free) domain cells */ + int gX2,gY2,gZ2,locgZ2; + int i0,i1,j0,j1,k0,k1; + int fits; + int cur_Ngr,ig,max_Ngr; /* number of granules in a current set, index, and maximum set size */ + double gdX,gdY,gdZ,gdXh,gdYh,gdZh; /* auxilliary grid cell sizes and their halfs (h) */ + int locz0,locz1,locgZ,gr_locgN; + double R,R2,Di,Di2; /* radius and dimater of granule, and their squares */ + double x0,x1,y0,y1,z0,z1; /* where to put random number (inner box) */ + int id0,id1,jd0,jd1,kd0,kd1; /* dipoles limit that fall inside inner box */ + int Nfit; /* number of succesfully placed granules in a current set */ + double overhead; /* estimate of the overhead needed to have exactly needed N of granules */ + double tmp1,tmp2,tmp3,t1,t2,t3; + int sx,sy,sz; /* maximum shifts for checks of neighboring cells in auxilliary grid + for 'small' it is the shift in index */ + unsigned long key[KEY_LENGTH]; /* key to initialize random number generator */ + unsigned char *dom; /* information about the domain on a granule grid */ + unsigned short *occup; /* information about the occupied cells */ + int sm_gr; /* whether granules are small (then simpler algorithm is used) */ + unsigned short *tree_index; /* index for traversing granules inside one cell (for small) */ + double *vgran; /* coordinates of a set of granules */ + char *vfit; /* results of granule fitting on the grid (boolean) */ + int *ginX,*ginY,*ginZ; /* indices to find dipoles inside auxilliary grid */ + int indX,indY,indZ; /* indices for doubled auxilliary grid */ + int bit; /* bit position in char of 'dom' */ + double gr[3]; /* coordinates of a single granule */ + + /* set variables; consider jagged */ + Di=gr_d/(gridspace*jagged); + if (Di<1) LogError(EC_WARN,ONE_POS,"Granule diameter is smaller than dipole size. "\ + "It is recommended to increase resolution"); + R=Di/2; + R2=R*R; + Di2=4*R2; + boxXY=boxX*(size_t)boxY; + /* inner box */ + if (Di>MIN(boxX,MIN(boxY,boxZ))) LogError(EC_WARN,ONE_POS, + "Granule size is larger than minimum particle dimension"); + x0=R-0.5; + x1=boxX-R-0.5; + y0=R-0.5; + y1=boxY-R-0.5; + z0=R-0.5; + z1=boxZ-R-0.5; + /* initialize auxilliary grid */ + CheckOverflow(MAX(boxX,MAX(boxY,boxZ))*10/Di,ONE_POS,"PlaceGranules()"); + tmp1=sqrt(3)/Di; + gX=(int)ceil((x1-x0)*tmp1); + gdX=(x1-x0)/gX; + gY=(int)ceil((y1-y0)*tmp1); + gdY=(y1-y0)/gY; + gZ=(int)ceil((z1-z0)*tmp1); + gdZ=(z1-z0)/gZ; + sm_gr=(gdX<2 || gdY<2 || gdZ<2); /* sets the discrimination for small or large granules */ + if (sm_gr) { + PRINTZ("Using algorithm for small granules\n"); + /* redefine auxilliary grid */ + tmp1=1/MAX(2*Di,MIN_CELL_SIZE); + gX=(int)floor((x1-x0)*tmp1); + gdX=(x1-x0)/gX; + gY=(int)floor((y1-y0)*tmp1); + gdY=(y1-y0)/gY; + gZ=(int)floor((z1-z0)*tmp1); + gdZ=(z1-z0)/gZ; + } + else { + PRINTZ("Using algorithm for large granules\n"); + gX2=2*gX; + gdXh=gdX/2; + gY2=2*gY; + gdYh=gdY/2; + gZ2=2*gZ; + gdZh=gdZ/2; + /* this sets maximum distance of neighboring cells to check; condition gdX<R can only occur + if gX<=7, which is quite rare, so no optimization is performed. sx>3 can only occur if + gX<=2 and then it doesn't make sense to take bigger sx. Absolutely analogous for y, z. */ + if (gdX<R) sx=3; + else sx=2; + if (gdY<R) sy=3; + else sy=2; + if (gdZ<R) sz=3; + else sz=2; + } + gXY=MultOverflow(gX,gY,ONE_POS,"PlaceGranules()"); + gr_gN=MultOverflow(gXY,gZ,ONE_POS,"PlaceGranules()"); + /* calculate maximum number of granules in a grid; crude estimate */ + tmp2=(ceil((x1-x0)/Di)+1)*(ceil((y1-y0)/Di)+1)*(ceil((z1-z0)/Di)+1); + max_Ngr=MIN(MAX_GR_SET,tmp2); + /* local z grid + initialize communications */ + SetGranulComm(z0,z1,gdZ,gZ,gXY,max_Ngr,&locz0,&locz1,sm_gr); + if (!sm_gr) { + locgZ=locz1-locz0; + locgZ2=2*locgZ; + gr_locgN=gXY*locgZ; + } + if (ringid==ROOT) { + /* init random generator */ + key[0]=(unsigned long)time(NULL); + key[1]=(unsigned long)(clock()-wt_start); + init_by_array(key,KEY_LENGTH); + /* allocate memory */ + MALLOC_VECTOR(occup,ushort,gr_gN,ONE); + if (sm_gr) MALLOC_VECTOR(tree_index,ushort,max_Ngr,ONE); + else MALLOC_VECTOR(dom,uchar,gr_gN,ALL); + } + else if (!sm_gr && locgZ!=0) MALLOC_VECTOR(dom,uchar,gr_locgN,ALL); + MALLOC_VECTOR(vgran,double,3*max_Ngr,ALL); + MALLOC_VECTOR(vfit,char,max_Ngr,ALL); + if (!sm_gr && locgZ!=0) { + /* build some more indices */ + MALLOC_VECTOR(ginX,int,gX2+1,ALL); + MALLOC_VECTOR(ginY,int,gY2+1,ALL); + MALLOC_VECTOR(ginZ,int,locgZ2+1,ALL); + for (i=0;i<=gX2;i++) ginX[i]=(int)ceil(x0+i*gdXh); + id0=ginX[0]; + id1=ginX[gX2]; + for (i=0;i<=gY2;i++) ginY[i]=(int)ceil(y0+i*gdYh); + jd0=ginY[0]; + jd1=ginY[gZ2]; + for (i=0;i<=locgZ2;i++) ginZ[i]=(int)ceil(z0+(i+2*locz0)*gdZh); + kd0=MAX(ginZ[0],local_z0); + indZ=1; + if (kd0>=ginZ[1]) indZ++; + kd1=MIN(ginZ[locgZ2],local_z1_coer); + } + n=count=count_gr=false_count=0; + nd=0; + /* crude estimate of the probability to place a small granule into domain */ + if (sm_gr) overhead=Ndip/mat_count[gr_mat]; + else overhead=1; + /* main cycle */ + while (n<gr_N) { + if (sm_gr) { /* small granules */ + /* just generate granules */ + if (ringid==ROOT) { + cur_Ngr=MIN(ceil((gr_N-n)*overhead),max_Ngr); + /* generate points and quick check */ + ig=false_count=0; + for (ui=0;ui<gr_gN;ui++) occup[ui]=MAX_GR_SET; /* used as undef */ + while (ig<cur_Ngr) { + count++; + false_count++; + fits=TRUE; + /* random position in a grid */ + gr[0]=genrand(0,gX); + gr[1]=genrand(0,gY); + gr[2]=genrand(0,gZ); + /* coordinates in a grid */ + t1=floor(gr[0]); + t2=floor(gr[1]); + t3=floor(gr[2]); + indX=(int)t1; + indY=(int)t2; + indZ=(int)t3; + t1=gr[0]-t1; /* t_i are distances to the edges */ + t2=gr[1]-t2; + t3=gr[2]-t3; + /* convert to usual coords (in dipole grid) */ + gr[0]=gr[0]*gdX+x0; + gr[1]=gr[1]*gdY+y0; + gr[2]=gr[2]*gdZ+z0; + index=indZ*gXY+indY*gX+indX; + last=CheckCell(gr,vgran,tree_index,Di2,occup[index],&fits); + /* weird construction (7 inclosed ifs) but should be fast */ + if (fits) { + /* possible x-neighbour */ + t1*=gdX; /* transform shifts to usual coords; done only when needed */ + sx=0; + if (t1<Di) { + if (indX!=0) sx=-1; + } + else if ((t1=gdX-t1)<Di && indX!=gX-1) sx=1; + if (sx!=0) CheckCell(gr,vgran,tree_index,Di2,occup[index+sx],&fits); + if (fits) { + /* possible y-neighbour */ + t2*=gdY; + sy=0; + if (t2<Di) { + if (indY!=0) sy=-gX; + } + else if ((t2=gdY-t2)<Di && indY!=gY-1) sy=gX; + if (sy!=0) CheckCell(gr,vgran,tree_index,Di2,occup[index+sy],&fits); + if (fits) { + /* possible z-neighbour */ + t3*=gdZ; + sz=0; + if (t3<Di) { + if (indZ!=0) sz=-(int)gXY; + } + else if ((t3=gdZ-t3)<Di && indZ!=gZ-1) sz=gXY; + if (sz!=0) CheckCell(gr,vgran,tree_index,Di2,occup[index+sz],&fits); + if (fits) { + /* possible xy-neighbour */ + if (sx!=0 && sy!=0 && ((tmp1=t1*t1)+(tmp2=t2*t2)<Di2)) + CheckCell(gr,vgran,tree_index,Di2,occup[index+sx+sy],&fits); + if (fits) { + /* possible xz-neighbour */ + if (sx!=0 && sz!=0 && ((tmp1+(tmp3=t3*t3))<Di2)) + CheckCell(gr,vgran,tree_index,Di2,occup[index+sx+sz],&fits); + if (fits) { + /* possible yz-neighbour & xyz-neighbour */ + if (sy!=0 && sz!=0 && (tmp2+tmp3<Di2)) { + CheckCell(gr,vgran,tree_index,Di2,occup[index+sy+sz],&fits); + if (fits && sx!=0 && (tmp1+tmp2+tmp3<Di2)) + CheckCell(gr,vgran,tree_index,Di2,occup[index+sx+sy+sz],&fits); + } + } + } + } + } + } + } + if (fits) { + memcpy(vgran+3*ig,gr,3*sizeof(double)); + tree_index[ig]=MAX_GR_SET; + if (last==MAX_GR_SET) occup[index]=(unsigned short)ig; + else tree_index[last]=(unsigned short)ig; + ig++; + false_count=0; + } + if (false_count>MAX_FALSE_SKIP_SMALL) break; + } + /* real number of placed granules for this set */ + cur_Ngr=ig; + } + } + else { /* large granules */ + /* generate domain pattern */ + if (locgZ!=0) { + for (i=0;i<gr_locgN;i++) dom[i]=0; + dom_index2=0; + index2=(kd0-local_z0)*boxXY; + bit=((indZ&1)^1)<<2; + for (k=kd0;k<kd1;k++,index2+=boxXY) { + index1=index2+jd0*boxX; + dom_index1=dom_index2; + indY=1; + bit&=~2; + for (j=jd0;j<jd1;j++,index1+=boxX) { + index=index1+id0; + dom_index=dom_index1; + indX=1; + bit&=~1; + for (i=id0;i<id1;i++,index++) { + if (material_tmp[index]!=gr_mat) dom[dom_index]|=(unsigned char)(1<<bit); + if (i+1==ginX[indX]) { + indX++; + bit^=1; + if (indX&1) dom_index++; + } + } + if (j+1==ginY[indY]) { + indY++; + bit^=2; + if (indY&1) dom_index1+=gX; + } + } + if (k+1==ginZ[indZ]) { + indZ++; + bit^=4; + if (indZ&1) dom_index2+=gXY; + } + } + } + /* send/collect domain pattern */ + CollectDomainGranul(dom,gXY,locz0,locgZ,&Timing_Granul_comm); + if (ringid==ROOT) { + /* analyze domain pattern */ + avail=0; + for (ui=0;ui<gr_gN;ui++) if (dom[ui]!=0xFF) avail++; + cur_Ngr=MIN(avail,(size_t)max_Ngr); + tmp1=(gr_N-n)*overhead; + if (cur_Ngr>tmp1) cur_Ngr=(int)ceil(tmp1); + /* generate points and quick check */ + ig=false_count=0; + for (ui=0;ui<gr_gN;ui++) occup[ui]=MAX_GR_SET; /* used as undef */ + while (ig<cur_Ngr) { + count++; + /* random position in a double grid */ + gr[0]=genrand(0,gX2); + gr[1]=genrand(0,gY2); + gr[2]=genrand(0,gZ2); + /* coordinates in doubled grid */ + indX=(int)floor(gr[0]); + indY=(int)floor(gr[1]); + indZ=(int)floor(gr[2]); /* position bit inside one cell */ + bit=1<<((indX&1)+((indY&1)<<1)+((indZ&1)<<2)); + /* coordinates in usual grid */ + indX/=2; + indY/=2; + indZ/=2; + index=indZ*gXY+indY*gX+indX; + /* two simple checks */ + if (!(dom[index]&bit) && occup[index]==MAX_GR_SET) { + /* convert to usual coords (in dipole grid) */ + gr[0]=gr[0]*gdXh+x0; + gr[1]=gr[1]*gdYh+y0; + gr[2]=gr[2]*gdZh+z0; + fits=TRUE; + false_count++; + if ((i0=indX-sx)<0) i0=0; + if ((i1=indX+sx+1)>gZ) i1=gX; + if ((j0=indY-sy)<0) j0=0; + if ((j1=indY+sy+1)>gY) j1=gY; + if ((k0=indZ-sz)<0) k0=0; + if ((k1=indZ+sz+1)>gZ) k1=gZ; + dom_index2=k0*gXY; + for (k=k0;k<k1;k++,dom_index2+=gXY) { + dom_index1=dom_index2+j0*gX; + for (j=j0;j<j1;j++,dom_index1+=gX) { + dom_index=dom_index1+i0; + for (i=i0;i<i1;i++,dom_index++) if (occup[dom_index]!=MAX_GR_SET) { + index1=3*occup[dom_index]; + t1=gr[0]-vgran[index1]; + t2=gr[1]-vgran[index1+1]; + t3=gr[2]-vgran[index1+2]; + if ((t1*t1+t2*t2+t3*t3)<Di2) { + fits=FALSE; + break; + } + } + if (!fits) break; + } + if (!fits) break; + } + if (fits) { + memcpy(vgran+3*ig,gr,3*sizeof(double)); + occup[index]=(unsigned short)ig; + ig++; + false_count=0; + /* Here it is possible to correct the domain pattern because of the presence + of a new granule. However it probably will be useful only for large volume + fractions */ + } + if (false_count>MAX_FALSE_SKIP) break; + } + } + /* real number of placed granules for this set */ + cur_Ngr=ig; + } + } /* end of large granules */ + /* cast to all processors */ + MyBcast(&cur_Ngr,int_type,1,&Timing_Granul_comm); + MyBcast(vgran,double_type,3*cur_Ngr,&Timing_Granul_comm); + count_gr+=cur_Ngr; + /* final check if granules belong to the domain */ + for (ig=0;ig<cur_Ngr;ig++) { + memcpy(gr,vgran+3*ig,3*sizeof(double)); + k0=MAX((int)ceil(gr[2]-R),local_z0); + k1=MIN((int)floor(gr[2]+R),local_z1_coer-1); + fits=TRUE; + index2=(k0-local_z0)*boxXY; + for (k=k0;k<=k1;k++,index2+=boxXY) { + tmp1=R2-(gr[2]-k)*(gr[2]-k); + tmp2=sqrt(tmp1); + j0=(int)ceil(gr[1]-tmp2); + j1=(int)floor(gr[1]+tmp2); + index1=index2+j0*boxX; + for (j=j0;j<=j1;j++,index1+=boxX) { + tmp2=sqrt(tmp1-(gr[1]-j)*(gr[1]-j)); + i0=(int)ceil(gr[0]-tmp2); + i1=(int)floor(gr[0]+tmp2); + index=index1+i0; + for (i=i0;i<=i1;i++,index++) { + if (material_tmp[index]!=gr_mat) { + fits=FALSE; + break; + } + } + if (!fits) break; + } + if (!fits) break; + } + vfit[ig]=(char)fits; + } + /* collect fits */ + ExchangeFits(vfit,cur_Ngr,&Timing_Granul_comm); + /* fit dipole grid with successive granules */ + Nfit=n; + for (ig=0;ig<cur_Ngr;ig++) { + if (vfit[ig]) { + /* a succesful granule */ + n++; + /* fill dipoles in the sphere with granule material */ + memcpy(gr,vgran+3*ig,3*sizeof(double)); + k0=MAX((int)ceil(gr[2]-R),local_z0); + k1=MIN((int)floor(gr[2]+R),local_z1_coer-1); + index2=(k0-local_z0)*boxXY; + for (k=k0;k<=k1;k++,index2+=boxXY) { + tmp1=R2-(gr[2]-k)*(gr[2]-k); + tmp2=sqrt(tmp1); + j0=(int)ceil(gr[1]-tmp2); + j1=(int)floor(gr[1]+tmp2); + index1=index2+j0*boxX; + for (j=j0;j<=j1;j++,index1+=boxX) { + tmp2=sqrt(tmp1-(gr[1]-j)*(gr[1]-j)); + i0=(int)ceil(gr[0]-tmp2); + i1=(int)floor(gr[0]+tmp2); + index=index1+i0; + for (i=i0;i<=i1;i++,index++) { + material_tmp[index]=(unsigned char)(Nmat-1); + nd++; + } + } + } + /* if the allocation was too oprimistic */ + if (n>=gr_N) break; + } + } + Nfit=n-Nfit; + /* overhead is estimated based on the estimation of mean value - 1*standard deviation + for the probability of fiting one granule. It is estimated from the Bernulli statistics + k out of n successful hits. M(p)=(k+1)/(n+2); s^2(p)=(k+1)(n-k+1)/(n+3)(n+2)^2 + M(p)-s(p)=[(k+1)/(n+2)]*[1-sqrt((n-k+1)/(k+1)(n+3))]; + overhead=1/latter */ + overhead=(cur_Ngr+2)/((1-sqrt((cur_Ngr-Nfit+1)/(double)((Nfit+1)*(cur_Ngr+3))))*(Nfit+1)); + if (Nfit!=0) zerofit=0; + else { + zerofit++; + /* check if taking too long */ + if (zerofit>MAX_ZERO_FITS) { + MyInnerProduct(&nd,double_type,1,&Timing_Granul_comm); + LogError(EC_ERROR,ONE_POS, + "The granule generator failed to reach required volume fraction (%g) of granules. "\ + "%u granules were successfully placed up to a volume fraction of %g", + gr_vf,n,nd/mat_count[gr_mat]); + } + } + } + PRINTZ("Granule generator: total random placements= %u (efficiency 1 = %g)\n"\ + " possible granules= %u (efficiency 2 = %g)\n", + count,count_gr/(double)count,count_gr,gr_N/(double)count_gr); + MyInnerProduct(&nd,double_type,1,&Timing_Granul_comm); + /* free everything */ + if (ringid==ROOT) { + Free_general(occup); + if (sm_gr) Free_general(tree_index); + else Free_general(dom); + } + else if (!sm_gr && locgZ!=0) Free_general(dom); + FreeGranulComm(sm_gr); + Free_general(vgran); + Free_general(vfit); + if (!sm_gr && locgZ!=0) { + Free_general(ginX); + Free_general(ginY); + Free_general(ginZ); + } + return nd; } #undef KEY_LENGTH #undef MAX_ZERO_FITS @@ -1180,690 +883,588 @@ static double PlaceGranules(void) #undef MAX_FALSE_SKIP_SMALL #undef MAX_GR_SET #undef MIN_CELL_SIZE -//========================================================== +/*==========================================================*/ static int FitBox(const int box) -/* finds the smallest value for which program would work (should be even and divide jagged); - * the limit is also checked - */ + /* finds the smallest value for which program would work + (should be even and divide jagged); the limit is also checked */ { - int res; + int res; - if (IS_EVEN(jagged)) res=jagged*((box+jagged-1)/jagged); - else res=2*jagged*((box+2*jagged-1)/(2*jagged)); - if (res>BOX_MAX) LogError(EC_ERROR,ONE_POS, - "Derived grid size (%d) is too large (>%d)",res,BOX_MAX); - return res; + if (jagged%2==0) res=jagged*((box+jagged-1)/jagged); + else res=2*jagged*((box+2*jagged-1)/(2*jagged)); + if (res>BOX_MAX) LogError(EC_ERROR,ONE_POS, + "Derived grid size (%d) is too large (>%d)",res,BOX_MAX); + return res; } -//========================================================== +/*==========================================================*/ void InitShape(void) -/* perform of initialization of symmetries and boxY, boxZ. Estimate the volume of the particle, when - * not discretized. Check whether enough refractive indices are specified. - */ + /* perform of initialization of symmetries and boxY, boxZ + * Estimate the volume of the particle, when not discretisized. + * Check whether enough refractive indices are specified + */ { - int n_boxX,n_boxY,n_boxZ; // new values for dimensions - double n_sizeX; // new value for size - double h_d,b_d,c_d,h2,b2,c2; - double yx_ratio,zx_ratio,tmp1,tmp2,tmp3; - double diskratio,aspectY,aspectZ; - double ad,ct,ct2; // cos(theta0) and its square - TIME_TYPE tstart; - int Nmat_need,i,temp; - int dpl_def_used; // if default dpl is used for grid initialization - bool box_det_sh; // if boxX is determined by shape itself - bool size_det_sh; // if size is determined by shape itself - bool size_given_cmd; // if size is given in the command line - char sizename[MAX_LINE]; // type of input size, used in diagnostic messages - /* TO ADD NEW SHAPE - * Add here all intermediate variables, which are used only inside this function. You may as - * well use 'tmp1'-'tmp3' variables defined above. - */ - - tstart=GET_TIME(); - - box_det_sh=(shape==SH_READ); - size_det_sh=(shape==SH_AXISYMMETRIC); - /* TO ADD NEW SHAPE - * If new shape defines dimension of the computational grid or absolute size of the particle, - * change corresponding definition in one of two lines above. In many cases this is not - * relevant. - */ - - size_given_cmd=(sizeX!=UNDEF || a_eq!=UNDEF); - if (sizeX!=UNDEF) strcpy(sizename,"size"); - else if (a_eq!=UNDEF) strcpy(sizename,"eq_rad"); - // check for redundancy of input data - if (dpl!=UNDEF) { - if (size_given_cmd) { - if (boxX!=UNDEF) PrintError("Extra information is given by setting '-dpl', '-grid', " - "and '-%s'",sizename); - else if (box_det_sh) PrintError("Extra information is given by setting both '-dpl' and " - "'-%s', while shape '%s' sets the size of the grid",sizename,shapename); - } - else if (size_det_sh) { - if (boxX!=UNDEF) PrintError("Extra information is given by setting '-dpl' and '-grid', " - "while shape '%s' sets the particle size",shapename); - // currently this can't happen, but may become relevant in the future - else if (box_det_sh) PrintError("Extra information is given by setting '-dpl', while " - "shape '%s' sets both the particle size and the size of the grid",shapename); - } - } - /* calculate default dpl - 10*sqrt(max(|m|)); - * for anisotropic each component is considered separately - */ - tmp2=0; - for (i=0;i<Ncomp*Nmat;i++) { - tmp1=cAbs2(ref_index[i]); - if (tmp2<tmp1) tmp2=tmp1; - } - dpl_def=10*sqrt(tmp2); - dpl_def_used=FALSE; - // initialization of global option index for error messages - opt=opt_sh; - // shape initialization - if (shape==SH_AXISYMMETRIC) { - /* Axisymmetric homogeneous shape, defined by its contour in ro-z plane of the cylindrical - * coordinate system. Its symmetry axis coincides with the z-axis, and the contour is read - * from file. Each line defines ro and z coordinates of a point, the first and the last - * points are connected automatically. Linear interpolation is used between the points. - */ - SPRINTZ(sh_form_str,"axisymmetric, defined by a contour in ro-z plane from file %s;" - " diameter:%%.10g",shape_fname); - InitContour(shape_fname,&zx_ratio,&n_sizeX); - yx_ratio=1; - symZ=FALSE; // input contour is assumed asymmetric over ro-axis - /* TODO: this can be determined from the contour. However, it is not trivial, especially - * when the contour intersects itself. - */ - volume_ratio=UNDEF; - Nmat_need=1; - } - else if (shape==SH_BOX) { - if (sh_Npars==0) { - STRCPYZ(sh_form_str,"cube; size of edge along x-axis:%.10g"); - aspectY=aspectZ=1; - } - else { // 2 parameters are given - aspectY=sh_pars[0]; - TestPositive(aspectY,"aspect ratio y/x"); - aspectZ=sh_pars[1]; - TestPositive(aspectZ,"aspect ratio z/x"); - SPRINTZ(sh_form_str,"rectangular parallelepiped; size along x-axis:%%.10g, aspect " - "ratios y/x=%.10g, z/x=%.10g",aspectY,aspectZ); - } - if (aspectY!=1) symR=FALSE; - // set half-aspect ratios - haspY=aspectY/2; - haspZ=aspectZ/2; - volume_ratio=aspectY*aspectZ; - yx_ratio=aspectY; - zx_ratio=aspectZ; - Nmat_need=1; - } - else if(shape==SH_CAPSULE) { - diskratio=sh_pars[0]; - TestNonNegative(diskratio,"height to diameter ratio"); - SPRINTZ(sh_form_str,"capsule; diameter(d):%%.10g, cylinder height h/d=%.10g",diskratio); - hdratio=diskratio/2; - volume_ratio = PI_OVER_FOUR*diskratio + PI_OVER_SIX; - yx_ratio=1; - zx_ratio=diskratio+1; - Nmat_need=1; - } - else if (shape==SH_COATED) { - coat_ratio=sh_pars[0]; - TestRangeII(coat_ratio,"inner/outer diameter ratio",0,1); - SPRINTZ(sh_form_str,"coated sphere; diameter(d):%%.10g, inner diameter d_in/d=%.10g", - coat_ratio); - if (sh_Npars==4) { - coat_x=sh_pars[1]; - coat_y=sh_pars[2]; - coat_z=sh_pars[3]; - if (coat_x*coat_x+coat_y*coat_y+coat_z*coat_z>0.25*(1-coat_ratio)*(1-coat_ratio)) - PrintErrorHelp("Inner sphere is not fully inside the outer"); - SPRINTZ(sh_form_str+strlen(sh_form_str), - "\n position of inner sphere center r/d= {%.10g,%.10g,%.10g}", - coat_x,coat_y,coat_z); - } - else coat_x=coat_y=coat_z=0; // initialize default values - coat_r2=0.25*coat_ratio*coat_ratio; - volume_ratio=PI_OVER_SIX; - if (coat_x!=0) symX=symR=FALSE; - if (coat_y!=0) symY=symR=FALSE; - if (coat_z!=0) symZ=FALSE; - yx_ratio=zx_ratio=1; - Nmat_need=2; - } - else if(shape==SH_CYLINDER) { - diskratio=sh_pars[0]; - TestPositive(diskratio,"height to diameter ratio"); - SPRINTZ(sh_form_str,"cylinder; diameter(d):%%.10g, height h/d=%.10g",diskratio); - hdratio=diskratio/2; - volume_ratio=PI_OVER_FOUR*diskratio; - yx_ratio=1; - zx_ratio=diskratio; - Nmat_need=1; - } - else if (shape==SH_EGG) { - /* determined by equation: (a/r)^2=1+nu*cos(theta)-(1-eps)cos^2(theta) - * or equivalently: a^2=r^2+nu*r*z-(1-eps)z^2. Parameters must be 0<eps<=1, 0<=nu<eps. - * This shape is proposed in: Hahn D.V., Limsui D., Joseph R.I., Baldwin K.C., Boggs N.T., - * Carr A.K., Carter C.C., Han T.S., and Thomas M.E. "Shape characteristics of biological - * spores", paper 6954-31 to be presented at "SPIE Defence + Security", March 2008 - */ - egeps=sh_pars[0]; - TestRangeNI(egeps,"egg parameter epsilon",0,1); - egnu=sh_pars[1]; - TestRangeIN(egnu,"egg parameter nu",0,egeps); - // egg shape is symmetric about z-axis (xz and yz planes, but generally NOT xy plane) - if (egnu!=0) symZ=FALSE; - /* cos(theta0): ct=-nu/[eps+sqrt(eps^2-nu^2)]; this expression for root of the quadratic - * equation is used for numerical stability (i.e. when nu=0); at this theta0 the diameter - * (maximum width perpendicular to z-axis) d=Dx is located - */ - ct=-egnu/(egeps+sqrt(egeps*egeps-egnu*egnu)); - ct2=ct*ct; - // Determine ad=(a/d) and its square - ad2=(1+egnu*ct-(1-egeps)*ct2)/(4*(1-ct2)); - ad=sqrt(ad2); - tmp1=1/sqrt(egeps+egnu); - tmp2=1/sqrt(egeps-egnu); - tmp3=2*(1-egeps); - /* Center of the computational box (z coordinate): - * z0=(a/d)*[1/sqrt(eps+nu)+1/sqrt(eps-nu)]/2; but more numerically stable expression is - * used (for nu->0). Although it may overflow faster for nu->eps, volume_ratio (below) will - * overflow even faster. It is used to shift coordinates from the computational reference - * frame (centered at z0) to the natural one - */ - egz0=-ad*egnu*(tmp1*tmp1*tmp2*tmp2)/(tmp1+tmp2); - /* (V/d^3)=(4*pi/3)*(a/d)^3*{[2(1-eps)-nu]/sqrt(eps+nu)+[2(1-eps)+nu]/sqrt(eps-nu)}/ - * /[nu^2+4(1-eps)] - */ - volume_ratio=FOUR_PI_OVER_THREE*ad2*ad*((tmp3-egnu)*tmp1+(tmp3+egnu)*tmp2) - /(egnu*egnu+2*tmp3); - SPRINTZ(sh_form_str,"egg; diameter(d):%%.10g, epsilon=%.10g, nu=%.10g, a/d=%.10g", - egeps,egnu,ad); - Nmat_need=1; - yx_ratio=1; - zx_ratio=ad*(tmp1+tmp2); // (a/d)*[1/sqrt(eps+nu)+1/sqrt(eps-nu)] - } - else if (shape==SH_ELLIPSOID) { - aspectY=sh_pars[0]; - TestPositive(aspectY,"aspect ratio y/x"); - aspectZ=sh_pars[1]; - TestPositive(aspectZ,"aspect ratio z/x"); - SPRINTZ(sh_form_str,"ellipsoid; size along x-axis:%%.10g, aspect ratios y/x=%.10g, " - "z/x=%.10g",aspectY,aspectZ); - if (aspectY!=1) symR=FALSE; - // set inverse squares of aspect ratios - invsqY=1/(aspectY*aspectY); - invsqZ=1/(aspectZ*aspectZ); - volume_ratio=PI_OVER_SIX*aspectY*aspectZ; - yx_ratio=aspectY; - zx_ratio=aspectZ; - Nmat_need=1; - } - else if (shape==SH_LINE) { - STRCPYZ(sh_form_str,"line; length:%g"); - symY=symZ=symR=FALSE; - n_boxY=n_boxZ=jagged; - yx_ratio=zx_ratio=UNDEF; - volume_ratio=UNDEF; - Nmat_need=1; - } - else if(shape==SH_RBC) { - /* three-parameter shape; developed by K.A.Semyanov,P.A.Tarasov,P.A.Avrorov - * based on work by P.W.Kuchel and E.D.Fackerell, "Parametric-equation representation - * of biconcave erythrocytes," Bulletin of Mathematical Biology 61, 209-220 (1999). - * ro^4+2S*ro^2*z^2+z^4+P*ro^2+Q*z^2+R=0, ro^2=x^2+y^2, P,Q,R,S are determined by d,h,b,c - * given in the command line. - */ - h_d=sh_pars[0]; - TestPositive(h_d,"ratio of maximum width to diameter"); - b_d=sh_pars[1]; - TestNonNegative(b_d,"ratio of minimum width to diameter"); - if (h_d<=b_d) PrintErrorHelp("given RBC is not biconcave; maximum width is in the center"); - c_d=sh_pars[2]; - TestRangeII(c_d,"relative diameter of maximum width",0,1); - SPRINTZ(sh_form_str, - "red blood cell; diameter(d):%%.10g, maximum and minimum width h/d=%.10g, b/d=%.10g\n" - " diameter of maximum width c/d=%.10g",h_d,b_d,c_d); - // calculate shape parameters - h2=h_d*h_d; - b2=b_d*b_d; - c2=c_d*c_d; - /* P={(b/d)^2*[c^4/(h^2-b^2)-h^2]-d^2}/4; Q=(d/b)^2*(P+d^2/4)-b^2/4; R=-d^2*(P+d^2/4)/4; - * S=-(2P+c^2)/h^2; here P,Q,R,S are made dimensionless dividing by respective powers of d - * Calculation is performed so that Q is well defined even for b=0. - */ - tmp1=((c2*c2/(h2-b2))-h2)/4; - P=b2*tmp1-0.25; - Q=tmp1-(b2/4); - R=-b2*tmp1/4; - S=-(2*P+c2)/h2; - yx_ratio=1; - zx_ratio=h_d; - volume_ratio=UNDEF; - Nmat_need=1; - } - else if (shape==SH_READ) { - SPRINTZ(sh_form_str,"specified by file %s; size along x-axis:%%.10g",shape_fname); - symX=symY=symZ=symR=FALSE; // input file is assumed asymmetric - InitDipFile(shape_fname,&n_boxX,&n_boxY,&n_boxZ,&Nmat_need); - yx_ratio=zx_ratio=UNDEF; - volume_ratio=UNDEF; - } - else if (shape==SH_SPHERE) { - STRCPYZ(sh_form_str,"sphere; diameter:%.10g"); - volume_ratio=PI_OVER_SIX; - yx_ratio=zx_ratio=1; - Nmat_need=1; - } - else if (shape==SH_SPHEREBOX) { - coat_ratio=sh_pars[0]; - TestRangeII(coat_ratio,"sphere diameter/cube edge ratio",0,1); - SPRINTZ(sh_form_str, - "sphere in cube; size of cube edge(a):%%.10g, diameter of sphere d/a=%.10g",coat_ratio); - coat_r2=0.25*coat_ratio*coat_ratio; - yx_ratio=zx_ratio=1; - volume_ratio=1; - Nmat_need=2; - } - /* TO ADD NEW SHAPE - * add an option here (in the end of 'else if' sequence). Identifier ('SH_...') should be - * defined in const.h. The option should - * 1) save all the input parameters from array 'sh_pars' to local variables - * (defined in the beginning of this source files) - * 2) test all input parameters (for that you're encouraged to use functions from param.h since - * they would automatically produce informative output in case of error). If the shape can - * accept different number of parameters (UNDEF was set in array shape_opt) then also test - * the number of parameters. - * 3) if shape breaks any symmetry, corresponding variable should be set to FALSE. Do not set - * any of them to TRUE, as they can be set to FALSE by some other factors. - * symX, symY, symZ - symmetries of reflection over planes YZ, XZ, XY respectively. - * symR - symmetry of rotation for 90 degrees over the Z axis - * 4) initialize the following: - * sh_form_str - descriptive string, should contain %g - it would be replaced by box size along - * x-axis afterwards (in param.c). - * Either yx_ratio (preferably) or n_boxY. The former is a ratio of particle sizes along y and x - * axes. Initialize n_boxY directly only if it is not proportional to boxX, like in - * shape LINE above, since boxX is not initialized at this moment. If yx_ratio is not - * initialized, set it explicitly to UNDEF. - * Analogously either zx_ratio (preferably) or n_boxZ. - * Nmat_need - number of different domains in this shape (void is not included) - * volume_ratio - ratio of particle volume to (boxX)^3. Initialize it if it can be calculated - * analytically or set to UNDEF otherwise. This parameter is crucial if one wants - * to initialize computational grid from '-eq_rad' and '-dpl'. - * n_sizeX - absolute size of the particle, defined by shape; initialize only when relevant, - * e.g. for shapes such as 'axisymmetric'. - * all other auxiliary variables, which are used in shape generation (MakeParticle(), see - * below), should be defined in the beginning of this file. If you need temporary local - * variables (which are used only in this part of the code), either use 'tmp1'-'tmp3' or - * define your own (with more informative names) in the beginning of this function. - * Also (rarely) if the shape defines dimension of the computational grid or absolute size of - * the particle, correct values of box_det_sh and size_det_sh in the beginning of this function. - */ - - // initialize domain granulation - if (sh_granul) { - symX=symY=symZ=symR=FALSE; // no symmetry with granules - if (gr_mat+1>Nmat_need) - PrintError("Specified domain number to be granulated (%d) is larger than total number " - "of domains (%d) for the given shape (%s)",gr_mat+1,Nmat_need,shapename); - else Nmat_need++; - strcat(shapename,"_gran"); - } - // check if enough refractive indices or extra - if (Nmat<Nmat_need) { - if (prognose) { - if (dpl_def_used) PrintError("Given number of refractive indices (%d) is less " - "than number of domains (%d). Since computational grid is initialized based on " - "the default dpl, it may change depending on the actual refractive indices.", - Nmat,Nmat_need); - } - else PrintError("Only %d refractive indices are given. %d are required",Nmat,Nmat_need); - } - else if (Nmat>Nmat_need) LogError(EC_INFO,ONE_POS, - "More refractive indices are given (%d) than actually used (%d)",Nmat,Nmat_need); - Nmat=Nmat_need; - - // check anisotropic refractive indices for symmetries - if (anisotropy) for (i=0;i<Nmat;i++) symR=symR && ref_index[3*i][RE]==ref_index[3*i+1][RE] - && ref_index[3*i][IM]==ref_index[3*i+1][IM]; - - if (sym_type==SYM_NO) symX=symY=symZ=symR=FALSE; - else if (sym_type==SYM_ENF) symX=symY=symZ=symR=TRUE; - - // determine which size to use - if (size_det_sh) { - if (size_given_cmd) LogError(EC_INFO,ONE_POS,"Particle size specified by command line " - "option '-%s' overrides the internal specification of the shape '%s'. The particle " - "will be scaled accordingly.",sizename,shapename); - else sizeX=n_sizeX; - } - // use analytic connection between sizeX and a_eq if available - if (a_eq!=UNDEF && volume_ratio!=UNDEF) - sizeX=pow(FOUR_PI_OVER_THREE/volume_ratio,ONE_THIRD)*a_eq; - /* Initialization of boxX; - * if boxX is not defined by command line, it is either set by shape itself or - * if sizeX is set, boxX is initialized to default - * else dpl is initialized to default (if undefined) and boxX is calculated from sizeX and dpl - * else adjust boxX if needed. - */ - if (boxX==UNDEF && !box_det_sh) { - if (sizeX==UNDEF) { - // if a_eq is set, but sizeX was not initialized before - error - if (a_eq!=UNDEF) PrintError("Grid size can not be automatically determined from " - "equivalent radius and dpl for shape '%s', because its volume is not known " - "analytically. Either use '-size' instead of '-eq_rad' or specify grid size " - "manually by '-grid'.",shapename); - // default value for boxX; FitBox is redundant but safer for future changes - boxX=FitBox(DEF_GRID); - } - else { - if (dpl==UNDEF) { - /* use default dpl, but make sure that it does not produce too small grid - * (e.g. for nanoparticles). - */ - temp=(int)ceil(sizeX*dpl_def/lambda); - boxX=FitBox(MAX(temp,MIN_AUTO_GRID)); - dpl_def_used=TRUE; - } - else { // if dpl is given in the command line; then believe it - boxX=FitBox((int)ceil(sizeX*dpl/lambda)); - dpl=UNDEF; // dpl is given correct value in make_particle() - } - } - } - else { - /* warnings are issued if specified boxX need to be adjusted, - * especially when '-size' is used - */ - if (boxX!=UNDEF) temp=boxX; - else temp=n_boxX; - if ((boxX=FitBox(temp))!=temp) { - if (sizeX==UNDEF) LogError(EC_WARN,ONE_POS,"boxX has been adjusted from %i to %i. " - "Size along X-axis in the shape description is the size of new (adjusted) " - "computational grid.",temp,boxX); - else LogError(EC_WARN,ONE_POS, - "boxX has been adjusted from %i to %i. Size specified by the command line option " - "'-size' is used for the new (adjusted) computational grid.",temp,boxX); - } - if (box_det_sh && n_boxX>boxX) - PrintError("Particle (boxX=%d) does not fit into specified boxX=%d",n_boxX,boxX); - } - // if shape is determined by ratios, calculate proposed grid sizes along y and z axes - if (yx_ratio!=UNDEF) n_boxY=(int)ceil(yx_ratio*boxX); - if (zx_ratio!=UNDEF) n_boxZ=(int)ceil(zx_ratio*boxX); - // set boxY and boxZ - if (boxY==UNDEF) { // assumed that boxY and boxZ are either both defined or both not defined - boxY=FitBox(n_boxY); - boxZ=FitBox(n_boxZ); - } - else { - temp=boxY; - if ((boxY=FitBox(boxY))!=temp) - LogError(EC_WARN,ONE_POS,"boxY has been adjusted from %i to %i",temp,boxY); - temp=boxZ; - if ((boxZ=FitBox(boxZ))!=temp) - LogError(EC_WARN,ONE_POS,"boxZ has been adjusted from %i to %i",temp,boxZ); - // this error is not duplicated in the log file since it does not yet exist - if (n_boxY>boxY || n_boxZ>boxZ) - PrintError("Particle (boxY,Z={%d,%d}) does not fit into specified boxY,Z={%d,%d}", - n_boxY,n_boxZ,boxY,boxZ); - } - // initialize number of dipoles - Ndip=boxX*((double)boxY)*boxZ; - // initialize maxiter; not very realistic - if (maxiter==UNDEF) maxiter=MIN(INT_MAX,3*Ndip); - // some old, not really logical heuristics for Ntheta, but better than constant value - if (nTheta==UNDEF) { - if (Ndip<1000) nTheta=91; - else if (Ndip<10000) nTheta=181; - else if (Ndip<100000) nTheta=361; - else nTheta=721; - } - // this limitation should be removed in the future - if (chp_type!=CHP_NONE && (!symR || scat_grid)) LogError(EC_ERROR,ONE_POS, - "Currently checkpoints can be used when internal fields are calculated only once," - "i.e. for a single incident polarization."); - Timing_Particle = GET_TIME() - tstart; + int n_boxX,n_boxY,n_boxZ,temp; /* new values for dimensions */ + double h_d,b_d,c_d,h2,b2,c2; + double yx_ratio,zx_ratio,tmp1,tmp2,tmp3; + double diskratio,aspectY,aspectZ; + double ad,ct,ct2; /* cos(theta0) and its square */ + TIME_TYPE tstart; + int Nmat_need,i; + int dpl_def_used; /* if default dpl is used for grid initialization */ + int box_det_sh; /* if boxX is determined by shape itself */ + /* TO ADD NEW SHAPE + Add here all intermediate variables, which are used only inside this function. You may as well + use 'tmp1'-'tmp3' variables defined above. */ + + tstart=GET_TIME(); + /* trivial now, but may be more cases in the future */ + box_det_sh=(shape==SH_READ); + /* check for redundancy of input data */ + if (dpl!=UNDEF && (sizeX!=UNDEF || a_eq!=UNDEF)) { + if (boxX!=UNDEF) PrintError("Extra information is given by setting '-dpl, '-grid', and "\ + "either '-size' or '-eq_rad'"); + if (box_det_sh) PrintError("Extra information is given by setting both '-dpl' and either "\ + "'-size' or '-eq_rad', while shape '%s' sets the size of the grid",shapename); + } + /* calculate default dpl - 10*sqrt(max(|m|)); + for anisotropic each component is considered separately */ + tmp2=0; + for (i=0;i<Ncomp*Nmat;i++) { + tmp1=cAbs2(ref_index[i]); + if (tmp2<tmp1) tmp2=tmp1; + } + dpl_def=10*sqrt(tmp2); + dpl_def_used=FALSE; + /* initialization of global option index for error messages */ + opt=opt_sh; + /* shape initialization */ + if (shape==SH_BOX) { + if (sh_Npars==0) { + STRCPYZ(sh_form_str,"cube; size of edge along x-axis:%.10g"); + aspectY=aspectZ=1; + } + else { /* 2 parameters are given */ + aspectY=sh_pars[0]; + TestPositive(aspectY,"aspect ratio y/x"); + aspectZ=sh_pars[1]; + TestPositive(aspectZ,"aspect ratio z/x"); + SPRINTZ(sh_form_str, + "rectangular parallelepiped; size along x-axis:%%.10g, aspect ratios y/x=%.10g, z/x=%.10g", + aspectY,aspectZ); + } + if (aspectY!=1) symR=FALSE; + /* set half-aspect raios */ + haspY=aspectY/2; + haspZ=aspectZ/2; + volume_ratio=aspectY*aspectZ; + yx_ratio=aspectY; + zx_ratio=aspectZ; + Nmat_need=1; + } + else if(shape==SH_CAPSULE) { + diskratio=sh_pars[0]; + TestNonNegative(diskratio,"height to diameter ratio"); + SPRINTZ(sh_form_str,"capsule; diameter(d):%%.10g, cylinder height h/d=%.10g",diskratio); + hdratio=diskratio/2; + volume_ratio = PI_OVER_FOUR*diskratio + PI_OVER_SIX; + yx_ratio=1; + zx_ratio=diskratio+1; + Nmat_need=1; + } + else if (shape==SH_COATED) { + coat_ratio=sh_pars[0]; + TestRangeII(coat_ratio,"innner/outer diameter ratio",0,1); + SPRINTZ(sh_form_str,"coated sphere; diameter(d):%%.10g, inner diameter d_in/d=%.10g", + coat_ratio); + if (sh_Npars==4) { + coat_x=sh_pars[1]; + coat_y=sh_pars[2]; + coat_z=sh_pars[3]; + if (coat_x*coat_x+coat_y*coat_y+coat_z*coat_z>0.25*(1-coat_ratio)*(1-coat_ratio)) + PrintErrorHelp("Inner sphere is not fully inside the outer"); + SPRINTZ(sh_form_str+strlen(sh_form_str), + "\n position of inner sphere center r/d= {%.10g,%.10g,%.10g}", + coat_x,coat_y,coat_z); + } + else coat_x=coat_y=coat_z=0; /* initialize default values */ + coat_r2=0.25*coat_ratio*coat_ratio; + volume_ratio=PI_OVER_SIX; + if (coat_x!=0) symX=symR=FALSE; + if (coat_y!=0) symY=symR=FALSE; + if (coat_z!=0) symZ=FALSE; + yx_ratio=zx_ratio=1; + Nmat_need=2; + } + else if(shape==SH_CYLINDER) { + diskratio=sh_pars[0]; + TestPositive(diskratio,"height to diameter ratio"); + SPRINTZ(sh_form_str,"cylinder; diameter(d):%%.10g, height h/d=%.10g",diskratio); + hdratio=diskratio/2; + volume_ratio=PI_OVER_FOUR*diskratio; + yx_ratio=1; + zx_ratio=diskratio; + Nmat_need=1; + } + else if (shape==SH_EGG) { + /* determined by equation: (a/r)^2=1+nu*cos(theta)-(1-eps)cos^2(theta) + or equivalently: a^2=r^2+nu*r*z-(1-eps)z^2. Parameters must be 0<eps<=1, 0<=nu<eps. + This shape is proposed in: Hahn D.V., Limsui D., Joseph R.I., Baldwin K.C., Boggs N.T., + Carr A.K., Carter C.C., Han T.S., and Thomas M.E. "Shape characteristics of biological + spores", paper 6954-31 to be presented at "SPIE Defence + Security", March 2008 */ + egeps=sh_pars[0]; + TestRangeNI(egeps,"egg parameter epsilon",0,1); + egnu=sh_pars[1]; + TestRangeIN(egnu,"egg parameter nu",0,egeps); + /* egg shape is symmetric about z-axis (xz and yz planes, but generally NOT xy plane) */ + if (egnu!=0) symZ=FALSE; + /* cos(theta0): ct=-nu/[eps+sqrt(eps^2-nu^2)]; this expression for root of the quadratic + equation is used for numerical stability (i.e. when nu=0); at this theta0 the diameter + (maximum width perpendicular to z-axis) d=Dx is located */ + ct=-egnu/(egeps+sqrt(egeps*egeps-egnu*egnu)); + ct2=ct*ct; + /* Determine ad=(a/d) and its square */ + ad2=(1+egnu*ct-(1-egeps)*ct2)/(4*(1-ct2)); + ad=sqrt(ad2); + tmp1=1/sqrt(egeps+egnu); + tmp2=1/sqrt(egeps-egnu); + tmp3=2*(1-egeps); + /* Center of the computational box (z coordinate): z0=(a/d)*[1/sqrt(eps+nu)+1/sqrt(eps-nu)]/2; + but more numerically stable expression is used (for nu->0). Although it may overflow faster + for nu->eps, volume_ratio (below) will overflow even faster. It is used to shift coordinates + from the computational reference frame (centered at z0) to the natural one */ + egz0=-ad*egnu*(tmp1*tmp1*tmp2*tmp2)/(tmp1+tmp2); + /* (V/d^3)=(4*pi/3)*(a/d)^3*{[2(1-eps)-nu]/sqrt(eps+nu)+[2(1-eps)+nu]/sqrt(eps-nu)}/ + /[nu^2+4(1-eps)] */ + volume_ratio=FOUR_PI_OVER_THREE*ad2*ad*((tmp3-egnu)*tmp1+(tmp3+egnu)*tmp2)/(egnu*egnu+2*tmp3); + SPRINTZ(sh_form_str,"egg; diameter(d):%%.10g, epsilon=%.10g, nu=%.10g, a/d=%.10g", + egeps,egnu,ad); + Nmat_need=1; + yx_ratio=1; + zx_ratio=ad*(tmp1+tmp2); /* (a/d)*[1/sqrt(eps+nu)+1/sqrt(eps-nu)] */ + } + else if (shape==SH_ELLIPSOID) { + aspectY=sh_pars[0]; + TestPositive(aspectY,"aspect ratio y/x"); + aspectZ=sh_pars[1]; + TestPositive(aspectZ,"aspect ratio z/x"); + SPRINTZ(sh_form_str,"ellipsoid; size along x-axis:%%.10g, aspect ratios y/x=%.10g, z/x=%.10g", + aspectY,aspectZ); + if (aspectY!=1) symR=FALSE; + /* set inverse squares of ascpect ratios */ + invsqY=1/(aspectY*aspectY); + invsqZ=1/(aspectZ*aspectZ); + volume_ratio=PI_OVER_SIX*aspectY*aspectZ; + yx_ratio=aspectY; + zx_ratio=aspectZ; + Nmat_need=1; + } + else if (shape==SH_LINE) { + STRCPYZ(sh_form_str,"line; length:%g"); + symY=symZ=symR=FALSE; + n_boxY=n_boxZ=jagged; + yx_ratio=zx_ratio=UNDEF; + volume_ratio=UNDEF; + Nmat_need=1; + } + else if(shape==SH_RBC) { + /* three-parameter shape; developed by K.A.Semyanov,P.A.Tarasov,P.A.Avrorov + based on work by P.W.Kuchel and E.D.Fackerell, "Parametric-equation representation + of biconcave erythrocytes," Bulletin of Mathematical Biology 61, 209-220 (1999). + ro^4+2S*ro^2*z^2+z^4+P*ro^2+Q*z^2+R=0, ro^2=x^2+y^2, P,Q,R,S are determined by d,h,b,c given + in the command line */ + h_d=sh_pars[0]; + TestPositive(h_d,"ratio of maximum width to diameter"); + b_d=sh_pars[1]; + TestNonNegative(b_d,"ratio of minimum width to diameter"); + if (h_d<=b_d) PrintErrorHelp("given RBC is not biconcave; maximum width is in the center"); + c_d=sh_pars[2]; + TestRangeII(c_d,"relative diameter of maximum width",0,1); + SPRINTZ(sh_form_str, + "red blood cell; diameter(d):%%.10g, maximum and minimum width h/d=%.10g, b/d=%.10g\n"\ + " diameter of maximum width c/d=%.10g",h_d,b_d,c_d); + /* calculate shape parameters */ + h2=h_d*h_d; + b2=b_d*b_d; + c2=c_d*c_d; + /* P={(b/d)^2*[c^4/(h^2-b^2)-h^2]-d^2}/4; Q=(d/b)^2*(P+d^2/4)-b^2/4; R=-d^2*(P+d^2/4)/4; + S=-(2P+c^2)/h^2; here P,Q,R,S are made dimensionless dividing by respective powers of d + Calculation is performed so that Q is well defined even for b=0 */ + tmp1=((c2*c2/(h2-b2))-h2)/4; + P=b2*tmp1-0.25; + Q=tmp1-(b2/4); + R=-b2*tmp1/4; + S=-(2*P+c2)/h2; + yx_ratio=1; + zx_ratio=h_d; + volume_ratio=UNDEF; + Nmat_need=1; + } + else if (shape==SH_READ) { + SPRINTZ(sh_form_str,"specified by file %s; size along x-axis:%%.10g",aggregate_file); + symX=symY=symZ=symR=FALSE; /* input file is assumed assymetric */ + InitDipFile(aggregate_file,&n_boxX,&n_boxY,&n_boxZ,&Nmat_need); + yx_ratio=zx_ratio=UNDEF; + volume_ratio=UNDEF; + } + else if (shape==SH_SPHERE) { + STRCPYZ(sh_form_str,"sphere; diameter:%.10g"); + volume_ratio=PI_OVER_SIX; + yx_ratio=zx_ratio=1; + Nmat_need=1; + } + else if (shape==SH_SPHEREBOX) { + coat_ratio=sh_pars[0]; + TestRangeII(coat_ratio,"sphere diameter/cube edge ratio",0,1); + SPRINTZ(sh_form_str, + "sphere in cube; size of cube edge(a):%%.10g, diameter of sphere d/a=%.10g",coat_ratio); + coat_r2=0.25*coat_ratio*coat_ratio; + yx_ratio=zx_ratio=1; + volume_ratio=1; + Nmat_need=2; + } + /* TO ADD NEW SHAPE + add an option here (in the end of 'else if' sequence). Identifier ('SH_...') should be + defined in const.h. The option should + 1) save all the input parameters from array 'sh_pars' to local variables + (defined in the beginning of this source files) + 2) test all input parameters (for that you're encouraged to use functions from param.h since + they would automatically produce informative output in case of error). If the shape can + accept different number of parameters (UNDEF was set in array shape_opt) then also test the + number of parameters. + 3) if shape breaks any symmetry, corresponding variable should be set to FALSE. Do not set + any of them to TRUE, as they can be set to FALSE by some other factors. + symX, symY, symZ - symmetries of reflection over planes YZ, XZ, XY respectively. + symR - symmetry of rotation for 90 degrees over the Z axis + 4) initialize the following: + sh_form_str - descriptive string, should contain %g - it would be replaced by box size along + x-axis afterwards (in param.c). + Either yx_ratio (preferably) or n_boxY. The former is a ratio of particle sizes along y and x + axes. Initialize n_boxY directly only if it is not proportional to boxX, like in + shape LINE above, since boxX is not initialized at this moment. If yx_ratio is not + initialized, set it explicitely to UNDEF. + Analogously either zx_ratio (preferably) or n_boxZ. + Nmat_need - number of different domains in this shape (void is not included) + volume_ratio - ratio of particle volume to (boxX)^3. Initialize it if it can be calculated + analytically or set to UNDEF otherwise. This parameter is crucial if one wants + to initialize computational grid from '-eq_rad' and '-dpl'. + all other auxiliary variables, which are used in shape generation (MakeParticle(), see below), + should be defined in the beginning of this file. If you need temporary local variables + (which are used only in this part of the code), either use 'tmp1'-'tmp3' or define your + own (with more informative names) in the beginning of this function. */ + + /* initialize domain granulation */ + if (sh_granul) { + symX=symY=symZ=symR=FALSE; /* no symmetry with granules */ + if (gr_mat+1>Nmat_need) PrintError("Specified domain number to be granulated (%d) is larger "\ + "than total number of domains (%d) for the given shape (%s)",gr_mat+1,Nmat_need,shapename); + else Nmat_need++; + strcat(shapename,"_gran"); + } + /* check if enough refr. indices or extra */ + if (Nmat<Nmat_need) { + if (prognose) { + if (dpl_def_used) PrintError("Given number of refractive indices (%d) is less "\ + "than number of domains (%d). Since computational grid is initialized based on the "\ + "default dpl, it may change depending on the actual refractive indices.",Nmat,Nmat_need); + } + else PrintError("Only %d refractive indices are given. %d are required",Nmat,Nmat_need); + } + else if (Nmat>Nmat_need) LogError(EC_INFO,ONE_POS, + "More refractive indices are given (%d) than actually used (%d)",Nmat,Nmat_need); + Nmat=Nmat_need; + + /* ckeck anisotropic refractive indices for symmetries */ + if (anisotropy) for (i=0;i<Nmat;i++) symR=symR && ref_index[3*i][RE]==ref_index[3*i+1][RE] + && ref_index[3*i][IM]==ref_index[3*i+1][IM]; + + if (sym_type==SYM_NO) symX=symY=symZ=symR=FALSE; + else if (sym_type==SYM_ENF) symX=symY=symZ=symR=TRUE; + + /* use analytic connection between sizeX and a_eq if available */ + if (a_eq!=UNDEF && volume_ratio!=UNDEF) + sizeX=pow(FOUR_PI_OVER_THREE/volume_ratio,ONE_THIRD)*a_eq; + /* Initializitation of boxX; + if boxX is not defined by command line, it is either set by shape itself or + if sizeX is set, boxX is initialized to default + else dpl is initialized to default (if undefined) and boxX is calculated from sizeX and dpl + else adjust boxX if needed */ + if (boxX==UNDEF) { + if (box_det_sh) boxX=FitBox(n_boxX); + else { + if (sizeX==UNDEF) { + /* if a_eq is set, but sizeX was not initialized before - error */ + if (a_eq!=UNDEF) PrintError("Grid size can not be automatically determined from "\ + "equivalent radius and dpl for shape '%s', because its volume is not known "\ + "analytically. Either use '-size' instead of '-eq_rad' or specify grid size manually "\ + "by '-grid'.",shapename); + /* default value for boxX; FitBox is redundant but safer for future changes */ + boxX=FitBox(DEF_GRID); + } + else { + if (dpl==UNDEF) { + /* use default dpl, but make sure that it does not produce too small grid + (e.g. for nanoparticles) */ + temp=(int)ceil(sizeX*dpl_def/lambda); + boxX=FitBox(MAX(temp,MIN_AUTO_GRID)); + dpl_def_used=TRUE; + } + else { /* if dpl is given in the command line; then believe it */ + boxX=FitBox((int)ceil(sizeX*dpl/lambda)); + dpl=UNDEF; /* dpl is given correct value in make_particle() */ + } + } + } + } + else { + temp=boxX; + if ((boxX=FitBox(boxX))!=temp) + LogError(EC_WARN,ONE_POS,"boxX has been adjusted from %i to %i",temp,boxX); + if (box_det_sh && n_boxX>boxX) + PrintError("Particle (boxX=%d) does not fit into specified boxX=%d",n_boxX,boxX); + } + /* if shape is determined by ratios, calculate proposed grid sizes along y and z axes */ + if (yx_ratio!=UNDEF) n_boxY=(int)ceil(yx_ratio*boxX); + if (zx_ratio!=UNDEF) n_boxZ=(int)ceil(zx_ratio*boxX); + /* set boxY and boxZ */ + if (boxY==UNDEF) { /* assumed that boxY and boxZ are either both defined or both not defined */ + boxY=FitBox(n_boxY); + boxZ=FitBox(n_boxZ); + } + else { + temp=boxY; + if ((boxY=FitBox(boxY))!=temp) + LogError(EC_WARN,ONE_POS,"boxY has been adjusted from %i to %i",temp,boxY); + temp=boxZ; + if ((boxZ=FitBox(boxZ))!=temp) + LogError(EC_WARN,ONE_POS,"boxZ has been adjusted from %i to %i",temp,boxZ); + /* this error is not duplicated in the logfile since it does not yet exist */ + if (n_boxY>boxY || n_boxZ>boxZ) + PrintError("Particle (boxY,Z={%d,%d}) does not fit into specified boxY,Z={%d,%d}", + n_boxY,n_boxZ,boxY,boxZ); + } + /* initialize number of dipoles */ + Ndip=boxX*((double)boxY)*boxZ; + /* initialize maxiter; not very realistic */ + if (maxiter==UNDEF) maxiter=MIN(INT_MAX,3*Ndip); + /* some old, not really logical heuristics for Ntheta, but better than constant value */ + if (nTheta==UNDEF) { + if (Ndip<1000) nTheta=91; + else if (Ndip<10000) nTheta=181; + else if (Ndip<100000) nTheta=361; + else nTheta=721; + } + /* this limitation should be removed in the future */ + if (chp_type!=CHP_NONE && (!symR || scat_grid)) LogError(EC_ERROR,ONE_POS, + "Currently checkpoints can be used when internal fields are calculated only once,"\ + "i.e. for a single incident polarization."); + Timing_Particle = GET_TIME() - tstart; } -//========================================================== +/*==========================================================*/ void MakeParticle(void) -// creates a particle; initializes all dipoles counts, dpl, gridspace + /* creates a particle; initializes all dipoles counts, dpl, gridspace */ { - int i,j,k,ns; - size_t index,dip,nlocalRows_tmp; - double tmp1,tmp2,tmp3; - double xr,yr,zr,xcoat,ycoat,zcoat,r2,z2,zshift; - double jcX,jcY,jcZ; // center for jagged - int local_z0_unif; // should be global or semi-global - int largerZ,smallerZ; // number of larger and smaller z in intersections with contours - int xj,yj,zj; - int mat; - unsigned short us_tmp; - TIME_TYPE tstart,tgran; - /* TO ADD NEW SHAPE - * Add here all intermediate variables, which are used only inside this function. You may as - * well use 'tmp1'-'tmp3' variables defined above. - */ - - tstart=GET_TIME(); - - index=0; - // assumed that box's are even - jcX=jcY=jcZ=jagged/2.0; - cX=(boxX-1)/2.0; - cY=(boxY-1)/2.0; - cZ=(boxZ-1)/2.0; - nlocalRows_tmp=MultOverflow(3,local_Ndip,ALL_POS,"nlocalRows_tmp"); - /* allocate temporary memory; even if prognosis, since they are needed for exact estimation - * they will be reallocated afterwards (when nlocalRows is known). - */ - MALLOC_VECTOR(material_tmp,uchar,local_Ndip,ALL); - MALLOC_VECTOR(DipoleCoord_tmp,double,nlocalRows_tmp,ALL); - MALLOC_VECTOR(position_tmp,ushort,nlocalRows_tmp,ALL); - - for(k=local_z0;k<local_z1_coer;k++) - for(j=0;j<boxY;j++) - for(i=0;i<boxX;i++) { - xj=jagged*(i/jagged)-boxX/2; - yj=jagged*(j/jagged)-boxY/2; - zj=jagged*(k/jagged)-boxZ/2; - - xr=(xj+jcX)/(boxX); - yr=(yj+jcY)/(boxX); - zr=(zj+jcZ)/(boxX); - - mat=Nmat; // corresponds to void - - if (shape==SH_AXISYMMETRIC) { - r2=xr*xr+yr*yr; - if (r2>=contRoSqMin && r2<=0.25) { - largerZ=smallerZ=0; - contCurRo=sqrt(r2); - contCurZ=zr; - for (ns=0;ns<contNseg;ns++) - if (contCurRo>=contSegRoMin[ns] && contCurRo<=contSegRoMax[ns]) - CheckContourSegment(contSeg+ns) ? largerZ++ : smallerZ++; - // check for consistency; if the code is perfect, this is not needed - if (!IS_EVEN(largerZ+smallerZ)) LogError(EC_ERROR,ALL_POS, - "Point (ro,z)=(%g,%g) produced weird result when checking whether it lies " - "inside the contour. Larger than z %d intersections, smaller - %d.", - contCurRo,contCurZ,largerZ,smallerZ); - if (!IS_EVEN(largerZ)) mat=0; - } - } - else if (shape==SH_BOX) { - if (fabs(yr)<=haspY && fabs(zr)<=haspZ) mat=0; - } - else if (shape==SH_CAPSULE) { - r2=xr*xr+yr*yr; - if (r2<=0.25) { - tmp1=fabs(zr)-hdratio; - if (tmp1<=0 || tmp1*tmp1+r2<=0.25) mat=0; - } - } - else if (shape==SH_COATED) { - if (xr*xr+yr*yr+zr*zr<=0.25) { // first test to skip some dipoles immediately) - xcoat=xr-coat_x; - ycoat=yr-coat_y; - zcoat=zr-coat_z; - if (xcoat*xcoat+ycoat*ycoat+zcoat*zcoat<=coat_r2) mat=1; - else mat=0; - } - } - else if (shape==SH_CYLINDER) { - if(xr*xr+yr*yr<=0.25 && fabs(zr)<=hdratio) mat=0; - } - else if (shape==SH_EGG) { - r2=xr*xr+yr*yr; - zshift=zr+egz0; - z2=zshift*zshift; - if (r2+egeps*z2+egnu*zshift*sqrt(r2+z2)<=ad2) mat=0; - } - else if (shape==SH_ELLIPSOID) { - if (xr*xr+yr*yr*invsqY+zr*zr*invsqZ<=0.25) mat=0; - } - else if (shape==SH_LINE) { - if (yj==0 && zj==0) mat=0; - } - else if (shape==SH_RBC) { - r2=xr*xr+yr*yr; - z2=zr*zr; - if (r2*r2+2*S*r2*z2+z2*z2+P*r2+Q*z2+R<=0) mat=0; - } - else if (shape==SH_SPHERE) { - if (xr*xr+yr*yr+zr*zr<=0.25) mat=0; - } - else if (shape==SH_SPHEREBOX) { - if (xr*xr+yr*yr+zr*zr<=coat_r2) mat=1; - else if (fabs(yr)<=0.5 && fabs(zr)<=0.5) mat=0; - } - /* TO ADD NEW SHAPE - * add an option here (in the end of 'else if' sequence). Identifier ('SH_...') - * should be defined in const.h. This option should set 'mat' - index of domain for - * a point, specified by {xr,yr,zr} - coordinates divided by grid size along X (xr - * from -0.5 to 0.5, others - depending on aspect ratios). C array indexing used: - * mat=0 - first domain, etc. If point corresponds to void, do not set 'mat'. If you - * need temporary local variables (which are used only in this part of the code), - * either use 'tmp1'-'tmp3' or define your own (with more informative names) in the - * beginning of this function. - */ - - position_tmp[3*index]=(unsigned short)i; - position_tmp[3*index+1]=(unsigned short)j; - position_tmp[3*index+2]=(unsigned short)k; - // afterwards multiplied by gridspace - DipoleCoord_tmp[3*index]=i-cX; - DipoleCoord_tmp[3*index+1]=j-cY; - DipoleCoord_tmp[3*index+2]=k-cZ; - material_tmp[index]=(unsigned char)mat; - index++; - } // End box loop - if (shape==SH_READ) ReadDipFile(shape_fname); - // initialization of mat_count and dipoles counts - for(i=0;i<=Nmat;i++) mat_count[i]=0; - for(dip=0;dip<local_Ndip;dip++) mat_count[material_tmp[dip]]++; - local_nvoid_Ndip=local_Ndip-mat_count[Nmat]; - MyInnerProduct(mat_count,double_type,Nmat+1,NULL); - if ((nvoid_Ndip=Ndip-mat_count[Nmat])==0) - LogError(EC_ERROR,ONE_POS,"All dipoles of the scatterer are void"); - nlocalRows=3*local_nvoid_Ndip; - // initialize dpl and gridspace - volcor_used=(volcor && (volume_ratio!=UNDEF)); - if (sizeX==UNDEF) { - if (a_eq!=UNDEF) dpl=lambda*pow(nvoid_Ndip*THREE_OVER_FOUR_PI,ONE_THIRD)/a_eq; - else if (dpl==UNDEF) dpl=dpl_def; // default value of dpl - // sizeX is determined to give correct volume - if (volcor_used) sizeX=lambda*pow(nvoid_Ndip/volume_ratio,ONE_THIRD)/dpl; - else sizeX=lambda*boxX/dpl; - } - else { - // dpl is determined to give correct volume - if (volcor_used) dpl=lambda*pow(nvoid_Ndip/volume_ratio,ONE_THIRD)/sizeX; - else dpl=lambda*boxX/sizeX; - } - // Check consistency for FCD - if ((IntRelation==G_FCD || PolRelation==POL_FCD) && dpl<=2) - LogError(EC_ERROR,ONE_POS,"Too small dpl for FCD formulation, should be at least 2"); - gridspace=lambda/dpl; - // initialize equivalent size parameter and cross section - kd = TWO_PI/dpl; - /* from this moment on a_eq and all derived quantities are based on the real a_eq, which can - * in several cases be slightly different from the one given by '-eq_rad' option. - */ - a_eq = pow(THREE_OVER_FOUR_PI*nvoid_Ndip,ONE_THIRD)*gridspace; - ka_eq = WaveNum*a_eq; - inv_G = 1/(PI*a_eq*a_eq); - // granulate one domain, if needed - if (sh_granul) { - tgran=GET_TIME(); - Timing_Granul_comm=0; - // calculate number of granules - if (mat_count[gr_mat]==0) LogError(EC_ERROR,ONE_POS, - "Domain to be granulated does not contain any dipoles"); - tmp1=gridspace/gr_d; - tmp2=mat_count[gr_mat]*gr_vf*SIX_OVER_PI; - tmp3=tmp2*tmp1*tmp1*tmp1; - CheckOverflow(tmp3,ONE_POS,"Make_Particle()"); - gr_N=(size_t)ceil(tmp3); - // correct granules diameter to get exact volume fraction (if volume correction is used) - if (volcor_used) gr_d=gridspace*pow(tmp2/gr_N,ONE_THIRD); - // actually place granules - mat_count[Nmat-1]=PlaceGranules(); - // calculate exact volume fraction - gr_vf_real=mat_count[Nmat-1]/mat_count[gr_mat]; - mat_count[gr_mat]-=mat_count[Nmat-1]; - Timing_Granul=GET_TIME()-tgran; - } - /* allocate main particle arrays, using precise nlocalRows even when prognosis is used to enable - * save_geom afterwards. - */ - MALLOC_VECTOR(material,uchar,local_nvoid_Ndip,ALL); - MALLOC_VECTOR(DipoleCoord,double,nlocalRows,ALL); - MALLOC_VECTOR(position,ushort,nlocalRows,ALL); - - memory+=(3*(sizeof(short int)+sizeof(double))+sizeof(char))*local_nvoid_Ndip; - // copy nontrivial part of arrays - index=0; - for (dip=0;dip<local_Ndip;dip++) if (material_tmp[dip]<Nmat) { - material[index]=material_tmp[dip]; - // DipoleCoord=gridspace*DipoleCoord_tmp - MultScal(gridspace,DipoleCoord_tmp+3*dip,DipoleCoord+3*index); - memcpy(position+3*index,position_tmp+3*dip,3*sizeof(short int)); - index++; - } - // free temporary memory - Free_general(material_tmp); - Free_general(DipoleCoord_tmp); - Free_general(position_tmp); - if (shape==SH_AXISYMMETRIC) { - for (ns=0;ns<contNseg;ns++) FreeContourSegment(contSeg+ns); - Free_general(contSegRoMin); - Free_general(contSegRoMax); - } - - // save geometry - if (save_geom) SaveGeometry(); - - /* adjust z-axis of position vector, to speed-up matrix-vector multiplication a little bit; - * after this point 'position(z)' is taken relative to the local_z0. - */ - if (local_z0!=0) { - us_tmp=(unsigned short)local_z0; - for (dip=2;dip<3*local_nvoid_Ndip;dip+=3) position[dip]-=us_tmp; - } - local_Nz_unif=position[3*local_nvoid_Ndip-1]+1; - local_z0_unif=local_z0; // TODO: should be changed afterwards - box_origin_unif[0]=-gridspace*cX; - box_origin_unif[1]=-gridspace*cY; - box_origin_unif[2]=gridspace*(local_z0_unif-cZ); - - Timing_Particle += GET_TIME() - tstart; + int i, j, k; + size_t index,dip,nlocalRows_tmp; + double tmp1,tmp2,tmp3; + double xr,yr,zr,xcoat,ycoat,zcoat,r2,z2,zshift; + double cX,cY,cZ,jcX,jcY,jcZ; /* centers for DipoleCoord and jagged */ + int local_z0_unif; /* should be global or semi-global */ + int xj,yj,zj; + int mat; + unsigned short us_tmp; + TIME_TYPE tstart,tgran; + /* TO ADD NEW SHAPE + Add here all intermediate variables, which are used only inside this function. You may as well + use 'tmp1'-'tmp3' variables defined above. */ + + tstart=GET_TIME(); + + index=0; + /* assumed that box's are even */ + jcX=jcY=jcZ=jagged/2.0; + cX=(boxX-1)/2.0; + cY=(boxY-1)/2.0; + cZ=(boxZ-1)/2.0; + nlocalRows_tmp=MultOverflow(3,local_Ndip,ALL_POS,"nlocalRows_tmp"); + /* allocate temporary memory; even if prognose, since they are needed for exact estimation + they will be reallocated afterwards (when nlocalRows is known) */ + MALLOC_VECTOR(material_tmp,uchar,local_Ndip,ALL); + MALLOC_VECTOR(DipoleCoord_tmp,double,nlocalRows_tmp,ALL); + MALLOC_VECTOR(position_tmp,ushort,nlocalRows_tmp,ALL); + + for(k=local_z0;k<local_z1_coer;k++) + for(j=0;j<boxY;j++) + for(i=0;i<boxX;i++) { + xj=jagged*(i/jagged)-boxX/2; + yj=jagged*(j/jagged)-boxY/2; + zj=jagged*(k/jagged)-boxZ/2; + + xr=(xj+jcX)/(boxX); + yr=(yj+jcY)/(boxX); + zr=(zj+jcZ)/(boxX); + + mat=Nmat; /* corresponds to void */ + + if (shape==SH_BOX) { + if (fabs(yr)<=haspY && fabs(zr)<=haspZ) mat=0; + } + else if (shape==SH_CAPSULE) { + r2=xr*xr+yr*yr; + if (r2<=0.25) { + tmp1=fabs(zr)-hdratio; + if (tmp1<=0 || tmp1*tmp1+r2<=0.25) mat=0; + } + } + else if (shape==SH_COATED) { + if (xr*xr+yr*yr+zr*zr<=0.25) { /* first test to skip some dipoles immediately) */ + xcoat=xr-coat_x; + ycoat=yr-coat_y; + zcoat=zr-coat_z; + if (xcoat*xcoat+ycoat*ycoat+zcoat*zcoat<=coat_r2) mat=1; + else mat=0; + } + } + else if (shape==SH_CYLINDER) { + if(xr*xr+yr*yr<=0.25 && fabs(zr)<=hdratio) mat=0; + } + else if (shape==SH_EGG) { + r2=xr*xr+yr*yr; + zshift=zr+egz0; + z2=zshift*zshift; + if (r2+egeps*z2+egnu*zshift*sqrt(r2+z2)<=ad2) mat=0; + } + else if (shape==SH_ELLIPSOID) { + if (xr*xr+yr*yr*invsqY+zr*zr*invsqZ<=0.25) mat=0; + } + else if (shape==SH_LINE) { + if (yj==0 && zj==0) mat=0; + } + else if (shape==SH_RBC) { + r2=xr*xr+yr*yr; + z2=zr*zr; + if (r2*r2+2*S*r2*z2+z2*z2+P*r2+Q*z2+R<=0) mat=0; + } + else if (shape==SH_SPHERE) { + if (xr*xr+yr*yr+zr*zr<=0.25) mat=0; + } + else if (shape==SH_SPHEREBOX) { + if (xr*xr+yr*yr+zr*zr<=coat_r2) mat=1; + else if (fabs(yr)<=0.5 && fabs(zr)<=0.5) mat=0; + } + /* TO ADD NEW SHAPE + add an option here (in the end of 'else if' sequence). Identifier ('SH_...') should be + defined in const.h. This option should set 'mat' - index of domain for a point, + specified by {xr,yz,zr} - coordinates divided by grid size along X (xr from -0.5 to 0.5, + others - depending on aspect ratios). C array indexing used: mat=0 - first domain, etc. + If point corresponds to void, do not set 'mat'. If you need temporary local variables + (which are used only in this part of the code), either use 'tmp1'-'tmp3' or define your + own (with more informative names) in the beginning of this function.*/ + + position_tmp[3*index]=(unsigned short)i; + position_tmp[3*index+1]=(unsigned short)j; + position_tmp[3*index+2]=(unsigned short)k; + /* afterwards multiplied by gridspace */ + DipoleCoord_tmp[3*index]=i-cX; + DipoleCoord_tmp[3*index+1]=j-cY; + DipoleCoord_tmp[3*index+2]=k-cZ; + material_tmp[index]=(unsigned char)mat; + index++; + } /* End box loop */ + if (shape==SH_READ) ReadDipFile(aggregate_file); + /* initialization of mat_count and dipoles counts */ + for(i=0;i<=Nmat;i++) mat_count[i]=0; + for(dip=0;dip<local_Ndip;dip++) mat_count[material_tmp[dip]]++; + local_nvoid_Ndip=local_Ndip-mat_count[Nmat]; + MyInnerProduct(mat_count,double_type,Nmat+1,NULL); + if ((nvoid_Ndip=Ndip-mat_count[Nmat])==0) + LogError(EC_ERROR,ONE_POS,"All dipoles of the scatterer are void"); + nlocalRows=3*local_nvoid_Ndip; + /* initialize dpl and gridspace */ + volcor_used=(volcor && (volume_ratio!=UNDEF)); + if (sizeX==UNDEF) { + if (a_eq!=UNDEF) dpl=lambda*pow(nvoid_Ndip*THREE_OVER_FOUR_PI,ONE_THIRD)/a_eq; + else if (dpl==UNDEF) dpl=dpl_def; /* default value of dpl */ + /* sizeX is determined to give correct volume */ + if (volcor_used) sizeX=lambda*pow(nvoid_Ndip/volume_ratio,ONE_THIRD)/dpl; + else sizeX=lambda*boxX/dpl; + } + else { + /* dpl is determined to give correct volume */ + if (volcor_used) dpl=lambda*pow(nvoid_Ndip/volume_ratio,ONE_THIRD)/sizeX; + else dpl=lambda*boxX/sizeX; + } + /* Check consistency for FCD */ + if ((IntRelation==G_FCD || PolRelation==POL_FCD) && dpl<=2) + LogError(EC_ERROR,ONE_POS,"Too small dpl for FCD formulation, should be at least 2"); + gridspace=lambda/dpl; + /* initialize equivalent size parameter and cross section */ + kd = TWO_PI/dpl; + /* from this moment on a_eq and all derived quantities are based on the real a_eq, which can + in several cases be slightly different from the one given by '-eq_rad' option */ + a_eq = pow(THREE_OVER_FOUR_PI*nvoid_Ndip,ONE_THIRD)*gridspace; + ka_eq = WaveNum*a_eq; + inv_G = 1/(PI*a_eq*a_eq); + /* granulate one domain, if needed */ + if (sh_granul) { + tgran=GET_TIME(); + Timing_Granul_comm=0; + /* calculate number of granules */ + if (mat_count[gr_mat]==0) LogError(EC_ERROR,ONE_POS, + "Domain to be granulated does not contain any dipoles"); + tmp1=gridspace/gr_d; + tmp2=mat_count[gr_mat]*gr_vf*SIX_OVER_PI; + tmp3=tmp2*tmp1*tmp1*tmp1; + CheckOverflow(tmp3,ONE_POS,"Make_Particle()"); + gr_N=(size_t)ceil(tmp3); + /* correct granules diameter to get exact volume fraction (if volume correction is used) */ + if (volcor_used) gr_d=gridspace*pow(tmp2/gr_N,ONE_THIRD); + /* actually place granules */ + mat_count[Nmat-1]=PlaceGranules(); + /* calculate exact volume fraction */ + gr_vf_real=mat_count[Nmat-1]/mat_count[gr_mat]; + mat_count[gr_mat]-=mat_count[Nmat-1]; + Timing_Granul=GET_TIME()-tgran; + } + /* allocate main particle arrays, using precise nlocalRows + even when '-prognose' is used to enable save_geom afterwards */ + MALLOC_VECTOR(material,uchar,local_nvoid_Ndip,ALL); + MALLOC_VECTOR(DipoleCoord,double,nlocalRows,ALL); + MALLOC_VECTOR(position,ushort,nlocalRows,ALL); + + memory+=(3*(sizeof(short int)+sizeof(double))+sizeof(char))*local_nvoid_Ndip; + /* copy nontrivial part of arrays */ + index=0; + for (dip=0;dip<local_Ndip;dip++) if (material_tmp[dip]<Nmat) { + material[index]=material_tmp[dip]; + /* DipoleCoord=gridspace*DipoleCoord_tmp */ + MultScal(gridspace,DipoleCoord_tmp+3*dip,DipoleCoord+3*index); + memcpy(position+3*index,position_tmp+3*dip,3*sizeof(short int)); + index++; + } + /* free temporary memory */ + Free_general(material_tmp); + Free_general(DipoleCoord_tmp); + Free_general(position_tmp); + + /* save geometry */ + if (save_geom) SaveGeometry(); + + /* adjust z-axis of position vector, to speed-up matvec a little bit + after this point 'position(z)' is taken relative to the local_z0 */ + if (local_z0!=0) { + us_tmp=(unsigned short)local_z0; + for (dip=2;dip<3*local_nvoid_Ndip;dip+=3) position[dip]-=us_tmp; + } + local_Nz_unif=position[3*local_nvoid_Ndip-1]+1; + local_z0_unif=local_z0; /* should be changed afterwards */ + box_origin_unif[0]=-gridspace*cX; + box_origin_unif[1]=-gridspace*cY; + box_origin_unif[2]=gridspace*(local_z0_unif-cZ); + + Timing_Particle += GET_TIME() - tstart; } diff --git a/src/make_seq b/src/make_seq index 71ab359c..9949ad80 100644 --- a/src/make_seq +++ b/src/make_seq @@ -16,8 +16,7 @@ MFILES += make_seq $(LASTSEQ) $(PROGSEQ): $(COBJECTS) $(FOBJECTS) $(CC) -o $@ $(LFLAGS) $(COBJECTS) $(FOBJECTS) $(LDLIBS) -# Everything is recompiled (but not dependencies) -# when any of makefiles is changed +# Everything is recompiled (but not dependencies) when any of makefiles is changed) $(COBJECTS): %.o: %.c %.d $(MFILES) $(CC) -c $(CFLAGS) $< @@ -31,7 +30,7 @@ $(LASTSEQ): # we assume that each Fortran file is completely independent $(CDEPEND): %.d: %.c $(MFILES) - $(CC) $(DEPFLAG) $(CFLAGS) $< $(DFFLAG) $@.$$$$; \ + $(CC) $(DEPFLAG) $(CFLAGS) $< > $@.$$$$; \ sed 's,\($*\)\.o[ :]*,\1.o $@ : ,g' < $@.$$$$ > $@; \ rm -f $@.$$$$ diff --git a/src/matvec.c b/src/matvec.c index 7f3dddf9..01dfd8d9 100644 --- a/src/matvec.c +++ b/src/matvec.c @@ -1,11 +1,11 @@ /* FILE: matvec.c * AUTH: Maxim Yurkin * DESCR: calculate local matrix vector product of decomposed interaction - * matrix with r_k or p_k, using a FFT based convolution algorithm + * matrix with rk or pk, using a FFT based convolution algorithm * * Previous version by Michel Grimminck * - * Copyright (C) 2006-2008 University of Amsterdam + * Copyright (C) 2006-2007 University of Amsterdam * This code is covered by the GNU General Public License. */ #include <stdio.h> @@ -20,331 +20,332 @@ #include "function.h" #include "io.h" -// SEMI-GLOBAL VARIABLES +/* SEMI-GLOBAL VARIABLES */ -// defined and initialized in fft.c +/* defined and initialized in fft.c */ extern const doublecomplex *Dmatrix; extern doublecomplex *Xmatrix,*slices,*slices_tr; extern const size_t DsizeY,DsizeZ,DsizeYZ; -//============================================================ +/*============================================================*/ INLINE size_t IndexSliceZY(const size_t y,const size_t z) { - return (z*gridY+y); + return (z*gridY+y); } -//============================================================ +/*============================================================*/ INLINE size_t IndexSliceYZ(const size_t y,const size_t z) { - return(y*gridZ+z); + return(y*gridZ+z); } -//============================================================ +/*============================================================*/ INLINE size_t IndexGarbledX(const size_t x,const size_t y,const size_t z) { #ifdef PARALLEL - return(((z%local_Nz)*smallY+y)*gridX+(z/local_Nz)*local_Nx+x%local_Nx); + return(((z%local_Nz)*smallY+y)*gridX+(z/local_Nz)*local_Nx+x%local_Nx); #else - return((z*smallY+y)*gridX+x); + return((z*smallY+y)*gridX+x); #endif } -//============================================================ +/*============================================================*/ INLINE size_t IndexXmatrix(const size_t x,const size_t y,const size_t z) { - return((z*smallY+y)*gridX+x); + return((z*smallY+y)*gridX+x); } -//============================================================ +/*============================================================*/ INLINE size_t IndexDmatrix_mv(size_t x,size_t y,size_t z,const int transposed) { - if (transposed) { // used only for G_SO - if (x>0) x=gridX-x; - if (y>0) y=gridY-y; - if (z>0) z=gridZ-z; - } - else { - if (y>=DsizeY) y=gridY-y; - if (z>=DsizeZ) z=gridZ-z; - } + if (transposed) { /* used only for G_SO */ + if (x>0) x=gridX-x; + if (y>0) y=gridY-y; + if (z>0) z=gridZ-z; + } + else { + if (y>=DsizeY) y=gridY-y; + if (z>=DsizeZ) z=gridZ-z; + } - return(NDCOMP*(x*DsizeYZ+z*DsizeY+y)); + return(NDCOMP*(x*DsizeYZ+z*DsizeY+y)); } -//============================================================ +/*============================================================*/ -void MatVec (doublecomplex *argvec, // the argument vector - doublecomplex *resultvec, // the result vector - double *inprod, // the resulting inner product - const int her) // 0 for non-Hermitian, 1 for Hermitian -/* This function implements both MatVec_nim and MatVecAndInp_nim. The difference is that when we - * want to calculate the inner product as well, we pass 'inprod' as a non-NULL pointer. if 'inprod' - * is NULL, we don't calculate it. 'argvec' always remains unchanged afterwards, however it is not - * strictly const - some manipulations may occur during the execution. - */ +void MatVec (doublecomplex *argvec, /* the argument vector */ + doublecomplex *resultvec, /* the result vector */ + double *inprod, /* the result inner product */ + const int her) /* 0 for non-hermitic, 1 for hermetic */ +/* This function implements both MatVec_nim and MatVecAndInp_nim. + the difference is that when we want to calculate the inproduct + as well, we pass 'inprod' as a non-null pointer. if 'inprod' is + a NULL, we don't calculate it. + argvec allways remains unchanged afterwards, however it is not + strictly const - some manipulations may occur during the execution */ { - size_t i,j; - doublecomplex fmat[6],xv[3],yv[3]; - doublecomplex temp; - size_t index,x,y,z,Xcomp; - int ipr; - unsigned char mat; - int transposed; - size_t boxY_st=boxY,boxZ_st=boxZ; // copies with different type + size_t i,j; + doublecomplex fmat[6],xv[3],yv[3]; + doublecomplex temp; + size_t index,x,y,z,Xcomp; + int ipr; + unsigned char mat; + int transposed; + size_t boxY_st=boxY,boxZ_st=boxZ; /* copies with different type */ #ifdef PRECISE_TIMING - SYSTEM_TIME tvp[18]; - SYSTEM_TIME Timing_FFTXf,Timing_FFTYf,Timing_FFTZf,Timing_FFTXb,Timing_FFTYb,Timing_FFTZb, - Timing_Mult1,Timing_Mult2,Timing_Mult3,Timing_Mult4,Timing_Mult5, - Timing_BTf,Timing_BTb,Timing_TYZf,Timing_TYZb,Timing_ipr; - double t_FFTXf,t_FFTYf,t_FFTZf,t_FFTXb,t_FFTYb,t_FFTZb, - t_Mult1,t_Mult2,t_Mult3,t_Mult4,t_Mult5,t_ipr, - t_BTf,t_BTb,t_TYZf,t_TYZb,t_Arithm,t_FFT,t_Comm; + SYSTEM_TIME tvp[18]; + SYSTEM_TIME Timing_FFTXf,Timing_FFTYf,Timing_FFTZf,Timing_FFTXb,Timing_FFTYb,Timing_FFTZb, + Timing_Mult1,Timing_Mult2,Timing_Mult3,Timing_Mult4,Timing_Mult5, + Timing_BTf,Timing_BTb,Timing_TYZf,Timing_TYZb,Timing_ipr; + double t_FFTXf,t_FFTYf,t_FFTZf,t_FFTXb,t_FFTYb,t_FFTZb, + t_Mult1,t_Mult2,t_Mult3,t_Mult4,t_Mult5,t_ipr, + t_BTf,t_BTb,t_TYZf,t_TYZb,t_Arithm,t_FFT,t_Comm; #endif -/* A = I + S.D.S - * S = sqrt(C) - * A.x = x + S.D.(S.x) - * A(H).x = x + (S(T).D(T).S(T).x(*))(*) - * C,S - diagonal => symmetric - * (!! will change if tensor (non-diagonal) polarizability is used !!) - * D - symmetric (except for G_SO) - * - * D.x=F(-1)(F(D).F(X)) - * F(D) is just a vector - * - * G_SO: F(D(T)) (k) = F(D) (-k) - * k - vector index - * - * For (her) three additional operations of nConj are used. Should not be a problem, - * but can be avoided by a more complex code. - */ + /* A = I + S.D.S + * S = sqrt(C) + * A.x = x + S.D.(S.x) + * A(H).x = x + (S(T).D(T).S(T).x(*))(*) + * C,S - diagonal => symmetric + * (!! will change if tensor (non-diagonal) polarizability is used !!) + * D - symmetric (except for G_SO) + * + * D.x=F(-1)(F(D).F(X)) + * F(D) is just a vector + * + * G_SO: F(D(T)) (k) = F(D) (-k) + * k - vector index + * + * For (her) three additional operations of nConj are used. Should not be a problem, + * but can be avoided by a more complex code. + */ - transposed=(!reduced_FFT) && her; - if (inprod) ipr=TRUE; - else ipr=FALSE; + transposed=(!reduced_FFT) && her; + if (inprod) ipr=TRUE; + else ipr=FALSE; #ifdef PRECISE_TIMING - InitTime(&Timing_FFTYf); - InitTime(&Timing_FFTZf); - InitTime(&Timing_FFTYb); - InitTime(&Timing_FFTZb); - InitTime(&Timing_Mult2); - InitTime(&Timing_Mult3); - InitTime(&Timing_Mult4); - InitTime(&Timing_TYZf); - InitTime(&Timing_TYZb); - GetTime(tvp); + InitTime(&Timing_FFTYf); + InitTime(&Timing_FFTZf); + InitTime(&Timing_FFTYb); + InitTime(&Timing_FFTZb); + InitTime(&Timing_Mult2); + InitTime(&Timing_Mult3); + InitTime(&Timing_Mult4); + InitTime(&Timing_TYZf); + InitTime(&Timing_TYZb); + GetTime(tvp); #endif - // FFT_matvec code - if (ipr) *inprod = 0.0; + /* FFT_matvec code */ + if (ipr) *inprod = 0.0; - // fill Xmatrix with 0.0 - for (i=0;i<3*local_Nsmall;i++) Xmatrix[i][RE]=Xmatrix[i][IM]=0.0; + /* fill Xmatrix with 0.0 */ + for (i=0;i<3*local_Nsmall;i++) Xmatrix[i][RE]=Xmatrix[i][IM]=0.0; - // transform from coordinates to grid and multiply with coupling constant - if (her) nConj(argvec); // conjugated back afterwards + /* transform from coordinates to grid and multiply with coupling constant */ + if (her) nConj(argvec); /* conjugated back afterwards */ - for (i=0;i<local_nvoid_Ndip;i++) { - // fill grid with argvec*sqrt_cc - j=3*i; - mat=material[i]; - index=IndexXmatrix(position[j],position[j+1],position[j+2]); - for (Xcomp=0;Xcomp<3;Xcomp++) // Xmat=cc_sqrt*argvec - cMult(cc_sqrt[mat][Xcomp],argvec[j+Xcomp],Xmatrix[index+Xcomp*local_Nsmall]); - } + for (i=0;i<local_nvoid_Ndip;i++) { + /* fill grid with argvec*sqrt_cc */ + j=3*i; + mat=material[i]; + index=IndexXmatrix(position[j],position[j+1],position[j+2]); + for (Xcomp=0;Xcomp<3;Xcomp++) /* Xmat=cc_sqrt*argvec */ + cMult(cc_sqrt[mat][Xcomp],argvec[j+Xcomp],Xmatrix[index+Xcomp*local_Nsmall]); + } #ifdef PRECISE_TIMING - GetTime(tvp+1); - elapsed(tvp,tvp+1,&Timing_Mult1); + GetTime(tvp+1); + elapsed(tvp,tvp+1,&Timing_Mult1); #endif - // FFT X - fftX(FFT_FORWARD); // fftX Xmatrix + /* FFT X */ + fftX(FFT_FORWARD); /* fftX Xmatrix */ #ifdef PRECISE_TIMING - GetTime(tvp+2); - elapsed(tvp+1,tvp+2,&Timing_FFTXf); + GetTime(tvp+2); + elapsed(tvp+1,tvp+2,&Timing_FFTXf); #endif - BlockTranspose(Xmatrix); + BlockTranspose(Xmatrix); #ifdef PRECISE_TIMING - GetTime(tvp+3); - elapsed(tvp+2,tvp+3,&Timing_BTf); + GetTime(tvp+3); + elapsed(tvp+2,tvp+3,&Timing_BTf); #endif - // following is done by slices - for(x=local_x0;x<local_x1;x++) { + /* following is done by slices */ + for(x=local_x0;x<local_x1;x++) { #ifdef PRECISE_TIMING - GetTime(tvp+4); + GetTime(tvp+4); #endif - // clear slice - for(i=0;i<3*gridYZ;i++) slices[i][RE]=slices[i][IM]=0.0; + /* clear slice */ + for(i=0;i<3*gridYZ;i++) slices[i][RE]=slices[i][IM]=0.0; - // fill slices with values from Xmatrix - for(y=0;y<boxY_st;y++) for(z=0;z<boxZ_st;z++) { - i=IndexSliceYZ(y,z); - j=IndexGarbledX(x,y,z); - for (Xcomp=0;Xcomp<3;Xcomp++) - cEqual(Xmatrix[j+Xcomp*local_Nsmall],slices[i+Xcomp*gridYZ]); - } + /* fill slices with values from Xmatrix */ + for(y=0;y<boxY_st;y++) for(z=0;z<boxZ_st;z++) { + i=IndexSliceYZ(y,z); + j=IndexGarbledX(x,y,z); + for (Xcomp=0;Xcomp<3;Xcomp++) + cEqual(Xmatrix[j+Xcomp*local_Nsmall],slices[i+Xcomp*gridYZ]); + } #ifdef PRECISE_TIMING - GetTime(tvp+5); - ElapsedInc(tvp+4,tvp+5,&Timing_Mult2); + GetTime(tvp+5); + ElapsedInc(tvp+4,tvp+5,&Timing_Mult2); #endif - // fftZ&Y - fftZ(FFT_FORWARD); // fftZ slices + /* fftZ&Y */ + fftZ(FFT_FORWARD); /* fftZ slices */ #ifdef PRECISE_TIMING - GetTime(tvp+6); - ElapsedInc(tvp+5,tvp+6,&Timing_FFTZf); + GetTime(tvp+6); + ElapsedInc(tvp+5,tvp+6,&Timing_FFTZf); #endif - TransposeYZ(FFT_FORWARD); + TransposeYZ(FFT_FORWARD); #ifdef PRECISE_TIMING - GetTime(tvp+7); - ElapsedInc(tvp+6,tvp+7,&Timing_TYZf); + GetTime(tvp+7); + ElapsedInc(tvp+6,tvp+7,&Timing_TYZf); #endif - fftY(FFT_FORWARD); // fftY slices_tr + fftY(FFT_FORWARD); /* fftY slices_tr */ #ifdef PRECISE_TIMING - GetTime(tvp+8); - ElapsedInc(tvp+7,tvp+8,&Timing_FFTYf); + GetTime(tvp+8); + ElapsedInc(tvp+7,tvp+8,&Timing_FFTYf); #endif - // do the product D~*X~ - for(z=0;z<gridZ;z++) for(y=0;y<gridY;y++) { - i=IndexSliceZY(y,z); - for (Xcomp=0;Xcomp<3;Xcomp++) - cEqual(slices_tr[i+Xcomp*gridYZ],xv[Xcomp]); + /* do the product D~*X~ */ + for(z=0;z<gridZ;z++) for(y=0;y<gridY;y++) { + i=IndexSliceZY(y,z); + for (Xcomp=0;Xcomp<3;Xcomp++) + cEqual(slices_tr[i+Xcomp*gridYZ],xv[Xcomp]); - j=IndexDmatrix_mv(x-local_x0,y,z,transposed); - memcpy(fmat,Dmatrix[j],6*sizeof(doublecomplex)); - if (reduced_FFT) { - if (y>smallY) { - cInvSign(fmat[1]); // fmat[1]*=-1 - if (z>smallZ) cInvSign(fmat[2]); // fmat[2]*=-1 - else cInvSign(fmat[4]); // fmat[4]*=-1 - } - else if (z>smallZ) { - cInvSign(fmat[2]); // fmat[2]*=-1 - cInvSign(fmat[4]); // fmat[4]*=-1 - } - } - cSymMatrVec(fmat,xv,yv); // yv=fmat*xv - for (Xcomp=0;Xcomp<3;Xcomp++) - cEqual(yv[Xcomp],slices_tr[i+Xcomp*gridYZ]); - } + j=IndexDmatrix_mv(x-local_x0,y,z,transposed); + memcpy(fmat,Dmatrix[j],6*sizeof(doublecomplex)); + if (reduced_FFT) { + if (y>smallY) { + cInvSign(fmat[1]); /* fmat[1]*=-1 */ + if (z>smallZ) cInvSign(fmat[2]); /* fmat[2]*=-1 */ + else cInvSign(fmat[4]); /* fmat[4]*=-1 */ + } + else if (z>smallZ) { + cInvSign(fmat[2]); /* fmat[2]*=-1 */ + cInvSign(fmat[4]); /* fmat[4]*=-1 */ + } + } + cSymMatrVec(fmat,xv,yv); /* yv=fmat*xv */ + for (Xcomp=0;Xcomp<3;Xcomp++) + cEqual(yv[Xcomp],slices_tr[i+Xcomp*gridYZ]); + } #ifdef PRECISE_TIMING - GetTime(tvp+9); - ElapsedInc(tvp+8,tvp+9,&Timing_Mult3); + GetTime(tvp+9); + ElapsedInc(tvp+8,tvp+9,&Timing_Mult3); #endif - // fft_invY&Z - fftY(FFT_BACKWARD); // fftY slices_tr + /* fft_invY&Z */ + fftY(FFT_BACKWARD); /* fftY slices_tr */ #ifdef PRECISE_TIMING - GetTime(tvp+10); - ElapsedInc(tvp+9,tvp+10,&Timing_FFTYb); + GetTime(tvp+10); + ElapsedInc(tvp+9,tvp+10,&Timing_FFTYb); #endif - TransposeYZ(FFT_BACKWARD); + TransposeYZ(FFT_BACKWARD); #ifdef PRECISE_TIMING - GetTime(tvp+11); - ElapsedInc(tvp+10,tvp+11,&Timing_TYZb); + GetTime(tvp+11); + ElapsedInc(tvp+10,tvp+11,&Timing_TYZb); #endif - fftZ(FFT_BACKWARD); // fftZ slices + fftZ(FFT_BACKWARD); /* fftZ slices */ #ifdef PRECISE_TIMING - GetTime(tvp+12); - ElapsedInc(tvp+11,tvp+12,&Timing_FFTZb); + GetTime(tvp+12); + ElapsedInc(tvp+11,tvp+12,&Timing_FFTZb); #endif - // copy slice back to Xmatrix - for(y=0;y<boxY_st;y++) for(z=0;z<boxZ_st;z++) { - i=IndexSliceYZ(y,z); - j=IndexGarbledX(x,y,z); - for (Xcomp=0;Xcomp<3;Xcomp++) - cEqual(slices[i+Xcomp*gridYZ],Xmatrix[j+Xcomp*local_Nsmall]); - } + /* copy slice back to Xmatrix */ + for(y=0;y<boxY_st;y++) for(z=0;z<boxZ_st;z++) { + i=IndexSliceYZ(y,z); + j=IndexGarbledX(x,y,z); + for (Xcomp=0;Xcomp<3;Xcomp++) + cEqual(slices[i+Xcomp*gridYZ],Xmatrix[j+Xcomp*local_Nsmall]); + } #ifdef PRECISE_TIMING - GetTime(tvp+13); - ElapsedInc(tvp+12,tvp+13,&Timing_Mult4); + GetTime(tvp+13); + ElapsedInc(tvp+12,tvp+13,&Timing_Mult4); #endif - } // end of loop over slices - // FFT-X back the result - BlockTranspose(Xmatrix); + } /* end of loop over slices */ + /* FFT-X back the result */ + BlockTranspose(Xmatrix); #ifdef PRECISE_TIMING - GetTime(tvp+14); - elapsed(tvp+13,tvp+14,&Timing_BTb); + GetTime(tvp+14); + elapsed(tvp+13,tvp+14,&Timing_BTb); #endif - fftX(FFT_BACKWARD); // fftX Xmatrix + fftX(FFT_BACKWARD); /* fftX Xmatrix */ #ifdef PRECISE_TIMING - GetTime(tvp+15); - elapsed(tvp+14,tvp+15,&Timing_FFTXb); + GetTime(tvp+15); + elapsed(tvp+14,tvp+15,&Timing_FFTXb); #endif - // fill resultvec - for (i=0;i<local_nvoid_Ndip;i++) { - j=3*i; - mat=material[i]; - index=IndexXmatrix(position[j],position[j+1],position[j+2]); - for (Xcomp=0;Xcomp<3;Xcomp++) { - cMult(cc_sqrt[mat][Xcomp],Xmatrix[index+Xcomp*local_Nsmall],temp); - cAdd(argvec[j+Xcomp],temp,resultvec[j+Xcomp]); // result=argvec+cc_sqrt*Xmat - } - // norm is unaffected by conjugation, hence can be computed here - if (ipr) *inprod+=cvNorm2(resultvec+j); - } - if (her) { - nConj(resultvec); - nConj(argvec); // conjugate back argvec, so it remains unchanged after MatVec - } + /* fill resultvec */ + for (i=0;i<local_nvoid_Ndip;i++) { + j=3*i; + mat=material[i]; + index=IndexXmatrix(position[j],position[j+1],position[j+2]); + for (Xcomp=0;Xcomp<3;Xcomp++) { + cMult(cc_sqrt[mat][Xcomp],Xmatrix[index+Xcomp*local_Nsmall],temp); + cAdd(argvec[j+Xcomp],temp,resultvec[j+Xcomp]); /* result=argvec+cc_sqrt*Xmat */ + } + /* norm is unaffected by conjugation, hence can be computed here */ + if (ipr) *inprod+=cvNorm2(resultvec+j); + } + if (her) { + nConj(resultvec); + nConj(argvec); /* conjugate back argvec, so it remains unchanged after MatVec */ + } #ifdef PRECISE_TIMING - GetTime(tvp+16); - elapsed(tvp+15,tvp+16,&Timing_Mult5); + GetTime(tvp+16); + elapsed(tvp+15,tvp+16,&Timing_Mult5); #endif - if (ipr) MyInnerProduct(inprod,double_type,1,&Timing_OneIterComm); + if (ipr) MyInnerProduct(inprod,double_type,1,&Timing_OneIterComm); #ifdef PRECISE_TIMING - GetTime(tvp+17); - elapsed(tvp+16,tvp+17,&Timing_ipr); + GetTime(tvp+17); + elapsed(tvp+16,tvp+17,&Timing_ipr); - SetTimerFreq(); - t_Mult1=TimerToSec(&Timing_Mult1); - t_Mult2=TimerToSec(&Timing_Mult2); - t_Mult3=TimerToSec(&Timing_Mult3); - t_Mult4=TimerToSec(&Timing_Mult4); - t_Mult5=TimerToSec(&Timing_Mult5); - t_TYZf=TimerToSec(&Timing_TYZf); - t_TYZb=TimerToSec(&Timing_TYZb); - t_BTf=TimerToSec(&Timing_BTf); - t_BTb=TimerToSec(&Timing_BTb); - t_FFTXf=TimerToSec(&Timing_FFTXf); - t_FFTXb=TimerToSec(&Timing_FFTXb); - t_FFTYf=TimerToSec(&Timing_FFTYf); - t_FFTYb=TimerToSec(&Timing_FFTYb); - t_FFTZf=TimerToSec(&Timing_FFTZf); - t_FFTZb=TimerToSec(&Timing_FFTZb); - t_ipr=TimerToSec(&Timing_ipr); + SetTimerFreq(); + t_Mult1=TimerToSec(&Timing_Mult1); + t_Mult2=TimerToSec(&Timing_Mult2); + t_Mult3=TimerToSec(&Timing_Mult3); + t_Mult4=TimerToSec(&Timing_Mult4); + t_Mult5=TimerToSec(&Timing_Mult5); + t_TYZf=TimerToSec(&Timing_TYZf); + t_TYZb=TimerToSec(&Timing_TYZb); + t_BTf=TimerToSec(&Timing_BTf); + t_BTb=TimerToSec(&Timing_BTb); + t_FFTXf=TimerToSec(&Timing_FFTXf); + t_FFTXb=TimerToSec(&Timing_FFTXb); + t_FFTYf=TimerToSec(&Timing_FFTYf); + t_FFTYb=TimerToSec(&Timing_FFTYb); + t_FFTZf=TimerToSec(&Timing_FFTZf); + t_FFTZb=TimerToSec(&Timing_FFTZb); + t_ipr=TimerToSec(&Timing_ipr); - t_Arithm=t_Mult1+t_Mult2+t_Mult3+t_Mult4+t_Mult5+t_TYZf+t_TYZb; - t_FFT=t_FFTXf+t_FFTYf+t_FFTZf+t_FFTXb+t_FFTYb+t_FFTZb; - t_Comm=t_BTf+t_BTb+t_ipr; + t_Arithm=t_Mult1+t_Mult2+t_Mult3+t_Mult4+t_Mult5+t_TYZf+t_TYZb; + t_FFT=t_FFTXf+t_FFTYf+t_FFTZf+t_FFTXb+t_FFTYb+t_FFTZb; + t_Comm=t_BTf+t_BTb+t_ipr; - PRINTBOTHZ(logfile, - "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n" - " MatVec timing \n" - "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n" - "Arith1 = %4.4f Arithmetics = %4.4f\n" - "FFTXf = %4.4f FFT = %4.4f\n" - "BTf = %4.4f Comm = %4.4f\n" - "Arith2 = %4.4f\n" - "FFTZf = %4.4f Total = %4.4f\n" - "TYZf = %4.4f\n" - "FFTYf = %4.4f\n" - "Arith3 = %4.4f\n" - "FFTYb = %4.4f\n" - "TYZb = %4.4f\n" - "FFTZb = %4.4f\n" - "Arith4 = %4.4f\n" - "BTb = %4.4f\n" - "FFTXb = %4.4f\n" - "Arith5 = %4.4f\n" - "InProd = %4.4f\n\n", - t_Mult1,t_Arithm,t_FFTXf,t_FFT,t_BTf,t_Comm,t_Mult2, - t_FFTZf,DiffSec(tvp,tvp+16),t_TYZf,t_FFTYf,t_Mult3,t_FFTYb,t_TYZb,t_FFTZb, - t_Mult4,t_BTb,t_FFTXb,t_Mult5,t_ipr); - PRINTZ("\nPrecise timing is complete. Finishing execution.\n"); - Stop(0); + PRINTBOTHZ(logfile, + "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n"\ + " MatVec timing \n"\ + "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n"\ + "Arith1 = %4.4f Arithmetics = %4.4f\n"\ + "FFTXf = %4.4f FFT = %4.4f\n"\ + "BTf = %4.4f Comm = %4.4f\n"\ + "Arith2 = %4.4f\n"\ + "FFTZf = %4.4f Total = %4.4f\n"\ + "TYZf = %4.4f\n"\ + "FFTYf = %4.4f\n"\ + "Arith3 = %4.4f\n"\ + "FFTYb = %4.4f\n"\ + "TYZb = %4.4f\n"\ + "FFTZb = %4.4f\n"\ + "Arith4 = %4.4f\n"\ + "BTb = %4.4f\n"\ + "FFTXb = %4.4f\n"\ + "Arith5 = %4.4f\n"\ + "InProd = %4.4f\n\n", + t_Mult1,t_Arithm,t_FFTXf,t_FFT,t_BTf,t_Comm,t_Mult2, + t_FFTZf,DiffSec(tvp,tvp+16),t_TYZf,t_FFTYf,t_Mult3,t_FFTYb,t_TYZb,t_FFTZb, + t_Mult4,t_BTb,t_FFTXb,t_Mult5,t_ipr); + PRINTZ("\nPrecise timing is complete. Finishing execution.\n"); + Stop(0); #endif } diff --git a/src/memory.c b/src/memory.c index ddb020cd..c5db3f1c 100644 --- a/src/memory.c +++ b/src/memory.c @@ -6,7 +6,7 @@ * * Previous versions by Alfons Hoekstra * - * Copyright (C) 2006-2008 University of Amsterdam + * Copyright (C) 2006-2007 University of Amsterdam * This code is covered by the GNU General Public License. */ #include <stdio.h> @@ -18,248 +18,236 @@ #include "const.h" #ifdef FFTW3 -# include <fftw3.h> // for fftw_malloc +# include <fftw3.h> /* for fftw_malloc */ #endif -// common error check +/* common error check */ #define MALLOC_ERROR LogError(EC_ERROR,who,fname,line,"Could not malloc %s",name) #define CHECK_NULL(size,v) if ((size)!=0 && (v)==NULL) MALLOC_ERROR #define CHECK_SIZE(size,type) if ((SIZE_MAX/sizeof(type))<(size)) MALLOC_ERROR #define IF_FREE(v) if((v)!=NULL) free(v) #define OVERFLOW LogError(EC_ERROR,who,fname,line,"Integer overflow in '%s'",name); -//============================================================ +/*============================================================*/ void CheckOverflow(const double size,OTHER_ARGUMENTS) -// checks if size can fit into size_t type, otherwise overflow will happen before memory allocation + /* checks if size can fit into size_t type, + otherwise overflow will happen before memory allocation */ { - if (size>SIZE_MAX) OVERFLOW; + if (size>SIZE_MAX) OVERFLOW; } -//============================================================ +/*============================================================*/ size_t MultOverflow(const size_t a,const size_t b,OTHER_ARGUMENTS) -// multiplies two integers and checks for overflow + /* multiplies two integers and checks for overflow */ { - if ((SIZE_MAX/a)<b) OVERFLOW; - return(a*b); + if ((SIZE_MAX/a)<b) OVERFLOW; + return(a*b); } -//============================================================ +/*============================================================*/ doublecomplex *complexVector(const size_t size,OTHER_ARGUMENTS) -// allocates complex vector + /* allocates complex vector */ { - doublecomplex *v; + doublecomplex *v; - CHECK_SIZE(size,doublecomplex); + CHECK_SIZE(size,doublecomplex); #ifdef FFTW3 - v=(doublecomplex *)fftw_malloc(size*sizeof(doublecomplex)); + v=(doublecomplex *)fftw_malloc(size*sizeof(doublecomplex)); #else - v=(doublecomplex *)malloc(size*sizeof(doublecomplex)); + v=(doublecomplex *)malloc(size*sizeof(doublecomplex)); #endif - CHECK_NULL(size,v); - return v; + CHECK_NULL(size,v); + return v; } -//============================================================ +/*============================================================*/ double **doubleMatrix(const size_t rows,const size_t cols,OTHER_ARGUMENTS) -// allocates double matrix (rows x cols) + /* allocates double matrix (rows x cols) */ { - register size_t i; - double **m; - - CHECK_SIZE(rows,double *); - CHECK_SIZE(cols,double); - m=(double **)malloc(rows*sizeof(double *)); - CHECK_NULL(rows,m); - for (i=0;i<rows;i++) { - m[i]=(double *)malloc(cols*sizeof(double)); - CHECK_NULL(cols,m[i]); - } - return m; -} - -//============================================================ - -double *doubleVector(const size_t size,OTHER_ARGUMENTS) -// allocates double vector -{ - double *v; + register size_t i; + double **m; - CHECK_SIZE(size,double); - v=(double *)malloc(size*sizeof(double)); - CHECK_NULL(size,v); - return v; + CHECK_SIZE(rows,double *); + CHECK_SIZE(cols,double); + m=(double **)malloc(rows*sizeof(double *)); + CHECK_NULL(rows,m); + for (i=0;i<rows;i++) { + m[i]=(double *)malloc(cols*sizeof(double)); + CHECK_NULL(cols,m[i]); + } + return m; } -//============================================================ +/*============================================================*/ -double *doubleRealloc(double *ptr,const size_t size,OTHER_ARGUMENTS) -// reallocates double vector ptr to a larger size +double *doubleVector(const size_t size,OTHER_ARGUMENTS) + /* allocates double vector */ { - double *v; + double *v; - CHECK_SIZE(size,double); - v=(double *)realloc(ptr,size*sizeof(double)); - CHECK_NULL(size,v); - return v; + CHECK_SIZE(size,double); + v=(double *)malloc(size*sizeof(double)); + CHECK_NULL(size,v); + return v; } -//============================================================ +/*============================================================*/ double *doubleVector2(const size_t nl,const size_t nh,OTHER_ARGUMENTS) -// allocates double vector with indices from nl to nh; all arguments must be non-negative and nh>=nl + /* allocates double vector with indices from nl to nh; all arguments must be non-negative; + and nh>=nl */ { - double *v; - size_t size; - - if (nh<nl || nh-nl==SIZE_MAX) MALLOC_ERROR; - else size=nh-nl+1; - CHECK_SIZE(size,double); - v=(double *)malloc(size*sizeof(double)); - CHECK_NULL(size,v); - v-=nl; - return v; + double *v; + size_t size; + + if (nh<nl || nh-nl==SIZE_MAX) MALLOC_ERROR; + else size=nh-nl+1; + CHECK_SIZE(size,double); + v=(double *)malloc(size*sizeof(double)); + CHECK_NULL(size,v); + v-=nl; + return v; } -//============================================================ +/*============================================================*/ int **intMatrix(const size_t nrl,const size_t nrh,const size_t ncl,const size_t nch,OTHER_ARGUMENTS) -/* allocates integer matrix with indices [nrl,nrh]x[ncl,nch]; all arguments must be non-negative; - * and nrh>=nrl; nch>=ncl - */ -{ - register size_t i; - size_t rows,cols; - int **m; - - if (nrh<nrl || nrh-nrl==SIZE_MAX) MALLOC_ERROR; - else rows=nrh-nrl+1; - if (nch<ncl || nch-ncl==SIZE_MAX) MALLOC_ERROR; - else cols=nch-ncl+1; - CHECK_SIZE(rows,int *); - CHECK_SIZE(cols,int); - m=(int **)malloc(rows*sizeof(int *)); - CHECK_NULL(rows,m); - m-=nrl; - for (i=nrl;i<=nrh;i++) { - m[i]=(int *)malloc(cols*sizeof(int)); - CHECK_NULL(cols,m[i]); - m[i]-=ncl; - } - return m; -} - -//============================================================ + /* allocates integer matrix with indices [nrl,nrh]x[ncl,nch]; all arguments must be non-negative; + and nrh>=nrl; nch>=ncl */ +{ + register size_t i; + size_t rows,cols; + int **m; + + if (nrh<nrl || nrh-nrl==SIZE_MAX) MALLOC_ERROR; + else rows=nrh-nrl+1; + if (nch<ncl || nch-ncl==SIZE_MAX) MALLOC_ERROR; + else cols=nch-ncl+1; + CHECK_SIZE(rows,int *); + CHECK_SIZE(cols,int); + m=(int **)malloc(rows*sizeof(int *)); + CHECK_NULL(rows,m); + m-=nrl; + for (i=nrl;i<=nrh;i++) { + m[i]=(int *)malloc(cols*sizeof(int)); + CHECK_NULL(cols,m[i]); + m[i]-=ncl; + } + return m; +} + +/*============================================================*/ int *intVector(const size_t size,OTHER_ARGUMENTS) -// allocates integer vector + /* allocates integer vector */ { - int *v; + int *v; - CHECK_SIZE(size,int); - v=(int *)malloc(size*sizeof(int)); - CHECK_NULL(size,v); - return v; + CHECK_SIZE(size,int); + v=(int *)malloc(size*sizeof(int)); + CHECK_NULL(size,v); + return v; } -//============================================================ +/*============================================================*/ unsigned short *ushortVector(const size_t size,OTHER_ARGUMENTS) -// allocates unsigned short vector + /* allocates unsigned short vector */ { - unsigned short *v; + unsigned short *v; - CHECK_SIZE(size,short); - v=(unsigned short *)malloc(size*sizeof(short)); - CHECK_NULL(size,v); - return v; + CHECK_SIZE(size,short); + v=(unsigned short *)malloc(size*sizeof(short)); + CHECK_NULL(size,v); + return v; } -//============================================================ +/*============================================================*/ char *charVector(const size_t size,OTHER_ARGUMENTS) -// allocates unsigned char vector + /* allocates unsigned char vector */ { - char *v; + char *v; - v=(char *)malloc(size); - CHECK_NULL(size,v); - return v; + v=(char *)malloc(size); + CHECK_NULL(size,v); + return v; } -//============================================================ +/*============================================================*/ unsigned char *ucharVector(const size_t size,OTHER_ARGUMENTS) -// allocates unsigned char vector + /* allocates unsigned char vector */ { - unsigned char *v; + unsigned char *v; - v=(unsigned char *)malloc(size); - CHECK_NULL(size,v); - return v; + v=(unsigned char *)malloc(size); + CHECK_NULL(size,v); + return v; } -//============================================================ +/*============================================================*/ void *voidVector(const size_t size,OTHER_ARGUMENTS) -// allocates void vector + /* allocates void vector */ { - void *v; + void *v; - v=malloc(size); - CHECK_NULL(size,v); - return v; + v=malloc(size); + CHECK_NULL(size,v); + return v; } -//============================================================ +/*============================================================*/ void Free_cVector (doublecomplex *v) -// frees complex vector + /* frees complex vector */ { #ifdef FFTW3 - if (v!=NULL) fftw_free(v); + if (v!=NULL) fftw_free(v); #else - IF_FREE(v); + IF_FREE(v); #endif } -//============================================================ +/*============================================================*/ void Free_dMatrix(double **m,const size_t rows) -// frees double matrix (rows x cols) + /* frees double matrix (rows x cols) */ { - register size_t i; + register size_t i; - for (i=0;i<rows;i++) IF_FREE(m[i]); - IF_FREE(m); + for (i=0;i<rows;i++) IF_FREE(m[i]); + IF_FREE(m); } -//============================================================ +/*============================================================*/ void Free_dVector2(double *v,const size_t nl) -// frees double vector with indices from nl; all arguments must be non-negative + /* frees double vector with indices from nl; all arguments must be non-negative */ { - IF_FREE(v+nl); + IF_FREE(v+nl); } -//============================================================ +/*============================================================*/ void Free_iMatrix(int **m,const size_t nrl,const size_t nrh,const size_t ncl) -// frees integer matrix with indices [nrl,nrh]x[ncl,...]; all arguments must be non-negative + /* frees integer matrix with indices [nrl,nrh]x[ncl,...]; all arguments must be non-negative */ { - register size_t i; + register size_t i; - for (i=nrh;i>=nrl;i--) IF_FREE(m[i]+ncl); - IF_FREE(m+nrl); + for (i=nrh;i>=nrl;i--) IF_FREE(m[i]+ncl); + IF_FREE(m+nrl); } -//============================================================ +/*============================================================*/ void Free_general(void *v) -// frees general vector; kept in a special function for future development + /* frees general vector; kept in a special function for future development */ { - IF_FREE(v); + IF_FREE(v); } diff --git a/src/memory.h b/src/memory.h index f3aee821..6f17959a 100644 --- a/src/memory.h +++ b/src/memory.h @@ -4,22 +4,22 @@ * memory allocation and freeing * also includes overflows checks * - * Copyright (C) 2006-2008 University of Amsterdam + * Copyright (C) 2006-2007 University of Amsterdam * This code is covered by the GNU General Public License. */ #ifndef __memory_h #define __memory_h -#include <stddef.h> // for size_t -#include "function.h" // for function attributes +#include <stddef.h> /* for size_t */ +#include "function.h" /* for function attributes */ #define MBYTE 1048576.0 -// for conciseness +/* for conciseness */ #define OTHER_ARGUMENTS const int who,const char *fname,const int line,const char *name void CheckOverflow(double size,OTHER_ARGUMENTS); size_t MultOverflow(size_t a,size_t b,OTHER_ARGUMENTS); -// allocate +/* allocate */ doublecomplex *complexVector(size_t size,OTHER_ARGUMENTS) ATT_MALLOC; double **doubleMatrix(size_t rows,size_t cols,OTHER_ARGUMENTS) ATT_MALLOC; double *doubleVector(size_t size,OTHER_ARGUMENTS) ATT_MALLOC; @@ -30,20 +30,17 @@ unsigned short *ushortVector(size_t size,OTHER_ARGUMENTS) ATT_MALLOC; char *charVector(size_t size,OTHER_ARGUMENTS) ATT_MALLOC; unsigned char *ucharVector(size_t size,OTHER_ARGUMENTS) ATT_MALLOC; void *voidVector(size_t size,OTHER_ARGUMENTS) ATT_MALLOC; -// reallocate -double *doubleRealloc(double *ptr,const size_t size,OTHER_ARGUMENTS) ATT_MALLOC; -// free +/* free */ void Free_cVector(doublecomplex *v); void Free_dMatrix(double **m,size_t rows); void Free_dVector2(double *v,size_t nl); void Free_iMatrix(int **m,size_t nrl,size_t nrh,size_t ncl); void Free_general(void *v); -// macros to use for allocation and reallocation +/* macros to use for allocation */ #define MALLOC_VECTOR(vec,type,size,who) vec=type##Vector(size,who,POSIT,#vec) #define MALLOC_DVECTOR2(vec,nl,nh,who) vec=doubleVector2(nl,nh,who,POSIT,#vec) #define MALLOC_DMATRIX(vec,rows,cols,who) vec=doubleMatrix(rows,cols,who,POSIT,#vec) #define MALLOC_IMATRIX(vec,nrl,nrh,ncl,nch,who) vec=intMatrix(nrl,nrh,ncl,nch,who,POSIT,#vec) -#define REALLOC_DVECTOR(vec,size,who) vec=doubleRealloc(vec,size,who,POSIT,#vec) -#endif //__memory_h +#endif /*__memory_h*/ diff --git a/src/os.h b/src/os.h index 6de078eb..2e53d89e 100644 --- a/src/os.h +++ b/src/os.h @@ -2,23 +2,21 @@ * AUTH: Maxim Yurkin * DESCR: determines which operation system is used * - * Copyright (C) 2006-2008 University of Amsterdam + * Copyright (C) 2006-2007 University of Amsterdam * This code is covered by the GNU General Public License. */ #ifndef __os_h #define __os_h -/* If neither WINDOWS nor POSIX is found, some parts of the program, such as precise timing and - * file locking, will fail to compile - */ +/* If neither WINDOWS nor POSIX is found, some parts of the program, such as Precise Timing and + File Locking, will fail to compile */ #ifdef _WIN32 -# define WINDOWS -# include <windows.h> // all windows functions need this +# define WINDOWS +# include <windows.h> /* all windows functions need this */ /* this list is not exhaustive. gcc always defines __POSIX__ on POSIX-compliant systems, - * however other compilers do not necessarily do the same. You may define it manually - */ + however other compilers do not necessarily do the same. You may define it manually */ #elif defined(__POSIX__) || defined(unix) || defined (__unix) || defined (__unix__) -# define POSIX +# define POSIX #endif -#endif // __os_h +#endif /*__os_h*/ diff --git a/src/param.c b/src/param.c index ed5b7bad..a7493bb6 100644 --- a/src/param.c +++ b/src/param.c @@ -25,189 +25,181 @@ #include "function.h" #include "parbas.h" -// definitions for file locking +/* definitions for file locking */ #ifdef USE_LOCK -# ifdef WINDOWS -# define FILEHANDLE HANDLE -# elif defined(POSIX) -# include <unistd.h> -# include <fcntl.h> -# ifdef LOCK_FOR_NFS -# include <errno.h> // for error handling of fcntl call -# endif -# define FILEHANDLE int -# else -# error *** Unknown operation system. Creation of lock files is not supported. *** -# endif -# define LOCK_WAIT 1 // in seconds -# define MAX_LOCK_WAIT_CYCLES 60 +# ifdef WINDOWS +# define FILEHANDLE HANDLE +# elif defined(POSIX) +# include <unistd.h> +# include <fcntl.h> +# ifdef LOCK_FOR_NFS +# include <errno.h> /* for error handling of fcntl call */ +# endif +# define FILEHANDLE int +# else +# error *** Unknown operation system. Creation of lock files is not supported. *** +# endif +# define LOCK_WAIT 1 /* in seconds */ +# define MAX_LOCK_WAIT_CYCLES 60 #else -# define FILEHANDLE int +# define FILEHANDLE int #endif -// GLOBAL VARIABLES +/* GLOBAL VARIABLES */ -opt_index opt; // main option index +opt_index opt; /* main option index */ -// SEMI-GLOBAL VARIABLES +/* SEMI-GLOBAL VARIABLES */ -// defined and initialized in crosssec.c +/* defined and initialized in crosssec.c */ extern const char avg_string[]; -// defined and initialized in GenerateB.c +/* defined and initialized in GenerateB.c */ extern const char beam_descr[]; -// defined and initialized in make_particle.c +/* defined and initialized in make_particle.c */ extern const int volcor_used; extern const char sh_form_str[]; extern const int gr_N; extern const double gr_vf_real; extern const double mat_count[]; -// used in CalculateE.c -int store_int_field; // save full internal fields to text file -int store_dip_pol; // save dipole polarizations to text file -int store_beam; // save incident beam to file -int store_scat_grid; // Store the scattered field for grid of angles -int calc_Cext; // Calculate the extinction cross-section - always do -int calc_Cabs; // Calculate the absorption cross-section - always do -int calc_Csca; // Calculate the scattering cross-section by integration -int calc_vec; // Calculate the unnormalized asymmetry-parameter -int calc_asym; // Calculate the asymmetry-parameter -int calc_mat_force; // Calculate the scattering force by matrix-evaluation -int store_force; // Write radiation pressure per dipole to file +/* used in CalculateE.c */ +int store_int_field; /* save full internal fields to text file */ +int store_dip_pol; /* save dipole polarizations to text file */ +int store_beam; /* save incident beam to file */ +int store_scat_grid; /* Store the scattered field for grid of angles */ +int calc_Cext; /* Calculate the extinction cross-section - allways do */ +int calc_Cabs; /* Calculate the absorption cross-section - allways do */ +int calc_Csca; /* Calculate the scattering cross-section by integration */ +int calc_vec; /* Calculate the unnormalized asymmetry-parameter */ +int calc_asym; /* Calculate the asymmetry-parameter */ +int calc_mat_force; /* Calculate the scattering force by matrix-evaluation */ +int store_force; /* Write radiation pressure per dipole to file */ int phi_int_type; /* type of phi integration (each bit determines - * whether to calculate with different multipliers) - */ -// used in calculator.c -int avg_inc_pol; // whether to average CC over incident polarization -char alldir_parms[MAX_FNAME]; // name of file with alldir parameters -char scat_grid_parms[MAX_FNAME]; // name of file with parameters of scattering grid -// used in crosssec.c -double prop_0[3]; // initial incident direction (in laboratory reference frame) -double incPolX_0[3],incPolY_0[3]; // initial incident polarizations (in lab RF) -int ScatRelation; // type of formulae for scattering quantities -// used in GenerateB.c + whether to calculate with different multipliers) */ +/* used in calculator.c */ +int avg_inc_pol; /* whether to average CC over incident polarization */ +char alldir_parms[MAX_FNAME]; /* name of file with alldir parameters */ +char scat_grid_parms[MAX_FNAME]; /* name of file with parameters of scattering grid */ +/* used in crosssec.c */ +double prop_0[3]; /* initial incident direction (in laboratory reference frame) */ +double incPolX_0[3],incPolY_0[3]; /* initial incident polarizations (in lab RF)*/ +int ScatRelation; /* type of formulae for scattering quantities */ +/* used in GenerateB.c */ int beam_Npars; -double beam_pars[MAX_N_BEAM_PARMS]; // beam parameters -// used in io.c -char logname[MAX_FNAME]=""; // name of logfile -// used in iterative.c -double eps; // relative error to reach -// used in make_particle.c -int shape; // particle shape definition -int sh_Npars; // number of shape parameters -double sh_pars[MAX_N_SH_PARMS]; // storage for shape parameters -int sym_type; // how to treat particle symmetries -double sizeX; // size of particle along x-axis -double dpl; // number of dipoles per lambda (wavelength) -double lambda; // incident wavelength (in vacuum) -int jagged; // size of big dipoles, used to construct a particle -char shape_fname[MAX_FNAME]; // name of file, defining the shape -char save_geom_fname[MAX_FNAME]; // geometry file name to save dipole configuration -char shapename[MAX_LINE]; // name of the used shape -int volcor; // whether to use volume correction -int save_geom; // whether to save dipole configuration in .geom file -opt_index opt_sh; // option index of shape option used -double gr_vf; // granules volume fraction -double gr_d; // granules diameter -int gr_mat; // domain number to granulate -double a_eq; // volume-equivalent radius of the particle -int sg_format; // format for saving geometry files -int store_grans; // whether to save granule positions to file - -// LOCAL VARIABLES - -static char run_name[MAX_WORD]; // first part of the dir name ('run' or 'test') -static char avg_parms[MAX_FNAME]; // name of file with orientation averaging parameters -static char *exename; // name of executable (adda, adda.exe, adda_mpi,...) -static int Nmat_given; // number of refractive indices given in the command line -// structure definitions +double beam_pars[MAX_N_BEAM_PARMS]; /* beam parameters */ +/* used in io.c */ +char logname[MAX_FNAME]=""; /* name of logfile */ +/* used in iterative.c */ +double eps; /* relative error to reach */ +/* used in make_particle.c */ +int shape; /* particle shape definition */ +int sh_Npars; /* number of shape parameters */ +double sh_pars[MAX_N_SH_PARMS]; /* storage for shape parameters */ +int sym_type; /* how to treat particle symmetries */ +double sizeX; /* size of particle along x-axis */ +double dpl; /* number of dipoles per lambda (wavelength) */ +double lambda; /* incident wavelength (in vacuum) */ +int jagged; /* size of big dipoles, used to construct a particle */ +char aggregate_file[MAX_FNAME]; /* name of aggregate file */ +char save_geom_fname[MAX_FNAME]; /* geometry file name to save dipole configuration */ +char shapename[MAX_LINE]; /* name of the shape used */ +int volcor; /* whether to use volume correction */ +int save_geom; /* whether to save dipole configuration in .geom file */ +opt_index opt_sh; /* option index of shape option used */ +double gr_vf; /* granules volume fraction */ +double gr_d; /* granules diameter */ +int gr_mat; /* domain number to granulate */ +double a_eq; /* volume-equivalent radius of the particle */ +int sg_format; /* format for saving geometry files */ + +/* LOCAL VARIABLES */ + +static char run_name[MAX_WORD]; /* first part of the dir name ('run' or 'test') */ +static char avg_parms[MAX_FNAME]; /* name of file with orientation averaging parameters */ +static char *exename; /* name of executable (adda or adda.exe) */ +static int Nmat_given; /* number of refractive indices given in the command line */ + /* structure definitions */ struct subopt_struct { - const char *name; // name of option - const char *usage; // how to use (argument list) - const char *help; // help string - const int narg; /* possible number of arguments; UNDEF -> should not be checked; - * may contain also some special negative codes, like FNAME_ARG - */ - const int type; // type of suboption + const char *name; /* name of option */ + const char *usage; /* how to use (argument list) */ + const char *help; /* help string */ + const int narg; /* possible number of argumetns ; UNDEF -> should not be checked */ + const int type; /* type of suboption */ }; struct opt_struct { - const char *name; // name of option - void (*func)(int Narg,char **argv); // pointer to a function, that parse this parameter - int used; // flag to indicate, if the option was already used - const char *usage; // how to use (argument list) - const char *help; // help string - const int narg; // possible number of arguments; UNDEF -> should not be checked - const struct subopt_struct *sub; // suboptions + const char *name; /* name of option */ + void (*func)(int Narg,char **argv); /* pointer to a function, that parse this parameter */ + int used; /* flag to indicate, if the option was allready used */ + const char *usage; /* how to use (argument list) */ + const char *help; /* help string */ + const int narg; /* possible number of argumetns ; UNDEF -> should not be checked */ + const struct subopt_struct *sub; /* suboptions */ }; -// const string for usage of ADDA -static const char exeusage[]="[-<opt1> [<args1>] [-<opt2> <args2>]...]]"; -/* initializations of suboptions; should be 'NULL terminated' - * each row contains: suboption name, usage string, help string, number of arguments - * (UNDEF = not checked automatically), identifier (number) - */ + /* const string for usage of ADDA */ +static const char exeusage[]="[-<opt1> [<args1>] [-<opt2> [<args2>]...]]"; + /* initializations of suboptions; should be 'NULL terminated' + each row contains: suboption name, usage string, help string, number of arguments + (UNDEF = not checked automatically, identifier (number) */ static const struct subopt_struct beam_opt[]={ - {"plane","","Infinite plane wave",0,B_PLANE}, - {"lminus","<width> [<x> <y> <z>]","Simplest approximation of the Gaussian beam. The beam " - "width is obligatory and x, y, z coordinates of the center of the beam are optional " - "parameters (all in um). By default beam center coincides with the center of the " - "computational box.",UNDEF,B_LMINUS}, - {"davis3","<width> [<x> <y> <z>]","3rd order approximation of the Gaussian beam (by Davis). " - "The beam width is obligatory and x, y, z coordinates of the center of the beam are " - "optional parameters (all in um). By default beam center coincides with the center of the " - "computational box.",UNDEF,B_DAVIS3}, - {"barton5","<width> [<x> <y> <z>]","5th order approximation of the Gaussian beam (by Barton). " - "The beam width is obligatory and x, y, z coordinates of the center of the beam are " - "optional parameters (all in um). By default beam center coincides with the center of the " - "computational box. This option is recommended for the description of the Gaussian beam.", - UNDEF,B_BARTON5}, - {NULL,NULL,NULL,0,0} + {"plane","","Infinite plane wave",0,B_PLANE}, + {"lminus","<width> [<x> <y> <z>]","Simplest approximation of the Gaussian beam. The beam width "\ + "is obligatory and x, y, z coordinates of the center of the beam are optional parameters "\ + "(all in um). By default beam center coincides with the center of the computational box.", + UNDEF,B_LMINUS}, + {"davis3","<width> [<x> <y> <z>]","3rd order approximation of the Gaussian beam (by Davis). The "\ + "beam width is obligatory and x, y, z coordinates of the center of the beam are optional "\ + "parameters (all in um). By default beam center coincides with the center of the "\ + "computational box.",UNDEF,B_DAVIS3}, + {"barton5","<width> [<x> <y> <z>]","5th order approximation of the Gaussian beam (by Barton). "\ + "The beam width is obligatory and x, y, z coordinates of the center of the beam are "\ + "optional parameters (all in um). By default beam center coincides with the center of the "\ + "computational box. This option is recommended for the description of the Gaussian beam.", + UNDEF,B_BARTON5}, + {NULL,NULL,NULL,0,0} }; static const struct subopt_struct shape_opt[]={ - {"axisymmetric","<filename>","Axisymmetric homogeneous shape, defined by its contour in " - "ro-z plane of the cylindrical coordinate system. Its symmetry axis coincides with the " - "z-axis, and the contour is read from file.",FNAME_ARG,SH_AXISYMMETRIC}, - {"box","[<y/x> <z/x>]","Homogeneous cube (if no arguments are given) or a rectangular " - "parallelepiped with edges x,y,z.",UNDEF,SH_BOX}, - {"capsule","<h/d>","Homogeneous capsule (cylinder with half-spherical end caps) with cylinder " - "height h and diameter d (its axis of symmetry coincides with the z-axis).",1,SH_CAPSULE}, - {"coated","<d_in/d> [<x/d> <y/d> <z/d>]","Sphere with a spherical inclusion; outer sphere has " - "a diameter d (first domain). The included sphere has a diameter d_in (optional position " - "of the center: x,y,z).",UNDEF,SH_COATED}, - {"cylinder","<h/d>","Homogeneous cylinder with height (length) h and diameter d (its axis of " - "symmetry coincides with the z-axis).",1,SH_CYLINDER}, - {"egg","<eps> <nu>","Axisymmetric egg shape given by a^2=r^2+nu*r*z-(1-eps)z^2, where 'a' is " - "scaling factor. Parameters must satisfy 0<eps<=1, 0<=nu<eps.",2,SH_EGG}, - {"ellipsoid","<y/x> <z/x>","Homogeneous general ellipsoid with semi-axes x,y,z",2,SH_ELLIPSOID}, - {"line","","Line along the x-axis with the width of one dipole",0,SH_LINE}, - {"rbc","<h/d> <b/d> <c/d>","Red Blood Cell, an axisymmetric (over z-axis) biconcave " - "homogeneous particle, which is characterized by diameter d, maximum and minimum width h, " - "b, and diameter at the position of the maximum width c. The surface is described by " - "ro^4+2S*ro^2*z^2+z^4+P*ro^2+Q*z^2+R=0, ro^2=x^2+y^2, P,Q,R,S are determined by the " - "described parameters.",3,SH_RBC}, - {"read","<filename>","Read a particle geometry from file <filename>",FNAME_ARG,SH_READ}, - {"sphere","","Homogeneous sphere",0,SH_SPHERE}, - {"spherebox","<d_sph/Dx>","Sphere (diameter d_sph) in a cube (size Dx, first domain)", - 1,SH_SPHEREBOX}, + {"box","[<y/x> <z/x>]","Homogenous cube (if no arguments are given) or a rectangular "\ + "parallelepiped with edges x,y,z.",UNDEF,SH_BOX}, + {"capsule","<h/d>","Homogenous capsule (cylinder with half-spherical end caps) with cylinder "\ + "height h and diameter d (its axis of symmetry coincides with the z-axis).",1,SH_CAPSULE}, + {"coated","<d_in/d> [<x/d> <y/d> <z/d>]","Sphere with a spherical inclusion; outer sphere has a "\ + "diameter d (first domain). The included sphere has a diameter d_in (optional position of "\ + "the center: x,y,z).",UNDEF,SH_COATED}, + {"cylinder","<h/d>","Homogenous cylinder with height (length) h and diameter d (its axis of "\ + "symmetry coincides with the z-axis).",1,SH_CYLINDER}, + {"egg","<eps> <nu>","Axisymmetric egg shape given by a^2=r^2+nu*r*z-(1-eps)z^2, where 'a' "\ + "is scaling factor. Parameters must satisfy 0<eps<=1, 0<=nu<eps.",2,SH_EGG}, + {"ellipsoid","<y/x> <z/x>","Homogenous general ellipsoid with semi-axes x,y,z",2,SH_ELLIPSOID}, + {"line","","Line along the x-axis with the width of one dipole",0,SH_LINE}, + {"rbc","<h/d> <b/d> <c/d>","Red Blood Cell, an axisymmetric (over z-axis) biconcave homogenous "\ + "particle, which is characterized by diameter d, maximum and minimum width h, b, and "\ + "diameter at the position of the maximum width c. The surface is described by "\ + "ro^4+2S*ro^2*z^2+z^4+P*ro^2+Q*z^2+R=0, ro^2=x^2+y^2, P,Q,R,S are determined by the "\ + "described parameters.",3,SH_RBC}, + {"read","<filename>","Read a particle geometry from file <filename>",1,SH_READ}, + {"sphere","","Homogenous sphere",0,SH_SPHERE}, + {"spherebox","<d_sph/Dx>","Sphere (diameter d_sph) in a cube (size Dx, first domain)", + 1,SH_SPHEREBOX}, /* TO ADD NEW SHAPE - * add a row here, before null-terminating element. It contains: - * shape name (used in command line), usage string (what command line parameters can be used - * for this shape), help string (shown when -h option is used), possible number of float parameters, - * shape identifier (constant defined in const.h). Instead of number of parameters UNDEF can be - * used (if shape can accept variable number of parameters, then check it explicitly in - * function InitShape) or FNAME_ARG (if the shape accepts a single string argument with file name). - * Number of parameters should not be greater than MAX_N_SH_PARMS (defined in const.h). It is - * recommended to use dimensionless shape parameters, e.g. aspect ratios. - */ - {NULL,NULL,NULL,0,0} + add a row here, before null-terminating element. It contains: + shape name (used in command line), usage string (what command line parameters can be used + for this shape), help string (shown when -h option is used), possible number of parameters + (use UNDEF if shape can accept different number of parameters, then check it explicitly in + function InitShape), shape identifier (constant defined in const.h). Number of parameters + should not be greater than MAX_N_SH_PARMS (defined in const.h). It is recommended to use + dimensionless shape parameters, e.g. aspect ratios. */ + + {NULL,NULL,NULL,0,0} }; -// EXTERNAL FUNCTIONS +/* EXTERNAL FUNCTIONS */ -// GenerateB.c +/* GenerateB.c */ void InitBeam(void); -//======================================================================== -// declarations of parsing functions; definitions are given below. defines are for conciseness +/*========================================================================*/ + /* declarations of parsing functions; definitions are given below. + defines are for conciseness */ #define PARSE_NAME(a) parse_##a #define PARSE_FUNC(a) void PARSE_NAME(a)(int Narg,char **argv) #define PAR(a) #a,PARSE_NAME(a),FALSE @@ -252,7 +244,6 @@ PARSE_FUNC(size); PARSE_FUNC(store_beam); PARSE_FUNC(store_dip_pol); PARSE_FUNC(store_force); -PARSE_FUNC(store_grans); PARSE_FUNC(store_int_field); PARSE_FUNC(store_scat_grid); PARSE_FUNC(sym); @@ -260,1451 +251,1426 @@ PARSE_FUNC(test); PARSE_FUNC(V) ATT_NORETURN; PARSE_FUNC(vec); PARSE_FUNC(yz); -/* initialization of options, their usage and help; - * each row contains: PAR(option name),usage string, help string, number of arguments - * (UNDEF = not checked automatically),pointer to suboption (if exist) - */ + /* initialization of options, their usage and help; + each row contains: PAR(option name),usage string, help string, number of arguments + (UNDEF = not checked automatically),pointer to suboption (if exist) */ static struct opt_struct options[]={ - {PAR(alldir_inp),"<filename>","Specifies a file with parameters of the grid of scattering " - "angles for calculating integral scattering quantities.\n" - "Default: " FD_ALLDIR_PARMS,1,NULL}, - {PAR(anisotr),"","Specifies that refractive index is anisotropic (its tensor is limited to be " - "diagonal in particle reference frame). '-m' then accepts 6 arguments per each domain. " - "Can not be used with CLDR polarizability and all SO formulations.",0,NULL}, - {PAR(asym),"","Calculate the asymmetry vector. Implies '-Csca' and '-vec'",0,NULL}, - {PAR(beam),"<type> [<arg1>...]","Sets a type of the incident beam. Four other float arguments " - "must be specified for all beam types except 'plane'. These are the width and x, y, z " - "coordinates of the center of the beam respectively (all in um).\n" - "Default: plane",UNDEF,beam_opt}, - {PAR(chp_dir),"<dirname>","Sets directory for the checkpoint (both for saving and loading).\n" - "Default: " FD_CHP_DIR,1,NULL}, - {PAR(chp_load),"","Restart a simulation from a checkpoint",0,NULL}, - {PAR(chp_type),"{normal|regular|always}", - "Sets type of the checkpoint. All types, except 'always', require '-chpoint'.\n" - "Default: normal",1,NULL}, - {PAR(chpoint),"<time>","Specifies the time for checkpoints in format '#d#h#m#s'. " - "All fields are optional, numbers are integers, 's' can be omitted, the format is not " - "case sensitive.\n" - "Examples: 12h30M, 1D10s, 3600",1,NULL}, - {PAR(Cpr_mat),"","Calculate the total radiation force",0,NULL}, - {PAR(Csca),"","Calculate scattering cross section (by integrating the scattered field)",0,NULL}, - {PAR(dir),"<dirname>","Sets directory for output files.\n" - "Default: constructed automatically",1,NULL}, - {PAR(dpl),"<arg>","Sets parameter 'dipoles per lambda', float.\n" - "Default: 10|m|, where |m| is the maximum of all given refractive indices.",1,NULL}, - {PAR(eps),"<arg>","Specifies the stopping criterion for the iterative solver by setting the " - "relative norm of the residual 'epsilon' to reach. <arg> is an exponent of base 10 " - "(float), i.e. epsilon=10^(-<arg>).\n" - "Default: 5 (epsilon=1E-5)",1,NULL}, - {PAR(eq_rad),"<arg>","Sets volume-equivalent radius of the particle in um, float. If default " - "wavelength is used, this option specifies the volume-equivalent size parameter. Can not " - "be used together with '-size'. Size is defined by some shapes themselves, then this " - "option can be used to override the internal specification and scale the shape.\n" - "Default: determined by the value of '-size' or by '-grid', '-dpl', and '-lambda'.",1,NULL}, - {PAR(granul),"<vol_frac> <diam> [<dom_number>]","Specifies that one particle domain should be " - "randomly filled with spherical granules with specified diameter <diam> and volume " - "fraction <vol_frac>. Domain number to fill is given by the last optional argument. " - "Algorithm may fail for volume fractions > 30-50%.\n" - "Default <dom_number>: 1",UNDEF,NULL}, - {PAR(grid),"<nx> [<ny> <nz>]","Sets dimensions of the computation grid. Arguments should be " - "even integers. In most cases <ny> and <nz> can be omitted (they are automatically " - "determined by <nx> based on the proportions of the scatterer). This command line option " - "is not relevant when particle geometry is read from a file ('-shape read'). If '-jagged' " - "option is used the grid dimension is effectively multiplied by the specified number.\n" - "Default: 16 (if neither '-size' nor '-eq_rad' are specified) or defined by\n" - " '-size' or '-eq_rad', '-lambda', and '-dpl'.",UNDEF,NULL}, - {PAR(h),"[<opt> [<subopt>]]","Shows help. If used without arguments, ADDA shows a list of all " - "available command line options. If first argument is specified, help on specific command " - "line option <opt> is shown (only the name of the option should be given without " - "preceding dash). For some options (e.g. '-beam' or '-shape') specific help on a " - "particular suboption <subopt> may be shown.\n" - "Example: shape coated",UNDEF,NULL}, - {PAR(int),"{poi|fcd|fcd_st|so}","Sets prescription to calculate interaction term. 'so' is " - "under development and incompatible with '-anisotr'. 'fcd' requires dpl to be larger " - "than 2.\n" - "Default: poi",1,NULL}, - {PAR(iter),"{cgnr|bicg|bicgstab|qmr}","Sets the iterative solver.\n" - "Default: qmr",1,NULL}, - {PAR(jagged),"<arg>","Sets a size of a big dipole in units of small dipoles, integer. It is " - "used to improve the discretization of the particle without changing the shape.\n" - "Default: 1",1,NULL}, - {PAR(lambda),"<arg>","Sets incident wavelength in um, float.\n" - "Default: 2*pi",1,NULL}, - {PAR(m),"{<m1Re> <m1Im> [...]|<m1xxRe> <m1xxIm> <m1yyRe> <m1yyIm> <m1zzRe> <m1zzIm> [...]}", - "Sets refractive indices, float. Each pair of arguments specifies real and imaginary part " - "of the refractive index of one of the domains. If '-anisotr' is specified, three " - "refractive indices correspond to one domain (diagonal elements of refractive index " - "tensor in particle reference frame). Maximum number of different refractive indices is " - "defined at compilation time by the parameter MAX_NMAT in file const.h (by default, 15). " - "None of the refractive indices can be equal to 1+0i.\n" - "Default: 1.5 0",UNDEF,NULL}, - {PAR(maxiter),"<arg>","Sets the maximum number of iterations of the iterative solver, " - "integer.\n" - "Default: very large, not realistic value",1,NULL}, - {PAR(no_reduced_fft),"","Do not use symmetry of the interaction matrix to reduce the storage " - "space for the Fourier-transformed matrix.",0,NULL}, - {PAR(no_vol_cor),"","Do not use 'dpl (volume) correction'. If this option is given, ADDA will " - "try to match size of the dipole grid along x-axis to that of the particle, either given " - "by '-size' or calculated analytically from '-eq_rad'. Otherwise (by default) ADDA will " - "try to match the volumes, using either '-eq_rad' or the value calculated analytically " - "from '-size'.",0,NULL}, - {PAR(ntheta),"<arg>","Sets the number of intervals, into which the range of scattering angles " - "[0,180] (degrees) is equally divided, integer. This is used for scattering angles in " - "yz-plane. If particle is not symmetric and orientation averaging is not used, the range " - "is extended to 360 degrees (with the same length of elementary interval, i.e. number of " - "intervals is doubled).\n" - "Default: from 90 to 720 depending on the size of the computational grid.",1,NULL}, - {PAR(opt),"{speed|mem}", - "Sets whether ADDA should optimize itself for maximum speed or for minimum memory usage.\n" - "Default: speed",1,NULL}, - {PAR(orient),"{<alpha> <beta> <gamma>|avg [<filename>]}","Either sets an orientation of the " - "particle by three Euler angles 'alpha','beta','gamma' (in degrees) or specifies that " - "orientation averaging should be performed. <filename> sets a file with parameters for " - "orientation averaging. Here zyz-notation (or y-convention) is used for Euler angles.\n" - "Default orientation: 0 0 0\n" - "Default <filename>: " FD_AVG_PARMS,UNDEF,NULL}, - {PAR(phi_integr),"<arg>","Turns on and specifies the type of Mueller matrix integration over " - "azimuthal angle 'phi'. <arg> is an integer from 1 to 31, each bit of which, from lowest " - "to highest, indicates whether the integration should be performed with multipliers 1, " - "cos(2*phi), sin(2*phi), cos(4*phi), and sin(4*phi) respectively.\n" - "Examples: 1 (one integration with no multipliers),\n" - " 6 (two integration with cos(2*phi) and sin(2*phi) multipliers).",1,NULL}, - {PAR(pol),"{cm|rrc|ldr [avgpol]|cldr|so|fcd}","Type of polarization prescription. An optional " - "flag 'avg' can be added for LDR - it specifies that LDR polarizability should be " - "averaged over incident polarizations. 'so' is under development. 'cldr' and 'so' are " - "incompatible with '-anisotr'. 'fcd' requires dpl to be larger than 2.\n" - "Default: ldr (without averaging).",UNDEF,NULL}, - {PAR(prognose),"","Do not actually perform simulation (not even memory allocation) but only " - "estimate the required RAM. Implies '-test'.",0,NULL}, - {PAR(prop),"<x> <y> <z>","Sets propagation direction of incident radiation, float. " - "Normalization (to the unity vector) is performed automatically.\n" - "Default: 0 0 1",3,NULL}, - {PAR(save_geom),"[<filename>]","Saves dipole configuration to a file <filename> (a path " - "relative to the output directory). Can be used with '-prognose'.\n" - "Default: <type>.geom (<type> is a first argument to the '-shape' option; '_gran' is \n" - " added if '-granul' option is used).",UNDEF,NULL}, - {PAR(scat),"{dr|so}","Sets prescription to calculate scattering quantities. 'so' is under " - "development and incompatible with '-anisotr'.\n" - "Default: dr",1,NULL}, - {PAR(scat_grid_inp),"<filename>","Specifies a file with parameters of the grid of scattering " - "angles for calculating Mueller matrix (possibly integrated over 'phi').\n" - "Default: " FD_SCAT_PARMS,1,NULL}, - {PAR(sg_format),"{text|text_ext|ddscat}","Specifies format for saving geometry files. First " - "two are ADDA default formats for single- and multi-domain particles respectively. 'text' " - "is automatically changed to 'text_ext' for multi-domain particles. DDSCAT format " - "corresponds to its shape option FRMFIL and output of 'calltarget' utility " - "(version 6.1).\n" - "Default: text",1,NULL}, - {PAR(shape),"<type> [<arg1>...]","Sets shape of the particle, either predefined or 'read' " - "from file. All the parameters of predefined shapes are floats except for filenames.\n" - "Default: sphere",UNDEF,shape_opt}, - {PAR(size),"<arg>","Sets the size of the computational grid along the x-axis in um, float. If " - "default wavelength is used, this option specifies the 'size parameter' of the " - "computational grid. Can not be used together with '-eq_rad'. Size is defined by some " - "shapes themselves, then this option can be used to override the internal specification " - "and scale the shape.\n" - "Default: determined by the value of '-eq_rad' or by '-grid', '-dpl', and '-lambda'.", - 1,NULL}, - {PAR(store_beam),"","Save incident beam to a file",0,NULL}, - {PAR(store_dip_pol),"","Save dipole polarizations to a file",0,NULL}, - {PAR(store_force),"","Calculate the radiation force on each dipole. Requires '-Cpr_mat'", - 0,NULL}, - {PAR(store_grans),"","Save granule coordinates (placed by '-granul' option) to a file",0,NULL}, - {PAR(store_int_field),"","Save internal fields to a file",0,NULL}, - {PAR(store_scat_grid),"","Calculate Mueller matrix for a grid of scattering angles and save it to a file.",0,NULL}, - {PAR(sym),"{auto|no|enf}","Automatically determine particle symmetries ('auto'), do not take " - "them into account ('no'), or enforce them ('enf').\n" - "Default: auto",1,NULL}, - {PAR(test),"","Begin name of the output directory with 'test' instead of 'run'",0,NULL}, - {PAR(V),"","Show ADDA version, compiler used to build this executable, and copyright " - "information",0,NULL}, - {PAR(vec),"","Calculate the not-normalized asymmetry vector",0,NULL}, - {PAR(yz),"","Calculate the Mueller matrix in yz-plane even if it is calculated for a " - "scattering grid. If the latter option is not enabled, scattering in yz-plane is always " - "calculated.",0,NULL} + {PAR(alldir_inp),"<filename>","Specifies a file with parameters of the grid of scattering "\ + "angles for calculating integral scattering quantities.\n"\ + "Default: " FD_ALLDIR_PARMS,1,NULL}, + {PAR(anisotr),"","Specifies that refractive index is anisotropic (its tensor is limited to be "\ + "diagonal in particle reference frame). '-m' then accepts 6 arguments per each domain. Can "\ + "not be used with CLDR polarizability and all SO formulations.",0,NULL}, + {PAR(asym),"","Calculate the asymmetry vector. Implies '-Csca' and '-vec'",0,NULL}, + {PAR(beam),"<type> [<arg1>...]","Sets a type of the incident beam. Four other float arguments "\ + "must be specified for all beam types except 'plane'. These are the width and x, y, z "\ + "coordinates of the center of the beam respectively (all in um).\n"\ + "Default: plane",UNDEF,beam_opt}, + {PAR(chp_dir),"<dirname>","Sets directory for the checkpoint (both for saving and loading).\n"\ + "Default: " FD_CHP_DIR,1,NULL}, + {PAR(chp_load),"","Restart a simulation from a checkpoint",0,NULL}, + {PAR(chp_type),"{normal|regular|always}", + "Sets type of the checkpoint. All types, except 'always', require '-chpoint'.\n"\ + "Default: normal",1,NULL}, + {PAR(chpoint),"<time>","Specifies the time for checkpoints in format '#d#h#m#s'. All fields are "\ + "optional, numbers are integers, 's' can be omitted, the format is not case sensitive.\n"\ + "Examples: 12h30M, 1D10s, 3600",1,NULL}, + {PAR(Cpr_mat),"","Calculate the total radiation force",0,NULL}, + {PAR(Csca),"","Calculate scattering cross section (by integrating the scattered field)",0,NULL}, + {PAR(dir),"<dirname>","Sets directory for output files.\n"\ + "Default: constructed automatically",1,NULL}, + {PAR(dpl),"<arg>","Sets parameter 'dipoles per lambda', float.\n"\ + "Default: 10|m|, where |m| is the maximum of all given refractive indices.",1,NULL}, + {PAR(eps),"<arg>","Specifies the stopping criterion for the iterative solver by setting the "\ + "relative norm of the residual 'epsilon' to reach. <arg> is an exponent of base 10 (float), "\ + "i.e. epsilon=10^(-<arg>).\n"\ + "Default: 5 (epsilon=1E-5)",1,NULL}, + {PAR(eq_rad),"<arg>","Sets volume-equivalent radius of the particle in um, float. If default "\ + "wavelength is used, this option specifies the volume-equivalent size parameter. Can not be "\ + "used together with '-size'.\n"\ + "Default: determined by the value of '-size' or by '-grid', '-dpl', and '-lambda'.",1,NULL}, + {PAR(granul),"<vol_frac> <diam> [<dom_number>]","Specifies that one particle domain should be "\ + "randomly filled with spherical granules with specified diameter <diam> and volume fraction "\ + "<vol_frac>. Domain number to fill is given by the last optional argument. Algorithm may "\ + "fail for volume fractions > 30-50%.\n"\ + "Default <dom_number>: 1",UNDEF,NULL}, + {PAR(grid),"<nx> [<ny> <nz>]","Sets dimensions of the computation grid. Arguments should be "\ + "even integers. In most cases <ny> and <nz> can be omitted (they are automatically "\ + "determined by <nx> based on the proportions of the scatterer). This command line option is "\ + "not relevant when particle geometry is read from a file ('-shape read'). If '-jagged' "\ + "option is used the grid dimension is effectively multiplied by the specified number.\n"\ + "Default: 16 (if neither '-size' nor '-eq_rad' are specified) or defined by\n"\ + " '-size' or '-eq_rad', '-lambda', and '-dpl'.",UNDEF,NULL}, + {PAR(h),"[<opt> [<subopt>]]","Shows help. If used without arguments, ADDA shows a list of all "\ + "available command line options. If first argument is specified, help on specific command "\ + "line option <opt> is shown (only the name of the option should be given without preceding "\ + "dash). For some options (e.g. '-beam' or '-shape') specific help on a particular suboption "\ + "<subopt> may be shown.\n"\ + "Example: shape coated",UNDEF,NULL}, + {PAR(int),"{poi|fcd|fcd_st|so}","Sets prescription to calculate interaction term. 'so' is under "\ + "development and incompatible with '-anisotr'. 'fcd' requires dpl to be larger than 2.\n"\ + "Default: poi",1,NULL}, + {PAR(iter),"{cgnr|bicg|bicgstab|qmr}","Sets the iterative solver.\n"\ + "Default: qmr",1,NULL}, + {PAR(jagged),"<arg>","Sets a size of a big dipole in units of small dipoles, integer. It is "\ + "used to improve the discretization of the particle without changing the shape.\n"\ + "Default: 1",1,NULL}, + {PAR(lambda),"<arg>","Sets incident wavelength in um, float.\n"\ + "Default: 2*pi",1,NULL}, + {PAR(m),"{<m1Re> <m1Im> [...]|<m1xxRe> <m1xxIm> <m1yyRe> <m1yyIm> <m1zzRe> <m1zzIm> [...]}", + "Sets refractive indices, float. Each pair of arguments specifies real and imaginary part of "\ + "the refractive index of one of the domains. If '-anisotr' is specified, three refractive "\ + "indices correspond to one domain (diagonal elements of refractive index tensor in particle "\ + "reference frame). Maximum number of different refractive indices is defined at compilation "\ + "time by the parameter MAX_NMAT in file const.h (by default, 15). None of the refractive "\ + "indices can be equal to 1+0i.\n"\ + "Default: 1.5 0",UNDEF,NULL}, + {PAR(maxiter),"<arg>","Sets the maximum number of iterations of the iterative solver, integer.\n"\ + "Default: very large, not realistic value",1,NULL}, + {PAR(no_reduced_fft),"","Do not use symmetry of the interaction matrix to reduce the storage "\ + "space for the Fourier-transformed matrix.",0,NULL}, + {PAR(no_vol_cor),"","Do not use 'dpl (volume) correction'. If this option is given, ADDA will "\ + "try to match size of the dipole grid along x-axis to that of the particle, either given by "\ + "'-size' or calculated analytically from '-eq_rad'. Otherwise (by default) ADDA will try to "\ + "match the volumes, using either '-eq_rad' or the value calculated analytically from '-size'.", + 0,NULL}, + {PAR(ntheta),"<arg>","Sets the number of intervals, into which the range of scattering angles "\ + "[0,180] (degrees) is equally divided, integer. This is used for scattering angles in "\ + "yz-plane. If particle is not symmetric and orientation averaging is not used, the range is "\ + "extended to 360 degrees (with the same length of elementary interval, i.e. number of "\ + "intervals is doubled).\n"\ + "Default: from 90 to 720 depending on the size of the computational grid.",1,NULL}, + {PAR(opt),"{speed|mem}", + "Sets whether ADDA should optimize itself for maximum speed or for minimum memory usage.\n"\ + "Default: speed",1,NULL}, + {PAR(orient),"{<alpha> <beta> <gamma>|avg [<filename>]}","Either sets an orientation of the "\ + "particle by three Euler angles 'alpha','beta','gamma' (in degrees) or specifies that "\ + "orientation averaging should be performed. <filename> sets a file with parameters for "\ + "orientation averaging. Here zyz-notation (or y-convention) is used for the Euler angles.\n"\ + "Default orientation: 0 0 0\n"\ + "Default <filename>: " FD_AVG_PARMS,UNDEF,NULL}, + {PAR(phi_integr),"<arg>","Turns on and specifies the type of Mueller matrix integration over "\ + "azimuthal angle 'phi'. <arg> is an integer from 1 to 31, each bit of which, from lowest to "\ + "highest, indicates whether the integration should be performed with multipliers 1, "\ + "cos(2*phi), sin(2*phi), cos(4*phi), and sin(4*phi) respectively.\n"\ + "Examples: 1 (one integration with no multipliers),\n"\ + " 6 (two integration with cos(2*phi) and sin(2*phi) multipliers).",1,NULL}, + {PAR(pol),"{cm|rrc|ldr [avgpol]|cldr|so|fcd}","Type of polarization prescription. An optional "\ + "flag 'avg' can be added for LDR - it specifies that LDR polarizability should be averaged "\ + "over incident polarizations. 'so' is under development. 'cldr' and 'so' are incompatible "\ + "with '-anisotr'. 'fcd' requires dpl to be larger than 2.\n"\ + "Default: ldr (without averaging).",UNDEF,NULL}, + {PAR(prognose),"","Do not actually perform simulation (not even memory allocation) but only "\ + "estimate the required RAM. Implies '-test'.",0,NULL}, + {PAR(prop),"<x> <y> <z>","Sets propagation direction of incident radiation, float. "\ + "Normalization (to the unity vector) is performed automatically.\n"\ + "Default: 0 0 1",3,NULL}, + {PAR(save_geom),"[<filename>]","Saves dipole configuration to a file <filename> (a path "\ + "relative to the output directory). Can be used with '-prognose'.\n"\ + "Default: <type>.geom (<type> is a first argument to the '-shape' option; '_gran' is added\n"\ + " if '-granul' option is used).",UNDEF,NULL}, + {PAR(scat),"{dr|so}","Sets prescription to calculate scattering quantities. 'so' is under "\ + "development and incompatible with '-anisotr'.\n"\ + "Default: dr",1,NULL}, + {PAR(scat_grid_inp),"<filename>","Specifies a file with parameters of the grid of scattering "\ + "angles for calculating Mueller matrix (possibly integrated over 'phi').\n"\ + "Default: " FD_SCAT_PARMS,1,NULL}, + {PAR(sg_format),"{text|text_ext|ddscat}","Specifies format for saving geometry files. First two "\ + "are ADDA default formats for single- and multi-domain particles respectively. 'text' is "\ + "automatically changed to 'text_ext' for multi-domain particles. DDSCAT format corresponds to "\ + "its shape option FRMFIL and output of 'calltarget' utility (version 6.1).\n"\ + "Default: text",1,NULL}, + {PAR(shape),"<type> [<arg1>...]","Sets shape of the particle, either predefined or 'read' from "\ + "file. All the parameters of predefined shapes are floats.\n"\ + "Default: sphere",UNDEF,shape_opt}, + {PAR(size),"<arg>","Sets the size of the computational grid along the x-axis in um, float. If "\ + "default wavelength is used, this option specifies the 'size parameter' of the computational "\ + "grid. Can not be used together with '-eq_rad'.\n"\ + "Default: determined by the value of '-eq_rad' or by '-grid', '-dpl', and '-lambda'.",1,NULL}, + {PAR(store_beam),"","Save incident beam to a file",0,NULL}, + {PAR(store_dip_pol),"","Save dipole polarizations to a file",0,NULL}, + {PAR(store_force),"","Calculate the radiation force on each dipole. Requires '-Cpr_mat'",0,NULL}, + {PAR(store_int_field),"","Save internal fields to a file",0,NULL}, + {PAR(store_scat_grid),"", + "Calculate Mueller matrix for a grid of scattering angles and save it to a file.",0,NULL}, + {PAR(sym),"{auto|no|enf}","Automatically determine particle symmetries ('auto'), do not take "\ + "them into account ('no'), or enforce them ('enf').\n"\ + "Default: auto",1,NULL}, + {PAR(test),"","Begin name of the output directory with 'test' instead of 'run'",0,NULL}, + {PAR(V),"","Show ADDA version, compiler used to build this executable, and copyright information", + 0,NULL}, + {PAR(vec),"","Calculate the not-normalized asymmetry vector",0,NULL}, + {PAR(yz),"","Calculate the Mueller matrix in yz-plane even if it is calculated for a scattering "\ + "grid. If the latter option is not enabled, scattering in yz-plane is always calculated.", + 0,NULL} }; -// auxiliary functions -//============================================================ + /* auxiliary functions */ +/*============================================================*/ static const char *OptionName(void) -// produces full option name for error messages + /* produces full option name for error messages */ { - static char buf[MAX_LINE]; + static char buf[MAX_LINE]; - if (opt.l2==UNDEF) return options[opt.l1].name; - else { - sprintf(buf,"%s %s",options[opt.l1].name,options[opt.l1].sub[opt.l2].name); - return buf; - } + if (opt.l2==UNDEF) return options[opt.l1].name; + else { + sprintf(buf,"%s %s",options[opt.l1].name,options[opt.l1].sub[opt.l2].name); + return buf; + } } -//============================================================ +/*============================================================*/ void PrintErrorHelp(const char *fmt, ... ) -/* print anything to stderr (on root processor), then help on the arguments used, and stop; - * assumes that all processors call it; has line wrapping - */ -{ - va_list args; - const char *optname,*use; - char *pos; - char line[MAX_MESSAGE]; - - if (ringid==ROOT) { - // produce error message - va_start(args,fmt); - strcpy(line,"ERROR: "); - pos=line+strlen(line); - pos+=vsprintf(pos,fmt,args); - strcpy(pos,"\n"); - pos+=strlen(pos); - va_end(args); - // add help message - if (opt.l1==UNDEF) // no option is found - pos+=sprintf(pos,"Usage: %s %s\n" - "Type '%s -h' for help\n",exename,exeusage,exename); - else { // at least option is found - if (opt.l2==UNDEF) use=options[opt.l1].usage; - else use=options[opt.l1].sub[opt.l2].usage; - optname=OptionName(); - pos+=sprintf(pos,"Usage: -%s %s\n" - "Type '%s -h %s' for details\n",optname,use,exename,optname); - } - WrapLines(line); - fprintf(stderr,"%s",line); - fflush(stderr); - } - // wait for root to generate an error message - Synchronize(); - Stop(1); -} - -//============================================================ + /* print anything to stderr (on root processor), then help on the arguments used, and stop; + assumes that all processors call it; has line wrapping */ +{ + va_list args; + const char *optname,*use; + char *pos; + char line[MAX_MESSAGE]; + + if (ringid==ROOT) { + /* produce error message */ + va_start(args,fmt); + strcpy(line,"ERROR: "); + pos=line+strlen(line); + pos+=vsprintf(pos,fmt,args); + strcpy(pos,"\n"); + pos+=strlen(pos); + va_end(args); + /* add help message */ + if (opt.l1==UNDEF) /* no option is found */ + pos+=sprintf(pos,"Usage: %s %s\n"\ + "Type '%s -h' for help\n",exename,exeusage,exename); + else { /* at least option is found */ + if (opt.l2==UNDEF) use=options[opt.l1].usage; + else use=options[opt.l1].sub[opt.l2].usage; + optname=OptionName(); + pos+=sprintf(pos,"Usage: -%s %s\n"\ + "Type '%s -h %s' for details\n",optname,use,exename,optname); + } + WrapLines(line); + fprintf(stderr,"%s",line); + fflush(stderr); + } + /* wait for root to generate an error message */ + Synchronize(); + Stop(1); +} + +/*============================================================*/ void PrintErrorHelpSafe(const char *fmt, ... ) -/* print anything to stderr (on root processor), then help on the arguments used, and stop; - * assumes that all processors call it; same as PrintErrorHelp but uses no internal buffers to be - * safe for any input parameters, which may come from a command line, at a cost of lacking line - * wrapping. - */ -{ - va_list args; - const char *optname,*use; - - if (ringid==ROOT) { - // produce error message - va_start(args,fmt); - fprintf(stderr,"ERROR: "); - vfprintf(stderr,fmt,args); - fprintf(stderr,"\n"); - va_end(args); - // add help message - if (opt.l1==UNDEF) // no option is found - fprintf(stderr,"Usage: %s %s\n" - "Type '%s -h' for help\n",exename,exeusage,exename); - else { // at least option is found - if (opt.l2==UNDEF) use=options[opt.l1].usage; - else use=options[opt.l1].sub[opt.l2].usage; - optname=OptionName(); - fprintf(stderr,"Usage: -%s %s\n" - "Type '%s -h %s' for details\n",optname,use,exename,optname); - } - fflush(stderr); - } - // wait for root to generate an error message - Synchronize(); - Stop(1); -} - -//============================================================ + /* print anything to stderr (on root processor), then help on the arguments used, and stop; + assumes that all processors call it + same as PrintErrorHelp but uses no internal buffers to be safe for any input parameters, + which may come from a command line, at a cost of lacking line wrapping */ +{ + va_list args; + const char *optname,*use; + + if (ringid==ROOT) { + /* produce error message */ + va_start(args,fmt); + fprintf(stderr,"ERROR: "); + vfprintf(stderr,fmt,args); + fprintf(stderr,"\n"); + va_end(args); + /* add help message */ + if (opt.l1==UNDEF) /* no option is found */ + fprintf(stderr,"Usage: %s %s\n"\ + "Type '%s -h' for help\n",exename,exeusage,exename); + else { /* at least option is found */ + if (opt.l2==UNDEF) use=options[opt.l1].usage; + else use=options[opt.l1].sub[opt.l2].usage; + optname=OptionName(); + fprintf(stderr,"Usage: -%s %s\n"\ + "Type '%s -h %s' for details\n",optname,use,exename,optname); + } + fflush(stderr); + } + /* wait for root to generate an error message */ + Synchronize(); + Stop(1); +} + +/*============================================================*/ static void NargError(const int Narg,const char *expec) -/* Print error of illegal number of arguments to an option (suboption) and display correct usage - * information - */ + /* Print error of illegal number of arguments to an option (suboption); + and display correct usage information */ { - char buf[MAX_WORD]; // not to allocate memory if needed + char buf[MAX_WORD]; /* not to allocate memory if needed */ - if (expec==NULL) { - if (opt.l2==UNDEF) sprintf(buf,"%d",options[opt.l1].narg); - else sprintf(buf,"%d",options[opt.l1].sub[opt.l2].narg); - expec=buf; - } - PrintErrorHelp("Illegal number of arguments (%d) to '-%s' option (%s expected)", - Narg,OptionName(),expec); + if (expec==NULL) { + if (opt.l2==UNDEF) sprintf(buf,"%d",options[opt.l1].narg); + else sprintf(buf,"%d",options[opt.l1].sub[opt.l2].narg); + expec=buf; + } + PrintErrorHelp("Illegal number of arguments (%d) to '-%s' option (%s expected)", + Narg,OptionName(),expec); } -//============================================================ -// following two functions are interfaces to NargError - +/*============================================================*/ + /* following two functions are interfaces to NargError */ INLINE void TestNarg(const int Narg) -// check if Narg given to an option is correct + /* check if Narg given to an option is correct */ { - if (options[opt.l1].narg!=UNDEF && Narg!=options[opt.l1].narg) NargError(Narg,NULL); + if (options[opt.l1].narg!=UNDEF && Narg!=options[opt.l1].narg) + NargError(Narg,NULL); } -//============================================================ +/*============================================================*/ INLINE void TestNarg_sub(const int Narg) -// check if Narg given to a suboption is correct + /* check if Narg given to a suboption is correct */ { - int need; - - need=options[opt.l1].sub[opt.l2].narg; - if (need==FNAME_ARG) need=1; - if (need!=UNDEF && Narg!=need) NargError(Narg,NULL); + if (options[opt.l1].sub[opt.l2].narg!=UNDEF && Narg!=options[opt.l1].sub[opt.l2].narg) + NargError(Narg,NULL); } -//============================================================ +/*============================================================*/ static void NotSupported(const char *type,const char *given) -/* print error message that "type 'given' is not supported" - * type should start with a capital letter - */ + /* print error message that "type 'given' is not supported" + type should start with a capital letter */ { - PrintErrorHelpSafe("%s '%s' is not supported",type,given); + PrintErrorHelpSafe("%s '%s' is not supported",type,given); } -//============================================================ +/*============================================================*/ INLINE void TestStrLength(const char *str,const unsigned int size) -/* check if string fits in buffer of size 'size', otherwise produces error message - * 'opt' is command line option that checks its argument - */ + /* check if string fits in buffer of size 'size', otherwise produces error message + 'opt' is command line option that checks its argument */ { - if (strlen(str)>=size) - PrintErrorHelp("Too long argument to '-%s' option (only %ud chars allowed). If you really "\ - "need it you may increase MAX_DIRNAME in const.h and recompile",OptionName(),size-1); + if (strlen(str)>=size) + PrintErrorHelp("Too long argument to '-%s' option (only %ud chars allowed). If you really "\ + "need it you may increase MAX_DIRNAME in const.h and recompile", + OptionName(),size-1); } -//============================================================ +/*============================================================*/ INLINE void ScanfDoubleError(const char *str,double *res) -// scanf an option argument and checks for errors + /* scanf an option argument and checks for errors */ { - if (sscanf(str,"%lf",res)!=1) PrintErrorHelpSafe( - "Non-numeric argument (%s) is given to the option '-%s'",str,OptionName()); + if (sscanf(str,"%lf",res)!=1) + PrintErrorHelpSafe("Non-numeric argument (%s) is given to the option '-%s'",str,OptionName()); } -//============================================================ +/*============================================================*/ INLINE void ScanfIntError(const char *str,int *res) -// scanf an option argument and checks for errors + /* scanf an option argument and checks for errors */ { - double tmp; + double tmp; - if (sscanf(str,"%lf",&tmp)!=1) PrintErrorHelpSafe( - "Non-numeric argument (%s) is given to the option '-%s'",str,OptionName()); - if (tmp <INT_MIN || tmp>INT_MAX) PrintErrorHelpSafe( - "Argumenent value (%s) of the option '-%s' is out of integer bounds",str,OptionName()); - if (sscanf(str,"%d",res)!=1) - PrintErrorHelpSafe("Error reading argument (%s) of the option '-%s'",str,OptionName()); + if (sscanf(str,"%lf",&tmp)!=1) + PrintErrorHelpSafe("Non-numeric argument (%s) is given to the option '-%s'",str,OptionName()); + if (tmp <INT_MIN || tmp>INT_MAX) PrintErrorHelpSafe( + "Argumenent value (%s) of the option '-%s' is out of integer bounds",str,OptionName()); + if (sscanf(str,"%d",res)!=1) + PrintErrorHelpSafe("Error reading argument (%s) of the option '-%s'",str,OptionName()); } -//============================================================ +/*============================================================*/ INLINE int IsOption(const char *str) -/* checks if string is an option. First should be '-' and then letter (any case); - * it enables use of negative numbers as sub-parameters - */ + /* checks if string is an option. First should be '-' and then letter (any case); + it enables use of negative numbers as subparameters */ { - /* conversion to int is needed to remove warnings caused by the fact that str[1] is - * _signed_ char - */ - return (str[0]=='-' && isalpha((int)(str[1]))); + /* conversion to int is needed to remove warnings caused by the fact + that str[1] is _signed_ char */ + return (str[0]=='-' && isalpha((int)(str[1]))); } -//============================================================ +/*============================================================*/ static int TimeField(const char c) -// analyze one time multiplier + /* analyze one time multiplier */ { - if (c=='d' || c=='D') return 86400; - else if (c=='h' || c=='H') return 3600; - else if (c=='m' || c=='M') return 60; - else if (c=='s' || c=='S' || c==0) return 1; - else PrintErrorHelp("Illegal time format specifier (%c)",c); - // never reached - return 0; + if (c=='d' || c=='D') return 86400; + else if (c=='h' || c=='H') return 3600; + else if (c=='m' || c=='M') return 60; + else if (c=='s' || c=='S' || c==0) return 1; + else PrintErrorHelp("Illegal time format specifier (%c)",c); + /* never reached */ + return 0; } -//============================================================ +/*============================================================*/ static int ScanTime(const char *str) -// scans time in seconds from a string "%d[d,D[%d]][h,H[%d]][m,M[%d]][s,S] -{ -#define TIME_N_TYPES 4 // not so easy to change - int tim,t[TIME_N_TYPES],n,i; - char c[TIME_N_TYPES]; - - for (i=0;i<TIME_N_TYPES;i++) c[i]=0; - n=sscanf(str,"%d%c%d%c%d%c%d%c",t,c,t+1,c+1,t+2,c+2,t+3,c+3); - if (n<1) PrintErrorHelpSafe("Wrong time format '%s'",str); - tim=0; - i=0; - while (n>0) { - tim+=t[i]*TimeField(c[i]); - n-=2; - i++; - } - return tim; + /* scans time in seconds from a string "%d[d,D[%d]][h,H[%d]][m,M[%d]][s,S] */ +{ +#define TIME_N_TYPES 4 /* not so easy to change */ + int tim,t[TIME_N_TYPES],n,i; + char c[TIME_N_TYPES]; + + for (i=0;i<TIME_N_TYPES;i++) c[i]=0; + n=sscanf(str,"%d%c%d%c%d%c%d%c",t,c,t+1,c+1,t+2,c+2,t+3,c+3); + if (n<1) PrintErrorHelpSafe("Wrong time format '%s'",str); + tim=0; + i=0; + while (n>0) { + tim+=t[i]*TimeField(c[i]); + n-=2; + i++; + } + return tim; #undef TIME_N_TYPES } -//============================================================ +/*============================================================*/ static void PrintTime(char *s,const time_t *time_ptr) { - struct tm *t; + struct tm *t; - t=gmtime(time_ptr); - s[0]=0; // initialize string - if (t->tm_yday>0) sprintf(s,"%dd ",t->tm_yday); - if (t->tm_hour>0) sprintf(s+strlen(s),"%dh ",t->tm_hour); - if (t->tm_min>0) sprintf(s+strlen(s),"%dm ",t->tm_min); - if (t->tm_sec>0) sprintf(s+strlen(s),"%ds ",t->tm_sec); + t=gmtime(time_ptr); + s[0]=0; /* initialize string */ + if (t->tm_yday>0) sprintf(s,"%dd ",t->tm_yday); + if (t->tm_hour>0) sprintf(s+strlen(s),"%dh ",t->tm_hour); + if (t->tm_min>0) sprintf(s+strlen(s),"%dm ",t->tm_min); + if (t->tm_sec>0) sprintf(s+strlen(s),"%ds ",t->tm_sec); } -//======================================================================== -// parsing functions definitions - +/*========================================================================*/ + /* parsing functions definitions*/ PARSE_FUNC(alldir_inp) { - TestStrLength(argv[1],MAX_FNAME); - strcpy(alldir_parms,argv[1]); + TestStrLength(argv[1],MAX_FNAME); + strcpy(alldir_parms,argv[1]); } PARSE_FUNC(anisotr) { - anisotropy = TRUE; - Ncomp=3; + anisotropy = TRUE; + Ncomp=3; } PARSE_FUNC(asym) { - calc_asym = TRUE; - calc_vec = TRUE; - calc_Csca = TRUE; + calc_asym = TRUE; + calc_vec = TRUE; + calc_Csca = TRUE; } PARSE_FUNC(beam) { - int i,j,found; - - Narg--; - found=FALSE; - i=-1; - while (beam_opt[++i].name!=NULL) if (strcmp(argv[1],beam_opt[i].name)==0) { - // set suboption and beam type - opt.l2=i; - beamtype=beam_opt[i].type; - beam_Npars=Narg; - // check number of arguments - TestNarg_sub(Narg); - if (beamtype!=B_PLANE) { - if (Narg!=1 && Narg!=4) NargError(Narg,"1 or 4"); - } - // parse and check consistency - for (j=0;j<Narg;j++) ScanfDoubleError(argv[j+2],beam_pars+j); - if (Narg>0) TestPositive(beam_pars[0],"beam width"); - // stop search - found=TRUE; - break; - } - if(!found) NotSupported("Beam type",argv[1]); + int i,j,found; + + Narg--; + found=FALSE; + i=-1; + while (beam_opt[++i].name!=NULL) if (strcmp(argv[1],beam_opt[i].name)==0) { + /* set suboption and beamtype */ + opt.l2=i; + beamtype=beam_opt[i].type; + beam_Npars=Narg; + /* check number of arguments */ + TestNarg_sub(Narg); + if (beamtype!=B_PLANE) { + if (Narg!=1 && Narg!=4) NargError(Narg,"1 or 4"); + } + /* parse and check consistency */ + for (j=0;j<Narg;j++) ScanfDoubleError(argv[j+2],beam_pars+j); + if (Narg>0) TestPositive(beam_pars[0],"beam width"); + /* stop search */ + found=TRUE; + break; + } + if(!found) NotSupported("Beam type",argv[1]); } PARSE_FUNC(chp_dir) { - TestStrLength(argv[1],MAX_DIRNAME); - strcpy(chp_dir,argv[1]); + TestStrLength(argv[1],MAX_DIRNAME); + strcpy(chp_dir,argv[1]); } PARSE_FUNC(chp_load) { - load_chpoint = TRUE; + load_chpoint = TRUE; } PARSE_FUNC(chp_type) { - if (strcmp(argv[1],"normal")==0) chp_type=CHP_NORMAL; - else if (strcmp(argv[1],"regular")==0) chp_type=CHP_REGULAR; - else if (strcmp(argv[1],"always")==0) chp_type=CHP_ALWAYS; - else NotSupported("Checkpoint type",argv[1]); + if (strcmp(argv[1],"normal")==0) chp_type=CHP_NORMAL; + else if (strcmp(argv[1],"regular")==0) chp_type=CHP_REGULAR; + else if (strcmp(argv[1],"always")==0) chp_type=CHP_ALWAYS; + else NotSupported("Checkpoint type",argv[1]); } PARSE_FUNC(chpoint) { - chp_time=ScanTime(argv[1]); - if (chp_time<=0) { - chp_time=UNDEF; - if (chp_type==CHP_NONE) chp_type=CHP_ALWAYS; - } - else if (chp_type==CHP_NONE) chp_type=CHP_NORMAL; + chp_time=ScanTime(argv[1]); + if (chp_time<=0) { + chp_time=UNDEF; + if (chp_type==CHP_NONE) chp_type=CHP_ALWAYS; + } + else if (chp_type==CHP_NONE) chp_type=CHP_NORMAL; } PARSE_FUNC(Cpr_mat) { - calc_mat_force = TRUE; + calc_mat_force = TRUE; } PARSE_FUNC(Csca) { - calc_Csca = TRUE; + calc_Csca = TRUE; } PARSE_FUNC(dir) { - TestStrLength(argv[1],MAX_DIRNAME); - strcpy(directory,argv[1]); + TestStrLength(argv[1],MAX_DIRNAME); + strcpy(directory,argv[1]); } PARSE_FUNC(dpl) { - ScanfDoubleError(argv[1],&dpl); - TestPositive(dpl,"dpl"); + ScanfDoubleError(argv[1],&dpl); + TestPositive(dpl,"dpl"); } PARSE_FUNC(eps) { - double tmp; + double tmp; - ScanfDoubleError(argv[1],&tmp); - TestPositive(tmp,"eps exponent"); - eps=pow(10,-tmp); + ScanfDoubleError(argv[1],&tmp); + TestPositive(tmp,"eps exponent"); + eps=pow(10,-tmp); } PARSE_FUNC(eq_rad) { - ScanfDoubleError(argv[1],&a_eq); - TestPositive(a_eq,"dpl"); + ScanfDoubleError(argv[1],&a_eq); + TestPositive(a_eq,"dpl"); } PARSE_FUNC(granul) { - if (Narg!=2 && Narg!=3) NargError(Narg,"2 or 3"); - ScanfDoubleError(argv[1],&gr_vf); - TestRangeII(gr_vf,"volume fraction",0,PI_OVER_SIX); - ScanfDoubleError(argv[2],&gr_d); - TestPositive(gr_d,"diameter"); - if (Narg==3) { - ScanfIntError(argv[3],&gr_mat); - TestPositive_i(gr_mat,"domain number"); - } - else gr_mat=1; - gr_mat--; // converted to usual indexing starting from 0 - sh_granul=TRUE; + if (Narg!=2 && Narg!=3) NargError(Narg,"2 or 3"); + ScanfDoubleError(argv[1],&gr_vf); + TestRangeII(gr_vf,"volume fraction",0,PI_OVER_SIX); + ScanfDoubleError(argv[2],&gr_d); + TestPositive(gr_d,"diameter"); + if (Narg==3) { + ScanfIntError(argv[3],&gr_mat); + TestPositive_i(gr_mat,"domain number"); + } + else gr_mat=1; + gr_mat--; /* converted to usual indexing starting from 0 */ + sh_granul=TRUE; } PARSE_FUNC(grid) { - if (Narg!=1 && Narg!=3) NargError(Narg,"1 or 3"); - ScanfIntError(argv[1],&boxX); // boxes are further multiplied by jagged if needed - TestRange_i(boxX,"gridX",1,BOX_MAX); - if (Narg==3) { - ScanfIntError(argv[2],&boxY); - TestRange_i(boxY,"gridY",1,BOX_MAX); - ScanfIntError(argv[3],&boxZ); - TestRange_i(boxY,"gridY",1,BOX_MAX); - } + if (Narg!=1 && Narg!=3) NargError(Narg,"1 or 3"); + ScanfIntError(argv[1],&boxX); /* boxes are further multiplied by jagged if needed */ + TestRange_i(boxX,"gridX",1,BOX_MAX); + if (Narg==3) { + ScanfIntError(argv[2],&boxY); + TestRange_i(boxY,"gridY",1,BOX_MAX); + ScanfIntError(argv[3],&boxZ); + TestRange_i(boxY,"gridY",1,BOX_MAX); + } } PARSE_FUNC(h) { - int i,j,found; - - if (Narg>2) NargError(Narg,"not more than 2"); - // do all output on root processor - if (ringid==ROOT) { - found=FALSE; - if (Narg>=1) { - for(i=0;i<LENGTH(options);i++) if (strcmp(argv[1],options[i].name)==0) { - if (Narg==2) { - if (options[i].sub==NULL) - printf("No specific help is available for suboptions of this option\n\n"); - else { - j=-1; - while (options[i].sub[++j].name!=NULL) - if (strcmp(argv[2],options[i].sub[j].name)==0) { - printf(" -%s %s %s\n%s\n",options[i].name,options[i].sub[j].name, - options[i].sub[j].usage,WrapLinesCopy(options[i].sub[j].help)); - found=TRUE; - break; - } - if (!found) printf("Unknown suboption '%s'\n\n",argv[2]); - } - } - if (!found) { - printf(" -%s %s\n%s\n",options[i].name,options[i].usage, - WrapLinesCopy(options[i].help)); - if (options[i].sub!=NULL) { - printf("Available suboptions:\n"); - j=-1; - while (options[i].sub[++j].name!=NULL) - printf(" %s %s\n",options[i].sub[j].name,options[i].sub[j].usage); - printf("Type '%s -h %s <subopt>' for details\n",exename,options[i].name); - } - } - found=TRUE; - break; - } - if (!found) printf("Unknown option '%s'\n\n",argv[1]); - } - if (!found) { - printf("Usage: '%s %s'\n" - "Available options:\n",exename,exeusage); - for (i=0;i<LENGTH(options);i++) printf(" -%s %s\n",options[i].name,options[i].usage); - printf("Type '%s -h <opt>' for details\n",exename); - } - } - // exit - Stop(0); + int i,j,found; + + if (Narg>2) NargError(Narg,"not more than 2"); + /* do all output on root processor */ + if (ringid==ROOT) { + found=FALSE; + if (Narg>=1) { + for(i=0;i<LENGTH(options);i++) if (strcmp(argv[1],options[i].name)==0) { + if (Narg==2) { + if (options[i].sub==NULL) + printf("No specific help is available for suboptions of this option\n\n"); + else { + j=-1; + while (options[i].sub[++j].name!=NULL) if (strcmp(argv[2],options[i].sub[j].name)==0) { + printf(" -%s %s %s\n%s\n",options[i].name,options[i].sub[j].name, + options[i].sub[j].usage,WrapLinesCopy(options[i].sub[j].help)); + found=TRUE; + break; + } + if (!found) printf("Unknown suboption '%s'\n\n",argv[2]); + } + } + if (!found) { + printf(" -%s %s\n%s\n",options[i].name,options[i].usage,WrapLinesCopy(options[i].help)); + if (options[i].sub!=NULL) { + printf("Available suboptions:\n"); + j=-1; + while (options[i].sub[++j].name!=NULL) + printf(" %s %s\n",options[i].sub[j].name,options[i].sub[j].usage); + printf("Type '%s -h %s <subopt>' for details\n",exename,options[i].name); + } + } + found=TRUE; + break; + } + if (!found) printf("Unknown option '%s'\n\n",argv[1]); + } + if (!found) { + printf("Usage: '%s %s'\n"\ + "Available options:\n",exename,exeusage); + for (i=0;i<LENGTH(options);i++) printf(" -%s %s\n",options[i].name,options[i].usage); + printf("Type '%s -h <opt>' for details\n",exename); + } + } + /* exit */ + Stop(0); } PARSE_FUNC(int) { - if (strcmp(argv[1],"poi")==0) IntRelation=G_POINT_DIP; - else if (strcmp(argv[1],"fcd")==0) IntRelation=G_FCD; - else if (strcmp(argv[1],"fcd_st")==0) IntRelation=G_FCD_ST; - else if (strcmp(argv[1],"so")==0) IntRelation=G_SO; - else NotSupported("Interaction term prescription",argv[1]); + if (strcmp(argv[1],"poi")==0) IntRelation=G_POINT_DIP; + else if (strcmp(argv[1],"fcd")==0) IntRelation=G_FCD; + else if (strcmp(argv[1],"fcd_st")==0) IntRelation=G_FCD_ST; + else if (strcmp(argv[1],"so")==0) IntRelation=G_SO; + else NotSupported("Interaction term prescription",argv[1]); } PARSE_FUNC(iter) { - if (strcmp(argv[1],"cgnr")==0) IterMethod=IT_CGNR; - else if (strcmp(argv[1],"bicgstab")==0) IterMethod=IT_BICGSTAB; - else if (strcmp(argv[1],"bicg")==0) IterMethod=IT_BICG_CS; - else if (strcmp(argv[1],"qmr")==0) IterMethod=IT_QMR_CS; - else NotSupported("Iterative method",argv[1]); + if (strcmp(argv[1],"cgnr")==0) IterMethod=IT_CGNR; + else if (strcmp(argv[1],"bicgstab")==0) IterMethod=IT_BICGSTAB; + else if (strcmp(argv[1],"bicg")==0) IterMethod=IT_BICG_CS; + else if (strcmp(argv[1],"qmr")==0) IterMethod=IT_QMR_CS; + else NotSupported("Iterative method",argv[1]); } PARSE_FUNC(jagged) { - ScanfIntError(argv[1],&jagged); - TestRange_i(jagged,"jagged",1,BOX_MAX); + ScanfIntError(argv[1],&jagged); + TestRange_i(jagged,"jagged",1,BOX_MAX); } PARSE_FUNC(lambda) { - ScanfDoubleError(argv[1],&lambda); - TestPositive(lambda,"wavelength"); + ScanfDoubleError(argv[1],&lambda); + TestPositive(lambda,"wavelength"); } PARSE_FUNC(m) { - int i; - - if (!IS_EVEN(Narg) || Narg==0) NargError(Narg,"even"); - Nmat=Nmat_given=Narg/2; - if (Nmat>MAX_NMAT) - PrintErrorHelp("Too many materials (%d), maximum %d are supported. You may increase " - "parameter MAX_NMAT in const.h and recompile.",Nmat,MAX_NMAT); - for (i=0;i<Nmat;i++) { - ScanfDoubleError(argv[2*i+1],&ref_index[i][RE]); - ScanfDoubleError(argv[2*i+2],&ref_index[i][IM]); - if (ref_index[i][RE]==1 && ref_index[i][IM]==0) - PrintErrorHelp("Given refractive index #%d is that of vacuum, which is not supported. " - "Consider using, for instance, 1.0001 instead.",i+1); - } + int i; + + if (Narg%2!=0 || Narg==0) NargError(Narg,"even"); + Nmat=Nmat_given=Narg/2; + if (Nmat>MAX_NMAT) + PrintErrorHelp("Too many materials (%d), maximum %d are supported. You may increase parameter "\ + "MAX_NMAT in const.h and recompile.",Nmat,MAX_NMAT); + for (i=0;i<Nmat;i++) { + ScanfDoubleError(argv[2*i+1],&ref_index[i][RE]); + ScanfDoubleError(argv[2*i+2],&ref_index[i][IM]); + if (ref_index[i][RE]==1 && ref_index[i][IM]==0) + PrintErrorHelp("Given refractive index #%d is that of vacuum, which is not supported. "\ + "Consider using, for instance, 1.0001 instead.",i+1); + } } PARSE_FUNC(maxiter) { - ScanfIntError(argv[1],&maxiter); - TestPositive_i(maxiter,"maximum number of iterations"); + ScanfIntError(argv[1],&maxiter); + TestPositive_i(maxiter,"maximum number of iterations"); } PARSE_FUNC(no_reduced_fft) { - reduced_FFT=FALSE; + reduced_FFT=FALSE; } PARSE_FUNC(no_vol_cor) { - volcor=FALSE; + volcor=FALSE; } PARSE_FUNC(ntheta) { - ScanfIntError(argv[1],&nTheta); - TestPositive_i(nTheta,"number of theta intervals"); - nTheta++; + ScanfIntError(argv[1],&nTheta); + TestPositive_i(nTheta,"number of theta intervals"); + nTheta++; } PARSE_FUNC(opt) { - if (strcmp(argv[1],"speed")==0) save_memory=FALSE; - else if (strcmp(argv[1],"mem")==0) save_memory=TRUE; - else NotSupported("Optimization method",argv[1]); + if (strcmp(argv[1],"speed")==0) save_memory=FALSE; + else if (strcmp(argv[1],"mem")==0) save_memory=TRUE; + else NotSupported("Optimization method",argv[1]); } PARSE_FUNC(orient) { - if (Narg==0) NargError(Narg,"at least 1"); - if (strcmp(argv[1],"avg")==0) { - if (Narg>2) PrintErrorHelp( - "Illegal number of arguments (%d) to '-orient avg' option (0 or 1 expected)",Narg-1); - orient_avg=TRUE; - if (Narg==2) { - TestStrLength(argv[2],MAX_FNAME); - strcpy(avg_parms,argv[2]); - } - } - else { - if (Narg!=3) NargError(Narg,"3"); - ScanfDoubleError(argv[1],&alph_deg); - ScanfDoubleError(argv[2],&bet_deg); - ScanfDoubleError(argv[3],&gam_deg); - } + if (Narg==0) NargError(Narg,"at least 1"); + if (strcmp(argv[1],"avg")==0) { + if (Narg>2) PrintErrorHelp( + "Illegal number of arguments (%d) to '-orient avg' option (0 or 1 expected)",Narg-1); + orient_avg=TRUE; + if (Narg==2) { + TestStrLength(argv[2],MAX_FNAME); + strcpy(avg_parms,argv[2]); + } + } + else { + if (Narg!=3) NargError(Narg,"3"); + ScanfDoubleError(argv[1],&alph_deg); + ScanfDoubleError(argv[2],&bet_deg); + ScanfDoubleError(argv[3],&gam_deg); + } } PARSE_FUNC(phi_integr) { - phi_integr = TRUE; - ScanfIntError(argv[1],&phi_int_type); - TestRange_i(phi_int_type,"type of integration over phi",1,31); + phi_integr = TRUE; + ScanfIntError(argv[1],&phi_int_type); + TestRange_i(phi_int_type,"type of integration over phi",1,31); } PARSE_FUNC(pol) { - if (Narg!=1 && Narg!=2) NargError(Narg,"1 or 2"); - if (strcmp(argv[1],"cm")==0) PolRelation=POL_CM; - else if (strcmp(argv[1],"rrc")==0) PolRelation=POL_RR; - else if (strcmp(argv[1],"ldr")==0) PolRelation=POL_LDR; - else if (strcmp(argv[1],"cldr")==0) PolRelation=POL_CLDR; - else if (strcmp(argv[1],"fcd")==0) PolRelation=POL_FCD; - else if (strcmp(argv[1],"so")==0) PolRelation=POL_SO; - else NotSupported("Polarization relation",argv[1]); - if (Narg==2) { - if (strcmp(argv[2],"avgpol")==0) avg_inc_pol=TRUE; - else PrintErrorHelpSafe("Unknown argument '%s' to '-pol %s' option",argv[2],argv[1]); - } + if (Narg!=1 && Narg!=2) NargError(Narg,"1 or 2"); + if (strcmp(argv[1],"cm")==0) PolRelation=POL_CM; + else if (strcmp(argv[1],"rrc")==0) PolRelation=POL_RR; + else if (strcmp(argv[1],"ldr")==0) PolRelation=POL_LDR; + else if (strcmp(argv[1],"cldr")==0) PolRelation=POL_CLDR; + else if (strcmp(argv[1],"fcd")==0) PolRelation=POL_FCD; + else if (strcmp(argv[1],"so")==0) PolRelation=POL_SO; + else NotSupported("Polarization relation",argv[1]); + if (Narg==2) { + if (strcmp(argv[2],"avgpol")==0) avg_inc_pol=TRUE; + else PrintErrorHelpSafe("Unknown argument '%s' to '-pol %s' option",argv[2],argv[1]); + } } PARSE_FUNC(prognose) { - prognose=TRUE; - strcpy(run_name,"test"); + prognose=TRUE; + strcpy(run_name,"test"); } PARSE_FUNC(prop) { - double tmp; + double tmp; - ScanfDoubleError(argv[1],prop_0); - ScanfDoubleError(argv[2],prop_0+1); - ScanfDoubleError(argv[3],prop_0+2); - tmp=DotProd(prop_0,prop_0); - if (tmp==0) PrintErrorHelp("Given propagation vector is null"); - tmp=1/sqrt(tmp); - prop_0[0]*=tmp; - prop_0[1]*=tmp; - prop_0[2]*=tmp; + ScanfDoubleError(argv[1],prop_0); + ScanfDoubleError(argv[2],prop_0+1); + ScanfDoubleError(argv[3],prop_0+2); + tmp=DotProd(prop_0,prop_0); + if (tmp==0) PrintErrorHelp("Given propagation vector is null"); + tmp=1/sqrt(tmp); + prop_0[0]*=tmp; + prop_0[1]*=tmp; + prop_0[2]*=tmp; } PARSE_FUNC(save_geom) { - if (Narg>1) NargError(Narg,"0 or 1"); - save_geom=TRUE; - if (Narg==1) { - TestStrLength(argv[1],MAX_FNAME); - strcpy(save_geom_fname,argv[1]); - } + if (Narg>1) NargError(Narg,"0 or 1"); + save_geom=TRUE; + if (Narg==1) { + TestStrLength(argv[1],MAX_FNAME); + strcpy(save_geom_fname,argv[1]); + } } PARSE_FUNC(scat) { - if (strcmp(argv[1],"dr")==0) ScatRelation=SQ_DRAINE; - else if (strcmp(argv[1],"so")==0) ScatRelation=SQ_SO; - else NotSupported("Scattering quantities relation",argv[1]); + if (strcmp(argv[1],"dr")==0) ScatRelation=SQ_DRAINE; + else if (strcmp(argv[1],"so")==0) ScatRelation=SQ_SO; + else NotSupported("Scattering quantities relation",argv[1]); } PARSE_FUNC(scat_grid_inp) { - TestStrLength(argv[1],MAX_FNAME); - strcpy(scat_grid_parms,argv[1]); + TestStrLength(argv[1],MAX_FNAME); + strcpy(scat_grid_parms,argv[1]); } PARSE_FUNC(sg_format) { - if (strcmp(argv[1],"text")==0) sg_format=SF_TEXT; - else if (strcmp(argv[1],"text_ext")==0) sg_format=SF_TEXT_EXT; - else if (strcmp(argv[1],"ddscat")==0) sg_format=SF_DDSCAT; - else NotSupported("Geometry format",argv[1]); + if (strcmp(argv[1],"text")==0) sg_format=SF_TEXT; + else if (strcmp(argv[1],"text_ext")==0) sg_format=SF_TEXT_EXT; + else if (strcmp(argv[1],"ddscat")==0) sg_format=SF_DDSCAT; + else NotSupported("Geometry format",argv[1]); } PARSE_FUNC(shape) { - int i,j,found; - - Narg--; - found=FALSE; - i=-1; - while (shape_opt[++i].name!=NULL) if (strcmp(argv[1],shape_opt[i].name)==0) { - // set shape and shape option index - shape=shape_opt[i].type; - opt.l2=i; - opt_sh=opt; - sh_Npars=Narg; - // check number of arguments - TestNarg_sub(Narg); - if (shape==SH_COATED) { - if (Narg!=1 && Narg!=4) NargError(Narg,"1 or 4"); - } - else if (shape==SH_BOX) { - if (Narg!=0 && Narg!=2) NargError(Narg,"0 or 2"); - } - // special cases to parse filename - if (shape_opt[i].narg==FNAME_ARG) { - TestStrLength(argv[2],MAX_FNAME); - strcpy(shape_fname,argv[2]); - } - // else parse all parameters as float; their consistency is checked in InitShape() - else for (j=0;j<Narg;j++) ScanfDoubleError(argv[j+2],sh_pars+j); - // stop search - found=TRUE; - break; - } - if(!found) NotSupported("Shape type",argv[1]); - // set shape name; takes place only if shape name was matched above - strcpy(shapename,argv[1]); + int i,j,found; + + Narg--; + found=FALSE; + i=-1; + while (shape_opt[++i].name!=NULL) if (strcmp(argv[1],shape_opt[i].name)==0) { + /* set shape and shape option index */ + shape=shape_opt[i].type; + opt.l2=i; + opt_sh=opt; + sh_Npars=Narg; + /* check number of arguments */ + TestNarg_sub(Narg); + if (shape==SH_COATED) { + if (Narg!=1 && Narg!=4) NargError(Narg,"1 or 4"); + } + else if (shape==SH_BOX) { + if (Narg!=0 && Narg!=2) NargError(Narg,"0 or 2"); + } + /* parse; consistency of shape arguments is checked in InitShape() */ + if (shape==SH_READ) { + TestStrLength(argv[2],MAX_FNAME); + strcpy(aggregate_file,argv[2]); + } + else for (j=0;j<Narg;j++) ScanfDoubleError(argv[j+2],sh_pars+j); + /* stop search */ + found=TRUE; + break; + } + if(!found) NotSupported("Shape type",argv[1]); + /* set shapename; takes place only if shapename was matched above */ + strcpy(shapename,argv[1]); } PARSE_FUNC(size) { - ScanfDoubleError(argv[1],&sizeX); - TestPositive(sizeX,"particle size"); + ScanfDoubleError(argv[1],&sizeX); + TestPositive(sizeX,"particle size"); } PARSE_FUNC(store_beam) { - store_beam = TRUE; + store_beam = TRUE; } PARSE_FUNC(store_dip_pol) { - store_dip_pol=TRUE; + store_dip_pol=TRUE; } PARSE_FUNC(store_force) { - store_force = TRUE; -} -PARSE_FUNC(store_grans) -{ - store_grans=TRUE; + store_force = TRUE; } PARSE_FUNC(store_int_field) { - store_int_field=TRUE; + store_int_field=TRUE; } PARSE_FUNC(store_scat_grid) { - store_scat_grid = TRUE; + store_scat_grid = TRUE; } PARSE_FUNC(sym) { - if (strcmp(argv[1],"auto")==0) sym_type=SYM_AUTO; - else if (strcmp(argv[1],"no")==0) sym_type=SYM_NO; - else if (strcmp(argv[1],"enf")==0) sym_type=SYM_ENF; - else NotSupported("Symmetry option",argv[1]); + if (strcmp(argv[1],"auto")==0) sym_type=SYM_AUTO; + else if (strcmp(argv[1],"no")==0) sym_type=SYM_NO; + else if (strcmp(argv[1],"enf")==0) sym_type=SYM_ENF; + else NotSupported("Symmetry option",argv[1]); } PARSE_FUNC(test) { - strcpy(run_name,"test"); + strcpy(run_name,"test"); } PARSE_FUNC(V) { - char copyright[]="\n\nCopyright (C) 2006-2008 University of Amsterdam\n" - "This program is free software; you can redistribute it and/or modify it under the terms " - "of the GNU General Public License as published by the Free Software Foundation; either " - "version 2 of the License, or (at your option) any later version.\n\n" - "This program is distributed in the hope that it will be useful, but WITHOUT ANY " - "WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A " - "PARTICULAR PURPOSE. See the GNU General Public License for more details.\n\n" - "You should have received a copy of the GNU General Public License along with this " - "program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite " - "330, Boston, MA 02111-1307, USA.\n"; - char ccver_str[MAX_LINE]; + char copyright[]="\n\nCopyright (C) 2006-2008 University of Amsterdam\n"\ + "This program is free software; you can redistribute it and/or modify it under the terms of "\ + "the GNU General Public License as published by the Free Software Foundation; either "\ + "version 2 of the License, or (at your option) any later version.\n\n"\ + "This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; "\ + "without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. "\ + "See the GNU General Public License for more details.\n\n"\ + "You should have received a copy of the GNU General Public License along with this program; "\ + "if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, "\ + "MA 02111-1307, USA.\n"; + char ccver_str[MAX_LINE]; #if defined(__DECC) - char cctype; + char cctype; #elif defined(__BORLANDC__) - int ccver; + int ccver; #endif - if (ringid==ROOT) { - // compiler & version (works only for selected compilers) - // Intel + if (ringid==ROOT) { + /* compiler & version (works only for selected compilers) */ + /* Intel */ #if defined(__ICC) || defined(__INTEL_COMPILER) -# define COMPILER "Intel" -# ifdef __INTEL_COMPILER -# define CCVERSION __INTEL_COMPILER -# else -# define CCVERSION __ICC -# endif - sprintf(ccver_str,"%d.%d",CCVERSION/100,CCVERSION%100); - // DEC (Compaq) +# define COMPILER "Intel" +# ifdef __INTEL_COMPILER +# define CCVERSION __INTEL_COMPILER +# else +# define CCVERSION __ICC +# endif + sprintf(ccver_str,"%d.%d",CCVERSION/100,CCVERSION%100); + /* DEC (Compaq) */ #elif defined(__DECC) -# define COMPILER "DEC (Compaq)" - cctype=(__DECC_VER/10000)%10; - if (cctype==6) cctype='T'; - else if (cctype==8) cctype='S'; - else if (cctype==9) cctype='V'; - else cctype=' '; - sprintf(ccver_str,"%c%d.%d-%d",cctype,__DECC_VER/10000000,(__DECC_VER/100000)%100, - __DECC_VER%1000); - // Borland +# define COMPILER "DEC (Compaq)" + cctype=(__DECC_VER/10000)%10; + if (cctype==6) cctype='T'; + else if (cctype==8) cctype='S'; + else if (cctype==9) cctype='V'; + else cctype=' '; + sprintf(ccver_str,"%c%d.%d-%d",cctype,__DECC_VER/10000000,(__DECC_VER/100000)%100, + __DECC_VER%1000); + /* Borland */ #elif defined(__BORLANDC__) -# define COMPILER "Borland" - sprintf(ccver_str,"%x",__BORLANDC__); - sscanf(ccver_str,"%d",&ccver); - sprintf(ccver_str,"%d.%d",ccver/100,ccver%100); - // Microsoft +# define COMPILER "Borland" + sprintf(ccver_str,"%x",__BORLANDC__); + sscanf(ccver_str,"%d",&ccver); + sprintf(ccver_str,"%d.%d",ccver/100,ccver%100); + /* Microsoft */ #elif defined(_MSC_VER) -# define COMPILER "Microsoft" - sprintf(ccver_str,"%d.%d",_MSC_VER/100,_MSC_VER%100); - // GNU +# define COMPILER "Microsoft" + sprintf(ccver_str,"%d.%d",_MSC_VER/100,_MSC_VER%100); + /* GNU */ #elif defined(__GNUC__) -# define COMPILER "GNU" - sprintf(ccver_str,"%d.%d.%d",__GNUC__,__GNUC_MINOR__,__GNUC_PATCHLEVEL__); - // IBM -#elif defined(__xlc__) -# define COMPILER "IBM" - strncpy(ccver_str,__xlc__,MAX_LINE-1); - // unknown compiler +# define COMPILER "GNU" + sprintf(ccver_str,"%d.%d.%d",__GNUC__,__GNUC_MINOR__,__GNUC_PATCHLEVEL__); + /* unknown compiler */ #else -# define COMPILER_UNKNOWN -# define COMPILER "unknown" +# define COMPILER_UNKNOWN +# define COMPILER "unknown" #endif - // print version, type and compiler information - printf("ADDA v." ADDA_VERSION "\n"); + /* print version, type and compiler information */ + printf("'Amsterdam DDA' v." ADDA_VERSION "\n"); #ifdef MPI - // Version of MPI standard is specified, requires MPI 1.2 - printf("Parallel version conforming to MPI standard %d.%d\n",MPI_VERSION,MPI_SUBVERSION); + /* Version of MPI standard is specified, requires MPI 1.2 */ + printf("Parallel version conforming to MPI standard %d.%d\n", + MPI_VERSION,MPI_SUBVERSION); #else - printf("Sequential version\n"); + printf("Sequential version\n"); #endif - printf("Built with " COMPILER " C compiler"); + printf("Built with " COMPILER " C compiler"); #ifndef COMPILER_UNKNOWN - printf(", version %s",ccver_str); + printf(", version %s",ccver_str); #endif - // print copyright information - WrapLines(copyright); - printf("%s",copyright); - } - // exit - Stop(0); + /* print copyright information */ + WrapLines(copyright); + printf("%s",copyright); + } + /* exit */ + Stop(0); #undef COMPILER #undef CCVERSION #undef COMPILER_UNKNOWN } PARSE_FUNC(vec) { - calc_vec = TRUE; + calc_vec = TRUE; } PARSE_FUNC(yz) { - yzplane = TRUE; + yzplane = TRUE; } #undef PAR #undef PARSE_FUNC #undef PARSE_NAME -// end of parsing functions -//============================================================= + /* end of parsing functions */ +/*=============================================================*/ static FILEHANDLE CreateLockFile(const char *fname) -// create lock file; works only if USE_LOCK is enabled + /* create locks file + works only if USE_LOCK is enabled */ { #ifdef USE_LOCK - FILEHANDLE fd; - int i; - -# ifdef WINDOWS - i=0; - while ((fd=CreateFile(fname,GENERIC_WRITE,FILE_SHARE_WRITE,NULL,CREATE_NEW, - FILE_ATTRIBUTE_NORMAL,NULL))==INVALID_HANDLE_VALUE) { - Sleep(LOCK_WAIT*1000); - if (i++ == MAX_LOCK_WAIT_CYCLES) - LogError(EC_ERROR,ONE_POS,"Lock file %s permanently exists",fname); - } -# elif defined(POSIX) -# ifdef LOCK_FOR_NFS - struct flock lock; -# endif - // open file exclusively - i=0; - while ((fd=open(fname,O_WRONLY | O_CREAT | O_EXCL,0666))==-1) { - sleep(LOCK_WAIT); - if (i++ == MAX_LOCK_WAIT_CYCLES) - LogError(EC_ERROR,ONE_POS,"Lock file %s permanently exists",fname); - } -# ifdef LOCK_FOR_NFS - // specify lock - lock.l_type=F_WRLCK; - lock.l_whence=SEEK_SET; - lock.l_start=0; - lock.l_len=0; - // obtain lock*/ - i=0; - while (fcntl(fd,F_SETLK,&lock)==-1) { - // if locked by another process wait and try again - if (errno==EACCES || errno==EAGAIN) { - sleep(LOCK_WAIT); - if (i++ == MAX_LOCK_WAIT_CYCLES) - LogError(EC_ERROR,ONE_POS,"Lock file %s permanently exists",fname); - } - else { // otherwise produce a message and continue - if (errno==EOPNOTSUPP || errno==ENOLCK) LogError(EC_WARN,ONE_POS, - "Advanced file locking is not supported by the file system"); - else LogError(EC_WARN,ONE_POS,"Unknown problem with file locking ('%s').", - strerror(errno)); - break; - } - } -# endif -# endif - // return file handle - return fd; + FILEHANDLE fd; + int i; + +# ifdef WINDOWS + i=0; + while ((fd=CreateFile(fname,GENERIC_WRITE,FILE_SHARE_WRITE,NULL,CREATE_NEW, + FILE_ATTRIBUTE_NORMAL,NULL))==INVALID_HANDLE_VALUE) { + Sleep(LOCK_WAIT*1000); + if (i++ == MAX_LOCK_WAIT_CYCLES) + LogError(EC_ERROR,ONE_POS,"Lock file %s permanently exists",fname); + } +# elif defined(POSIX) +# ifdef LOCK_FOR_NFS + struct flock lock; +# endif + /* open file exclusively */ + i=0; + while ((fd=open(fname,O_WRONLY | O_CREAT | O_EXCL,0666))==-1) { + sleep(LOCK_WAIT); + if (i++ == MAX_LOCK_WAIT_CYCLES) + LogError(EC_ERROR,ONE_POS,"Lock file %s permanently exists",fname); + } +# ifdef LOCK_FOR_NFS + /* specify lock */ + lock.l_type=F_WRLCK; + lock.l_whence=SEEK_SET; + lock.l_start=0; + lock.l_len=0; + /* obtain lock*/ + i=0; + while (fcntl(fd,F_SETLK,&lock)==-1) { + /* if locked by another process wait and try again */ + if (errno==EACCES || errno==EAGAIN) { + sleep(LOCK_WAIT); + if (i++ == MAX_LOCK_WAIT_CYCLES) + LogError(EC_ERROR,ONE_POS,"Lock file %s permanently exists",fname); + } + else { /* otherwise produce a message and continue */ + if (errno==EOPNOTSUPP || errno==ENOLCK) LogError(EC_WARN,ONE_POS, + "Advanced file locking is not supported by the filesystem"); + else LogError(EC_WARN,ONE_POS,"Unknown problem with file locking ('%s').",strerror(errno)); + break; + } + } +# endif +# endif + /* return file handle */ + return fd; #else - return 0; + return 0; #endif } -//============================================================ +/*============================================================*/ static void RemoveLockFile(FILEHANDLE fd,const char *fname) -// closes and remove lock file; works only if USE_LOCK is enabled + /* closes and remove lock file + works only if USE_LOCK is enabled */ { #ifdef USE_LOCK -# ifdef WINDOWS - // close file - CloseHandle(fd); -# elif defined(POSIX) - // close file; all locks are automatically released - close(fd); -# endif - // remove lock file - RemoveErr(fname,ONE_POS); +# ifdef WINDOWS + /* close file */ + CloseHandle(fd); +# elif defined(POSIX) + /* close file; all locks are automatically released */ + close(fd); +# endif + /* remove lock file */ + RemoveErr(fname,ONE_POS); #endif } -//============================================================ +/*============================================================*/ void InitVariables(void) -// some defaults are specified also in const.h -{ - prop_0[0]=0; // by default beam propagates along z-axis - prop_0[1]=0; - prop_0[2]=1; - directory[0]=0; - lambda=TWO_PI; - // initialize ref_index of scatterer - Nmat=Nmat_given=1; - ref_index[0][RE]=1.5; - ref_index[0][IM]=0.0; - // initialize to null to determine further whether it is initialized - logfile=NULL; - logname[0]=0; - - boxX=boxY=boxZ=UNDEF; - sizeX=UNDEF; - a_eq=UNDEF; - dpl=UNDEF; - strcpy(run_name,"run"); - nTheta=UNDEF; - eps=1E-5; - shape=SH_SPHERE; - strcpy(shapename,"sphere"); - store_int_field=FALSE; - store_dip_pol=FALSE; - PolRelation=POL_LDR; - ScatRelation=SQ_DRAINE; - IntRelation=G_POINT_DIP; - IterMethod=IT_QMR_CS; - sym_type=SYM_AUTO; - prognose=FALSE; - maxiter=UNDEF; - jagged=1; - beamtype=B_PLANE; - strcpy(alldir_parms,FD_ALLDIR_PARMS); - strcpy(avg_parms,FD_AVG_PARMS); - strcpy(scat_grid_parms,FD_SCAT_PARMS); - strcpy(chp_dir,FD_CHP_DIR); - chp_time=UNDEF; - chp_type=CHP_NONE; - orient_avg=FALSE; - alph_deg=bet_deg=gam_deg=0.0; - volcor=TRUE; - reduced_FFT=TRUE; - save_geom=FALSE; - save_geom_fname[0]=0; - yzplane=UNDEF; - all_dir=FALSE; - scat_grid=FALSE; - phi_integr=FALSE; - store_scat_grid=FALSE; - calc_Cext=TRUE; - calc_Cabs=TRUE; - calc_Csca=FALSE; - calc_vec=FALSE; - calc_asym=FALSE; - calc_mat_force=FALSE; - store_force=FALSE; - store_grans=FALSE; - load_chpoint=FALSE; - sh_granul=FALSE; - symX=symY=symZ=symR=TRUE; - anisotropy=FALSE; - save_memory=FALSE; - sg_format=SF_TEXT; - memory=0; - Ncomp=1; -} - -//============================================================ + /* some defaults are specified also in const.h */ +{ + /* defaults */ + prop_0[0]=0; /* by default beam propagates along z-axis */ + prop_0[1]=0; + prop_0[2]=1; + directory[0]=0; + lambda=TWO_PI; + /* initialize ref_index of scatterer */ + Nmat=Nmat_given=1; + ref_index[0][RE]=1.5; + ref_index[0][IM]=0.0; + /* initialize to null to determine further whether it is initialized */ + logfile=NULL; + logname[0]=0; + + boxX=boxY=boxZ=UNDEF; + sizeX=UNDEF; + a_eq=UNDEF; + dpl=UNDEF; + strcpy(run_name,"run"); + nTheta=UNDEF; + eps=1E-5; + shape=SH_SPHERE; + strcpy(shapename,"sphere"); + store_int_field=FALSE; + store_dip_pol=FALSE; + PolRelation=POL_LDR; + ScatRelation=SQ_DRAINE; + IntRelation=G_POINT_DIP; + IterMethod=IT_QMR_CS; + sym_type=SYM_AUTO; + prognose=FALSE; + maxiter=UNDEF; + jagged=1; + beamtype=B_PLANE; + strcpy(alldir_parms,FD_ALLDIR_PARMS); + strcpy(avg_parms,FD_AVG_PARMS); + strcpy(scat_grid_parms,FD_SCAT_PARMS); + strcpy(chp_dir,FD_CHP_DIR); + chp_time=UNDEF; + chp_type=CHP_NONE; + orient_avg=FALSE; + alph_deg=bet_deg=gam_deg=0.0; + volcor=TRUE; + reduced_FFT=TRUE; + save_geom=FALSE; + save_geom_fname[0]=0; + yzplane=UNDEF; + all_dir=FALSE; + scat_grid=FALSE; + phi_integr=FALSE; + store_scat_grid=FALSE; + calc_Cext=TRUE; + calc_Cabs=TRUE; + calc_Csca=FALSE; + calc_vec=FALSE; + calc_asym=FALSE; + calc_mat_force=FALSE; + store_force=FALSE; + load_chpoint=FALSE; + sh_granul=FALSE; + symX=symY=symZ=symR=TRUE; + anisotropy=FALSE; + save_memory=FALSE; + sg_format=SF_TEXT; + memory=0; + Ncomp=1; +} + +/*============================================================*/ void ParseParameters(const int argc,char **argv) -// parses input parameters -{ - int i,j,Narg,tmp; - int found; - char *p1,*p2; - - // try to determine terminal width - if ((p1=getenv("COLUMNS"))!=NULL && sscanf(p1,"%d",&tmp)==1 && tmp>=MIN_TERM_WIDTH) - term_width=tmp; - // get name of executable; remove all path overhead - if ((p1=strrchr(argv[0],'\\'))==NULL) p1=argv[0]; - if ((p2=strrchr(argv[0],'/'))==NULL) p2=argv[0]; - exename=MAX(p1,p2)+1; - // initialize option - opt.l1=UNDEF; - // check first argument - if (argc>1 && !IsOption(argv[1])) - PrintErrorHelpSafe("Illegal format of first argument '%s'",argv[1]); - // read command line - for (i=1;i<argc;i++) { - // get number of arguments - Narg=0; - while ((i+(++Narg))<argc && !IsOption(argv[i+Narg])); - Narg--; - - argv[i]++; // shift to remove "-" in the beginning of the string - found=FALSE; - opt.l1=opt.l2=UNDEF; - for (j=0;j<LENGTH(options);j++) if (strcmp(argv[i],options[j].name)==0) { - opt.l1=j; - // check consistency, if enabled for this parameter - TestNarg(Narg); - // parse this parameter - (*options[j].func)(Narg,argv+i); - // check duplicate options; it is safe since at this point argv[i] is known to be normal - if (options[j].used) PrintError("Option '-%s' is used more than once",argv[i]); - else options[j].used=TRUE; - // stop search - found=TRUE; - break; - } - if(!found) PrintErrorHelpSafe("Unknown option '-%s'",argv[i]); - argv[i]--; // shift back - i+=Narg; - } -} - -//============================================================ + /* parses input parameters */ +{ + int i,j,Narg,tmp; + int found; + char *p1,*p2; + + /* try to determine terminal width */ + if ((p1=getenv("COLUMNS"))!=NULL && sscanf(p1,"%d",&tmp)==1 && tmp>=MIN_TERM_WIDTH) + term_width=tmp; + /* get name of executable; remove all path overhead */ + if ((p1=strrchr(argv[0],'\\'))==NULL) p1=argv[0]; + if ((p2=strrchr(argv[0],'/'))==NULL) p2=argv[0]; + exename=MAX(p1,p2)+1; + /* initialize option */ + opt.l1=UNDEF; + /* check first argument */ + if (argc>1 && !IsOption(argv[1])) + PrintErrorHelpSafe("Illegal format of first argument '%s'",argv[1]); + /* read command line */ + for (i=1;i<argc;i++) { + /* get number of arguments */ + Narg=0; + while ((i+(++Narg))<argc && !IsOption(argv[i+Narg])); + Narg--; + + argv[i]++; /* shift to remove "-" in the beginning of the string */ + found=FALSE; + opt.l1=opt.l2=UNDEF; + for (j=0;j<LENGTH(options);j++) if (strcmp(argv[i],options[j].name)==0) { + opt.l1=j; + /* check consistency, if enabled for this parameter */ + TestNarg(Narg); + /* parse this parameter */ + (*options[j].func)(Narg,argv+i); + /* check duplicate options; it is safe since at this point argv[i] is known to be normal */ + if (options[j].used) PrintError("Option '-%s' is used more than once",argv[i]); + else options[j].used=TRUE; + /* stop search */ + found=TRUE; + break; + } + if(!found) PrintErrorHelpSafe("Unknown option '-%s'",argv[i]); + argv[i]--; /* shift back */ + i+=Narg; + } /* end of reading command line arguments */ +} + +/*============================================================*/ void VariablesInterconnect(void) -// finish parameters initialization based on their interconnections -{ - double temp; - - // initialize WaveNum ASAP - WaveNum = TWO_PI/lambda; - // parameter interconnections - if (IntRelation==G_SO) reduced_FFT=FALSE; - if (calc_Csca || calc_vec) all_dir = TRUE; - if (store_scat_grid || phi_integr) { - scat_grid = TRUE; - if (yzplane==UNDEF) yzplane = FALSE; - } - else if (yzplane==UNDEF) yzplane = TRUE; - // parameter incompatibilities - if (orient_avg) { - if (prop_0[2]!=1) PrintError("'-prop' and '-orient avg' can not be used together"); - if (store_int_field) - PrintError("'-store_int_field' and '-orient avg' can not be used together"); - if (store_dip_pol) - PrintError("'-store_dip_pol' and '-orient avg' can not be used together"); - if (store_beam) PrintError("'-store_beam' and '-orient avg' can not be used together"); - if (scat_grid) PrintError( - "'-orient avg' can not be used with calculation of scattering for a grid of angles"); - // TODO: this limitation should be removed in the future - if (all_dir) - PrintError("Currently '-orient avg' can not be used with calculation of asym or Csca"); - } - if (anisotropy) { - if (PolRelation==POL_CLDR) PrintError("'-anisotr' is incompatible with '-pol cldr'"); - if (PolRelation==POL_SO) PrintError("'-anisotr' is incompatible with '-pol so'"); - if (ScatRelation==SQ_SO) PrintError("'-anisotr' is incompatible with '-scat so'"); - if (IntRelation==G_SO) PrintError("'-anisotr' is incompatible with '-int so'"); - if (Nmat%3!=0) PrintError( - "When '-anisotr' is used 6 numbers (3 complex values) should be given per each domain"); - else Nmat=Nmat/3; - } - if (chp_type!=CHP_NONE) { - if (chp_time==UNDEF && chp_type!=CHP_ALWAYS) - PrintError("You must specify time for this checkpoint type"); - // TODO: this limitation should be removed in the future - if (orient_avg) PrintError("Currently checkpoint is incompatible with '-orient avg'"); - } - if (sizeX!=UNDEF && a_eq!=UNDEF) PrintError("'-size' and '-eq_rad' can not be used together"); - // scale boxes by jagged; should be completely robust to overflows -#define JAGGED_BOX(a) if (a!=UNDEF) { \ - if ((BOX_MAX/(size_t)jagged)<(size_t)a) \ - LogError(EC_ERROR,ONE_POS,"Derived grid size (" #a ") is too large (>%d)",BOX_MAX); \ - else a*=jagged; } - - if (jagged!=1) { - JAGGED_BOX(boxX); - JAGGED_BOX(boxY); - JAGGED_BOX(boxZ); - } + /* finish parameters initialization based on their interconnections */ +{ + double temp; + + /* initialize WaveNum ASAP */ + WaveNum = TWO_PI/lambda; + /* parameter interconnections */ + if (IntRelation==G_SO) reduced_FFT=FALSE; + if (calc_Csca || calc_vec) all_dir = TRUE; + if (store_scat_grid || phi_integr) { + scat_grid = TRUE; + if (yzplane==UNDEF) yzplane = FALSE; + } + else if (yzplane==UNDEF) yzplane = TRUE; + /* parameter incompatibilities */ + if (orient_avg) { + if (prop_0[2]!=1) PrintError("'-prop' and '-orient avg' can not be used together"); + if (store_int_field) + PrintError("'-store_int_field' and '-orient avg' can not be used together"); + if (store_dip_pol) + PrintError("'-store_dip_pol' and '-orient avg' can not be used together"); + if (store_beam) PrintError("'-store_beam' and '-orient avg' can not be used together"); + if (scat_grid) PrintError( + "'-orient avg' can not be used with calculation of scattering for a grid of angles"); + /* this limitation should be removed in the future */ + if (all_dir) + PrintError("Currently '-orient avg' can not be used with calculation of asym or Csca"); + } + if (anisotropy) { + if (PolRelation==POL_CLDR) PrintError("'-anisotr' is incompatible with '-pol cldr'"); + if (PolRelation==POL_SO) PrintError("'-anisotr' is incompatible with '-pol so'"); + if (ScatRelation==SQ_SO) PrintError("'-anisotr' is incompatible with '-scat so'"); + if (IntRelation==G_SO) PrintError("'-anisotr' is incompatible with '-int so'"); + if (Nmat%3!=0) PrintError( + "When '-anisotr' is used 6 numbers (3 complex values) should be given per each domain"); + else Nmat=Nmat/3; + } + if (chp_type!=CHP_NONE) { + if (chp_time==UNDEF && chp_type!=CHP_ALWAYS) + PrintError("You must specify time for this checkpoint type"); + /* this limitation should be removed in the future */ + if (orient_avg) PrintError("Currently checkpoint is incompatible with '-orient avg'"); + } + if (sizeX!=UNDEF && a_eq!=UNDEF) PrintError("'-size' and '-eq_rad' can not be used together"); + /* scale boxes by jagged; should be completely robust to overflows */ +#define JAGGED_BOX(a) if (a!=UNDEF) { if ((BOX_MAX/(size_t)jagged)<(size_t)a) \ + LogError(EC_ERROR,ONE_POS,"Derived grid size (" #a ") is too large (>%d)",BOX_MAX); \ + else a*=jagged; } + if (jagged!=1) { + JAGGED_BOX(boxX); + JAGGED_BOX(boxY); + JAGGED_BOX(boxZ); + } #undef JAGGED_BOX - /* Determine two incident polarizations. Equivalent to rotation of X,Y,Z basis by angles theta - * and phi from (0,0,1) to given propagation vector. - */ - if (fabs(prop_0[2])>=1) { // can not be >1 except for machine precision - incPolX_0[0]=prop_0[2]; - incPolY_0[1]=1; - incPolX_0[1]=incPolX_0[2]=incPolY_0[0]=incPolY_0[2]=0.0; - } - else { - temp=sqrt(1-prop_0[2]*prop_0[2]); - incPolX_0[0]=prop_0[0]*prop_0[2]/temp; - incPolX_0[1]=prop_0[1]*prop_0[2]/temp; - incPolX_0[2]=-temp; - incPolY_0[0]=-prop_0[1]/temp; - incPolY_0[1]=prop_0[0]/temp; - incPolY_0[2]=0.0; - } - // initialize beam description - InitBeam(); - // initialize averaging over orientation - if (orient_avg) { - ReadAvgParms(avg_parms); - if (sym_type==SYM_AUTO) sym_type=SYM_NO; - avg_inc_pol=TRUE; - } - else { - // else - initialize rotation stuff - InitRotation(); - /* if not default incidence, break the symmetry completely. This can be improved to account - * for some special cases, however, then symmetry of Gaussian beam should be treated more - * thoroughly than now. - */ - if (prop[2]!=1 && sym_type==SYM_AUTO) sym_type=SYM_NO; - } -} - -//============================================================ + /*determine two incident polarizations. Equivalent to rotation of X,Y,Z basis by + angles Theta and Phi from (0,0,1) to given propagation vector */ + if (fabs(prop_0[2])>=1) { /* can not be >1 except for machine precision */ + incPolX_0[0]=prop_0[2]; + incPolY_0[1]=1; + incPolX_0[1]=incPolX_0[2]=incPolY_0[0]=incPolY_0[2]=0.0; + } + else { + temp=sqrt(1-prop_0[2]*prop_0[2]); + incPolX_0[0]=prop_0[0]*prop_0[2]/temp; + incPolX_0[1]=prop_0[1]*prop_0[2]/temp; + incPolX_0[2]=-temp; + incPolY_0[0]=-prop_0[1]/temp; + incPolY_0[1]=prop_0[0]/temp; + incPolY_0[2]=0.0; + } + /* initialize beam description */ + InitBeam(); + /* initialize averaging over orientation */ + if (orient_avg) { + ReadAvgParms(avg_parms); + if (sym_type==SYM_AUTO) sym_type=SYM_NO; + avg_inc_pol=TRUE; + } + else { + /* else - initialize rotation stuff */ + InitRotation(); + /* if not default incidence, break the symmetry completely. This can be improved to + account for some special cases, however, then symmetry of Gaussian beam should be + treated more thoroughly than now. */ + if (prop[2]!=1 && sym_type==SYM_AUTO) sym_type=SYM_NO; + } +} + +/*============================================================*/ void DirectoryLog(const int argc,char **argv) -// create input directory and start logfile + /* create input directory and start logfile */ { - int i,Nexp; - FILE *Nexpfile; - char sbuffer[MAX_LINE]; - char *ptmp,*compname; - FILEHANDLE lockid; + int i,Nexp; + FILE *Nexpfile; + char sbuffer[MAX_LINE]; + char *ptmp,*compname; + FILEHANDLE lockid; #ifdef PARALLEL - char *ptmp2; + char *ptmp2; #endif -#ifdef WINDOWS // for obtaining computer name - TCHAR cname[MAX_COMPUTERNAME_LENGTH+1]; - DWORD cname_size=MAX_COMPUTERNAME_LENGTH+1; +#ifdef WINDOWS /* for obtaining computer name */ + TCHAR cname[MAX_COMPUTERNAME_LENGTH+1]; + DWORD cname_size=MAX_COMPUTERNAME_LENGTH+1; #endif - // devise directory name (for output files) - if (directory[0]==0) { - // ROOT processor works with ExpCount - if (ringid==ROOT) { - // lock file - lockid=CreateLockFile(F_EXPCOUNT_LCK); - // read ExpCount - if ((Nexpfile=fopen(F_EXPCOUNT,"r"))!=NULL) { - if (fscanf(Nexpfile,"%i",&Nexp)!=1) Nexp=0; - FCloseErr(Nexpfile,F_EXPCOUNT,ONE_POS); - } - else Nexp=0; - // put new number in Nexpfile - Nexpfile=FOpenErr(F_EXPCOUNT,"w",ONE_POS); - fprintf(Nexpfile,"%i",Nexp+1); - FCloseErr(Nexpfile,F_EXPCOUNT,ONE_POS); - // unlock - RemoveLockFile(lockid,F_EXPCOUNT_LCK); - } - // cast Nexp to all processors - MyBcast(&Nexp,int_type,1,NULL); - // create directory name - sprintf(sbuffer,"m%.4g",ref_index[0][RE]); - ptmp=strchr(sbuffer,'.'); - if (ptmp!=NULL) *ptmp='_'; - sprintf(directory,"%s%03i_%s_g%i%s",run_name,Nexp,shapename,boxX,sbuffer); + /* devise directory name (for output files) */ + if (directory[0]==0) { + /* ROOT processor works with ExpCount */ + if (ringid==ROOT) { + /* lock file */ + lockid=CreateLockFile(F_EXPCOUNT_LCK); + /* read ExpCount */ + if ((Nexpfile=fopen(F_EXPCOUNT,"r"))!=NULL) { + if (fscanf(Nexpfile,"%i",&Nexp)!=1) Nexp=0; + FCloseErr(Nexpfile,F_EXPCOUNT,ONE_POS); + } + else Nexp=0; + /* put new number in Nexpfile */ + Nexpfile=FOpenErr(F_EXPCOUNT,"w",ONE_POS); + fprintf(Nexpfile,"%i",Nexp+1); + FCloseErr(Nexpfile,F_EXPCOUNT,ONE_POS); + /* unlock */ + RemoveLockFile(lockid,F_EXPCOUNT_LCK); + } + /* cast Nexp to all processors */ + MyBcast(&Nexp,int_type,1,NULL); + /* create directory name */ + sprintf(sbuffer,"m%.4g",ref_index[0][RE]); + ptmp=strchr(sbuffer,'.'); + if (ptmp!=NULL) *ptmp='_'; + sprintf(directory,"%s%03i_%s_g%i%s",run_name,Nexp,shapename,boxX,sbuffer); #ifdef PARALLEL - // add PBS, SGE or SLURM job id to the directory name if available - if ((ptmp=getenv("PBS_JOBID"))!=NULL || (ptmp=getenv("JOB_ID"))!=NULL - || (ptmp=getenv("SLURM_JOBID"))!=NULL) { - // job ID is truncated at first ".", probably can happen only for PBS - if ((ptmp2=strchr(ptmp,'.'))!=NULL) *ptmp2=0; - sprintf(directory+strlen(directory),"id%s",ptmp); - } + /* add PBS or SGE job id to the directory name if available */ + if ((ptmp=getenv("PBS_JOBID"))!=NULL) { + /* jobid is truncated at first "." */ + if ((ptmp2=strchr(ptmp,'.'))!=NULL) *ptmp2=0; + } + else ptmp=getenv("JOB_ID"); + if (ptmp!=NULL) sprintf(directory+strlen(directory),"id%s",ptmp); #endif - } - // make new directory and print info - if (ringid==ROOT) { - MkDirErr(directory,ONE_POS); - printf("all data is saved in '%s'\n",directory); - } - // make logname; do it for all processors to enable additional logging in LogError - if (ringid==ROOT) sprintf(logname,"%s/" F_LOG,directory); - else sprintf(logname,"%s/" F_LOG_ERR,directory,ringid); - // start logfile - if (ringid==ROOT) { - // open logfile - logfile=FOpenErr(logname,"w",ONE_POS); - // log version number - fprintf(logfile,"Generated by ADDA v." ADDA_VERSION "\n"); - // get computer name + } + /* make new directory and print info */ + if (ringid==ROOT) { + MkDirErr(directory,ONE_POS); + printf("all data is saved in '%s'\n",directory); + } + /* make logname; do it for all processors to enable additional logging in LogError */ + if (ringid==ROOT) sprintf(logname,"%s/" F_LOG,directory); + else sprintf(logname,"%s/" F_LOG_ERR,directory,ringid); + /* start logfile */ + if (ringid==ROOT) { + /* open logfille */ + logfile=FOpenErr(logname,"w",ONE_POS); + /* log version number */ + fprintf(logfile,"Generated by ADDA v." ADDA_VERSION "\n"); + /* get computer name */ #ifdef WINDOWS - GetComputerName(cname,&cname_size); - compname=cname; -#else // POSIX and others - compname=getenv("HOST"); + GetComputerName(cname,&cname_size); + compname=cname; +#else /* POSIX and others */ + compname=getenv("HOST"); #endif - // write number of processors and computer name + /* write number of processors and computer name */ #ifdef PARALLEL - // write number of processors - fprintf(logfile,"The program was run on: %d processors (cores)",nprocs); - // add PBS or SGE host name if present, otherwise use compname - if ((ptmp=getenv("PBS_O_HOST"))!=NULL || (ptmp=getenv("SGE_O_HOST"))!=NULL) - fprintf(logfile," from %s\n",ptmp); - else if (compname!=NULL) fprintf(logfile," from %s\n",compname); - else fprintf(logfile,"\n"); -#else // sequential - if (compname!=NULL) fprintf(logfile,"The program was run on: %s\n",compname); + /* write number of processors */ + fprintf(logfile,"The program was run on: %d processors (cores)",nprocs); + /* add PBS or SGE host name if present, otherwise use compname */ + if ((ptmp=getenv("PBS_O_HOST"))!=NULL || (ptmp=getenv("SGE_O_HOST"))!=NULL) + fprintf(logfile," from %s\n",ptmp); + else if (compname!=NULL) fprintf(logfile," from %s\n",compname); + else fprintf(logfile,"\n"); +#else /* sequential */ + if (compname!=NULL) fprintf(logfile,"The program was run on: %s\n",compname); #endif - // log command line - fprintf(logfile,"command: '"); - for(i=0;i<argc;i++) fprintf(logfile,"%s ",argv[i]); - fprintf(logfile,"'\n"); - } - Synchronize(); // needed to wait for creation of the output directory - LogPending(); + /* log command line */ + fprintf(logfile,"command: '"); + for(i=0;i<argc;i++) fprintf(logfile,"%s ",argv[i]); + fprintf(logfile,"'\n"); + } + Synchronize(); /* needed to wait for creation of the output directory */ + LogPending(); } -//============================================================ +/*============================================================*/ void PrintInfo(void) -// print info to stdout and logfile -{ - int i; - char sbuffer[MAX_LINE]; - - if (ringid==ROOT) { - // print basic parameters - printf("lambda: %.10g Dipoles/lambda: %g\n",lambda,dpl); - printf("Required relative residual norm: %g\n",eps); - printf("Total number of occupied dipoles: %.0f\n",nvoid_Ndip); - // log basic parameters - fprintf(logfile,"lambda: %.10g\n",lambda); - fprintf(logfile,"shape: "); - fprintf(logfile,sh_form_str,sizeX); - if (sh_granul) fprintf(logfile, - "\n domain %d is filled with %d granules of diameter %g\n" - " volume fraction: specified - %g, actual - %g",gr_mat+1,gr_N,gr_d,gr_vf,gr_vf_real); - fprintf(logfile,"\nbox dimensions: %ix%ix%i\n",boxX,boxY,boxZ); - if (anisotropy) { - fprintf(logfile,"refractive index (diagonal elements of the tensor):\n"); - if (Nmat==1) fprintf(logfile," (%.10g%+.10gi,%.10g%+.10gi,%.10g%+.10gi)\n", - ref_index[0][RE],ref_index[0][IM],ref_index[1][RE],ref_index[1][IM], - ref_index[2][RE],ref_index[2][IM]); - else { - for (i=0;i<Nmat;i++) { - if (i<Nmat_given) fprintf(logfile, - " %d. (%.10g%+.10gi,%.10g%+.10gi,%.10g%+.10gi)\n", - i+1,ref_index[3*i][RE],ref_index[3*i][IM],ref_index[3*i+1][RE], - ref_index[3*i+1][IM],ref_index[3*i+2][RE],ref_index[3*i+2][IM]); - else fprintf(logfile," %d. not specified\n",i+1); - } - } - } - else { - fprintf(logfile,"refractive index: "); - if (Nmat==1) fprintf(logfile,"%.10g%+.10gi\n",ref_index[0][RE],ref_index[0][IM]); - else { - fprintf(logfile,"1. %.10g%+.10gi\n",ref_index[0][RE],ref_index[0][IM]); - for (i=1;i<Nmat;i++) { - if (i<Nmat_given) fprintf(logfile," %d. %.10g%+.10gi\n", - i+1,ref_index[i][RE],ref_index[i][IM]); - else fprintf(logfile," %d. not specified\n",i+1); - } - } - } - fprintf(logfile,"Dipoles/lambda: %g\n",dpl); - if (volcor_used) fprintf(logfile,"\t(Volume correction used)\n"); - fprintf(logfile,"Required relative residual norm: %g\n",eps); - fprintf(logfile,"Total number of occupied dipoles: %.0f\n",nvoid_Ndip); - if (Nmat>1) { - fprintf(logfile," per domain: 1. %.0f\n",mat_count[0]); - for (i=1;i<Nmat;i++) fprintf(logfile," %d. %.0f\n",i+1,mat_count[i]); - } - fprintf(logfile,"Volume-equivalent size parameter: %.10g\n",ka_eq); - // log incident beam and polarization polarization - fprintf(logfile,"\n---In laboratory reference frame:---\nIncident beam: %s\n",beam_descr); - fprintf(logfile,"Incident propagation vector: (%g,%g,%g)\n",prop_0[0],prop_0[1],prop_0[2]); - fprintf(logfile,"Incident polarization Y(par): (%g,%g,%g)\n", - incPolY_0[0],incPolY_0[1],incPolY_0[2]); - fprintf(logfile,"Incident polarization X(per): (%g,%g,%g)\n\n", - incPolX_0[0],incPolX_0[1],incPolX_0[2]); - // log particle orientation - if (orient_avg) fprintf(logfile,"Particle orientation - averaged\n%s\n",avg_string); - else { - // log incident polarization after transformation - if (alph_deg!=0 || bet_deg!=0 || gam_deg!=0) { - fprintf(logfile,"Particle orientation (deg): alpha=%g, beta=%g, gamma=%g\n\n" - "---In particle reference frame:---\n",alph_deg,bet_deg,gam_deg); - if (beam_asym) fprintf(logfile,"Incident Beam center position: (%g,%g,%g)\n", - beam_center[0],beam_center[1],beam_center[2]); - fprintf(logfile,"Incident propagation vector: (%g,%g,%g)\n", - prop[0],prop[1],prop[2]); - fprintf(logfile,"Incident polarization Y(par): (%g,%g,%g)\n", - incPolY[0],incPolY[1],incPolY[2]); - fprintf(logfile,"Incident polarization X(per): (%g,%g,%g)\n\n", - incPolX[0],incPolX[1],incPolX[2]); - } - else fprintf(logfile,"Particle orientation: default\n\n"); - } - // log Polarization relation - fprintf(logfile,"Polarization relation: "); - if (PolRelation==POL_CM) fprintf(logfile,"'Clausius-Mossotti'\n"); - else if (PolRelation==POL_RR) fprintf(logfile,"'Radiative Reaction Correction'\n"); - else if (PolRelation==POL_LDR) { - fprintf(logfile,"'Lattice Dispersion Relation'"); - if (avg_inc_pol) fprintf(logfile," (averaged over incident polarization)"); - fprintf(logfile,"\n"); - } - else if (PolRelation==POL_CLDR) - fprintf(logfile,"'Corrected Lattice Dispersion Relation'\n"); - else if (PolRelation==POL_FCD) fprintf(logfile,"'Filtered Coupled Dipoles'\n"); - else if (PolRelation==POL_SO) fprintf(logfile,"'Second Order'\n"); - // log Scattering Quantities formulae - fprintf(logfile,"Scattering quantities formulae: "); - if (ScatRelation==SQ_DRAINE) fprintf(logfile,"'by Draine'\n"); - else if (ScatRelation==SQ_SO) fprintf(logfile,"'Second Order'\n"); - // log Interaction term prescription - fprintf(logfile,"Interaction term prescription: "); - if (IntRelation==G_POINT_DIP) fprintf(logfile,"'as Point dipoles'\n"); - else if (IntRelation==G_FCD) fprintf(logfile,"'Filtered Green's tensor'\n"); - else if (IntRelation==G_FCD_ST) - fprintf(logfile,"'Filtered Green's tensor (quasistatic)'\n"); - else if (IntRelation==G_SO) fprintf(logfile,"'Second Order'\n"); - // log FFT method - fprintf(logfile,"FFT algorithm: "); + /* print info to stdout and logfile */ +{ + int i; + char sbuffer[MAX_LINE]; + + if (ringid==ROOT) { + /* print basic parameters */ + printf("lambda: %.10g Dipoles/lambda: %g\n",lambda,dpl); + printf("Required relative residual norm: %g\n",eps); + printf("Total number of occupied dipoles: %.0f\n",nvoid_Ndip); + /* log basic parameters */ + fprintf(logfile,"lambda: %.10g\n",lambda); + fprintf(logfile,"shape: "); + fprintf(logfile,sh_form_str,sizeX); + if (sh_granul) fprintf(logfile, + "\n domain %d is filled with %d granules of diameter %g\n"\ + " volume fraction: specified - %g, actual - %g",gr_mat+1,gr_N,gr_d,gr_vf,gr_vf_real); + fprintf(logfile,"\nbox dimensions: %ix%ix%i\n",boxX,boxY,boxZ); + if (anisotropy) { + fprintf(logfile,"refractive index (diagonal elements of the tensor):\n"); + if (Nmat==1) fprintf(logfile," (%.10g%+.10gi,%.10g%+.10gi,%.10g%+.10gi)\n", + ref_index[0][RE],ref_index[0][IM],ref_index[1][RE],ref_index[1][IM], + ref_index[2][RE],ref_index[2][IM]); + else { + for (i=0;i<Nmat;i++) { + if (i<Nmat_given) fprintf(logfile," %d. (%.10g%+.10gi,%.10g%+.10gi,%.10g%+.10gi)\n", + i+1,ref_index[3*i][RE],ref_index[3*i][IM],ref_index[3*i+1][RE], + ref_index[3*i+1][IM],ref_index[3*i+2][RE],ref_index[3*i+2][IM]); + else fprintf(logfile," %d. not specified\n",i+1); + } + } + } + else { + fprintf(logfile,"refractive index: "); + if (Nmat==1) fprintf(logfile,"%.10g%+.10gi\n",ref_index[0][RE],ref_index[0][IM]); + else { + fprintf(logfile,"1. %.10g%+.10gi\n",ref_index[0][RE],ref_index[0][IM]); + for (i=1;i<Nmat;i++) { + if (i<Nmat_given) fprintf(logfile, + " %d. %.10g%+.10gi\n",i+1,ref_index[i][RE],ref_index[i][IM]); + else fprintf(logfile," %d. not specified\n",i+1); + } + } + } + fprintf(logfile,"Dipoles/lambda: %g\n",dpl); + if (volcor_used) fprintf(logfile,"\t(Volume correction used)\n"); + fprintf(logfile,"Required relative residual norm: %g\n",eps); + fprintf(logfile,"Total number of occupied dipoles: %.0f\n",nvoid_Ndip); + if (Nmat>1) { + fprintf(logfile," per domain: 1. %.0f\n",mat_count[0]); + for (i=1;i<Nmat;i++) fprintf(logfile," %d. %.0f\n",i+1,mat_count[i]); + } + fprintf(logfile, + "Volume-equivalent size parameter: %.10g\n",ka_eq); + /* log incident beam and polarization polarization */ + fprintf(logfile,"\n---In laboratory reference frame:---\nIncident beam: %s\n",beam_descr); + fprintf(logfile,"Incident propagation vector: (%g,%g,%g)\n", + prop_0[0],prop_0[1],prop_0[2]); + fprintf(logfile,"Incident polarization Y(par): (%g,%g,%g)\n", + incPolY_0[0],incPolY_0[1],incPolY_0[2]); + fprintf(logfile,"Incident polarization X(per): (%g,%g,%g)\n\n", + incPolX_0[0],incPolX_0[1],incPolX_0[2]); + /* log particle orientation */ + if (orient_avg) fprintf(logfile,"Particle orientation - averaged\n%s\n",avg_string); + else { + /* log incident polarization after transformation */ + if (alph_deg!=0 || bet_deg!=0 || gam_deg!=0) { + fprintf(logfile,"Particle orientation (deg): alpha=%g, beta=%g, gamma=%g\n\n"\ + "---In particle reference frame:---\n",alph_deg,bet_deg,gam_deg); + if (beam_asym) fprintf(logfile,"Incident Beam center position: (%g,%g,%g)\n", + beam_center[0],beam_center[1],beam_center[2]); + fprintf(logfile,"Incident propagation vector: (%g,%g,%g)\n", + prop[0],prop[1],prop[2]); + fprintf(logfile,"Incident polarization Y(par): (%g,%g,%g)\n", + incPolY[0],incPolY[1],incPolY[2]); + fprintf(logfile,"Incident polarization X(per): (%g,%g,%g)\n\n", + incPolX[0],incPolX[1],incPolX[2]); + } + else fprintf(logfile,"Particle orientation: default\n\n"); + } + /* log Polarization relation */ + if (PolRelation==POL_CM) + fprintf(logfile,"Polarization relation: 'Clausius-Mossotti'\n"); + else if (PolRelation==POL_RR) + fprintf(logfile,"Polarization relation: 'Radiative Reaction Correction'\n"); + else if (PolRelation==POL_LDR) { + fprintf(logfile,"Polarization relation: 'Lattice Dispersion Relation'"); + if (avg_inc_pol) fprintf(logfile," (averaged over incident polarization)"); + fprintf(logfile,"\n"); + } + else if (PolRelation==POL_CLDR) + fprintf(logfile,"Polarization relation: 'Corrected Lattice Dispersion Relation'\n"); + else if (PolRelation==POL_FCD) + fprintf(logfile,"Polarization relation: 'Filtered Coupled Dipoles'\n"); + else if (PolRelation==POL_SO) + fprintf(logfile,"Polarization relation: 'Second Order'\n"); + /* log Scattering Quantities formulae */ + if (ScatRelation==SQ_DRAINE) + fprintf(logfile,"Scattering quantities formulae: 'by Draine'\n"); + else if (ScatRelation==SQ_SO) + fprintf(logfile,"Scattering quantities formulae: 'Second Order'\n"); + /* log Interaction term prescription */ + if (IntRelation==G_POINT_DIP) + fprintf(logfile,"Interaction term prescription: 'as Point dipoles'\n"); + else if (IntRelation==G_FCD) + fprintf(logfile,"Interaction term prescription: 'Filtered Green's tensor'\n"); + else if (IntRelation==G_FCD_ST) + fprintf(logfile,"Interaction term prescription: 'Filtered Green's tensor (quasistatic)'\n"); + else if (IntRelation==G_SO) + fprintf(logfile,"Interaction term prescription: 'Second Order'\n"); + /* log FFT method */ #ifdef FFTW3 - fprintf(logfile,"FFTW3\n"); + fprintf(logfile,"FFT algorithm: FFTW3\n"); #elif defined(FFT_TEMPERTON) - fprintf(logfile,"by C.Temperton\n"); + fprintf(logfile,"FFT algorithm: by C.Temperton\n"); #endif - // log Iterative Method - fprintf(logfile,"Iterative Method: "); - if (IterMethod==IT_CGNR) fprintf(logfile,"CGNR\n"); - else if (IterMethod==IT_BICGSTAB) fprintf(logfile,"Bi-CG Stabilized\n"); - else if (IterMethod==IT_BICG_CS) fprintf(logfile,"Bi-CG (complex symmetric)\n"); - else if (IterMethod==IT_QMR_CS) fprintf(logfile,"QMR (complex symmetric)\n"); - // log Symmetry options; do not print anything in case of SYM_AUTO - if (sym_type==SYM_NO) fprintf(logfile,"No symmetries are used\n"); - else if (sym_type==SYM_ENF) fprintf(logfile,"Symmetry is enforced by user (warning!)\n"); - // log optimization method - if (save_memory) fprintf(logfile,"Optimization is done for minimum memory usage\n"); - else fprintf(logfile,"Optimization is done for maximum speed\n"); - // log Checkpoint options - if (load_chpoint) fprintf(logfile,"Simulation is continued from a checkpoint\n"); - if (chp_type!=CHP_NONE) { - fprintf(logfile,"Checkpoint is turned on:\n"); - if (chp_type==CHP_NORMAL) fprintf(logfile," type = normal\n"); - else if (chp_type==CHP_REGULAR) fprintf(logfile," type = regular\n"); - else if (chp_type==CHP_ALWAYS) fprintf(logfile," type = always\n"); - if (chp_time==UNDEF) fprintf(logfile," time = no limit\n"); - else { - PrintTime(sbuffer,&chp_time); - /* chp_time is converted to long to avoid problems with definition of time_t - * (can be either int or long) - */ - fprintf(logfile," time = %s(%ld sec)\n",sbuffer,(long)chp_time); - } - } - if (load_chpoint || chp_type!=CHP_NONE) - fprintf(logfile," directory = '%s'\n",chp_dir); - } + /* log Iterative Method */ + if (IterMethod==IT_CGNR) + fprintf(logfile,"Iterative Method: CGNR\n"); + else if (IterMethod==IT_BICGSTAB) + fprintf(logfile,"Iterative Method: Bi-CG Stabilized\n"); + else if (IterMethod==IT_BICG_CS) + fprintf(logfile,"Iterative Method: Bi-CG (complex symmetric)\n"); + else if (IterMethod==IT_QMR_CS) + fprintf(logfile,"Iterative Method: QMR (complex symmetric)\n"); + /* log Symmetry options; do not print anything in case of SYM_AUTO */ + if (sym_type==SYM_NO) fprintf(logfile,"No symmetries are used\n"); + else if (sym_type==SYM_ENF) fprintf(logfile,"Symmetry is enforced by user (warning!)\n"); + /* log optimization method */ + if (save_memory) fprintf(logfile,"Optimization is done for minimum memory usage\n"); + else fprintf(logfile,"Optimization is done for maximum speed\n"); + /* log Checkpoint options */ + if (load_chpoint) fprintf(logfile,"Simulation is continued from a checkpoint\n"); + if (chp_type!=CHP_NONE) { + fprintf(logfile,"Checkpoint is turned on:\n"); + if (chp_type==CHP_NORMAL) fprintf(logfile," type = normal\n"); + else if (chp_type==CHP_REGULAR) fprintf(logfile," type = regular\n"); + else if (chp_type==CHP_ALWAYS) fprintf(logfile," type = always\n"); + if (chp_time==UNDEF) fprintf(logfile," time = no limit\n"); + else { + PrintTime(sbuffer,&chp_time); + /* chp_time is converted to long to avoid problems with definition of time_t + (can be either int or long) */ + fprintf(logfile," time = %s(%ld sec)\n",sbuffer,(long)chp_time); + } + } + if (load_chpoint || chp_type!=CHP_NONE) + fprintf(logfile," directory = '%s'\n",chp_dir); + } } diff --git a/src/param.h b/src/param.h index 2c155f50..620c9184 100644 --- a/src/param.h +++ b/src/param.h @@ -2,93 +2,93 @@ * AUTH: Maxim Yurkin * DESCR: INLINE routines for testing of input parameters * - * Copyright (C) 2006,2008 University of Amsterdam + * Copyright (C) 2006 University of Amsterdam * This code is covered by the GNU General Public License. */ #ifndef __param_h #define __param_h -#include "function.h" // needed for INLINE and function attributes +#include "function.h" /* needed for INLINE and function attributes */ typedef struct { - int l1; // first level index - int l2; // second level index + int l1; /* first level index */ + int l2; /* second level index */ } opt_index; -extern opt_index opt; // defined in param.c +extern opt_index opt; /* defined in param.c */ void PrintErrorHelp(const char *fmt, ... ) ATT_PRINTF(1,2) ATT_NORETURN; -//============================================================ +/*============================================================*/ INLINE void TestPositive(const double val,const char *name) -// check if val is positive, otherwise produces error message + /* check if val is positive, otherwise produces error message */ { - if (val<=0) PrintErrorHelp("Illegal %s (%g), must be positive",name,val); + if (val<=0) PrintErrorHelp("Illegal %s (%g), must be positive",name,val); } -//============================================================ +/*============================================================*/ INLINE void TestNonNegative(const double val,const char *name) -// check if val is positive, otherwise produces error message + /* check if val is positive, otherwise produces error message */ { - if (val<0) PrintErrorHelp("Illegal %s (%g), must be nonnegative",name,val); + if (val<0) PrintErrorHelp("Illegal %s (%g), must be nonnegative",name,val); } -//============================================================ +/*============================================================*/ INLINE void TestPositive_i(const int val,const char *name) -// check if val (int) is positive, otherwise produces error message + /* check if val (int) is positive, otherwise produces error message */ { - if (val<=0) PrintErrorHelp("Illegal %s (%d), must be positive",name,val); + if (val<=0) PrintErrorHelp("Illegal %s (%d), must be positive",name,val); } -//============================================================ -/* In following 4 functions, one of two letters means either Including or Not-including (left and - * right point of the interval respectively) - */ +/*============================================================*/ +/* In following 4 functions, one of two letters means either Including or Notincluding (left and + right point of the interval respectively) */ INLINE void TestRangeII(const double val,const char *name,const double min,const double max) -// check if val is in interval [min,max], otherwise produces error message + /* check if val is in interval [min,max], otherwise produces error message */ { - if (val<min || val>max) - PrintErrorHelp("Illegal %s (%g), must belong to the interval [%g,%g]",name,val,min,max); + if (val<min || val>max) PrintErrorHelp("Illegal %s (%g), must belong to the interval [%g,%g]", + name,val,min,max); } -//============================================================ +/*============================================================*/ INLINE void TestRangeNI(const double val,const char *name,const double min,const double max) -// checks if val is in interval (min,max], otherwise produces error message + /* check if val is in interval (min,max], otherwise produces error message */ { - if (val<=min || val>max) - PrintErrorHelp("Illegal %s (%g), must belong to the interval (%g,%g]",name,val,min,max); + if (val<=min || val>max) PrintErrorHelp("Illegal %s (%g), must belong to the interval (%g,%g]", + name,val,min,max); } -//============================================================ +/*============================================================*/ INLINE void TestRangeIN(const double val,const char *name,const double min,const double max) -// checks if val is in interval [min,max), otherwise produces error message + /* check if val is in interval [min,max), otherwise produces error message */ { - if (val<min || val>=max) - PrintErrorHelp("Illegal %s (%g), must belong to the interval [%g,%g)",name,val,min,max); + if (val<min || val>=max) PrintErrorHelp("Illegal %s (%g), must belong to the interval [%g,%g)", + name,val,min,max); } -//============================================================ +/*============================================================*/ INLINE void TestRangeNN(const double val,const char *name,const double min,const double max) -// checks if val is in interval (min,max), otherwise produces error message + /* check if val is in interval (min,max), otherwise produces error message */ { - if (val<=min || val>=max) - PrintErrorHelp("Illegal %s (%g), must belong to the interval (%g,%g)",name,val,min,max); + if (val<=min || val>=max) PrintErrorHelp("Illegal %s (%g), must belong to the interval (%g,%g)", + name,val,min,max); } -//============================================================ +/*============================================================*/ INLINE void TestRange_i(const int val,const char *name,const int min,const int max) -// checks if val (int) is in interval [min,max], otherwise produces error message + /* check if val (int) is in interval [min,max], otherwise produces error message */ { - if (val<min || val>max) - PrintErrorHelp("Illegal %s (%d), must belong to the interval [%d,%d]",name,val,min,max); + if (val<min || val>max) PrintErrorHelp("Illegal %s (%d), must belong to the interval [%d,%d]", + name,val,min,max); } -#endif // __param_h +#endif /* __param_h */ + diff --git a/src/parbas.h b/src/parbas.h index a60ef3ad..7addddea 100644 --- a/src/parbas.h +++ b/src/parbas.h @@ -2,23 +2,23 @@ * AUTH: Maxim Yurkin * DESCR: Parallel basics. Include necessary headers and checks version of the standard. * - * Copyright (C) 2007,2008 University of Amsterdam + * Copyright (C) 2007 University of Amsterdam * This code is covered by the GNU General Public License. */ #ifndef __parbas_h #define __parbas_h #ifdef MPI -# include <mpi.h> -// define required version of MPI -# define MPI_VER_REQ 1 -# define MPI_SUBVER_REQ 2 -// check MPI version for conformity during compilation -# if !defined(MPI_VERSION) || !defined(MPI_SUBVERSION) -# error *** Can not determine MPI version, hence MPI is too old. *** -# elif (MPI_VERSION<MPI_VER_REQ) || ((MPI_VERSION==MPI_VER_REQ) && (MPI_SUBVERSION<MPI_SUBVER_REQ)) -# error *** MPI version is too old. *** -# endif +# include <mpi.h> +/* define required version of MPI */ +# define MPI_VER_REQ 1 +# define MPI_SUBVER_REQ 2 +/* check MPI version for conformity during compilation */ +# if !defined(MPI_VERSION) || !defined(MPI_SUBVERSION) +# error *** Can not determine MPI version, hence MPI is too old. *** +# elif (MPI_VERSION<MPI_VER_REQ) || ((MPI_VERSION==MPI_VER_REQ) && (MPI_SUBVERSION<MPI_SUBVER_REQ)) +# error *** MPI version is too old. *** +# endif #endif -#endif // __parbas_h +#endif /*__parbas_h*/ diff --git a/src/prec_time.c b/src/prec_time.c index c5479c04..6d3abf18 100644 --- a/src/prec_time.c +++ b/src/prec_time.c @@ -1,21 +1,21 @@ /* FILE: prec_time.c * AUTH: Maxim Yurkin * DESCR: precision timing routines (OS dependent) - * definitions (including inline ones) are in prec_time.h + * definitions and inlines are in prec_time.h * - * Copyright (C) 2006,2008 University of Amsterdam + * Copyright (C) 2006 University of Amsterdam * This code is covered by the GNU General Public License. */ -#include "prec_time.h" // here all other needed includes are added +#include "prec_time.h" /* here all other needed includes are added */ -// this is to eliminate warnings about empty source file -#include "function.h" // for function attributes +/* this is to eliminate warnings about empty source file */ +#include "function.h" /* for function attributes */ void void_function(void) ATT_UNUSED; void void_function(void) {} -#ifdef PRECISE_TIMING // following is only for precise timing +#ifdef PRECISE_TIMING /* following is only for precise timing */ -// LOCAL VARIABLES +/* LOCAL VARIABLES */ #ifdef WINDOWS static double inv_freq; @@ -23,52 +23,54 @@ static double inv_freq; # define MICRO 1E-6 #endif -//============================================================ +/*============================================================*/ void InitTime(SYSTEM_TIME *t) -// set time to zero + /* set time to zero */ { #ifdef WINDOWS - t->QuadPart=0; + t->QuadPart=0; #elif defined(POSIX) - t->tv_sec=t->tv_usec=0; + t->tv_sec=t->tv_usec=0; #endif } -//============================================================ +/*============================================================*/ void SetTimerFreq(void) -// set frequency of windows timer; should be called once before running TimerToSec or DiffSec + /* set frequency of windows timer; + should be called once before running TimerToSec or DiffSec*/ { #ifdef WINDOWS - LARGE_INTEGER freq; + LARGE_INTEGER freq; - QueryPerformanceFrequency(&freq); - inv_freq=1/(double)freq.QuadPart; + QueryPerformanceFrequency(&freq); + inv_freq=1/(double)freq.QuadPart; #endif } -//============================================================ +/*============================================================*/ double TimerToSec(const SYSTEM_TIME *t) -// timer to seconds + /* timer to seconds */ { #ifdef WINDOWS - return (inv_freq*t->QuadPart); + return (inv_freq*t->QuadPart); #elif defined(POSIX) - return (t->tv_sec+MICRO*t->tv_usec); + return (t->tv_sec+MICRO*t->tv_usec); #endif } -//============================================================ +/*============================================================*/ double DiffSec(const SYSTEM_TIME *t1,const SYSTEM_TIME *t2) -// difference between two times in seconds + /* difference between two times in seconds */ { - SYSTEM_TIME res; + SYSTEM_TIME res; - elapsed(t1,t2,&res); - return TimerToSec(&res); + elapsed(t1,t2,&res); + return TimerToSec(&res); } -#endif // PRECISE_TIMING +#endif /* PRECISE_TIMING */ + diff --git a/src/prec_time.h b/src/prec_time.h index 5c5e0425..030405e2 100644 --- a/src/prec_time.h +++ b/src/prec_time.h @@ -1,35 +1,35 @@ /* FILE: prec_time.h * AUTH: Maxim Yurkin - * DESCR: definitions of inline functions for precise timing + * DESCR: definitions of inline functions for + * precise timing * - * Copyright (C) 2006-2008 University of Amsterdam + * Copyright (C) University of Amsterdam * This code is covered by the GNU General Public License. */ #ifndef __prec_time_h #define __prec_time_h -/* Precise timing gives an accuracy of order micro_sec. It gives extensive information on timing of - * FFT initialization, D-matrix initialization, and Matrix Vector multiplication. It is optimized - * to consume as little time as possible by itself. It is used mostly for locating and optimizing - * the bottlenecks of the code execution. It is not ANSI C, therefore is system dependent, though - * is expected to work for most. - */ +/* Precise timing gives an accuracy of order micro_sec. It gives extensive information + on timing of FFT init, Dmatrix init, and Matrix Vector multiplication. + It is optimized not to take by itself as little time as possible. + It is used mostly for locating and optimizing the bottlenecks of the code execution. + It is not ANSI C, therefore is system dependent, though is expected to work for most. */ -//#define PRECISE_TIMING // uncomment to perform precise timing +/*#define PRECISE_TIMING /* uncomment to conduct precise timing */ #ifdef PRECISE_TIMING -#include "os.h" // for OS definitions -#include "function.h" // for INLINE and function attributes +#include "os.h" /* for OS definitions */ +#include "function.h" /* for INLINE and function attributes */ #ifdef WINDOWS -# define SYSTEM_TIME LARGE_INTEGER +# define SYSTEM_TIME LARGE_INTEGER #elif defined(POSIX) -# include <sys/time.h> // for timeval and gettimeofday -# include <stdio.h> // needed for definition of NULL -# define SYSTEM_TIME struct timeval +# include <sys/time.h> /* for timeval and gettimeofday */ +# include <stdio.h> /* needed for definition of NULL */ +# define SYSTEM_TIME struct timeval #else -# error *** Unknown operation system. Precise timing is not supported. *** +# error *** Unknown operation system. Precise timing is not supported. *** #endif void InitTime(SYSTEM_TIME *t); @@ -37,45 +37,46 @@ void SetTimerFreq(void); double TimerToSec(const SYSTEM_TIME *t) ATT_PURE; double DiffSec(const SYSTEM_TIME *t1,const SYSTEM_TIME *t2) ATT_PURE; -//============================================================ +/*============================================================*/ INLINE void elapsed(const SYSTEM_TIME *t1,const SYSTEM_TIME *t2,SYSTEM_TIME *res) -// compute time difference + /* compute time difference */ { #ifdef WINDOWS - res->QuadPart=t2->QuadPart-t1->QuadPart; + res->QuadPart=t2->QuadPart-t1->QuadPart; #elif defined(POSIX) - res->tv_sec=t2->tv_sec-t1->tv_sec; - res->tv_usec=t2->tv_usec-t1->tv_usec; + res->tv_sec=t2->tv_sec-t1->tv_sec; + res->tv_usec=t2->tv_usec-t1->tv_usec; #endif } -//============================================================ +/*============================================================*/ INLINE void ElapsedInc(const SYSTEM_TIME *t1,const SYSTEM_TIME *t2,SYSTEM_TIME *res) -// compute time difference, increment result by this value + /* compute time difference, increment result by this value */ { #ifdef WINDOWS - res->QuadPart+=(t2->QuadPart-t1->QuadPart); + res->QuadPart+=(t2->QuadPart-t1->QuadPart); #elif defined(POSIX) - res->tv_sec+=(t2->tv_sec-t1->tv_sec); - res->tv_usec+=(t2->tv_usec-t1->tv_usec); + res->tv_sec+=(t2->tv_sec-t1->tv_sec); + res->tv_usec+=(t2->tv_usec-t1->tv_usec); #endif } -//============================================================ +/*============================================================*/ INLINE void GetTime(SYSTEM_TIME *t) -// get current time + /* compute time difference, increment result by this value */ { #ifdef WINDOWS - QueryPerformanceCounter(t); + QueryPerformanceCounter(t); #elif defined(POSIX) - // gettimeofday is described only in POSIX 1003.1-2001, but it should work for many other systems - gettimeofday(t,NULL); +/* gettimeofday is described only in POSIX 1003.1-2001, but it should work for + many other systems */ + gettimeofday(t,NULL); #endif } -#endif // PRECISE_TIMING +#endif /* PRECISE_TIMING */ -#endif // __prec_time_h +#endif /*__prec_time_h*/ diff --git a/src/sinint.c b/src/sinint.c index dc488dfd..d3c43411 100644 --- a/src/sinint.c +++ b/src/sinint.c @@ -1,107 +1,106 @@ -/* FILE: sinint.c - * AUTH: Maxim Yurkin - * DESCR: Function for calculating sine and cosine integrals. It was originaly based on - * routine given in "Numerical Recipes in C" 2nd ed. and then was slightly corrected - * according to the 3rd ed. of the same book. - * - * Copyright (C) 2007-2008 University of Amsterdam - * This code is covered by the GNU General Public License. - */ -#include <math.h> -#include <float.h> // for DBL_MIN and DBL_MAX -#include "const.h" -#include "cmplx.h" -#include "io.h" - -#define EPS DBL_EPSILON // Relative error, or absolute error near a zero of Ci(x) -#define MAXIT 100 // Maximum number of iterations allowed -#define TMIN 2.0 // Dividing line between using the series and the continued fraction -#define BIG DBL_MAX*EPS // A number near machine overflow limit -#define FPMIN DBL_MIN*4 // Number close to the smallest representable number - -//============================================================ - -void cisi(const double x,double *ci,double *si) -/* Computes the cosine and sine integrals Ci(x) and Si(x). Ci(0) is returned as a large negative - * number and no error message is generated. For x<0 routine returns -Si(-x) [correct] and Ci(-x), - * while actually Ci(x)=Ci(-x)-i*pi. - */ - -{ - int i,k,odd; - double a,err,fact,sign,sum,sumc,sums,t,term; - doublecomplex h,b,c,d,del,tmp; - - t=fabs(x); - // special case - if (x==0) { - *si=0; - *ci=-BIG; - return; - } - // Evaluate continued fraction by modified Lentz's method - if (t>TMIN) { - b[RE]=1; - b[IM]=t; - c[RE]=BIG; - c[IM]=0; - cInv(b,d); - cEqual(d,h); - for (i=1;i<MAXIT;i++) { - a=-i*i; - b[RE]+=2; - // d=1/(a*d+b) - cMultReal(a,d,d); - cAdd(b,d,d); - cInv(d,d); - // c=b+a/c; for i=1 c=+inf, so careful division should be performed to avoid overflows - cInv(c,c); - cMultReal(a,c,c); - cAdd(b,c,c); - // del=c*d, h*=del - cMult(c,d,del); - cMultSelf(h,del); - if (fabs(del[RE]-1)+fabs(del[IM])<=EPS) break; - } - if (i>=MAXIT) LogError(EC_ERROR,ALL_POS, - "Failed to converge during calculation of sine integral of %g",x); - imExp(-t,tmp); - cMultSelf(h,tmp); - *ci=-h[RE]; - *si=PI_OVER_TWO+h[IM]; - } - else { // Evaluate both series simultaneously - // Special case: avoid failure of convergence test because of underflow - if (t<sqrt(FPMIN)) { - sumc=0; - sums=t; - } - else { - sum=sums=sumc=0; - sign=fact=1; - odd=TRUE; - for (k=1;k<=MAXIT;k++) { - fact*=t/k; - term=fact/k; - sum+=sign*term; - err=term/fabs(sum); - if (odd) { - sign=-sign; - sums=sum; - sum=sumc; - } - else { - sumc=sum; - sum=sums; - } - if (err<EPS) break; - odd=!odd; - } - if (k>MAXIT) LogError(EC_ERROR,ALL_POS, - "Failed to converge during calculation of sine integral of %g",x); - } - *si=sums; - *ci=sumc+log(t)+EULER; - } - if (x<0) *si=-(*si); -} +/* FILE: sinint.c + * AUTH: Maxim Yurkin + * DESCR: Function for calculating sine and cosine integrals. It was originaly based on + * routine given in "Numerical Recipes in C" 2nd ed. and then was slightly corrected + * according to the 3rd ed. of the same book. + * + * Copyright (C) 2007-2008 University of Amsterdam + * This code is covered by the GNU General Public License. + */ +#include <math.h> +#include <float.h> /* for DBL_MIN and DBL_MAX */ +#include "const.h" +#include "cmplx.h" +#include "io.h" + +#define EPS DBL_EPSILON /* Relative error, or absolute error near a zero of Ci(x) */ +#define MAXIT 100 /* Maximum number of iterations allowed */ +#define TMIN 2.0 /* Dividing line between using the series and the continued fraction */ +#define BIG DBL_MAX*EPS /* A number near machine overflow limit */ +#define FPMIN DBL_MIN*4 /* Number close smallest representative number */ + +/*============================================================*/ + +void cisi(const double x,double *ci,double *si) + /* Computes the cosine and sine integrals Ci(x) and Si(x). Ci(0) is returned + as a large negative number and no error message is generated. For x<0 + routine returns -Si(-x) [correct] and Ci(-x), while actually Ci(x)=Ci(-x)-i*pi. */ + +{ + int i,k,odd; + double a,err,fact,sign,sum,sumc,sums,t,term; + doublecomplex h,b,c,d,del,tmp; + + t=fabs(x); + /* special case */ + if (x==0) { + *si=0; + *ci=-BIG; + return; + } + /* Evaluate continued fraction by modified Lentz's method */ + if (t>TMIN) { + b[RE]=1; + b[IM]=t; + c[RE]=BIG; + c[IM]=0; + cInv(b,d); + cEqual(d,h); + for (i=1;i<MAXIT;i++) { + a=-i*i; + b[RE]+=2; + /* d=1/(a*d+b) */ + cMultReal(a,d,d); + cAdd(b,d,d); + cInv(d,d); + /* c=b+a/c; for i=1 c=+inf, so careful division should be performed to avoid overflows */ + cInv(c,c); + cMultReal(a,c,c); + cAdd(b,c,c); + /* del=c*d, h*=del */ + cMult(c,d,del); + cMultSelf(h,del); + if (fabs(del[RE]-1)+fabs(del[IM])<=EPS) break; + } + if (i>=MAXIT) LogError(EC_ERROR,ALL_POS, + "Failed to converge during calculation of sine integral of %g",x); + imExp(-t,tmp); + cMultSelf(h,tmp); + *ci=-h[RE]; + *si=PI_OVER_TWO+h[IM]; + } + else { /* Evaluate both series simultaneously */ + /* Special case: avoid failure of convergence test because of underflow */ + if (t<sqrt(FPMIN)) { + sumc=0; + sums=t; + } + else { + sum=sums=sumc=0; + sign=fact=1; + odd=TRUE; + for (k=1;k<=MAXIT;k++) { + fact*=t/k; + term=fact/k; + sum+=sign*term; + err=term/fabs(sum); + if (odd) { + sign=-sign; + sums=sum; + sum=sumc; + } + else { + sumc=sum; + sum=sums; + } + if (err<EPS) break; + odd=!odd; + } + if (k>MAXIT) LogError(EC_ERROR,ALL_POS, + "Failed to converge during calculation of sine integral of %g",x); + } + *si=sums; + *ci=sumc+log(t)+EULER; + } + if (x<0) *si=-(*si); +} diff --git a/src/timing.c b/src/timing.c index 6684232a..21892dbb 100644 --- a/src/timing.c +++ b/src/timing.c @@ -2,7 +2,7 @@ * AUTH: Maxim Yurkin * DESCR: Basic timing and statistics routines * - * Copyright (C) 2006,2008 University of Amsterdam + * Copyright (C) 2006 University of Amsterdam * This code is covered by the GNU General Public License. */ #include <stdio.h> @@ -14,176 +14,177 @@ #include "timing.h" #ifdef MPI -# define TO_SEC(p) (p) +#define TO_SEC(p) (p) #else -# define TO_SEC(p) ((p) / (double) CLOCKS_PER_SEC) +#define TO_SEC(p) ((p) / (double) CLOCKS_PER_SEC) #endif -// SEMI-GLOBAL VARIABLES +/* SEMI-GLOBAL VARIABLES */ -// used in CalculateE.c -TIME_TYPE Timing_EFieldPlane,Timing_comm_EField, // for Eplane calculation: total and comm - Timing_IntField,Timing_IntFieldOne, // for internal fields: total & one calculation - Timing_ScatQuan; // for integral scattering quantities -unsigned long TotalEFieldPlane; // total number of planes for scattered field calculations -// used in calculator.c -TIME_TYPE Timing_Init; // for total initialization of the program (before CalculateE) -unsigned long TotalEval; // total number of orientation evaluations -// used in comm.c -TIME_TYPE Timing_Dm_Init_comm; // communication time for initialization of D-matrix -// used in crosssec.c -// total time for all_dir and scat_grid calculations -TIME_TYPE Timing_EField_ad,Timing_comm_EField_ad, // time for all_dir: total & comm - Timing_EField_sg,Timing_comm_EField_sg, // time for scat_dir: total & comm - Timing_ScatQuan_comm; // time for comm of scat.quantities -// used in iterative.c -TIME_TYPE Timing_OneIter, // total for one iteration - Timing_InitIter,Timing_InitIter_comm; // for initialization of iterations: total & comm -unsigned long TotalIter; // total number of iterations performed -// used in fft.c -TIME_TYPE Timing_FFT_Init, // for initialization of FFT routines - Timing_Dm_Init; // for building Dmatrix -// used in make_particle.c -TIME_TYPE Timing_Particle, // for particle construction - Timing_Granul,Timing_Granul_comm; // for granule generation: total & comm +/* used in CalculateE.c */ +TIME_TYPE Timing_EFieldPlane,Timing_comm_EField, /* for Eplane calculation: total and comm */ + Timing_IntField,Timing_IntFieldOne, /* for internal fields: total & one calculation */ + Timing_ScatQuan; /* for integral scattering quantities */ +unsigned long TotalEFieldPlane; /* total number of Efield planes calculations */ +/* used in calculator.c */ +TIME_TYPE Timing_Init; /* for total initialization of the program (before CalculateE) */ +unsigned long TotalEval; /* total number of orientation evaluations */ +/* used in comm.c */ +TIME_TYPE Timing_Dm_Init_comm; /* communication time for initialization of D-matrix */ +/* used in crosssec.c */ + /* total time for all_dir and scat_grid calculations */ +TIME_TYPE Timing_EField_ad,Timing_comm_EField_ad, /* time for all_dir: total & comm */ + Timing_EField_sg,Timing_comm_EField_sg, /* time for scat_dir: total & comm */ + Timing_ScatQuan_comm; /* time for comm of scat.quantities */ +/* used in iterative.c */ +TIME_TYPE Timing_OneIter, /* total for one iteration */ + Timing_InitIter,Timing_InitIter_comm; /* for initialization of iterative solver: + total & comm */ +unsigned long TotalIter; /* total number of iterations performed */ +/* used in fft.c */ +TIME_TYPE Timing_FFT_Init, /* for initialization of FFT routines */ + Timing_Dm_Init; /* for building Dmatrix */ +/* used in make_particle.c */ +TIME_TYPE Timing_Particle, /* for particle construction */ + Timing_Granul,Timing_Granul_comm; /* for granule generation: total & comm */ -//============================================================ +/*============================================================*/ void StartTime(void) -// start global time + /* start global time */ { - time(&wt_start); - last_chp_wt=wt_start; -#ifndef MPI // otherwise this initialization is performed immediately after MPI_Init - tstart_main = GET_TIME(); + time(&wt_start); + last_chp_wt=wt_start; +#ifndef MPI /* otherwise this initialization is performed immediately after MPI_Init */ + tstart_main = GET_TIME(); #endif } -//============================================================ +/*============================================================*/ void InitTiming(void) -// init timing variables and counters + /* init timing variables and counters */ { - TotalIter=TotalEval=TotalEFieldPlane=0; - Timing_EField=Timing_FileIO=Timing_IntField=Timing_ScatQuan=Timing_Integration=0; - Timing_ScatQuan_comm=Timing_Dm_Init_comm=0; + TotalIter=TotalEval=TotalEFieldPlane=0; + Timing_EField=Timing_FileIO=Timing_IntField=Timing_ScatQuan=Timing_Integration=0; + Timing_ScatQuan_comm=Timing_Dm_Init_comm=0; } -//============================================================ +/*============================================================*/ void FinalStatistics(void) -// print final output and statistics + /* print final output and statistics */ { - time_t wt_end; - TIME_TYPE Timing_TotalTime; + time_t wt_end; + TIME_TYPE Timing_TotalTime; - // wait for all processes to show correct execution time - Synchronize(); - if (ringid==ROOT) { - // last time measurements - Timing_TotalTime = GET_TIME() - tstart_main; - time(&wt_end); - // log statistics - fprintf(logfile, - "\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n"\ - " Timing Results \n"\ - "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n"); - if (!prognose) { - if (orient_avg) fprintf(logfile, - "Total number of single particle evaluations: %lu\n",TotalEval); - fprintf(logfile, - "Total number of iterations: %lu\n"\ - "Total planes of E field calculation (each %d points): %lu\n\n", - TotalIter,nTheta,TotalEFieldPlane); - } - fprintf(logfile, - "Total wall time: %.0f\n",difftime(wt_end,wt_start)); - fprintf(logfile, + /* wait for all processes to show correct execution time */ + Synchronize(); + if (ringid==ROOT) { + /* last time measurements */ + Timing_TotalTime = GET_TIME() - tstart_main; + time(&wt_end); + /* log statistics */ + fprintf(logfile, + "\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n"\ + " Timing Results \n"\ + "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n"); + if (!prognose) { + if (orient_avg) fprintf(logfile, + "Total number of single particle evaluations: %lu\n",TotalEval); + fprintf(logfile, + "Total number of iterations: %lu\n"\ + "Total planes of E field calculation (each %d points): %lu\n\n", + TotalIter,nTheta,TotalEFieldPlane); + } + fprintf(logfile, + "Total wall time: %.0f\n",difftime(wt_end,wt_start)); + fprintf(logfile, #ifdef MPI - "--Everything below is also wall times--\n"\ - "Time since MPI_Init: %.4f\n", + "--Everything below is also wall times--\n"\ + "Time since MPI_Init: %.4f\n", #else - "--Everything below is processor times--\n"\ - "Total time: %.4f\n", + "--Everything below is processor times--\n"\ + "Total time: %.4f\n", #endif - TO_SEC(Timing_TotalTime)); - fprintf(logfile, - " Initialization time: %.4f\n",TO_SEC(Timing_Init)); - if (!prognose) { - fprintf(logfile, - " init Dmatrix %.4f\n",TO_SEC(Timing_Dm_Init)); + TO_SEC(Timing_TotalTime)); + fprintf(logfile, + " Initialization time: %.4f\n",TO_SEC(Timing_Init)); + if (!prognose) { + fprintf(logfile, + " init Dmatrix %.4f\n",TO_SEC(Timing_Dm_Init)); #ifdef PARALLEL - fprintf(logfile, - " communication: %.4f\n",TO_SEC(Timing_Dm_Init_comm)); + fprintf(logfile, + " communication: %.4f\n",TO_SEC(Timing_Dm_Init_comm)); #endif - fprintf(logfile, - " FFT setup: %.4f\n",TO_SEC(Timing_FFT_Init)); - } - fprintf(logfile, - " make particle: %.4f\n",TO_SEC(Timing_Particle)); - if (sh_granul) { - fprintf(logfile, - " granule generator: %.4f\n",TO_SEC(Timing_Granul)); + fprintf(logfile, + " FFT setup: %.4f\n",TO_SEC(Timing_FFT_Init)); + } + fprintf(logfile, + " make particle: %.4f\n",TO_SEC(Timing_Particle)); + if (sh_granul) { + fprintf(logfile, + " granule generator: %.4f\n",TO_SEC(Timing_Granul)); #ifdef PARALLEL - fprintf(logfile, - " communication: %.4f\n",TO_SEC(Timing_Granul_comm)); + fprintf(logfile, + " communication: %.4f\n",TO_SEC(Timing_Granul_comm)); #endif - } - if (!prognose) { - fprintf(logfile, - " Internal fields: %.4f\n"\ - " one solution: %.4f\n"\ - " init solver: %.4f\n", - TO_SEC(Timing_IntField),TO_SEC(Timing_IntFieldOne),TO_SEC(Timing_InitIter)); + } + if (!prognose) { + fprintf(logfile, + " Internal fields: %.4f\n"\ + " one solution: %.4f\n"\ + " init solver: %.4f\n", + TO_SEC(Timing_IntField),TO_SEC(Timing_IntFieldOne),TO_SEC(Timing_InitIter)); #ifdef PARALLEL - fprintf(logfile, - " communication: %.4f\n",TO_SEC(Timing_InitIter_comm)); + fprintf(logfile, + " communication: %.4f\n",TO_SEC(Timing_InitIter_comm)); #endif - fprintf(logfile, - " one iteration: %.4f\n",TO_SEC(Timing_OneIter)); + fprintf(logfile, + " one iteration: %.4f\n",TO_SEC(Timing_OneIter)); #ifdef PARALLEL - fprintf(logfile, - " communication: %.4f\n",TO_SEC(Timing_OneIterComm)); + fprintf(logfile, + " communication: %.4f\n",TO_SEC(Timing_OneIterComm)); #endif - fprintf(logfile, - " Scattered fields: %.4f\n",TO_SEC(Timing_EField)); - if (yzplane) { - fprintf(logfile, - " one plane: %.4f\n",TO_SEC(Timing_EFieldPlane)); + fprintf(logfile, + " Scattered fields: %.4f\n",TO_SEC(Timing_EField)); + if (yzplane) { + fprintf(logfile, + " one plane: %.4f\n",TO_SEC(Timing_EFieldPlane)); #ifdef PARALLEL - fprintf(logfile, - " communication: %.4f\n",TO_SEC(Timing_comm_EField)); + fprintf(logfile, + " communication: %.4f\n",TO_SEC(Timing_comm_EField)); #endif - } - if (all_dir) { - fprintf(logfile, - " one alldir: %.4f\n",TO_SEC(Timing_EField_ad)); + } + if (all_dir) { + fprintf(logfile, + " one alldir: %.4f\n",TO_SEC(Timing_EField_ad)); #ifdef PARALLEL - fprintf(logfile, - " communication: %.4f\n",TO_SEC(Timing_comm_EField_ad)); + fprintf(logfile, + " communication: %.4f\n",TO_SEC(Timing_comm_EField_ad)); #endif - } - if (scat_grid) { - fprintf(logfile, - " one scat_grid: %.4f\n",TO_SEC(Timing_EField_sg)); + } + if (scat_grid) { + fprintf(logfile, + " one scat_grid: %.4f\n",TO_SEC(Timing_EField_sg)); #ifdef PARALLEL - fprintf(logfile, - " communication: %.4f\n",TO_SEC(Timing_comm_EField_sg)); + fprintf(logfile, + " communication: %.4f\n",TO_SEC(Timing_comm_EField_sg)); #endif - } - fprintf (logfile, - " Other sc.quantities: %.4f\n",TO_SEC(Timing_ScatQuan)); + } + fprintf (logfile, + " Other sc.quantities: %.4f\n",TO_SEC(Timing_ScatQuan)); #ifdef PARALLEL - fprintf(logfile, - " communication: %.4f\n",TO_SEC(Timing_ScatQuan_comm)); + fprintf(logfile, + " communication: %.4f\n",TO_SEC(Timing_ScatQuan_comm)); #endif - fprintf (logfile, - " File I/O: %.4f\n"\ - "Integration: %.4f\n", - TO_SEC(Timing_FileIO),TO_SEC(Timing_Integration)); - } - // close logfile - FCloseErr(logfile,F_LOG,ONE_POS); - } + fprintf (logfile, + " File I/O: %.4f\n"\ + "Integration: %.4f\n", + TO_SEC(Timing_FileIO),TO_SEC(Timing_Integration)); + } + /* close logfile */ + FCloseErr(logfile,F_LOG,ONE_POS); + } } diff --git a/src/timing.h b/src/timing.h index ad6f555c..060c2ef8 100644 --- a/src/timing.h +++ b/src/timing.h @@ -2,7 +2,7 @@ * AUTH: Maxim Yurkin * DESCR: Definitions for usual timing; should be completely portable * - * Copyright (C) 2006,2008 University of Amsterdam + * Copyright (C) 2006 University of Amsterdam * This code is covered by the GNU General Public License. */ #ifndef __timing_h @@ -11,16 +11,16 @@ #include "parbas.h" #ifdef MPI -# define TIME_TYPE double -# define GET_TIME MPI_Wtime +#define TIME_TYPE double +#define GET_TIME MPI_Wtime #else -# include <time.h> -# define TIME_TYPE clock_t -# define GET_TIME clock +#include <time.h> +#define TIME_TYPE clock_t +#define GET_TIME clock #endif void StartTime(void); void InitTiming(void); void FinalStatistics(void); -#endif // __timing_h +#endif /* __timing_h */ diff --git a/src/types.h b/src/types.h index 4d4e956e..508846f3 100644 --- a/src/types.h +++ b/src/types.h @@ -2,43 +2,43 @@ * AUTH: Maxim Yurkin * DESCR: definitions of various structures * - * Copyright (C) 2006-2008 University of Amsterdam + * Copyright (C) 2006-2007 University of Amsterdam * This code is covered by the GNU General Public License. */ #ifndef __types_h #define __types_h -// complex numbers -typedef double doublecomplex[2]; // complies with FFTW3 definition +/* complex numbers */ +typedef double doublecomplex[2]; /* complies with FFTW3 definition */ #define RE 0 #define IM 1 -typedef struct // integration parameters +typedef struct /* integration parameters */ { - double eps; // convergence criterion - int Jmin; // minimal number of refinements - int Jmax; // maximal number of refinements - double min; // minimum - double max; // maximum - size_t Grid_size; // number of grid points - int equival; // whether max and min points are equivalent - int periodic; // whether integrated function is periodic + double eps; /* convergence criterium */ + int Jmin; /* minimal number of refinements */ + int Jmax; /* maximal number of refinements */ + double min; /* minimum */ + double max; /* maximum */ + size_t Grid_size; /* number of gridpoints */ + int equival; /* whether max and min points are equivalent */ + int periodic; /* whether integrated function is periodic */ } Parms_1D; -typedef struct // values of angles -{ // !!! All angles are in degrees - double min; // minimum; for convenience (not really needed) - double max; // maximum; for convenience (not really needed) - size_t N; // number of points - double *val; // values of points +typedef struct /* values of angles */ +{ /* !!! All angles are in degrees */ + double min; /* minimum; for convenience (not really needed) */ + double max; /* maximum; for convenience (not really needed) */ + size_t N; /* number of points */ + double *val; /* values of points*/ } angle_set; -typedef struct // integration parameters -{ // !!! All angles are in degrees - int type; // if pairs are used or grid - size_t N; // total number of pairs (grid points) - angle_set theta; // values of theta - angle_set phi; // values of phi +typedef struct /* integration parameters */ +{ /* !!! All angles are in degrees */ + int type; /* if pairs are used or grid */ + size_t N; /* total number of pairs (grid points) */ + angle_set theta; /* values of theta */ + angle_set phi; /* values of phi */ } scat_grid_angles; -#endif // __types_h +#endif /*__types_h*/ diff --git a/src/vars.c b/src/vars.c index 204fb9b1..f6705251 100644 --- a/src/vars.c +++ b/src/vars.c @@ -2,132 +2,121 @@ * AUTH: Maxim Yurkin * DESCR: All the global variables are defined here * Global means: used in three or more source files. - * Variables that are used in only two source files are called 'semi-global' + * Variables that are used in only two source files are calles 'semi-global' * and not listed here. They are defined in one file and referenced with * 'extern' in another one. * * Copyright (C) 2006-2008 University of Amsterdam * This code is covered by the GNU General Public License. */ -#include <stdio.h> // for FILE and size_t -#include <time.h> // for time_t -#include "const.h" // for MAX_NMAT, MAX_DIRNAME -#include "types.h" // for doublecomplex, angle_set, scat_grid_angles -#include "timing.h" // for TIME_TYPE +#include <stdio.h> /* for FILE and size_t */ +#include <time.h> /* for time_t */ +#include "const.h" /* for MAX_NMAT, MAX_DIRNAME */ +#include "types.h" /* for doublecomplex, angle_set, scat_grid_angles */ +#include "timing.h" /* for TIME_TYPE */ -// basic variables -int boxX,boxY,boxZ; // sizes of box enclosing the particle -double gridspace; // inter-dipole distance -double kd; // k*d=2*PI/dpl -double ka_eq; // volume-equivalent size parameter -double inv_G; // inverse of equivalent cross section -double WaveNum; // wavenumber of incident light -double *DipoleCoord; // vector to hold the coordinates of the dipoles +/* basic variables */ +int boxX,boxY,boxZ; /* sizes of box enclosing the particle */ +double gridspace; /* inter-dipole distance */ +double kd; /* k*d=2*PI/dpl */ +double ka_eq; /* volume-equivalent size parameter */ +double inv_G; /* inverse of equivalent cross section */ +double WaveNum; /* wavenumber of incident light */ +double *DipoleCoord; /* vector to hold the coordinates of the dipoles */ unsigned short *position; /* position of the dipoles; in the very end of make_particle() - * z-components are adjusted to be relative to the local_z0 - */ -double memory; // total memory usage in bytes -int IntRelation; // type of formula for interaction term -int PolRelation; // type of formula for self-term (polarization relation) -int beamtype; // type of incident beam + z-components are adjusted to be relative to the local_z0 */ +double memory; /* total memory usage in bytes */ +int IntRelation; /* type of formula for interaction term */ +int PolRelation; /* type of formula for self-term (polarization relation) */ +int beamtype; /* type of incident beam */ -// symmetries -int symX,symY,symZ; /* symmetries of reflection relative to the planes perpendicular to x, y, and - * z axes. Only Y is actually used - */ -int symR; // symmetry of 90-degrees rotation about z axes +/* symmetries */ +int symX,symY,symZ; /* symmetries of reflection relative to the planes + perpendicular to x, y, and z axes. Only Y is actually used */ +int symR; /* symmetry of 90 deg. rotation about z axes */ -// flags (TRUE or FALSE) -int prognose; // make a prognosis about needed ram -int yzplane; // Calculate the field in the yz-plane +/* flags (TRUE or FALSE) */ +int prognose; /* make a prognose about needed ram */ +int yzplane; /* Calculate the field in the yz-plane */ int all_dir; /* Calculate the field for all directions on a theta-phi grid (internal - * parameter - initialized by other options: calculation of Csca and asym) - */ -int scat_grid; // calculate field on a grid of scattering angles -int phi_integr; // integrate over the phi angle -int reduced_FFT; // reduced number of storage for FFT, when matrix is symmetric -int orient_avg; // whether to use orientation averaging -int load_chpoint; // whether to load checkpoint -int beam_asym; // whether the beam center is shifted relative to the origin -int sh_granul; // whether to fill one domain with granules -int anisotropy; // whether the scattering medium is anisotropic -int save_memory; // whether to sacrifice some speed for memory + parameter - initialized by other options: calculation of Csca and asym) */ +int scat_grid; /* calculate field on a grid of scattering angles */ +int phi_integr; /* integrate over the phi angle */ +int reduced_FFT; /* reduced number of storage for FFT, when matrix is symmetric */ +int orient_avg; /* whether to use orientation averaging*/ +int load_chpoint; /* whether to load checkpoint */ +int beam_asym; /* whether the beam center is shifted relative to the origin */ +int sh_granul; /* whether to fill one domain with granules */ +int anisotropy; /* whether the scattering medium is anisotropic */ +int save_memory; /* whether to sacrifice some speed for memory */ -// 3D vectors (in particle reference frame) -double prop[3]; // incident direction (in particle reference frame) -double incPolX[3],incPolY[3]; // incident polarizations (in particle RF) -double beam_center[3]; // coordinates of the beam center -double box_origin_unif[3]; /* coordinates of the center of the first dipole in the local - * computational box (after uniform distribution of non-void dipoles - * among all processors) - */ +/* 3D vectors (in particle reference frame) */ +double prop[3]; /* incident direction (in particle reference frame) */ +double incPolX[3],incPolY[3]; /* incident polariztions (in particle RF) */ +double beam_center[3]; /* coordinates of the beam center */ +double box_origin_unif[3]; /* coordinates of the center of the first dipole in the local + computational box (after uniform distribution of non-void dipoles + among all processors) */ -// file info -char directory[MAX_DIRNAME]; // directory to save data in -FILE *logfile; // file where all the information about the run is saved -int term_width; // width of the terminal to which ADDA produces output +/* file info */ +char directory[MAX_DIRNAME]; /* directory to save data in */ +FILE *logfile; /* file where all the information about the run is saved */ +int term_width; /* width of the terminal to which ADDA produces output */ -// refractive index +/* refractive index */ int Nmat; /* number of different domains (for each either scalar or - * tensor refractive index is specified - */ -int Ncomp; // number of components of each refractive index (1 or 3) -doublecomplex ref_index[MAX_NMAT]; // a set of refractive indexes -doublecomplex cc_sqrt[MAX_NMAT][3]; // sqrt of couple constants -unsigned char *material; // material: index for cc + tensor refractive index is specified */ +int Ncomp; /* number of components of each refractive index (1 or 3) */ +doublecomplex ref_index[MAX_NMAT]; /* a set of refractive indexes */ +doublecomplex cc_sqrt[MAX_NMAT][3]; /* sqrt of couple constants */ +unsigned char *material; /* material: index for cc */ -// iterative solver -int IterMethod; // iterative method to use -int maxiter; // maximum number of iterations -doublecomplex *xvec; // total electric field on the dipoles -doublecomplex *pvec; // polarization of dipoles -doublecomplex *Einc; // incident field on dipoles +/* iterative solver */ +int IterMethod; /* iterative method to use */ +int maxiter; /* maximum number of iterations */ +doublecomplex *xvec; /* total electric field on the dipoles */ +doublecomplex *pvec; /* polarization of dipoles */ +doublecomplex *Einc; /* incident field on dipoles */ -// scattering at different angles -int nTheta; // number of angles in scattering profile -double alph_deg, bet_deg, gam_deg; // Euler angles of particle orientation in degrees -angle_set alpha_int; // sets of angles -scat_grid_angles angles; // angle sets for scat_grid -doublecomplex *EgridX,*EgridY; /* E calculated on a grid for many different directions (holds - * Eper and Epar) for two incident polarizations - */ -double *Egrid_buffer; // buffer to accumulate Egrid +/* scattering at different angles */ +int nTheta; /* number of angles in scattering profile */ +double alph_deg, bet_deg, gam_deg; /* Euler angles of particle orientation in degrees */ +angle_set alpha_int; /* sets of angles */ +scat_grid_angles angles; /* angle sets for scat_grid */ +doublecomplex *EgridX,*EgridY; /* E calculated on a grid for many different directions + (holds Eper and Epar) for two incident polarizations */ +double *Egrid_buffer; /* buffer to accumulate Egrid */ -// checkpoint -int chp_type; // type of checkpoint (to save) -time_t chp_time; // time of checkpoint (in sec) -char chp_dir[MAX_DIRNAME]; // directory name to save/load checkpoint +/* checkpoint */ +int chp_type; /* type of checkpoint (to save) */ +time_t chp_time; /* time of checkpoint (in sec) */ +char chp_dir[MAX_DIRNAME]; /* directory name to save/load checkpoint */ -// auxiliary grids and their partition over processors -size_t gridX,gridY,gridZ; /* sizes of the 'matrix' X, size_t - to remove type conversions - * we assume that 'int' is enough for it, but this declaration is - * to avoid type casting in calculations - */ -size_t gridYZ; // gridY*gridZ -size_t smallY,smallZ; // the size of the reduced matrix X -size_t local_Nsmall; // number of points of expanded grid per one processor -int nprocs; // total number of processes -int ringid; // ID of current process -int local_z0,local_z1; // starting and ending z for current processor -size_t local_Nz; // number of z layers (based on the division of smallZ) -int local_Nz_unif; /* number of z layers (distance between max and min values), - * belonging to this processor, after all non_void dipoles are - * uniformly distributed between all processors - */ -int local_z1_coer; // ending z, coerced to be not greater than boxZ -size_t local_x0,local_x1,local_Nx; /* starting, ending x for current processor and number of x - * layers (based on the division of smallX) - */ -size_t local_Ndip; // number of local total dipoles -size_t local_nvoid_Ndip; // number of local and ... -double nvoid_Ndip; // ... total non-void dipoles -size_t nlocalRows; // number of local rows of decomposition (only real dipoles) +/* auxillary grids and their partition over processors */ +size_t gridX,gridY,gridZ; /* sizes of the 'matrix' X, size_t - to remove type conversions */ +size_t gridYZ; /* gridY*gridZ */ +size_t smallY,smallZ; /* the size of the reduced matrix X */ +size_t local_Nsmall; /* number of points of expanded grid per one processor */ +int nprocs; /* total number of processes */ +int ringid; /* id of current process */ +int local_z0,local_z1; /* starting and ending z for current processor*/ +size_t local_Nz; /* number of z layers (based on the division of smallZ) */ +int local_Nz_unif; /* number of z layers (distance between max and min values), + belonging to this processor, after all non_void dipoles are + uniformly distributed between all processors */ +int local_z1_coer; /* ending z, coerced to be not greater than boxZ */ +size_t local_x0,local_x1,local_Nx; /* starting, ending x for current processor and + number of x layers (based on the division of smallX) */ +size_t local_Ndip; /* number of local total dipoles */ +size_t local_nvoid_Ndip; /* number of local and ... */ +double nvoid_Ndip; /* ... total non-void dipoles */ +size_t nlocalRows; /* number of local rows of decomposition (only real dipoles) */ -// timing -time_t wt_start, // starting wall time - last_chp_wt; // wall time of the last checkpoint -TIME_TYPE Timing_EField, // time for calculating scattered fields - Timing_FileIO, // time for input and output - Timing_Integration, // time for all integrations (with precomputed values) - Timing_OneIterComm, // communication time during one iteration - tstart_main; // starting time of the program (after MPI_Init in parallel) +/* timing */ +time_t wt_start, /* starting wall time */ + last_chp_wt; /* wall time of the last checkpoint */ +TIME_TYPE Timing_EField, /* time for calculating scattered fields */ + Timing_FileIO, /* time for input and output */ + Timing_Integration, /* time for all integrations (with precomputed values) */ + Timing_OneIterComm, /* communication time during one iteration */ + tstart_main; /* starting time of the program (after MPI_Init in parallel) */ diff --git a/src/vars.h b/src/vars.h index 57f0474b..b89f781d 100644 --- a/src/vars.h +++ b/src/vars.h @@ -2,7 +2,7 @@ * AUTH: Maxim Yurkin * DESCR: All the global variables are declared here. * Global means: used in three or more source files. - * Variables that are used in only two source files are called 'semi-global' + * Variables that are used in only two source files are calles 'semi-global' * and not listed here. They are defined in one file and referenced with * 'extern' in another one. * @@ -12,13 +12,13 @@ #ifndef __vars_h #define __vars_h -#include <stdio.h> // for FILE and size_t -#include <time.h> // for time_t -#include "const.h" // for MAX_NMAT -#include "types.h" // for doublecomplex, angle_set, scat_grid_angles -#include "timing.h" // for TIME_TYPE +#include <stdio.h> /* for FILE and size_t */ +#include <time.h> /* for time_t */ +#include "const.h" /* for MAX_NMAT */ +#include "types.h" /* for doublecomplex, angle_set, scat_grid_angles */ +#include "timing.h" /* for TIME_TYPE */ -// basic variables +/* basic variables */ extern int boxX,boxY,boxZ; extern double gridspace,kd,ka_eq,inv_G,WaveNum; extern double *DipoleCoord; @@ -28,31 +28,31 @@ extern int IntRelation; extern int PolRelation; extern int beamtype; -// symmetries +/* symmetries */ extern int symX,symY,symZ,symR; -// flags +/* flags (TRUE or FALSE) */ extern int prognose,yzplane,all_dir,scat_grid,phi_integr,sh_granul,reduced_FFT,orient_avg, load_chpoint,beam_asym,anisotropy,save_memory; -// 3D vectors +/* 3D vectors */ extern double prop[3],incPolX[3],incPolY[3],beam_center[3],box_origin_unif[3]; -// file info +/* file info */ extern char directory[]; extern FILE *logfile; -extern int term_width; +extern int term_width; -// refractive index +/* refractive index */ extern int Nmat,Ncomp; extern doublecomplex ref_index[MAX_NMAT]; extern doublecomplex cc_sqrt[MAX_NMAT][3]; extern unsigned char *material; -// iterative solver +/* iterative solver */ extern int IterMethod,maxiter; extern doublecomplex *xvec,*pvec,*Einc; -// scattering at different angles +/* scattering at different angles */ extern int nTheta; extern double alph_deg, bet_deg, gam_deg; extern angle_set alpha_int; @@ -60,12 +60,12 @@ extern scat_grid_angles angles; extern doublecomplex *EgridX,*EgridY; extern double *Egrid_buffer; -// checkpoint +/* checkpoint */ extern int chp_type; extern time_t chp_time; extern char chp_dir[]; -// auxiliary grids and their partition over processors +/* auxillary grids and their partition over processors */ extern size_t gridX,gridY,gridZ; extern size_t gridYZ; extern size_t smallY,smallZ; @@ -76,8 +76,8 @@ extern size_t local_Nz,local_x0,local_x1,local_Nx; extern size_t local_Ndip,local_nvoid_Ndip,nlocalRows; extern double nvoid_Ndip; -// timing +/* timing */ extern time_t wt_start,last_chp_wt; extern TIME_TYPE Timing_EField,Timing_FileIO,Timing_Integration,Timing_OneIterComm,tstart_main; -#endif // __vars_h +#endif /*__vars_h*/ diff --git a/win32/adda.exe b/win32/adda.exe index 32a868c5..a482c51b 100644 Binary files a/win32/adda.exe and b/win32/adda.exe differ diff --git a/win32/adda_mpi.exe b/win32/adda_mpi.exe index 9c7264dd..63626cb7 100644 Binary files a/win32/adda_mpi.exe and b/win32/adda_mpi.exe differ diff --git a/win32/readme.txt b/win32/readme.txt index 35ae3f72..45053a03 100644 --- a/win32/readme.txt +++ b/win32/readme.txt @@ -1,3 +1,27 @@ + ADDA 0.78 + *********** + "Amsterdam DDA" + + Maxim A. Yurkin(1,2) and Alfons G. Hoekstra(1) + + (1) Faculty of Science, Section Computational Science, + of the University of Amsterdam, + Kruislaan 403, 1098 SJ, Amsterdam, The Netherlands, + tel: +31-20-525-7530, fax: +31-20-525-7490 + + (2) Institute of Chemical Kinetics and Combustion, + Siberian Branch of the Russian Academy of Sciences, + Institutskaya 3, Novosibirsk, 630090, Russia, + tel: +7-383-333-3240, fax: +7-383-334-2350 + + email: adda@science.uva.nl + + last revised: 19 March 2008 + + Copyright (C) 2006-2008 University of Amsterdam + This software package is covered by the GNU General Public License. + + ## ##### ## ##### ## ## /#### /##### /## /##### /## /#### / ### // / / ### // / / ### / ### @@ -17,31 +41,6 @@ ## ## ## ## - - Maxim A. Yurkin - - Institute of Chemical Kinetics and Combustion, - Siberian Branch of the Russian Academy of Sciences, - Institutskaya 3, Novosibirsk, 630090, Russia, - tel: +7-383-333-3240, fax: +7-383-334-2350 - - Alfons G. Hoekstra - - Faculty of Science, Section Computational Science, - of the University of Amsterdam, - Kruislaan 403, 1098 SJ, Amsterdam, The Netherlands, - tel: +31-20-525-7530, fax: +31-20-525-7490 - - - email: adda@science.uva.nl - - $Date:: $ - - Copyright (C) 2006-2008 University of Amsterdam - This software package is covered by the GNU General Public License. - - - WINDOWS 32 EXECUTABLES **********************