Sunday 3 February 2013

How do I run large jobs in ABAQUS?

Posted By Admin On 22:56

Running Long Programs

To run long jobs, just log into ts-access and run them. Some familiarity with running things from the Unix/MacOS command line is essential (see A short Unix crib). Familiarity with shell scripts is an advantage. You can run the programs as you normally would from the Linux/MacOS command line, but when running long programs note that

You probably won't be able to give your program input from the command line
If you put & at the end of your command, you can run your program "in the background", freeing up the command line to do other things.
If you put nohup at the start of your command, you can log out without the program being killed when you log out.

so once you're logged into a linux/Unix/MacOS machine at CUED, typing

  slogin ts-access
  ...
  nohup my/program &

will start running my/program and continue running it after you log out of ts-access. Output that would normally appear onscreen goes into a file called nohup.out.

But before you run your program in the background first check that it starts ok when run the normally.

Matlab

If you run a matlab job remember to exit from matlab at the end of the script or function, because Matlab won't automatically exit. If, for example, you have a file in your home directory called roll2dice.m containing

function answer=roll2dice
  answer= randi(6) + randi(6)
  exit

you could log into ts-access, type

  nohup matlab -r roll2dice &

then log out of ts-access. Soon in your home directory on CUED's central system you'll have a file called nohup.out containing the output of your program. Matlab will no longer be running on the ts-access machine.

Preparing your code

Try to write your code so that it saves results periodically, and the program can re-start by loading in those results, carrying on from that stage. In this way you can still make progress even if your programs are interrupted by power-cuts, etc.

Many programs will run much faster if a little thought is given to optimising the code. Once programs run for days, even an improvement of a few percent becomes significant. See

for ways of speeding your programs up.

If your program requires interaction you may need to rewrite it so that interaction isn't required. See the Command line options section for help.

Troubleshooting

Your program may fail for several reasons

Using too much CPU - the system should be set up so that there's no limit to your CPU usage. Confirm that by typing
```
ulimit
```
You should get the reply "unlimited". If you type
```
ulimit -a
```
you'll get a list of other limits.
Using too much memory - maybe you have a "memory leak". Each time your program goes round a loop it may ask for more memory until finally there's no more memory left. You can use the "top" program to monitor memory usage. See Big Processes - Memory issues page for details.
The machine was rebooted. For details about when the machine was last rebooted, type
```
uptime
```
There's a bug in your code that's only triggered after a certain number of iterations or when arrays reach a certain size (because of an unexpected divide-by-zero, or a variable value that becomes bigger than can fit in a variable of that type, etc)

Signals are messages that are sent to processes. Typing

    man 7 signal

will show you a list of them. If your process receives a "SIGSEGV" signal for example, then that generally means a pointer has gone wrong (it's tried to access a piece of memory it's not allowed to) and typically indicates a code bug (most frequently trying to dereference a null pointer). Some signals (e.g. "SIGINT") can be ignored if you choose to do so but "SIGKILL" and "SIGSTOP" can't and will always stop your program. It's possible to add a signal handler to your code to deal with signals. Even if you can't protect your program from being stopped, you might be able to record why it stopped. The Unix Signals and Forking page has some information for C/C++ users.

By tl136 and js138

MechDocs Blog

Sunday 3 February 2013

How do I run large jobs in ABAQUS?

Running Long Programs

Matlab

Preparing your code

Troubleshooting

0 comments:

Post a Comment

Advertisement

Subscribe Us

Labels

Popular Posts

About The Author

Blog Archive

Labels List

Subscribe Here

Know Us

Search

Recent

Slider(Do Not Edit Here!)

Popular

Comments

Author

Main Nav

Banner Ad

Labels

Contact Us

Flickr