Verification Guild
A Community of Verification Professionals

 Create an AccountHome | Calendar | Downloads | FAQ | Links | Site Admin | Your Account  

Login
Nickname

Password

Security Code: Security Code
Type Security Code
BACKWARD

Don't have an account yet? You can create one. As a registered user you have some advantages like theme manager, comments configuration and post comments with your name.

Modules
· Home
· Downloads
· FAQ
· Feedback
· Recommend Us
· Web Links
· Your Account

Advertising

Who's Online
There are currently, 47 guest(s) and 0 member(s) that are online.

You are Anonymous user. You can register for free by clicking here

  
Verification Guild: Forums

 Forum FAQForum FAQ   SearchSearch   UsergroupsUsergroups   ProfileProfile  ProfileDigest    Log inLog in 

Distributed simulations?
Goto page 1, 2  Next
 
Post new topic   Reply to topic    Verification Guild Forum Index -> Main
View previous topic :: View next topic  
Author Message
alexg
Senior
Senior


Joined: Jan 07, 2004
Posts: 586
Location: Ottawa

PostPosted: Fri Feb 11, 2011 9:54 am    Post subject: Distributed simulations? Reply with quote

I am looking at distributed simulations as the one of possible ways to speed up simulations for large designs. As a simple example, design and testbench may run in parallel on 2 computers, using data structures to communicate with each other. It would be interesting to know if such an expertize exists in the industry.

Regards,
-Alex
Back to top
View user's profile Send e-mail
qwk000
Senior
Senior


Joined: Oct 13, 2004
Posts: 66
Location: Fort Worth, TX

PostPosted: Fri Feb 11, 2011 10:29 am    Post subject: Reply with quote

I don't know of any simulators that can support multi-machine computing. VCS supports multi-processor (parallel) computing, but you are looking beyond just multi-processor right?
Back to top
View user's profile
alexg
Senior
Senior


Joined: Jan 07, 2004
Posts: 586
Location: Ottawa

PostPosted: Fri Feb 11, 2011 11:05 am    Post subject: Reply with quote

Few more words about distributed simulations as I see it.

There is no need for specific simulators - any simulator will work fine.
To further clarify my intent, I'll give another example.

Assume, we have 2 design blocks. These blocks communicate with each other using data frames, and, together, perform some function F which has to be verified. Assume also, that both blocks are quite large, and, together with their block-level testbenches, already consume significant amount of simulation time. In order to verify function F, there can be 2 solutions:

1. Put them together and combine their testbenches
2. Let 2 block-level testbenches run in parallel on 2 different machines, supplying output transactions of the first block to the input of the second one.

2-nd approach is obviously faster than the 1-st one. Block-level testbenches remain almost intact. There is just a need to implement data communication between two testbenches. Using untimed transactions for communication may significantly reduce communication frequency.

-Alex
Back to top
View user's profile Send e-mail
srini
Senior
Senior


Joined: Jan 23, 2004
Posts: 430
Location: Bengaluru, India

PostPosted: Fri Feb 11, 2011 3:31 pm    Post subject: Reply with quote

How about using Socket based communication across the 2 machines (running the 2 blocks)? If this is an option, I recall VCS had an example perhaps in $VCS_HOME, we even ported to use SV-DPI.

Srini
www.cvcblr.com/blog
_________________
Srinivasan Venkataramanan
Chief Technology Officer, CVC www.cvcblr.com
A Pragmatic Approach to VMM Adoption
SystemVerilog Assertions Handbook
Using PSL/SUGAR 2nd Edition.
Contributor: The functional verification of electronic systems
Back to top
View user's profile Send e-mail Visit poster's website
pavanshanbhag
Senior
Senior


Joined: Mar 25, 2009
Posts: 380
Location: Bangalore, India

PostPosted: Sat Feb 12, 2011 11:38 am    Post subject: Reply with quote

AXIOM-EDA : Has the solution for your problem. It has a Multi CPU architecture - graphical debugging using single kernel.
MPSim is the simulator which is been designed by Axiom folks, that was designed from the beginning to address RTL, testbenches, assertion, coverage and debugging in a single kernel architecture for maximum performance productivity and predictability.

Its better to check with them :
info@axiom-da.com
_________________
-Pavan K Shanbhag

“The difference between genius and stupidity, genius knows his limits.” - Albert Einstein
Back to top
View user's profile Send e-mail Visit poster's website
peterpb
Junior
Junior


Joined: Mar 18, 2004
Posts: 9

PostPosted: Sun Feb 13, 2011 10:10 am    Post subject: Reply with quote

You might take a look at SimCluster from Avery Design:
http://www.avery-design.com/files/docs/SimClusterDS2010.pdf

FYI.
I don't (and didn't) work for Avery-Design. I only know that, a few years back, some engineers tried it. And I heard positive comments about the tool.
Back to top
View user's profile
pavanshanbhag
Senior
Senior


Joined: Mar 25, 2009
Posts: 380
Location: Bangalore, India

PostPosted: Sun Feb 13, 2011 12:28 pm    Post subject: Reply with quote

Quote:

DISTRIBUTED SIMULATION SUPPORTS SYSTEMVERILOG, VERILOG, VHDL, C/C++ Today’s SOCs and embedded systems integrate 3rd party and proprietary hardware and software. System-level verification requires integration of these models into an overall simulation model. Often models come in the form of HDLs, ANSI C/C++, or specialized C++ class libraries such as SystemC. Avery’s distributed simulation supports a heterogeneous environment enabling all model types to be integrated and simulated in a distributed environment.


This was pretty impressive about avery designs..
_________________
-Pavan K Shanbhag

“The difference between genius and stupidity, genius knows his limits.” - Albert Einstein
Back to top
View user's profile Send e-mail Visit poster's website
alexg
Senior
Senior


Joined: Jan 07, 2004
Posts: 586
Location: Ottawa

PostPosted: Sun Feb 13, 2011 12:44 pm    Post subject: Reply with quote

Peter, Pavan,
Thank you for the link to the Avery Design Solution.

I have 2 problems with it:
1. I don't want to use automated partition of design/testbench. In other words - I would prefer to partition design and testbench by myself.
2. I would like to create my own means to compress/send/decompress signal-level data. I believe, I can do it better than any tool can do Wink

So, I am thinking more about the method mentioned by Srini.

Recently, I was playing with Verilog fileIO and i looks that it provides relatively clean solution to transfer SV packet structures between 2 simulations. Here is the basic idea:

1. Simulators connect with "channels", being able to transfer 1 structure at a time ("blocking" type of communication)
2. Each "channel" is just a file. Using functions and tasks :can_get, get, can_put and put, 2 simulators can communicate through the file.
You can download working example of such communication (there is README file in the tarball).

Here is the link to the file:
http://www.box.net/shared/u7ssqrk2an

It would be good to know your opinion about this method.

Regards,
-Alex
Back to top
View user's profile Send e-mail
srini
Senior
Senior


Joined: Jan 23, 2004
Posts: 430
Location: Bengaluru, India

PostPosted: Sun Feb 13, 2011 11:55 pm    Post subject: Reply with quote

Hi Alex,

alexg wrote:
Peter, Pavan,
Thank you for the link to the Avery Design Solution.


So, I am thinking more about the method mentioned by Srini.

Here is the link to the file:
http://www.box.net/shared/u7ssqrk2an

It would be good to know your opinion about this method.

Regards,
-Alex


Took a very quick look at it. Have you considered using OVM/UVM TLM-like interface than inventing your own (though simple)?

Regards
Srini
www.cvcblr.com/blog
_________________
Srinivasan Venkataramanan
Chief Technology Officer, CVC www.cvcblr.com
A Pragmatic Approach to VMM Adoption
SystemVerilog Assertions Handbook
Using PSL/SUGAR 2nd Edition.
Contributor: The functional verification of electronic systems
Back to top
View user's profile Send e-mail Visit poster's website
chm
Senior
Senior


Joined: Nov 22, 2004
Posts: 43
Location: Unterpremstaetten, Austria

PostPosted: Mon Feb 14, 2011 4:06 am    Post subject: Reply with quote

Hi Alex,

alexg wrote:
Here is the link to the file:
http://www.box.net/shared/u7ssqrk2an


If I understand you correctly, you want to use a networked file systems (presumably NFS) for inter-process communication between two simulators running on different machines (not just different CPUs on one machine).

Technically, your approach is certainly possible, however

1) your implementation is flawed, as there is no file locking, and there is no guarantee that your heuristic "if I can read more than 1 byte, most likely the whole file has been written already" will work

2) the efficiency will be so terrible that this will only benefit your simulation time if there is almost no inter-process communication and lots of functionality to simulate on both sides. Remember that in NFS there is no caching permitted, which means that a file must be written to the disk before the sender task may return.

If you want to pursue a DIY approach, I suggest to develop a DPI library and implement TCP sockets in C.
Back to top
View user's profile
chrisspear
Senior
Senior


Joined: Jun 15, 2004
Posts: 202
Location: Marlboro, MA

PostPosted: Mon Feb 14, 2011 10:58 am    Post subject: Reply with quote

Parallel simulation has been tried for decades, and partitioning is always the biggest problem. You need multiple partitions with the following requirements:
-Parallel activity. Dividing the design in two blocks does no good if block B can only run after block A completes
-Equal sizes: If block A is more than 3x block B, you'll get little benefit from running them on separate processors
-Low communication: If the blocks are constantly sharing large amounts of data (a serial activity), you'll get little benefit from running them in parallel

The best designs today for parallel simulations are multi-core CPUs. Equal size, parallel activity, but still a lot of share resources, so may still not give a great speedup.

VCS has automatic partitioning, and the ability to run common activities such as waveform dumping in parallel with the rest of simulation. Look into this tried and tested solution before you go off and try to build your own. (Yes, I do work for Synopsys.)[/list][/list]
_________________
Chris Spear
Co-Author: SystemVerilog for Verification - 3rd edition!
http://chris.spear.net/systemverilog
Back to top
View user's profile Send e-mail Visit poster's website
alexg
Senior
Senior


Joined: Jan 07, 2004
Posts: 586
Location: Ottawa

PostPosted: Mon Feb 14, 2011 11:38 am    Post subject: Reply with quote

Hi Chris,

Thank you for your answer. Please see my comments for the requrements you've mentioned:

Quote:
-Parallel activity. Dividing the design in two blocks does no good if block B can only run after block A completes
-Equal sizes: If block A is more than 3x block B, you'll get little benefit from running them on separate processors


Parallelism in activity as well as block sizes are design architecture issues. It is the task of chip architect to reduce idle time, so parallel processing is a must for good architectures. So when block B runs after block A completes, block A immediately start processing new data chunk and so on. So, better architecture - less time for parallel simulations.

Block sizing is more an issue, since simulation time is not the same as latency in data processing. However, manual division with simulation profiling may help here too.

Quote:
-Low communication: If the blocks are constantly sharing large amounts of data (a serial activity), you'll get little benefit from running them in parallel


This is an issue, which usually reduces speed up effect of hardware accelerators and emulators.

To solve it, there is a need to develop verification components converting serial activity into parallel one and vice versa. Then, send only parallel data structure trough the link. These verification components (monitors and drivers) may be instantiated as between design and testbench, as between two blocks in design (here, it heavily depends on "parallel simulations-friendly" SOC architecture). Also, block-level simulations may be used to simulate complete SOC datapath if such data communication is set up between them.

So, I don't believe tools can help here. It is all about friendly SOC architectures, manual partiotions and communication hooks. And the benefit is - capability to simulate larger designs with less time using the same computer network.

Regards,
-Alex
Back to top
View user's profile Send e-mail
cabriggs
Senior
Senior


Joined: Jan 12, 2004
Posts: 96
Location: Massachusetts

PostPosted: Wed Feb 23, 2011 4:22 pm    Post subject: Reply with quote

Ed Arthur of Cisco talked about this at a DV Club meeting a few years ago: http://www.dvclub.org/images/Presentations/Arthur_Q207.pdf

They built a custom layer on on top of MPI and the result is somewhat similar to Avery's SimCluster. This shows you one way to do it yourself.
Back to top
View user's profile
alexg
Senior
Senior


Joined: Jan 07, 2004
Posts: 586
Location: Ottawa

PostPosted: Wed Feb 23, 2011 6:26 pm    Post subject: Reply with quote

Thank you. It's an interesting presentation.

-Alex.
Back to top
View user's profile Send e-mail
jmcneal
Senior
Senior


Joined: Jan 12, 2004
Posts: 34
Location: Hillsboro, Oregon

PostPosted: Tue Mar 01, 2011 2:56 pm    Post subject: Reply with quote

Alex -

As has been pointed out, your NFS solution would be vastly slower than the supported options already listed.

Many years ago I worked at Avery Design and installed SimCluster at several customer sites. You do not have to go with automatic partitioning, but that often makes partitioning much easier, especially for flat gate level designs. If you already have several blocks that consume roughly the same compute resources, and communication between them is slow or infrequent, you could see some significant speed up.

I've used VCS's parallel simulation recently as well (Currently working at Synopsys).

You get the best bang for buck when you have a process that is really large, or simulations that are really slow. These tools aren't a way to take a 2-hr simulation and make it run in 30 mins, but more of a way to take a 6 day simulation and run it in 12 hrs. Several times at Avery we'd get an N-1 (N= # of processors) speedup for very large/very slow simulations.

Quote:

So, I don't believe tools can help here. It is all about friendly SOC architectures, manual partiotions and communication hooks. And the benefit is - capability to simulate larger designs with less time using the same computer network.

You're right about friendly architectures making the partitioning easier. But given a waveform of the SOC running, an auto-partitioner can figure out where the best places to break the design are, given that it can analize which interfaces have lower data rates, etc. So by adding your serial-to-parallel-to-serial block, the auto-partitioner can identify that interface as a good candidate for partitioning.

-jeff
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    Verification Guild Forum Index -> Main All times are GMT - 5 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Verification Guild © 2006 Janick Bergeron
Web site engine's code is Copyright © 2003 by PHP-Nuke. All Rights Reserved. PHP-Nuke is Free Software released under the GNU/GPL license.
Page Generation: 4.715 Seconds