Ticket #708 (new Bug)

Opened 3 years ago

Last modified 3 years ago

crash when generating pdf

Reported by: billy.chan@… Owned by:
Priority: Critical Milestone:
Component: Version: 2.2
Keywords: Cc:

Description

[jira2trac import : issue created on May 8, 2009 3:44:29 AM CEST http://issues.cocoondev.org/browse/DSY-708 ]

When generate PDF, the tomcat web application crash. No response at all (The CPU usage is 0%). At this time,access the daisy only get waiting for response until I reload the application or restart tomcat. (note: the another application in the tomcat is still running.and working fine)

I have checked error log, there is not log for this error. I have try to decrease the log level to DEBUG then got following repeated :
[DEBUG] org.apache.activemq.transport.InactivityMonitor?(79) - No message sent since last write check, sending a KeepAliveInfo?
[DEBUG] org.apache.activemq.transport.InactivityMonitor?(103) - Message received since last read check, resetting flag:
[DEBUG] org.apache.activemq.transport.InactivityMonitor?(79) - No message sent since last write check, sending a KeepAliveInfo?
[DEBUG] org.apache.activemq.transport.InactivityMonitor?(79) - No message sent since last write check, sending a KeepAliveInfo?
[DEBUG] org.apache.activemq.transport.InactivityMonitor?(103) - Message received since last read check, resetting flag:
[DEBUG] org.apache.catalina.session.ManagerBase?(677) - Start expire sessions StandardManager? at 1240454605740 sessioncount 2
[DEBUG] org.apache.catalina.session.ManagerBase?(685) - End expire sessions StandardManager? processingTime 0 expired sessions: 0

I have done some testing for this issue,
Case 1: When the document contain 60 images ( they are reference to different image). This error occur.
Case 2: remove ANY one image whatever is smallest or largest.( means 59 images). Run fine.
Case 3: Still 59 images, but replace some of them with a huge image(in my test case is 2.3Mb bmp). Run fine.
Case 4: replace some image to be a small size( in my case is 1kb jpeg), but number of image is 60. Error occur.

pls help

Attachments

10951_run_ok.log (24.6 KB) - added by paul 3 years ago.
run_ok.log
10952_runcrash.log (111.2 KB) - added by paul 3 years ago.
runcrash.log

Change History

Changed 3 years ago by paul

runcrash.log

comment:1 Changed 3 years ago by paul

[jira2trac import : comment created by karel on May 8, 2009 8:15:06 AM CEST]

The repeated activemq messages don't seem related.
In what logfile are you looking? If not cocoon.log, can you check there as well? Can you include the (relevant parts of the) repository logs as well?

comment:2 Changed 3 years ago by paul

[jira2trac import : comment created by billyhokin on May 8, 2009 9:19:50 AM CEST]

{applicationRoot}/classes/log4j.properties's log level had changed to debug model.
Other log level is default.

In Tomcat site:
{daisyDataRoot}/logs/cocoon.log : no log
{tomcatRoot}/logs/catalina.out: in debug model , only log activemq messages.

In Repositoyr site:
{repositoyrDataRoot}/logs/daisy : no log
{repositoyrDataRoot}/logs/daisy-request-errors: no log
{repositoyrDataRoot}/logs/daisy-repository-server-service.log : no log

What else should I do for you gather data?

comment:3 Changed 3 years ago by paul

[jira2trac import : comment created by karel on May 8, 2009 9:56:30 AM CEST]

That's all I can think of when it comes to logfiles.

The fact that it works with (some) large images and not with 60 smallish images
makes me wonder if there is a specific image that it doesn't work for.
Can you try nailing down which image(s) cause the problem?

Another trail to investigate would be to have a look at the stacktraces.
In linux that's sending the QUIT signal (kill -QUIT <pid>) - you can find the pid number using ps (or jps)
Under windows I believe pressing ctrl-break on the command line should do it (requires you to run daisy from commandline, ie. not as a service)
I have a gut feeling that we need to focus on the wiki, but if possible, please include stacktraces from repository and wiki.

comment:4 Changed 3 years ago by paul

[jira2trac import : comment created by billyhokin on May 8, 2009 11:07:57 AM CEST]

Thank you for your quickly response first!

When I reduced to 59 images. I have tried serveral times to remove different image. So, I think it may not be related to a image cause the problem.

I have try kill -QUIT pid twice.
first when application run fine. (run_ok.log)
And run kill -QUIt after the application crash (runcrash.log)

During tomcat kill -QUIT , the repository didn't have any log. about error.

Thanks.

comment:5 Changed 3 years ago by paul

[jira2trac import : comment created by karel on May 8, 2009 1:51:35 PM CEST]

From runcrash.log I gather that the wiki is hanging because its (http)connection pool runs out. At least one of the threads (and it looks like most relevant threads are) waiting for http connections to become available.
Increasing the connection pool size could be a workaround (need to investigate how) but it's probably better to find the cause.

There may be various reasons for the connection pool running out.

  • Are the repository and the wiki on the same server?
  • Could there be a firewall or a choke blocking the connections?
  • Can you give an estimation on the number of documents in your book definition? (just an order of magnitude to have an idea)

More information gathering:

  • Can you do a kill -QUIT <pid> on the repository process as well? (After a lockup)

Thanks to you too for reporting and being quick on the responses as well btw :)

comment:6 Changed 3 years ago by paul

[jira2trac import : comment created by karel on May 8, 2009 1:52:39 PM CEST]

Oh and one more thing to try: can you run the wiki without Tomcat? (i.e .using Jetty)

comment:7 Changed 3 years ago by paul

[jira2trac import : comment created by bruno on May 8, 2009 2:17:18 PM CEST]

I didn't look at the logs, but this sounds like this is about the connection pool for connections between the remote repository client and the repository server, which happens to have a default limit of 60, configurable in the daisy.xconf:

<maxHttpConnections>60</maxHttpConnections>

So it sounds like FOP (the PDF engine) is not closing connections after loading images (or is loading all images concurrently, but that seems less likely).

comment:8 Changed 3 years ago by paul

[jira2trac import : comment created by karel on May 8, 2009 3:45:44 PM CEST]

I can't reproduce this this btw:

  • Using daisy 2.2 (as reported)
  • Using about 150 images (they're identical but different documents)
  • repository and wiki are on the same physical machine
  • with and without tomcat

Anyway, FOP went from 0.94 to 0.95beta since daisy 2.2 and at least some of the relevant classes in FOP were changed.
Can you test if this still occurs in daisy 2.3-RC?

comment:9 Changed 3 years ago by paul

[jira2trac import : comment created by billyhokin on May 11, 2009 3:28:51 AM CEST]

Thank you for your help to fix my problem!

Thanks Bruno, you let me know why there is a magic number 60.

Thanks Karel, Thank you for your a lot of help to debug this issue.

Note: See TracTickets for help on using tickets.