Dogtag by itself can handle much more, by several magnitudes, but it is a matter of sizing and tuning in various places.
May be several possible actions: revoke the non needed certs, eventually manually delete carefully selected LDAP entries, or keep all and try tuning one knob to see if perf is better
This will not remove the error "getEntries: exception java.lang.ClassCastException", till need a review of the LDAP server errrors log file for this, and eventually the CA's /var/log/pki/pki-tomcat/ca/system

In the provided CA debug log, the embedded CA goes 2000 entries a a time in the LDAP server, this does match the 41 times "DBVirtualList: entries: 2000" , for 79958 + 1229 = 81187 cert entries.

In the file
/etc/pki/pki-tomcat/ca/CS.cfg
check the parameter
ca.crl.pageSize
is it with a value 2000 ?

( Jack, is it the same ca.crl.pageSize used in ./base/server/cmscore/src/com/netscape/cmscore/dbs/DBVirtualList.java ? )

Thanks,
M.

On Fri, Nov 17, 2017 at 12:58 PM, Mike Johnson <m.d.johnson@kuub.org> wrote:
I'm working on doing this.  While doing so, I note I have two hosts,
seemingly with 40,000 certificates each.  I'm assuming this is not
great for performance.

On 17 November 2017 at 18:56, Marc Sauton <msauton@redhat.com> wrote:
> Yes, try to match the time stamps like for example the pki-tomcat ca debug
> log entry
> [17/Nov/2017:10:16:20][http-bio-8080-exec-2]: getEntries: exception
> java.lang.ClassCastException
> to the LDAP server errors log file, path similar to
> /var/log/dirsrv/slapd-EXAMPLE-COM/errors
> Thanks,
> M.
>
> On Fri, Nov 17, 2017 at 10:32 AM, John Magne <jmagne@redhat.com> wrote:
>>
>> Hello:
>>
>> After taking a quick look at your logs, it appears there is some issue
>> with dogtag simply
>> reading records from the ldap db at a pretty low level. Is there a chance
>> the db became
>> corrupted at some point or something? Or was this a brand new install of
>> the dogtag server?
>>
>> Sorry could not be more help but these symptoms are something I have not
>> seen before.
>>
>>
>>
>> ----- Original Message -----
>> > From: "Mike Johnson" <m.d.johnson@kuub.org>
>> > To: pki-users@redhat.com
>> > Sent: Friday, November 17, 2017 3:15:23 AM
>> > Subject: [Pki-users] Slowness and java.lang.ClassCastException
>> >
>> > Hi.  I am running a DogTag server as part of a FreeIPA install.  I
>> > have an issue which has persisted following some directory replication
>> > issues, now resolved.  I had initially put the slowness down to the
>> > replication issues but now I find errors in the PKI logs and the
>> > slowness has persisted.
>> >
>> > Though the services are working as intended, it's very slow and in
>> > particular the API is sitting at 100% load (user) while performing
>> > operations.
>> >
>> > Extracts from the logs are pasted at
>> > https://paste.fedoraproject.org/paste/7pz2eBm6KzItXZFFoYpldQ
>> >
>> > I'd be very grateful for any guidance as to how to investigate further.
>> > Thanks!
>> >
>> > Mike
>> >
>> > Name        : pki-base
>> > Version     : 10.4.1
>> > Release     : 13.el7_4
>> > Kernel: 3.10.0-693.5.2.el7.x86_64
>> > CentOS 7.4.1708
>> > FreeIPA v4.5, Domain Level 1.
>> >
>> > _______________________________________________
>> > Pki-users mailing list
>> > Pki-users@redhat.com
>> > https://www.redhat.com/mailman/listinfo/pki-users
>> >
>>
>> _______________________________________________
>> Pki-users mailing list
>> Pki-users@redhat.com
>> https://www.redhat.com/mailman/listinfo/pki-users
>
>