Removing the excess certificate entries seems to have resolved the
slowness, and I'm not seeing the
ClassCastException messages anymore. I've carried out a few
operations successfully.
The entries I removed were those like:
'cn=81178,ou=certificateRepository,ou=ca,o=ipaca' where
I had listed certs with a subject matching CN=fqdn,* and grepped out the DNs.
I should note that the machines were de-enrolled first which I suspect
reduces the risk of inconsistency?
I suspect that although Dogtag can handle many entries, the FreeIPA
web interface or API cannot.
Will monitor over the next 24 hours and roll back if necessary. Many
thanks for your helpful advice.
Mike
On 17 November 2017 at 23:36, Mike Johnson <m.d.johnson(a)kuub.org> wrote:
The ca.crl.pageSize entry is 100, which is unchanged from the
default
set up by the IPA installer.
I think what I'm going to do is list the entries to delete, grab a
backup and VMware snapshot, then try the revoke/delete approach.
Worst case is I have to roll it back, and it's a small environment so
a few minutes offline to give that a go is unlikely to hurt. It's
only used 9-5 and our product uses other AAA/cert management tools to
do its stuff.
Thanks for the helpful responses!
Mike
On 17 November 2017 at 23:19, Marc Sauton <msauton(a)redhat.com> wrote:
> Dogtag by itself can handle much more, by several magnitudes, but it is a
> matter of sizing and tuning in various places.
> May be several possible actions: revoke the non needed certs, eventually
> manually delete carefully selected LDAP entries, or keep all and try tuning
> one knob to see if perf is better
> This will not remove the error "getEntries: exception
> java.lang.ClassCastException", till need a review of the LDAP server errrors
> log file for this, and eventually the CA's /var/log/pki/pki-tomcat/ca/system
>
> In the provided CA debug log, the embedded CA goes 2000 entries a a time in
> the LDAP server, this does match the 41 times "DBVirtualList: entries:
2000"
> , for 79958 + 1229 = 81187 cert entries.
>
> In the file
> /etc/pki/pki-tomcat/ca/CS.cfg
> check the parameter
> ca.crl.pageSize
> is it with a value 2000 ?
>
> ( Jack, is it the same ca.crl.pageSize used in
> ./base/server/cmscore/src/com/netscape/cmscore/dbs/DBVirtualList.java ? )
>
> Thanks,
> M.
>
> On Fri, Nov 17, 2017 at 12:58 PM, Mike Johnson <m.d.johnson(a)kuub.org> wrote:
>>
>> I'm working on doing this. While doing so, I note I have two hosts,
>> seemingly with 40,000 certificates each. I'm assuming this is not
>> great for performance.
>>
>> On 17 November 2017 at 18:56, Marc Sauton <msauton(a)redhat.com> wrote:
>> > Yes, try to match the time stamps like for example the pki-tomcat ca
>> > debug
>> > log entry
>> > [17/Nov/2017:10:16:20][http-bio-8080-exec-2]: getEntries: exception
>> > java.lang.ClassCastException
>> > to the LDAP server errors log file, path similar to
>> > /var/log/dirsrv/slapd-EXAMPLE-COM/errors
>> > Thanks,
>> > M.
>> >
>> > On Fri, Nov 17, 2017 at 10:32 AM, John Magne <jmagne(a)redhat.com>
wrote:
>> >>
>> >> Hello:
>> >>
>> >> After taking a quick look at your logs, it appears there is some issue
>> >> with dogtag simply
>> >> reading records from the ldap db at a pretty low level. Is there a
>> >> chance
>> >> the db became
>> >> corrupted at some point or something? Or was this a brand new install
>> >> of
>> >> the dogtag server?
>> >>
>> >> Sorry could not be more help but these symptoms are something I have
>> >> not
>> >> seen before.
>> >>
>> >>
>> >>
>> >> ----- Original Message -----
>> >> > From: "Mike Johnson" <m.d.johnson(a)kuub.org>
>> >> > To: pki-users(a)redhat.com
>> >> > Sent: Friday, November 17, 2017 3:15:23 AM
>> >> > Subject: [Pki-users] Slowness and java.lang.ClassCastException
>> >> >
>> >> > Hi. I am running a DogTag server as part of a FreeIPA install. I
>> >> > have an issue which has persisted following some directory
>> >> > replication
>> >> > issues, now resolved. I had initially put the slowness down to
the
>> >> > replication issues but now I find errors in the PKI logs and the
>> >> > slowness has persisted.
>> >> >
>> >> > Though the services are working as intended, it's very slow and
in
>> >> > particular the API is sitting at 100% load (user) while performing
>> >> > operations.
>> >> >
>> >> > Extracts from the logs are pasted at
>> >> >
https://paste.fedoraproject.org/paste/7pz2eBm6KzItXZFFoYpldQ
>> >> >
>> >> > I'd be very grateful for any guidance as to how to investigate
>> >> > further.
>> >> > Thanks!
>> >> >
>> >> > Mike
>> >> >
>> >> > Name : pki-base
>> >> > Version : 10.4.1
>> >> > Release : 13.el7_4
>> >> > Kernel: 3.10.0-693.5.2.el7.x86_64
>> >> > CentOS 7.4.1708
>> >> > FreeIPA v4.5, Domain Level 1.
>> >> >
>> >> > _______________________________________________
>> >> > Pki-users mailing list
>> >> > Pki-users(a)redhat.com
>> >> >
https://www.redhat.com/mailman/listinfo/pki-users
>> >> >
>> >>
>> >> _______________________________________________
>> >> Pki-users mailing list
>> >> Pki-users(a)redhat.com
>> >>
https://www.redhat.com/mailman/listinfo/pki-users
>> >
>> >
>
>