Known Issues in GemFire 8.0.0

Last updated: August 26, 2014

Id Bugnote title Bugnote description Workaround
#51204 Functions are not registered if they implement Declarable When a JAR file is deployed using the gfsh "deploy" command any non-abstract Functions in the JAR file should automatically be registered with the Function Manager. However, if the Function also implements Declarable then registration does not occur. Do not deploy JAR files with Functions that also implement Declarable.
#51201 'start server' command --spring-xml-location configuration option bug prevents SDG from fully and properly configure an GemFire Server data node instance with Spring config. A bug was introduced in the GemFire 8.0 release that prevents a Spring context configuration file from properly configuring and bootstrapping a GemFire Server data node when launch through Gfsh using the 'start server' command's new --spring-xml-location option. Unfortunately, as a result, and GemFire Server cannot be fully and properly configured using Spring config since Spring Data GemFire performs a lookup first an any existing Cache instance in the JVM. As such, any Cache specific configuration (e.g. PDX) or DistributionConfig properties (e.g. log-level, ports, etc) specified in Spring config are effectively ignored. For instance, in the following Spring config...
<beans ...>
  <context:property-placeholder location="classpath:server.properties"/>
  <!--
  <util:properties id="gemfireCacheConfigurationSettings"
  location="classpath:gemfire.properties"/>
  -->
  <util:properties id="gemfireProperties">
    <prop key="name">SpringGemFirePeerCacheWithFunctions</prop>
    <prop key="mcast-port">0</prop>
    <prop key="log-file">./SpringGemFirePeerCacheWithFunctions.log</prop>
    <prop key="log-level">config</prop>
    <prop key="jmx-manager">true</prop>
    <prop key="jmx-manager-http-port">9090</prop>
    <prop key="jmx-manager-port">1199</prop>
    <prop key="jmx-manager-start">true</prop>
    <prop key="groups">testGroup</prop>
    <!--
    <prop key="locators">localhost[11235]</prop>
    <prop key="start-locator">localhost[11235]</prop>
    -->
  </util:properties>

  <gfe:cache properties-ref="gemfireProperties" pdx-serializer-ref=".."
   pdx-persistent="true" pdx-read-serialized="true"
   pdx-ignore-unread-fields="false"/>

  <gfe:cache-server auto-startup="true" bind-address="${server.bind.address}"
   host-name-for-clients="${server.hostname.for.clients}"
   port="${server.port}" max-connections="${server.max.connections}"/>

  <gfe:replicated-region id="AppData" persistent="false"/>

  <gfe:annotation-driven/>

  <bean class="org.spring.data.gemfire.cache.execute.RegionFunctions"/>

</beans>
The gemfireProperties bean specifying GemFire DistributionConfig properties to the Cache instance when the DS is created will be ignored as will any Cache specific attributes settings, such as the PDX attributes above. This can cause unexpected/surprising behavior on application deployment since SDG will find the unconfigured, premature Cache instance (which is by design). However, the 'AppData' Region will still be created as wells as the GemFire Function registered in org.spring.data.gemfire.cache.execute.RegionFunctions class in this example. As well, an actual "Cache Server" will be started on the configured port so long as the system port is available for use. Most things beyond the basic Cache configuration (attributes and properties) should still work.
The only workaround in 8.0.0 is to augment the Spring config with a cache.xml file. For example:
<gfe:cache cache-xml-location="/path/to/cache.xml" ..>
In the Spring config, to configure GemFire Distributed System properties or PDX, for example. Basically any attribute on the <gfe:cache> SDG XML namespace element or any of GemFire's DistributionConfig properties must be specified with cache.xml and GemFire Java System properties respectively. The only other option is to start Spring configured GemFire Server data nodes externally (not with Gfsh's 'start server' command --spring-xml-location option) using a simple Java class with a main method like so...
public class SpringGemFirePeerCacheApp {
  public static void main(final String... args) {
    new ClassPathXmlApplicationContext(getSpringXmlConfigurationFile(args))
  }
}
You can be as sophisticated or simple as you like with your "launcher" class.
#51083 Querying non primitive fields in Pdx serialized objects returns wrong results If a query is executed on pdx serialized objects and the where clause contains a comparison involving non primitive fields (nonPrimitiveFieldObject = objToBeCompared), the query returns incorrect results Use equals method instead of = operator
#51120 Locator fails to start properly if ssl-enabled is set to true Locator fails to start properly if the GemFire property ssl-enabled is set to true and jmx-manager-ssl is also set to true. Note that the "ssl-enabled" property has been deprecated in favor of the "cluster-ssl-enabled" property in 8.0. Use "jmx-manager-ssl-enabled" instead of "jmx-manager-ssl" and replace "ssl-enabled" with "cluster-ssl-enabled'.
#51111 GemFire redirects Tomcat/application log When GemFire is embedded in a container such as Tomcat, it will redirect all JDK logging output to the configured gemfire log. Thus, output which would typically go to the 'catalina' log, will not appear there anymore but it will appear in the gemfire log. Configure Tomcat to use log4j as the logging framework.
#51103 SerializationException: Could not create an instance of com.gemstone.gemfire.internal.cache.tier.sockets.HAEventWrapper Product logs may show an exception string that reads, "SerializationException: Could not create an instance of com.gemstone.gemfire.internal.cache.tier.sockets.HAEventWrapper" The exception is harmless and can be ignored.
#51078 Backup on multi-host windows platforms fails Due to a race condition while creating directories, a multi-host backup on windows platform may fail with "IOException: Could not create directory" Use a directory that is local to all host machines in the system. See the Pivotal GemFire User's Guide.
#51034 Due to host mapping issues, destroy region command fails validation due to empty response Depending upon configuration of /etc/hosts user may get this issue. This issue looks very similar to #46580 & #47645. These issues come if there are no or incorrect host-IP mapping in the /etc/hosts file. JMX federation was failing due to #47645. It was resolved by removing host name from unique identifier. Similar needs to be worked out here while determining members hosting a particular region.This issue will most likely go away with a proper host-IP mapping. Specify correct host-IP mappings in /etc/hosts.
#51024 Spurious warning: Message deserialization of <MessageType> ... did not read <XXX> bytes You may see the following warning on the gemfire log: Message deserialization of <MessageType> ... did not read <XXX> bytes Some messages do not read all their data when they detect some other condition that causes them to stop early. This warning can be ignored.
#51020 If the field object implements Struct then the OQL query may not get result if indexes enabled. If the object being put in the region has a field that implements com.gemstone.gemfire.cache.query.Struct and a query is executed with the field as projection attribute, the query does not return any result. This happens only if index is used by the query. do not implement the com.gemstone.gemfire.cache.query.Struct interface. This interface is used only to iterate over query results.
#50931 GFSH does not support semi-colon (;) in parameter values GFSH does not support semi-colons within command arguments. Do not use semi-colons when specifying the classpath option while on Windows. Instead, create and specify a single manifest jar for classpath. Create a Manifest-only JAR file with a Class-Path attribute listing the required JARs (dependencies) of their application and use single jar as classpath. This solution still may not be applicable in all the scenarios. For example user may require jars from different directory locations (which may be deployment environment dependent) and wont be able to list it in Manifest jars. In such case user may need to new create manifest jar for each different deployment.
#50920 Fatal error from asynchronous flusher thread when attempting to write an entry with keyId=0 to oplog It's caused by the region still initializing (GII from other member), and on going operations are retried. It only happens when using async disk writer. Hold on operations until regions are initialized when the regions are using async disk writer.
#50779 Improperly formatted query results in serialization exception on the parser exception If a query executed remotely from client on a server does not have correct syntax a SerializationException is returned instead of an error message string. Use correct query syntax
#50773 Setting socket-lease-time too low can result it members being forced out of the distributed system. If the socket-lease-time gemfire property is set to a small number then it may cause unexpected connectivity problems. For example it may cause ForcedDisconnectExceptions. Set socket-lease-time to a larger value. A safe minimum has not yet been determined but problems have been seen when it is set to a value lower than 1000.
#50513 ClassCastException (Class cannot be cast to VersionRespons) occurs when Locator is configured with SSL by client (e.g. Gfsh) attempts to connect without SSL. When a Locator is started in Gfsh, configured with SSL, perhaps like so...
gfsh>start locator --name=LocatorWithSSL --port=12480 --log-level=config
--properties-file=./conf/gemfire.properties
--security-properties-file=./conf/gemfire-security.properties
And a client subsequently attempts to connect without SSL, then the following Exception is thrown from GemFire...
[severe 2014/05/13 18:00:30.684 PDT Gfsh Launcher tid=0xb] (msgTID=11 msgSN=106)
java.lang.ClassCastException: java.lang.Class cannot be cast to
com.gemstone.org.jgroups.stack.tcpserver.VersionResponse
java.lang.IllegalStateException: java.lang.ClassCastException: java.lang.Class
cannot be cast to com.gemstone.org.jgroups
.stack.tcpserver.VersionResponse
at com.gemstone.gemfire.management.internal
.JmxManagerLocatorRequest.send(JmxManagerLocatorRequest.java:93)
at com.gemstone.gemfire.management.internal.cli.commands
.ShellCommands.connectToLocator(ShellCommands.java:516)
at com.gemstone.gemfire.management.internal.cli.commands
.ShellCommands.connect(ShellCommands.java:341)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at com.gemstone.gemfire.management.internal.cli.util.spring
.ReflectionUtils.invokeMethod(ReflectionUtils.java:44)
at com.gemstone.gemfire.management.internal.cli.shell
.GfshExecutionStrategy.execute(GfshExecutionStrategy.java:104)
at org.springframework.shell.core.AbstractShell
.executeCommand(AbstractShell.java:223)
at com.gemstone.gemfire.management.internal.cli.shell.Gfsh
.executeCommand(Gfsh.java:414)
at com.gemstone.gemfire.management.internal.cli.shell.Gfsh
.promptLoop(Gfsh.java:864)
at org.springframework.shell.core.JLineShell.run(JLineShell.java:158)
at java.lang.Thread.run(Thread.java:695)
Caused by: java.lang.ClassCastException: java.lang.Class
cannot be cast to com.gemstone.org.jgroups.stack.tcpserver
.VersionResponse
at com.gemstone.org.jgroups.stack.tcpserver.TcpClient
.requestToServer(TcpClient.java:90)
at com.gemstone.org.jgroups.stack.tcpserver.TcpClient
.requestToServer(TcpClient.java:73)
at com.gemstone.gemfire.management.internal
.JmxManagerLocatorRequest.send(JmxManagerLocatorRequest.java:84)

        ... 13 more
Client must connect to the Locator using SSL when SSL was configured for the Locator when started.
#50322 Unexpected EOF when gunzipping a gfs.gz file VSD currently requires that statistic archives be uncompressed to be loaded into it. But you may see "unexpected end of file" from the gunzip command when you try to uncompress a gfs.gz archive. This happens if the gfs.gz file was not cleanly shutdown which happens if the server writing it is still running or if the server was killed or crashed. Use "gunzip -c stats.gfs.gz >stats.gfs" to uncompress. You will still see a message about an unexpected EOF but it can be ignored and you can now load "stats.gfs" into VSD.
#50065 Data inconsistency in client with concurrent ops (destroy or invalidate + create) and concurrencyChecksEnabled An operation is in progress in a PR server node that is applied to the cache, but before the operation can be distributed to clients, the VM's shutdown hook starts to close the cache. This preempts messaging and keeps the event from reaching clients. The server then restarts and recovers from disk, so it has the entry but any clients having subscription queues in other server nodes do not see the event. It only happens when redundancy=0. Use redundancy=1 to resolve the issue.
#49520 Performance degradation with SSL enabled WAN GatewaySender When using SSL enabled with SerialGatewaySender, the performance degrades to some extent. Either use a cipher which is far less expensive or shift to Parallel WAN which is available from 7.0.
#48141 AsyncEventQueue does not process events with Local regions When AsyncEventQueue is attached to a local region, the events on the region are filtered internally and not processed by the AsyncEventQueue. Use local region with persistence.
#48123 Deploying a new function to Gemfire with Declarable interface and no properties fails. When deploying a new Function to GemFire with a declarable interface and no properties (propertiesList has 0 elements), the deployment fails. Remove Declarable interface on new functions.
#47790 Event loss in remote site in case GatewayReceiver started before user region is created On remote site, if GatewayReceiver is started before creating the user region, it may cause loss of events on remote site. Create user regions on remote site before starting the GatewayReceivers.
#47733 GatewayReceiver started before creating user region can cause RegionDestroyedException On remote WAN site, if GatewayReceiver is started before creating user region, it can cause RegionDestroyedExceptions. On remote WAN site, create user regions before starting the GatewayReceiver.
#47676 Join queries take very long time to execute Queries using joins among multiple regions may take longer time to execute. Use indexes on fields used in the join.
#47390 cacheClientProxyStats:messageQueueSize does not take into account 'most recent events dispatched' to client. Events that have already been dispatched are removed from the queue during subsequent dispatch. So at any given point in time, there will be some events in the queue which are dispatched to the client and acks received for them from the client but are not yet removed from the queue. These are removed during the next dispatch of a subsequent event. Customers may use clientSubscriptionStats: (eventsQueued-eventsDispatched-eventsRemovedByQRM-eventsExpired-eventsConflated) to find out the queue size. eventsConflated will be zero if conflation is off which is the default. eventsRemovedByQRM and eventsExpired will be zero if the server has been primary for this client throughout.
#46878 ^Z will kill gfsh and any servers started from that gfsh If you type ^Z from gfsh, it will kill your gfsh process and any locator/server processes that you started from that gfsh process. Note that this will also happen if you are running a shell script that is executing gfsh when you type ^Z. Use ^C instead of ^Z to interrupt a long running gfsh command. This will cause gfsh to quit waiting but leave the child processes running.
#46230 JLine DLL issue for multiple instances of gfsh started simultaneously. JLine uses DLLs on Windows to interact with the operating system to read special keys (such as the arrow keys) which are otherwise inaccessible when using the System.in stream. There is a rare possibility of two instances of gfsh trying to load the same DLL at the same time, which causes the second instance to fail. Restart the second instance of gfsh.
#46112 Moving gfsh to the background is not supported. On Unix systems, jobs can be moved to background (using Ctrl-Z) and then later moved back to the foreground (using fg). This behavior is not supported for gfsh. When you type Ctrl-Z, gfsh exits with a "FATAL Exit". Do not move gfsh jobs to the background.
#45964 Hang doing distributed region destroy during persistent recovery If a running member initiates a distributed destroy of a persistent region using Region.destroyRegion at the same time another member is trying to recover the region from disk, there is a slight chance the distributed destroy and the member recovery will hang. Wait until all members are running before doing a distributed destroy. If this hang is encountered, kill the member that is trying to recover from disk.
#45093 Clients may throw a Null Pointer Exception without a message if the client's server runs out of file descriptors Clients may throw a Null Pointer Exception that has no message if the client's server runs out of file descriptors. The exception may also be reported in the server's log. Increase the file descriptor limit to the appropriate level.
#44710 A region configured with persist-backup="true" and data-policy="persistent-partition" throws IllegalStateException A region configured with persist-backup="true" and data-policy="persistent-partition" throws IllegalStateException. Do not set the deprecated persist-backup attribute.
#44606 Registration of instantiators can cause Gateway deadlock Gateways experience deadlock when trying to register instantiators. Register the instantiators in the hubs prior to creating the cache using the serialization-registration cache xml element. This prevents the InternalInstantiator .sendRegistrationMessageToServers call
#44558 Gateway.stop() does not cleanup/destroy the region for the Gateway Event Queue Manually stopping a gateway using the API doesn't close the region backing the queue. This will cause unnecessary event replication to the JVM containing the stopped gateway. The region is internal but it can be retrieved and closed manually. The region is named: gatewayHubId + "_" + gatewayId + "_EVENT_QUEUE"

String gatewayRegionName = gatewayHubId + "_" + gatewayId + "_EVENT_QUEUE";

Region region = cache.getRegion(gatewayRegionName);

region.close();

The region should be just closed and not destroyed so any persistent data is not deleted.
#44411 Querying on an enum field always returns an empty result set Querying on enum fields returns an empty result set even when there are qualifying rows. The only workaround available for this issue is to use a bind parameter for the enum field in the query. For example: This query fails: select distinct * from /QueryRegion0 where aDay = Day.Wednesday The query succeeds when the query is rewritten as follows: select distinct * from /QueryRegion0 where aDay = $1 and Day.Wednesday is passed as an execution parameter.
#44410 The load-conditioning-interval property does not work as expected when connecting to explicit endpoints When the load-conditioning-interval property is used with explicit servers instead of with locators, connections are still recycled after 5 minutes. The property works as expected when you are using locators to obtain connections for server communication. Use locators to obtain connections for server communication.
#44404 Partitioned region single hop may fail to direct load balance requests to a newly joined server when optimize-for-write is set to false This problem is caused by stale metadata on the client. The problem occurs when the client is only performing read operations, and a Function has optimize-for-write set to false. Any write operations into the region will fix the problem. Perform a write operation into the region to fix the problem.
#44399 Changing the distributed-system-id can cause PDX failures If the distributed-system-id is changed and a previously used one is re-used, then PdxType conflicts can occur. Do not change the distributed-system-id after it has been set.
#44229 Destroy operation on a region causes offline member to become unusable When some members are offline, a destroy (or local destroy) operation on a persistent region causes the offline member to be unable to start. Start all offline members before destroying a persistent region.
#43904 WAN Gateways started before regions are created can cause updates to be lost If gateways are restarted and connected to remote sites before the local regions are created, then any events received by those gateways will cause exceptions and be dropped. In the case where gateways are defined in the same JVMs as the regions using xml, proper startup order is maintained and this will not happen. In the case where gateways are created and started in JVMs separate from those where regions are created, startup ordering may not be correct. Make sure that gateways are started after regions are created and initialized. In the case where gateways are created and started in JVMs separate from those where the regions are created, they should be manually started after the regions are created. A RegionMembershipListener can be used to facilitate this.
#43866 Cache plugins may fail if read-serialized is true If read-serialized is set to true on your cache and you have plugin classes (for example CacheListener, CacheWriter, CacheLoader), when those plugins are serialized as a PDX, the plugin fails because GemFire sees the plugin as an instance of PdxInstance. This problem only occurs if your plugins are serialized as PDX because you have implemented PdxSerializable or have a PdxSerializer that serializes the plugin class. Note that classes that implement Function are never passed to a PdxSerializer but can still implement PdxSerializable and then fail just like the other plugins. Do not implement PdxSerializable or change your PdxSerializer to serialize that plugin class. Instead, make your plugin class implement DataSerializable. This prevents the plugin from being serialized by a PdxSerializer.
#43849 Attempts to use a writable-working-dir over NFS may result in hangs involving NIO file locking Attempting to use a writable-working-dir over NFS may result in hangs involving NIO file locking. Licensing uses java.nio.channels.FileChannel.lock to lock the license state and events files that are persisted to writable-working-dir. The call to FileChannel lock may hang in the JVM native layer. The stack dump of the hung thread may look similar to the following:
java.lang.Thread.State: RUNNABLE
 at sun.nio.ch.FileChannelImpl.lock0(Native Method)
 at sun.nio.ch.FileChannelImpl.lock(FileChannelImpl.java:832)
 at java.nio.channels.FileChannel.lock(FileChannel.java:860)
 at com.springsource.vfabric.licensing.events.EventManager
 .saveEvents(EventManager.java:61)
- locked <0xe02f5a80> (a java.lang.Object)
at com.springsource.vfabric.licensing.events.EventManager
.saveEvent(EventManager.java:45)
at com.springsource.vfabric.licensing.events.EventManager.<init>
(EventManager.java:37)
at com.springsource.vfabric.licensing.client.LicenseManagerEnvironment
.<init>(LicenseManagerEnvironment.java:61)
at com.springsource.vfabric.licensing.client.LicenseManagerFactory
.getLicenseManager(LicenseManagerFactory.java:80)
at com.gemstone.gemfire.internal.licensing.VFabricLicenseEngine
.getLicenseManager(VFabricLicenseEngine.java:398)
at com.gemstone.gemfire.internal.licensing.VFabricLicenseEngine
.acquireLicense(VFabricLicenseEngine.java:93)
at com.gemstone.gemfire.internal.licensing.CacheLicenseChecker
.acquireLicense(CacheLicenseChecker.java:76)
at com.gemstone.gemfire.internal.licensing.LicenseChecker
.acquireLicense(LicenseChecker.java:251)
- locked <0xe02604c8> (a com.gemstone.gemfire.internal.licensing.ServerLicenseChecker)
at com.gemstone.gemfire.distributed.internal.InternalDistributedSystem
.getLicenseChecker(InternalDistributedSystem.java:635)
- locked <0xdfd56c28> (a java.util.concurrent.atomic.AtomicReference)
at com.gemstone.gemfire.distributed.internal.InternalDistributedSystem
.initialize(InternalDistributedSystem.java:470)
at com.gemstone.gemfire.distributed.internal.InternalDistributedSystem
.newInstance(InternalDistributedSystem.java:223)
at com.gemstone.gemfire.distributed.DistributedSystem
.connect(DistributedSystem.java:932)
Specify a directory on a local drive for writable-working-dir instead of a directory that is accessed through NFS. The property writable-working-dir is specified in gemfire.properties.
#43781 Region put may do multiple serializations of the value A Region put invocation may serialize the value multiple times. If the region being put on has a DataPolicy of EMPTY and it is in a cache server that clients have subscriptions on then one serialization will be done to push the value to peers of the server and another serialization will be done to push the value to the subscribed clients. If the region is using a disk store and it is not partitioned then it may be serialized twice; once to distribute it to peers and once to write it to disk. You could preserialize the value into a byte[] and put the byte[] in the region. But in this case all readers if the cache need to be changed to deserialize the byte[]. See the internal class com.gemstone.gemfire.internal.util.BlobHelper. You can use its static methods serializeToBlob and deserializeBlob.
#43758 Suspended transaction from function execution unusable after primary rebalancing When multiple invocations of a function participate in a single transaction (suspending and resuming transactions for each invocation), a high-availablity event may re-balance the primaries, which make it impossible possible to target the original transactional node for function execution. Use the system property gemfire.DISABLE_MOVE_PRIMARIES_ON_STARTUP to allow function execution to target the same member.
#43750 Gateway toString erroneously indicates that the Gateway is connected The Gateway toString message indicates that the Gateway is connected to the remote site even when it is not connected. For example: [info 2011/07/26 15:46:57.598 EDT <main> tid=0x1] Started Primary Gateway to LN connected to [LN-1=ln_host_1:6622, LN-2=ln_host_2:6622] To determine whether the Gateway failed to connect to the remote site, look for a warning similar to the following: [warning 2011/07/26 15:46:57.527 <main> tid=0x1] Primary Gateway to LN not connected to [LN-1=ln_host_1:6622, LN-2=ln_host_2:6622]: Could not connect. To determine when the Gateway successfully connects to the remote site, look for a message similar to the following: [info 2011/07/26 16:07:36.187 EDT <Gateway Event Processor from NY to LN> tid=0x154] Primary Gateway to LN connected to [LN-1=ln_host_1:6622, LN-2=ln_host_2:6622]: Using com.gemstone.gemfire.cache.client.internal.pooling.PooledConnection@1bb1849: Connection[ln_host_2:6622] after 81 failed connect attempts
#43713 JRockit may crash with an Illegal memory access JRockit may crash with an illegal memory access. The specific version we say this with during testing was: BEA JRockit(R) R27.6.5-32_o-121899-1.6.0_14-20091001-2107-windows-ia32. The call stack looked like this:
Thread Stack Trace:

at findNext+288()@0xffffffff7ddbc9f4

at findNextToReturn+32()@0xffffffff7ddbca94

at refIterFillFromFrame+248()@0xffffffff7ddbcd2c

at trProcessLocksForThread+52()@0xffffffff7ddcb1c0

at get_all_locks+88()@0xffffffff7dcee638

at javaLockConvertLazyToThin+88()@0xffffffff7dcee730

at RJNI_jrockit_vm_Locks_checkLazyLocked+584()@0xffffffff7dcf01d8
In this case following things might work - Turn off the optimizations with -Xnoopt option. This option turns off adaptive optimization. While optimized code generally runs faster than code that hasn’t been optimized, occasionally, the time required to optimize code results in undesirable delays processing. -XnoOpt lets you avoid these delays by turning off optimization. This option is also helpful when you suspect that a JVM or application problem, such as a system crash or poor startup performance, might be related to optimization. You can turn optimization off and retry your application. If it then runs successfully, you can safely assume that the problem lies with code optimization For more information check out the Oracle documentation Topic: -XnoOpt - Try to upgrade to the latest JRockit version as most of the problems are be fixed just by upgrading. - Last option would be get in touch with the Oracle Weblogic Support team.
#43673 Using query "select * from /exampleRegion.entrySet" fails in a client-server topology and/or in a PartitionedRegion. Using query "select * from /exampleRegion.entrySet" fails in a client-server topology and/or in a PartitionedRegion. The Following exception is thrown:
Exception in thread "main" com.gemstone.gemfire.cache.client
.ServerOperationException?: com.gemstone.gemfire.SerializationException?:
 failed serializing object at com.gemstone.gemfire.cache.client.internal
 .OpExecutorImpl?.handleException(OpExecutorImpl?.java:530) at
- Caused by: com.gemstone.gemfire.SerializationException?:
failed serializing object
at
- com.gemstone.gemfire.internal.cache.tier.sockets
.BaseCommand?.writeQueryResponseChunk(BaseCommand?.java:750) at
-
-
Caused by: java.io.NotSerializableException?:
com.gemstone.gemfire.internal.cache.LocalRegion?$NonTXEntry
at java.io.ObjectOutputStream?.writeObject0(ObjectOutputStream?.java:1164)
at java.io.ObjectOutputStream?.writeObject(ObjectOutputStream?.java:330)
at com.gemstone.gemfire.internal.InternalDataSerializer?
.writeSerializableObject(InternalDataSerializer?.java:2032)
...
                          
Use "select e.key, e.value from /exampleRegion.entrySet e" and construct the entry object in the application that is using Gemfire.
#43607 DynamicRegionFactory with registerInterest on a client may cause dynamic subregions to be lost DynamicRegionFactory with registerInterest on a client may cause dynamic subregions to be lost. If the client loses redundancy registerInterest will destroy any of the dynamic subregions. To avoid this problem set subscription-redundancy to a non-zero value or disable registerInterest on DynamicRegionFactory.
#43545 Cache close on client will wait until all operations in progress have been completed Cache close on client will wait until all operations in progress have been completed. This is because operations like putAll take the timeout value as an input parameter and may not close the sockets if operations are in progress. This is a corner case and if the user encounters this, they should ensure that their putAll operations are small or allow for a longer wait time to shut down the client If the user encounters this, they should ensure that their putAll operations are small or allow for a longer wait time to shut down the client
#43536 Function API classes must be included in the CLASSPATH The function APIs perform early deserialization during messaging of function results, filters, arguments, and the functions themselves. Therefore, the class for these objects must be included in the JVM's classpath. It is not possible to define your own class loader just before you read a function result or pass the arguments to your code. Add the classes for functions, function arguments, function filters, and function results to the CLASSPATH.
#42452 In case of client server function execution, Execution.execute() becomes a blocking call waiting for ResultCollector to get populated with all results In case of client server function execution, Execution.execute() becomes a blocking call waiting for ResultCollector to get populated with all results. For peer to peer case, it is a non blocking call We need to make the client side function execution non-blocking. For example:
 List futures = null;
	 try {
		 futures = execService.invokeAll(callableTasks);
      }
      catch (RejectedExecutionException rejectedExecutionEx) {
		throw rejectedExecutionEx;
      }
	catch (InterruptedException e) {
		throw new InternalGemFireException(e.getMessage());
      }
      if (futures != null) {
        Iterator itr = futures.iterator();
        while (itr.hasNext() && !execService.isShutdown()
            && !execService.isTerminated()) {
          Future fut = (Future)itr.next();
          try {
            fut.get();
          }
#42432 Java arguments passed to the gemfire.bat script are not passed to the JVM (Windows) The arguments passed with prefix '-J' to the gemfire script are expected to be passed to the Java VM process. This works on Unix/Linux systems when using the bin/gemfire shell script. However, on Windows systems using the gemfire.bat it does not work and is not supported. The workaround is to set you Java Arguments in the environment via the JAVA_ARGS environment variable and then run the gemfire.bat script and command. The script will then pick up the environments at run time. On Windows systems, set Java arguments using the JAVA_ARGS environment variable. When you run the gemfire.bat script, the arguments are read. There could be two types of arguments (1) Java VM switches (like -Xmx512m) : Use the environment variable JAVA_ARGS (2) GemFire properties that are to be set as System Properties to the Java VM: Use gemfire.properties or use the environment variable JAVA_ARGS without using '-J'. e.g. To set gemfire.mcast-port use -Dgemfire.mcast-port=15001
#42431 Region expiration may take longer than expected GemFire only uses a single thread to process expired region entries. This can cause expiration to take longer than expected as the schedule expiration queue up waiting for this single thread to process them. This bug is even worse if the expiration needs to remove the entry from disk or do a network hop. If you are not using GemFire transactions or are willing for a transaction to fail because of a conflict caused by a concurrent expiration then in GemFire 6.6 you can set -Dgemfire.EXPIRATIONS_CAUSE_CONFLICTS=true. This allows expirations to take advantage of multiple threads. If you then set -Dgemfire.EXPIRY_THREADS=XXX where XXX is the number of threads to use for expiration then you will have multiple threads doing concurrent expirations.
#42381 Cache creation does not fail if an index configured in cache.xml can not be created If a failure occurs while creating an index during cache creation (for example, gemfire starts up using cache.xml file), the cache should not be created. This will prevent users from trying to query indexes that do not exist. One way to make sure is to look into the stat file to see if all the indexes are created. Or if a query takes a long time than expected, it needs to be analyzed to see if it's using the expected index.
#42041 Calling Function.onServer repeatedly can cause socket exhaustion Heavy use of Function.OnServers from a client can cause sockets to churn and will cause "Too many open files" errors on the locator. If users see "Too many open files" errors when repeatedly calling Function.OnServers(), they should increase the ulimit settings on the host. For example, on Windows, change TcpIP/Parameters/NumConnections in the registry.
#40791 Applications that use GemFire cache client processes should call Cache.close followed by DistributedSystem.disConnect If applications using a client cache do not call DistributedSystem.disconnect(), stale data may be encountered when the application reopoens the cache and subscribes to updates. Applications that use GemFire client caches should call Cache.close() followed by DistributedSystem.disconnect().
#40693 An explicit cache destroy of an entry will be lost (to the backend database). An explicit cache destroy of an entry will be lost (to the backend database) if the entry has been eviction or expiration destroyed. The region.destroy(key) will get EntryNotFoundException. The application can then load the entry and then retry the destroy operation to destroy the entry in the database.
#40624 The EnforceUniqueHostStorageAllocation feature requires no two systems share IpAddresses Using the EnforceUniqueHostStorageAllocation feature requires that no two systems hosting members in a DistributedSystem share the same IpAddress. This is true even if the network adapter is in a "DOWN" state. The exceptions to this rule are the loopback address and the "is any" address (aka 127.0.0.1 and 0.0.0.0 respectively). The symptom when two members do share an IpAddress and the EnforceUniqueHostStorageAllocation system property is set to "true" is a message in the logs similar to the following: system.log: [warning 2009/04/21 10:00:41.290 PDT gemfire1_10503 <thread 1> tid=0x79] Unable to find sufficient members to host a bucket in the partitioned region. Region name = /partitionedRegion Current number of available data stores: 10 number successfully allocated = 3 number needed = 4 Data stores available: [ptestg(13629):58399/50210, lewis(10584):42395/52373, ptestg(13632):58401/50211, ptesth(8852):57714/32881, king(10497):37041/62411, lewis(10582):42398/52374, king(10501):37037/62412, ptesth(8850):57715/32882, king(10499):37039/62407, king(10503):37044/62414] Data stores successfully allocated: [king(10497):37041/62411, lewis(10582):42398/52374, ptesth(8850):57715/32882] Consider starting another member Remove duplicate IP addresses.
#39977 NoSubscriptionServersAvailableException while creating a client with security One some platforms calling getCredentials on the provided PKCSAuthInit template can be slow the first time it is called. This can cause a timeout on the server while creating a connection, resulting in a NoSubscriptionServersAvailableException on the client. Set the system property BridgeServer.acceptTimeout to something higher. The default is 9900 milliseconds.
#39541 Threads hang while blocking for synchronization in JRockit On Java SE 6 versions of JRockit JVM, one or more threads appear to hang while blocking for a synchronization that is not held by any other thread. We have found that this problem can be avoided by disabling lazyUnlocking using: -XXlazyUnlocking:enable=false According to the JRockit documentation: "In R27.5 lazy unlocking is enabled by default in Java SE 6 versions of JRockit JVM on all platforms except IA64 and with all garbage collection modes except the deterministic garbage collection mode." Disabling JRockit's lazyUnlocking seems to prevent these hangs.
#39139 Lease expiration causes locking to hang Lease expiration can cause all other lock requests on the DistributedLockService to hang. Global Region operations may hang for the same reasons. Use -1 for lock-lease to prevent lease expiration
#38250 NotSerializableException can block cache access with if occurring in a region with global or d-ack scope If the application tries to put a instance that isn't serializable into the cache it will block/hang the application and not recover if the region scope is global or d-ack. Add checks before any put or create operations that the object in question is an instance of java.io.Serializable.
#37943 Concurrent creation and destruction of a Partitioned Region may cause a distributed deadlock If a Partitioned Region is simultaneously created in one VM and destroyed in another, there is a window of time where a distributed dead-lock can occur causing both the creating thread and the destroying thread to hang. During testing, we found this bug difficult to reproduce, implying a low likelihood of occurrence. For a Partitioned Region of the same name, ensure that during creation there are no simultaneous Partitioned Region destructions.
#37476 Large messages will not conflate If p2p slow receiver is enabled with conflation and a large message is sent to the slow receiver then it will not allow itself to be conflated. Large messages are ones larger than the socket-buffer-size. The socket-buffer-size can be increased with a gemfire property but the operating system has its own limit on how large it can be made. After setting the property check your log for messages about the actual buffer allocated being smaller than requested.
#37158 Interrupting threads using DistributedLockService causes other members to hang or generate large log files Some indications that this problem has occurred include statements in the log such as: "Grantor is still initializing" "Grantor creation was aborted but grantor was not destroyed" If these appear in the log, then a thread was interrupted while using the DistributedLockService and the member must be disconnected from the DistributedSystem. Other members may actually hang and possibly produce very large log files. Disconnecting this member from the DistributedSystem will allow other members to continue working without any further problems. Do not interrupt any thread that may be using the DistributedLockService API. Use waitTimeMillis to specify how long the lock request will wait. The thread will not continue to wait after the request times out. Disconnecting from the DistributedSystem will cause any waiting threads to return.
#35816 Time stamps not taken into account between BridgeClient and BridgeServer GemFire attempts to compensate for clock skew when sending updates between distributed system members. Between tiers in client/server installations, however, no such compensatory work is done. (Clients and servers should never be in the same distributed system.) The time compensation is particularly important when expiration is enabled on a region (the creation/update stamp should be copied from the origin's). Make sure the clocks are synchronized between the machines where your clients and servers run. Best practice is to always make sure your clocks are synchronized, as this helps with log analysis for troubleshooting.
#35706 Low readTimeout may cause client to prematurely add server to dead list The readTimeout property is used both for Region operations (get, put etc.) and to determine when a server is dead. Setting it too low causes a client to prematurely add servers to the dead list, whereas setting it too high may cause Region operations to take longer than desired to detect a non-responsive server. Use care when setting the readTimeout property. You want to set a reasonable cap on the time for a given Region operation, however if the application receives CacheLoaderExceptions with "No active servers" messages, when servers are available, this many indicate a readTimeout setting that is too low.
#35646 Dataserializable instantiator use breaks data propagation through gateway Use of com.gemstone.gemfire.Instantiators to speed data deserialization can prevent data from propagating through a Gateway hub if you have installed an event listener in the hub that accesses the data. This is due to a defect in the product that keeps instantiator registrations from being propagated through the hub. In your event listener's Declarable.init(Properties) method, register all instantiators that will be encountered. This allows for proper deserialization of data for access by the listener.
#35373 WAN performance is not optimized for high-bandwidth connections The gateway hub functionality for multi-site installations was implemented to handle the difficulties of communication over lower-bandwidth WAN connections. It was not designed for optimum performance over high-bandwith connections. The implementation needs further optimization for high-bandwidth use. If performance is an issue, here are some strategies to try: 1. Try conflation, which increases throughput for large queues. Conflation was implemented for multi-site installations in version 4.2.3 (see Multi-Site Queue Conflation in the 4.2.3 release notes for details). 2. If you use conflation, then you might be able to increase the performance further, if necessary, through the client/server architecture. The potential for improvement depends on your latency requirements and the network bandwidth between sites.