Release Notes for Patches for the MapR 5.1.0 Release Release Notes for the December 2016 Patch Released 12/09/2016 These release notes describe the fixes that are included in this patch. Packages Red Hat Server mapr-patch-5.1.0.37549.GA-40890.x86_64.rpm Red Hat Client mapr-patch-client-5.1.0.37549.GA-40890.x86_64.rpm Red Hat Loopbacknfs mapr-patch-loopbacknfs-5.1.0.37549.GA-40890.x86_64.rpm Red Hat Posix-client-basic mapr-patch-posix-client-basic-5.1.0.37549.GA-40890.x86_64.rpm Red Hat Posix-client-platnium mapr-patch-posix-client-platinum-5.1.0.37549.GA-40890.x86_64.rpm Ubuntu Server mapr-patch-5.1.0.37549.GA-40890.x86_64.deb Ubuntu Client mapr-patch-client-5.1.0.37549.GA-40890.x86_64.deb Ubuntu Loopbacknfs mapr-patch-loopbacknfs-5.1.0.37549.GA-40890.x86_64.deb Ubuntu Posix-client-basic mapr-patch-posix-client-basic-5.1.0.37549.GA-40890.x86_64.deb Ubuntu Posix-client-platnium mapr-patch-posix-client-platinum-5.1.0.37549.GA-40890.x86_64.deb Win32 Client mapr-client-5.1.40890GA-1.win32.zip Win64 Client mapr-client-5.1.0.40890GA-1.amd64.zip Mac Client mapr-client-5.1.0.40890GA-1.x86_64.tar.gz Fixes Bug 13187 Description The maprcli volume create command was not setting group ownership to user's primary group. Resolution With this fix, the maprcli volume create command will set group ownership to user's primary group. Bug 20965 Description When working with multiple clusters, synchronization issues was causing MapRFileSystem to return NullPointerException. Resolution With this fix, MapRFileSystem has been improved to better support working with multiple clusters and MapRFileSystem contains fixes for synchronization issues. Bug 23257 Description In MCS, new NFS VIPs were visible in the NFS HA > VIP Assignments tab, but not in the NFS HA > NFS Setup tab. Resolution With this fix, the NFS VIPs will be available in both the NFS HA > VIP Assignments tab and the NFS HA > NFS Setup tab. Bug 23975 Description In version 5.1, MFS was failing to start on some docker containers as it was trying to figure out number of numa nodes from /sys/devices/system/node. Resolution With this fix, MFS will work on docker containers. Bug 24139 Description If limit spread was enabled and the nodes were more than 85% full, CLDB did not allocate containers for IOs on non-local volumes. Resolution With this fix, CLDB will now allocate new containers to ensure that the IO does not fail. Bug 24155 Description Disk setup was timing out if running trim on flash drives took some time. Resolution With this fix, disk setup will complete successfully and the warning message (“Starting Trim of SSD drives, it may take a long time to complete”) is entered in the log file. Bug 24249 Description When running map/reduce jobs with older versions of the MapR classes, the system hung because the older classes linked to the native library installed on cluster nodes that have been updated to a newer MapR version. Resolution With this fix, the new fs.mapr.bailout.on.library.mismatch parameter detects mismatched libraries, fails the map/reduce job, and logs an error message. The parameter is enabled by default. You can disable the parameter on all the TaskTracker nodes and resubmit the job for the task to continue to run. To disable the parameter, you must set it to false in the core-site.xml file. Bug 24585 Description Excessive logging in CLDB audit caused cldbaudit.log file to grow to large sizes. Resolution With this fix, to reduce the size of cldbaudit.log file, the queries to CLDB for ZK string will no longer be logged for auditing. Bug 24610 Description In a secure cluster, when there are intermittent connection drops (between MFS-MFS or client-MFS), the client and/or server could crash during authentication. Resolution With this fix, the client and/or server will not crash during authentication if there are intermittent connection drops. Bug 24812 Description Apache Hadoop could not look up the status of a finished job because job.xml was already removed from the search directory. Hive interpreted the job as failing and generated an exception. Resolution With this fix, Apache Hadoop correctly reports the status of the finished job. Bug 24965 Description On large clusters, sometimes the bind failed with the message indicating unavailability of port when running MR jobs, specifically reducer tasks. Resolution With this fix, the new fs.mapr.bind.retries configuration parameter in core-site.xml file, if set to true, will retry to bind during client initialization for 5 minutes before failing. By default, the fs.mapr.bind.retries configuration parameter is set to false. Bug 24915 Description In version 5.1, running the expandaudit utility on volumes can result in very large (more than 1GB) audit log files due to incorrect GETATTR (get attributes) cache handling. Resolution With this fix, the expandaudit utility has been updated so that it will not perform subsequent GETATTR calls if the original call to the same file identifier failed. Bug 25003 Description When a specific queue uses all of its resources, the UsedResources tab in the Resource Manager UI might show a greater value than shown in the MaxResources tab. This happens when another application is submitted and the application master container size is added in. Resolution With this fix, no more containers can be assigned to a queue when its UsedResource has reached the MaxResource limit. Bug 25177 Description When using FairScheduler with maxAMShare enabled, total amResourceUsage per queue is not calculated properly, which may cause applications to hang in ACCEPTED state. Resolution: AM resource usage is now calculated as expected and YARN jobs no longer get stuck in the ACCEPTED state. Release Notes for the October 2016 Patch Released 10/24/2016 These release notes describe the fixes that are included in this patch. Packages Red Hat Server mapr-patch-5.1.0.37549.GA-40163.x86_64.rpm Red Hat Client mapr-patch-client-5.1.0.37549.GA-40163.x86_64.rpm Red Hat Loopbacknfs mapr-patch-loopbacknfs-5.1.0.37549.GA-40163.x86_64.rpm Red Hat Posix-client-basic mapr-patch-posix-client-basic-5.1.0.37549.GA-40163.x86_64.rpm Red Hat Posix-client-platnium mapr-patch-posix-client-platinum-5.1.0.37549.GA-40163.x86_64.rpm Ubuntu Server mapr-patch-5.1.0.37549.GA-40163.x86_64.deb Ubuntu Client mapr-patch-client-5.1.0.37549.GA-40163.x86_64.deb Ubuntu Loopbacknfs mapr-patch-loopbacknfs-5.1.0.37549.GA-40163.x86_64.deb Ubuntu Posix-client-basic mapr-patch-posix-client-basic-5.1.0.37549.GA-40163.x86_64.deb Ubuntu Posix-client-platnium mapr-patch-posix-client-platinum-5.1.0.37549.GA-40163.x86_64.deb Win32 Client mapr-client-5.1.0.40163GA-1.win32.zip Win64 Client mapr-client-5.1.0.40163GA-1.amd64.zip Mac Client mapr-client-5.1.0.40163GA-1.x86_64.tar.gz Fixes Bug 14105 Description When nodes attempt to register with duplicate IDs, CLDB does not register the nodes and log meaningful error messages. Resolution With this fix, when nodes attempt to register with duplicate IDs, CLDB will log appropriate error messages. Bug 24408 Description When running multiple producers as separate threads within a process, with a very small value for buffer.memory (say 1KB), some producers can stall. This is due to a lack of buffer memory. Resolution With this fix, the default value for minimum buffer memory is increased to 10kB. Bug 24477 Description Jobs failed if a local volume was not available and directories for mapreduce could not be initialized. Resolution With this fix, jobs no longer fail, and local volume recovery is enhanced. Bug 24563 Description The Resource Manager failed to load or recover after an attempt to read the first byte of a data stream that had no first byte or no data. Resolution With this fix, the Resource Manager recovers after attempting to read a data stream with no first byte or no data. Bug 24566 Description An older version of the aws-sdk jar was built with Mapr. Resolution With this fix, MapR upgraded the aws-sdk jar from version 1.7.4 to 1.7.15. Bug 24630 Description The time stamp on the files showed negative numbers in nano seconds after applying mapr-patch 5.2 build 39544. Resolution With this fix, if the nanosecond timestamp overflows because of the old NFS bug, the nanoseconds value will be 9999. Bug 24658 Description CLDB returned “no master” and an empty list for container lookup, which NFS server could not handle, because when multiple servers are down, there can be no master for a container. Resolution With this fix, NFS server will handle empty node list for container lookup. Bug 24505 Description A job failed when the JvmManager went into an inconsistent state. Resolution With this fix, jobs no longer fail as a result of the JvmManager entering an inconsistent state. Bug 24562 Description CLDB (container location database) performance suffered because Warden gave the CLDB service a lower CPU priority. Resolution With this fix, Warden uses a new algorithm to set the correct CPU priority for the CLDB service. Bug 24656 Description MFS was churning cpu while taking snapshot because of some debug code in the builds. Resolution With this fix, MFS will no longer churn CPU as the debug code has been disabled. Bug 24700 Description The Job Tracker user interface failed with a NullPointerException when a user submitted a Hive job with a null value in a method. Resolution With this fix, the Job Tracker interface does not fail when a Hive job is run with a null value in a method. Bug 24992 Description Installing a MapR patch caused jar files to be removed from under the drill/drill-1.4.0/jars/ directory Resolution Jar files are no longer incorrectly removed Bug 24618 Description Remote mirror volumes could not be created on secure clusters using MCS even when the appropriate tickets were present. Resolution With this fix, remote mirror volumes can now be created on secure clusters using MCS. Bug 24971 Description When the mirroring operation started after a CLDB failover, sometimes it was sending request sto slave CLDB where data was stale, resulting in the the mirroring operation hanging. If the CLDB failover happened again during this time, the new CLDB master was discarding data resynchronized by the old mirroring operation, but marking the mirroring operation as successful. This resulted in data mismatch between source and destination. Resolution With this fix, mirroring requests will be sent to master CLDB node only. Bug 25041 Description Whenever a newly added node was made the master of the name container, MFS crashed while deleting files in the background. Resolution With this fix, MFS will not crash when a newly added node is made the master of the name container. Release Notes for the September 2016 Patch Released 9/23/2016 These release notes describe the fixes that are included in this patch. Packages Red Hat Server mapr-patch-5.1.0.37549.GA-39728.x86_64.rpm Red Hat Client mapr-patch-client-5.1.0.37549.GA-39728.x86_64.rpm Red Hat Loopbacknfs mapr-patch-loopbacknfs-5.1.0.37549.GA-39728.x86_64.rpm Red Hat Posix-client-basic mapr-patch-posix-client-basic-5.1.0.37549.GA-39728.x86_64.rpm Red Hat Posix-client-platnium mapr-patch-posix-client-platinum-5.1.0.37549.GA-39728.x86_64.rpm Ubuntu Server mapr-patch-5.1.0.37549.GA-39728.x86_64.deb Ubuntu Client mapr-patch-client-5.1.0.37549.GA-39728.x86_64.deb Ubuntu Loopbacknfs mapr-patch-loopbacknfs-5.1.0.37549.GA-39728.x86_64.deb Ubuntu Posix-client-basic mapr-patch-posix-client-basic-5.1.0.37549.GA-39728.x86_64.deb Ubuntu Posix-client-platnium mapr-patch-posix-client-platinum-5.1.0.37549.GA-39728.x86_64.deb Win32 Client mapr-client-5.1.0.39728GA-1.win32.zip Win64 Client mapr-client-5.1.0.39728GA-1.amd64.zip Mac Client mapr-client-5.1.0.39728GA-1.x86_64.tar.gz Fixes Bug 23652 Description The POSIX loopbacknfs client did not automatically refresh renewed service tickets. Resolution With this fix, the POSIX loopbacknfs client will: * Automatically use the renewed service ticket without requiring a restart if the ticket is replaced before expiration (ticket expiry time + grace period of 55 minutes). If the ticket is replaced after expiration (which is ticket expiry time + grace period of 55 minutes), the POSIX loopbacknfs client will not refresh the ticket as the mount will become stale. * Allow impersonation if a service ticket is replaced before ticket expiration (which is ticket expiry time + grace period of 55 minutes) with a servicewithimpersonation ticket. * Honor all changes to user/group IDs of the renewed ticket. Bug 24053 Description During client initialization, the client crashed if there was an error during initialization. Resolution With this fix, the client will not crash if there is an error during initialization. Bug 24057 Description Only SPs in average and above buckets were considered for balancing because disk balancer had to spread the containers for optimal SP utilization. Resolution With this fix, disk balancer balances SPs in all bins giving preference to those SPs that are highly utilized. SPs in lower bins are balanced periodically (configurable), when there is not too much balancing activity (of SPs that are highly utilized). By default, SPs in lower bins are balanced every 60 minutes. To set it to a lower value, reset the value for dbal.below.avg.bins.balancing.frequency using the maprcli config save command. The default size of above average, average, and below average bins is 20 (percentage). To allow more granular and aggressive balancing of storage pools across bins, reduce the size of each bin. To reduce the size of each bin, specify the value for the following parameters using the maprcli config save command: dbal.above.avg.bin.size specifies the bin size (%) of SPs whose usage is above cluster average dbal.avg.bin.size specifies the bin size (%) of SPs whose usage is in the average range dbal.below.avg.bin.size specifies the bin size (%) of SPs whose usage is below average range For example, to reduce the size of the below average bin to 10%, run the following command: maprcli config save -values {"dbal.below.avg.bin.size":"10"} Periodically, disk balancer logs information about all the bins and the storage pools in those bins in the file /opt/mapr/logs/cldbdiskbalancer.log. By default, the information is logged every 10 minutes and can be configured using the parameter dbal.loadtracker.info.log.frequency. Bug 24119 Details Warden adjusts the FileServer (MFS) and Node Manager (NM) memory incorrectly when NM and TaskTracker (TT) are on the same node. This can result in too much memory being allocated to MFS. Resolution With this fix, Warden does not adjust MFS memory when NM and TT are on the same node. Memory adjustment is implemented only when TT and MapR-FS (but no NM) are on the same node. Bug 24159 Description The mtime was updated whenever a hard link was created. Also, when a hard link was created from the FUSE mount point, although the ctime was updated, the update timestamp only showed the minutes and seconds and not the nanoseconds. Resolution With this fix, mtime will not change on the hard link and when a hard link is created from the FUSE mount point, the timestamp for ctime will include nanoseconds. Bug 24232 Description In certain cases, files were created with stale chunk IDs, which prevented users from accessing files in the parent directory. Resolution With this fix, files will not be created with stale chunk IDs. Bug 24324 Description The “disk not found” error was thrown because the script used to list disks was looking up disks in every instance of the fileserver process. Resolution With this fix, the script will look for disks only in the specific instance of the fileserver process. Bug 24280 Description Running the maprcli dashboard info command occasionally throws a TimeoutException error. Resolution With this fix, the internal timeout command was increased to provide more allowance for command processing. Release Notes for the August 2016 Patch Released 8/27/2016 These release notes describe the fixes that are included in this patch. Packages Red Hat Server mapr-patch-5.1.0.37549.GA-39353.x86_64.rpm Red Hat Client mapr-patch-client-5.1.0.37549.GA-39353.x86_64.rpm Red Hat Loopbacknfs mapr-patch-loopbacknfs-5.1.0.37549.GA-39353.x86_64.rpm Ubuntu Server mapr-patch-5.1.0.37549.GA-39353.x86_64.deb Ubuntu Client /mapr-patch-client-5.1.0.37549.GA-39353.x86_64.deb Ubuntu Loopbacknfs mapr-patch-loopbacknfs-5.1.0.37549.GA-39353.x86_64.deb Win32 Client mapr-client-5.1.0.39353GA-1.win32.zip Win64 Client mapr-client-5.1.0.39353GA-1.amd64.zip Mac OS X Client mapr-client-5.1.0.39353GA-1.x86_64.tar.gz Fixes Bug 20498 and 24143 Details JobTracker attempts to restart TaskTrackers resulted in the loading of the job configuration object multiple times. Loading the job configuration object multiple times caused a JobTracker lock contention and it also caused the JobTracker service to become unresponsive. Resolution With this fix, the JobTracker caches the job configuration object for each job. Then, it uses the cached configuration object associated with each job for all of the job's task completion events. Bug 23776 Details The client accessing using the Java API experienced a crash because the closed_flag in Inode close() method was not set to true when there was an error (like EACCES). Resolution With this fix, the closed_ flag will be set to true irrespective of the error. Bug 23931 Details Nested user queues did not inherit labels from their parent queue. As a result, no labels were configured for nested user queues. Resolution With this fix, each nested user queue inherits the label and label policy from its parent queue. Bug 23933 Details When an OJAI Document is created by parsing a JSON String that contains a List of Maps, for example [{}, {}, ...], an extraneous null element is added to this list,for example [{}, null, {}, ...] in the resultant document. Resolution With this fix, the extraneous null element does not occur. Bug 24018 Details YARN jobs failed with "Rename cannot overwrite non empty destination directory” Resolution With this fix, failures no longer occur. Bug 24022 Details Mirroring of a volume on a container which does not have a master container caused the mirror thread to hang. Resolution With this fix, mirroring will not hang when the container associated with the volume has no master. Bug 24025 Details When the HistoryServer read the job history file for a job that was not initialized correctly, it read "-" as a delimiter. This caused the job start time to have an empty value. As a result, the following warning displayed: